Fix NumPy MemoryError Large Array Allocation | Python Out-of-Memory

Introduction

NumPy arrays are stored in contiguous memory blocks. When creating large arrays, the total memory required (elements * dtype size) may exceed available RAM or the process address space limit. On 32-bit systems, the per-process limit is 2-4 GB. On 64-bit systems, the limit is physical RAM plus swap.

Symptoms

numpy.ndarray: MemoryError: Unable to allocate 15.3 GiB for an array with shape (200000, 10000) and data type float64
System becomes unresponsive during allocation (thrashing)
Process killed by OOM killer on Linux
Allocation works on machine A but fails on machine B with less RAM

Common Causes

Loading entire datasets into memory without chunking
Using float64 when float32 would suffice (2x memory)
Creating intermediate arrays during computation that double memory
32-bit Python process hitting 2 GB address space limit
Memory fragmentation preventing large contiguous allocation

Step-by-Step Fix

1.Check current memory usage and array size:
2.```python
3.import numpy as np
4.import psutil

arr_size = 200_000 * 10_000 * 8 # float64 = 8 bytes print(f"Required: {arr_size / 1e9:.1f} GB") mem = psutil.virtual_memory() print(f"Available: {mem.available / 1e9:.1f} GB") ```

1.Use smaller data types:
2.```python
3.# Instead of default float64
4.arr = np.zeros((200_000, 10_000), dtype=np.float32) # 50% memory reduction
5.arr = np.zeros((200_000, 10_000), dtype=np.float16) # 75% reduction
6.arr = np.zeros((200_000, 10_000), dtype=np.int32) # 50% reduction
7.`
8.Use memory-mapped files for out-of-core computation:
9.```python
10.import numpy as np

# Create array on disk instead of RAM arr = np.memmap('large_array.dat', dtype='float32', mode='w+', shape=(200_000, 10_000)) # Access works like normal array, data stored on disk arr[0, 0] = 3.14 arr.flush() # Ensure data written to disk ```

1.Process data in chunks:
2.```python
3.def process_in_chunks(filename, chunk_size=10_000):
4.total = np.zeros(10_000, dtype=np.float64)
5.for start in range(0, 200_000, chunk_size):
6.end = min(start + chunk_size, 200_000)
7.chunk = np.load(filename)[start:end] # Load only chunk
8.total += chunk.sum(axis=0)
9.del chunk # Free memory immediately
10.return total
11.`
12.Use Dask for distributed out-of-core arrays:
13.```python
14.import dask.array as da

# Creates array with chunked computation, not loaded into memory arr = da.zeros((200_000, 10_000), chunks=(10_000, 10_000), dtype=np.float32) result = arr.mean(axis=0).compute() # Only loads chunks needed ```

Prevention

Monitor memory with tracemalloc during development
Use np.info(arr) to check array memory footprint
Prefer generators and chunked processing for large datasets
Consider pyarrow for columnar data with efficient memory layout
Set resource limits: ulimit -v 8000000 (8 GB virtual memory) to catch issues early
Use numba with @njit for computation without creating intermediate arrays

Python NumPy MemoryError When Allocating Large Arrays

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Prevention

Share this guide

More Python Troubleshooting Guides

Python Unit Test Error

Python Argparse Error

Python Logging Configuration Error

Python URLLIB Error

Python Requests Timeout Error

Python FastAPI Validation Error