Introduction
NumPy arrays are stored in contiguous memory blocks. When creating large arrays, the total memory required (elements * dtype size) may exceed available RAM or the process address space limit. On 32-bit systems, the per-process limit is 2-4 GB. On 64-bit systems, the limit is physical RAM plus swap.
Symptoms
numpy.ndarray: MemoryError: Unable to allocate 15.3 GiB for an array with shape (200000, 10000) and data type float64- System becomes unresponsive during allocation (thrashing)
- Process killed by OOM killer on Linux
- Allocation works on machine A but fails on machine B with less RAM
Common Causes
- Loading entire datasets into memory without chunking
- Using float64 when float32 would suffice (2x memory)
- Creating intermediate arrays during computation that double memory
- 32-bit Python process hitting 2 GB address space limit
- Memory fragmentation preventing large contiguous allocation
Step-by-Step Fix
- 1.Check current memory usage and array size:
- 2.```python
- 3.import numpy as np
- 4.import psutil
arr_size = 200_000 * 10_000 * 8 # float64 = 8 bytes print(f"Required: {arr_size / 1e9:.1f} GB") mem = psutil.virtual_memory() print(f"Available: {mem.available / 1e9:.1f} GB") ```
- 1.Use smaller data types:
- 2.```python
- 3.# Instead of default float64
- 4.arr = np.zeros((200_000, 10_000), dtype=np.float32) # 50% memory reduction
- 5.arr = np.zeros((200_000, 10_000), dtype=np.float16) # 75% reduction
- 6.arr = np.zeros((200_000, 10_000), dtype=np.int32) # 50% reduction
- 7.
` - 8.Use memory-mapped files for out-of-core computation:
- 9.```python
- 10.import numpy as np
# Create array on disk instead of RAM arr = np.memmap('large_array.dat', dtype='float32', mode='w+', shape=(200_000, 10_000)) # Access works like normal array, data stored on disk arr[0, 0] = 3.14 arr.flush() # Ensure data written to disk ```
- 1.Process data in chunks:
- 2.```python
- 3.def process_in_chunks(filename, chunk_size=10_000):
- 4.total = np.zeros(10_000, dtype=np.float64)
- 5.for start in range(0, 200_000, chunk_size):
- 6.end = min(start + chunk_size, 200_000)
- 7.chunk = np.load(filename)[start:end] # Load only chunk
- 8.total += chunk.sum(axis=0)
- 9.del chunk # Free memory immediately
- 10.return total
- 11.
` - 12.Use Dask for distributed out-of-core arrays:
- 13.```python
- 14.import dask.array as da
# Creates array with chunked computation, not loaded into memory arr = da.zeros((200_000, 10_000), chunks=(10_000, 10_000), dtype=np.float32) result = arr.mean(axis=0).compute() # Only loads chunks needed ```
Prevention
- Monitor memory with
tracemallocduring development - Use
np.info(arr)to check array memory footprint - Prefer generators and chunked processing for large datasets
- Consider
pyarrowfor columnar data with efficient memory layout - Set resource limits:
ulimit -v 8000000(8 GB virtual memory) to catch issues early - Use
numbawith@njitfor computation without creating intermediate arrays