Introduction

SQLAlchemy's QueuePool manages database connections for performance, but when all pooled connections are in use and max_overflow is exhausted, new requests receive QueuePool limit hit or TimeoutError: QueuePool limit of size 5 overflow 10 reached. This happens under high concurrency, when connections are not properly returned to the pool (uncommitted transactions, unclosed sessions), or when pool_recycle is too short for long-running queries. The default pool size of 5 is inadequate for production workloads, and unclosed sessions are the most common cause of pool exhaustion.

Symptoms

bash
sqlalchemy.exc.TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection timed out, timeout 30.00

Or:

bash
sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) FATAL: remaining connection slots are reserved for non-replication superuser connections

Connection leak detection:

python
# Check pool status
engine.pool.status()
# Output: QueuePool size=5 overflow=10 checked out=15 total=15
# Should return to size=5 overflow=0 checked out=0 when idle

Common Causes

  • Sessions not closed: Session.close() or Session.remove() not called after use
  • Pool size too small: Default pool_size=5 insufficient for concurrent requests
  • Long-running transactions: Connections held during slow queries or external API calls
  • Connection leaks in error paths: Exception thrown before session.close()
  • pool_recycle too short: Database server closes idle connections before pool_recycle expires
  • max_overflow set too low: Burst traffic exceeds pool_size + max_overflow

Step-by-Step Fix

Step 1: Configure connection pool properly

```python from sqlalchemy import create_engine

engine = create_engine( "postgresql://user:pass@localhost/dbname", pool_size=20, # Increased from default 5 max_overflow=30, # Allow 30 overflow connections during bursts pool_timeout=30, # Wait 30 seconds before raising TimeoutError pool_recycle=1800, # Recycle connections every 30 minutes pool_pre_ping=True, # Verify connection before use ) ```

Step 2: Use context manager for session lifecycle

```python from sqlalchemy.orm import sessionmaker

Session = sessionmaker(bind=engine)

def get_session(): """Context manager that guarantees session cleanup.""" session = Session() try: yield session session.commit() except Exception: session.rollback() raise finally: session.close() # Always returns connection to pool

# Usage from contextlib import contextmanager

@contextmanager def db_session(): session = Session() try: yield session session.commit() except Exception: session.rollback() raise finally: session.close() ```

Step 3: Monitor pool health

```python import logging

# Enable pool logging to detect leaks logging.basicConfig() logging.getLogger('sqlalchemy.pool').setLevel(logging.DEBUG)

# Pool status endpoint for monitoring def pool_status(): pool = engine.pool return { "pool_size": pool.size(), "checked_in": pool.checkedin(), "checked_out": pool.checkedout(), "overflow": pool.overflow(), "invalid": pool.invalidated, }

# Call periodically or on /health endpoint print(pool_status()) # {'pool_size': 20, 'checked_in': 18, 'checked_out': 2, 'overflow': 0, 'invalid': 0} ```

Prevention

  • Use context managers or FastAPI/Flask middleware to guarantee session.close()
  • Set pool_size based on expected concurrency, not the default of 5
  • Enable pool_pre_ping to detect and replace stale connections automatically
  • Set pool_recycle below your database server's idle timeout
  • Add connection pool monitoring to your application's health checks
  • Use scoped_session for thread-safe session management in web applications
  • Profile slow queries -- a single slow query can hold a connection for minutes