Introduction Serverless and cloud-hosted databases (Aurora Serverless, Neon, Supabase, Cloud SQL) aggressively terminate idle connections to conserve resources. Applications that hold connections open during periods of inactivity will encounter `connection reset by peer` or `broken pipe` errors when the next query runs.

Symptoms - Intermittent `connection reset by peer` or `EOF` errors on first query after idle period - `FATAL: terminating connection due to administrator command` in PostgreSQL logs - `MySQL server has gone away` (ERROR 2006) on AWS RDS or Aurora - Connection pool reports healthy connections that fail on first use - Errors correlate with periods of low traffic (nights, weekends)

Common Causes - Cloud provider idle timeout (typically 5-15 minutes) is shorter than application connection pool max idle time - Load balancer or proxy (ALB, nginx) dropping idle TCP connections - NAT gateway connection tracking timeout expiring (typically 350 seconds on AWS) - Application not sending TCP keepalive probes on database connections - Connection pool not validating connections before checkout

Step-by-Step Fix 1. **Identify the timeout source by checking provider settings**: ```sql -- PostgreSQL: check idle timeout SHOW idle_in_transaction_session_timeout; SHOW tcp_keepalives_idle; SHOW tcp_keepalives_interval; SHOW tcp_keepalives_count; ```

  1. 1.Enable TCP keepalive on the database server:
  2. 2.```sql
  3. 3.ALTER SYSTEM SET tcp_keepalives_idle = 30;
  4. 4.ALTER SYSTEM SET tcp_keepalives_interval = 10;
  5. 5.ALTER SYSTEM SET tcp_keepalives_count = 6;
  6. 6.SELECT pg_reload_conf();
  7. 7.`
  8. 8.Configure connection pool to test connections before checkout:
  9. 9.```python
  10. 10.# SQLAlchemy with pre-ping
  11. 11.from sqlalchemy import create_engine

engine = create_engine( "postgresql+psycopg2://user:pass@host:5432/db", pool_size=10, max_overflow=20, pool_pre_ping=True, # Tests connection before each checkout pool_recycle=180, # Recycle connections every 3 minutes connect_args={ "keepalives": 1, "keepalives_idle": 30, "keepalives_interval": 10, "keepalives_count": 6, } ) ```

  1. 1.For Node.js/pg, configure keepalive:
  2. 2.```javascript
  3. 3.const { Pool } = require('pg');
  4. 4.const pool = new Pool({
  5. 5.connectionString: process.env.DATABASE_URL,
  6. 6.max: 20,
  7. 7.idleTimeoutMillis: 30000,
  8. 8.connectionTimeoutMillis: 5000,
  9. 9.keepAlive: true,
  10. 10.keepAliveInitialDelayMillis: 10000,
  11. 11.});

// Test connection before use pool.on('error', (err) => { console.error('Unexpected pool error', err); }); ```

  1. 1.Implement retry logic at the application level:
  2. 2.```python
  3. 3.def execute_with_retry(query, params, max_retries=3):
  4. 4.for attempt in range(max_retries):
  5. 5.try:
  6. 6.with engine.connect() as conn:
  7. 7.return conn.execute(text(query), params)
  8. 8.except (ConnectionError, OSError, DatabaseError) as e:
  9. 9.if attempt == max_retries - 1:
  10. 10.raise
  11. 11.engine.pool.dispose() # Clear the pool
  12. 12.time.sleep(0.5 * (2 ** attempt))
  13. 13.`

Prevention - Set `pool_recycle` to 60% of the provider's idle timeout (e.g., 180s for a 300s timeout) - Always enable `pool_pre_ping` or equivalent connection testing - Configure TCP keepalive at both OS and application levels - Use PgBouncer or ProxySQL as a local connection pooler with shorter idle timeouts - Monitor connection errors by type and alert on `connection reset` frequency - For serverless databases, use provider-specific connection poolers (Neon proxy, Supavisor)