Introduction

Network operations are inherently unreliable. Connections can time out due to temporary DNS issues, server overload, network congestion, or load balancer health checks. Without retry logic, a single transient failure causes the entire operation to fail. The requests library does not retry by default.

Symptoms

  • requests.exceptions.Timeout: HTTPSConnectionPool(host='api.example.com', port=443): Read timed out. (read timeout=10)
  • requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
  • Intermittent failures that succeed on manual retry
  • urllib3.exceptions.MaxRetryError after exhausting retries

Common Causes

  • No retry configuration on HTTP client
  • Timeout values too aggressive for the target service
  • Server-side slow responses under load
  • Network instability between client and server
  • DNS resolution delays causing connection timeouts

Step-by-Step Fix

  1. 1.Configure urllib3 Retry with exponential backoff:
  2. 2.```python
  3. 3.import requests
  4. 4.from requests.adapters import HTTPAdapter
  5. 5.from urllib3.util.retry import Retry

session = requests.Session()

retry_strategy = Retry( total=3, backoff_factor=1, status_forcelist=[429, 500, 502, 503, 504], allowed_methods=["HEAD", "GET", "PUT", "DELETE", "OPTIONS", "TRACE", "POST"], raise_on_status=False )

adapter = HTTPAdapter(max_retries=retry_strategy) session.mount("https://", adapter) session.mount("http://", adapter)

response = session.get("https://api.example.com/data", timeout=10) ```

  1. 1.Custom retry with jitter for production workloads:
  2. 2.```python
  3. 3.import time
  4. 4.import random
  5. 5.import requests

def request_with_retry(url, max_retries=5, base_delay=1, max_delay=60, **kwargs): """Retry with exponential backoff and jitter.""" for attempt in range(max_retries): try: response = requests.get(url, timeout=10, **kwargs) response.raise_for_status() return response except (requests.exceptions.Timeout, requests.exceptions.ConnectionError) as e: if attempt == max_retries - 1: raise delay = min(base_delay * (2 ** attempt) + random.uniform(0, 1), max_delay) print(f"Attempt {attempt + 1} failed: {e}. Retrying in {delay:.1f}s") time.sleep(delay) return None

# Usage response = request_with_retry("https://api.example.com/data") ```

  1. 1.Retry only idempotent operations safely:
  2. 2.```python
  3. 3.from urllib3.util.retry import Retry

# Safe: retries only GET and HEAD (idempotent) safe_retry = Retry( total=3, status_forcelist=[500, 502, 503, 504], allowed_methods=["HEAD", "GET"] )

# For POST/PUT, only retry on connection errors, not status codes post_retry = Retry( connect=3, read=0, # Don't retry read timeouts for POST redirect=3, status_forcelist=[], # No status code retries for non-idempotent allowed_methods=["POST"] ) ```

  1. 1.Configure connection pool with retry:
  2. 2.```python
  3. 3.import requests
  4. 4.from requests.adapters import HTTPAdapter
  5. 5.from urllib3.util.retry import Retry

session = requests.Session() retry = Retry(total=3, backoff_factor=0.5, status_forcelist=[502, 503, 504])

adapter = HTTPAdapter( max_retries=retry, pool_connections=10, pool_maxsize=50, pool_block=False ) session.mount("https://", adapter)

# Connection pool settings prevent connection exhaustion during retries ```

Prevention

  • Always configure retries for production HTTP clients
  • Use tenacity library for more sophisticated retry patterns:
  • ```python
  • from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type

@retry( stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=2, max=10), retry=retry_if_exception_type((requests.exceptions.Timeout, requests.exceptions.ConnectionError)) ) def fetch_data(): return requests.get("https://api.example.com/data", timeout=10).json() `` - Set connect timeout (5s) and read timeout (30s) separately - Monitor retry rates in application metrics to detect upstream issues - Implement circuit breaker pattern with pybreaker` for sustained failures