Introduction
Network operations are inherently unreliable. Connections can time out due to temporary DNS issues, server overload, network congestion, or load balancer health checks. Without retry logic, a single transient failure causes the entire operation to fail. The requests library does not retry by default.
Symptoms
requests.exceptions.Timeout: HTTPSConnectionPool(host='api.example.com', port=443): Read timed out. (read timeout=10)requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))- Intermittent failures that succeed on manual retry
urllib3.exceptions.MaxRetryErrorafter exhausting retries
Common Causes
- No retry configuration on HTTP client
- Timeout values too aggressive for the target service
- Server-side slow responses under load
- Network instability between client and server
- DNS resolution delays causing connection timeouts
Step-by-Step Fix
- 1.Configure urllib3 Retry with exponential backoff:
- 2.```python
- 3.import requests
- 4.from requests.adapters import HTTPAdapter
- 5.from urllib3.util.retry import Retry
session = requests.Session()
retry_strategy = Retry( total=3, backoff_factor=1, status_forcelist=[429, 500, 502, 503, 504], allowed_methods=["HEAD", "GET", "PUT", "DELETE", "OPTIONS", "TRACE", "POST"], raise_on_status=False )
adapter = HTTPAdapter(max_retries=retry_strategy) session.mount("https://", adapter) session.mount("http://", adapter)
response = session.get("https://api.example.com/data", timeout=10) ```
- 1.Custom retry with jitter for production workloads:
- 2.```python
- 3.import time
- 4.import random
- 5.import requests
def request_with_retry(url, max_retries=5, base_delay=1, max_delay=60, **kwargs): """Retry with exponential backoff and jitter.""" for attempt in range(max_retries): try: response = requests.get(url, timeout=10, **kwargs) response.raise_for_status() return response except (requests.exceptions.Timeout, requests.exceptions.ConnectionError) as e: if attempt == max_retries - 1: raise delay = min(base_delay * (2 ** attempt) + random.uniform(0, 1), max_delay) print(f"Attempt {attempt + 1} failed: {e}. Retrying in {delay:.1f}s") time.sleep(delay) return None
# Usage response = request_with_retry("https://api.example.com/data") ```
- 1.Retry only idempotent operations safely:
- 2.```python
- 3.from urllib3.util.retry import Retry
# Safe: retries only GET and HEAD (idempotent) safe_retry = Retry( total=3, status_forcelist=[500, 502, 503, 504], allowed_methods=["HEAD", "GET"] )
# For POST/PUT, only retry on connection errors, not status codes post_retry = Retry( connect=3, read=0, # Don't retry read timeouts for POST redirect=3, status_forcelist=[], # No status code retries for non-idempotent allowed_methods=["POST"] ) ```
- 1.Configure connection pool with retry:
- 2.```python
- 3.import requests
- 4.from requests.adapters import HTTPAdapter
- 5.from urllib3.util.retry import Retry
session = requests.Session() retry = Retry(total=3, backoff_factor=0.5, status_forcelist=[502, 503, 504])
adapter = HTTPAdapter( max_retries=retry, pool_connections=10, pool_maxsize=50, pool_block=False ) session.mount("https://", adapter)
# Connection pool settings prevent connection exhaustion during retries ```
Prevention
- Always configure retries for production HTTP clients
- Use
tenacitylibrary for more sophisticated retry patterns: - ```python
- from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception_type
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=2, max=10),
retry=retry_if_exception_type((requests.exceptions.Timeout, requests.exceptions.ConnectionError))
)
def fetch_data():
return requests.get("https://api.example.com/data", timeout=10).json()
``
- Set connect timeout (5s) and read timeout (30s) separately
- Monitor retry rates in application metrics to detect upstream issues
- Implement circuit breaker pattern with pybreaker` for sustained failures