What's Actually Happening
AWS Lambda functions fail during cold start initialization. The function times out before the handler can process the event.
The Error You'll See
Lambda timeout:
{
"errorMessage": "2024-01-01T00:00:00.000Z Task timed out after 3.00 seconds",
"errorType": "LambdaTimeoutException"
}Init timeout:
{
"errorMessage": "Initialization timed out",
"errorType": "LambdaInitTimeoutException"
}CloudWatch logs:
START RequestId: abc-123
Initializing function...
Init Duration: 3000.00 ms
END RequestId: abc-123
REPORT RequestId: abc-123 Duration: 3000.00 ms Billed Duration: 3000 ms Memory Size: 128 MB Max Memory Used: 64 MBWhy This Happens
- 1.Large deployment package - Slow to load and initialize
- 2.Heavy dependencies - Many modules to import
- 3.VPC configuration - ENI attachment adds latency
- 4.Low memory - Insufficient memory for initialization
- 5.Short timeout - Function timeout too low
- 6.Database connections - Connection establishment slow
Step 1: Analyze Cold Start Duration
```bash # Check CloudWatch logs for init duration: aws logs filter-log-events \ --log-group-name /aws/lambda/my-function \ --filter-pattern "INIT_START" \ --limit 10
# Look for init duration: # INIT_START Runtime Version: ... # INIT_REPORT Init Duration: 2500.00 ms
# Check average cold start duration: aws cloudwatch get-metric-statistics \ --namespace AWS/Lambda \ --metric-name Duration \ --dimensions Name=FunctionName,Value=my-function \ --statistics Average \ --period 300
# Monitor cold starts: # Check for "Init Duration" in logs: aws logs tail /aws/lambda/my-function --since 1h | grep "Init Duration"
# View recent invocations: aws lambda list-function-event-invoke-configs --function-name my-function ```
Step 2: Optimize Package Size
```bash # Check deployment package size: aws lambda get-function --function-name my-function | jq '.Configuration.CodeSize'
# List function dependencies: # For Node.js: du -sh node_modules/ ls -la node_modules/ | wc -l
# For Python: du -sh .package/ pip list --local
# Use Lambda layers for large dependencies: # Create layer: mkdir -p layer/nodejs cp package.json layer/nodejs/ cd layer/nodejs && npm install --production cd ../.. zip -r layer.zip layer/
# Publish layer: aws lambda publish-layer-version \ --layer-name my-deps \ --zip-file fileb://layer.zip
# Update function to use layer: aws lambda update-function-configuration \ --function-name my-function \ --layers arn:aws:lambda:us-east-1:123456789:layer:my-deps:1
# Use webpack/esbuild for Node.js: # webpack.config.js module.exports = { mode: 'production', target: 'node', entry: './src/index.js', output: { path: path.resolve(__dirname, 'dist'), filename: 'index.js', libraryTarget: 'commonjs2' } } ```
Step 3: Reduce Initialization Code
```python # Python - Move imports inside handler:
# BAD: All imports at top (slow cold start) import pandas import numpy import boto3 import requests
def handler(event, context): # handler code
# GOOD: Lazy imports def handler(event, context): import pandas # Only import when needed import numpy # handler code
# Initialize connections outside handler (reuse): # BAD: New connection every invocation def handler(event, context): client = boto3.client('dynamodb') # process
# GOOD: Reuse connection import boto3 dynamodb = boto3.resource('dynamodb') # Initialized once per container
def handler(event, context): table = dynamodb.Table('my-table') # process ```
```javascript // Node.js - Lazy loading:
// BAD: All imports at top const aws = require('aws-sdk'); const _ = require('lodash'); const moment = require('moment');
// GOOD: Lazy load let aws, _, moment;
const getAWS = () => { if (!aws) aws = require('aws-sdk'); return aws; };
exports.handler = async (event) => { const dynamodb = getAWS().DynamoDB(); // process }; ```
Step 4: Increase Memory and Timeout
```bash # Check current memory: aws lambda get-function-configuration --function-name my-function | jq '.MemorySize'
# Check current timeout: aws lambda get-function-configuration --function-name my-function | jq '.Timeout'
# Increase memory (also increases CPU): aws lambda update-function-configuration \ --function-name my-function \ --memory-size 1024 # 1GB, more CPU = faster init
# Increase timeout: aws lambda update-function-configuration \ --function-name my-function \ --timeout 30 # 30 seconds
# Use memory-based profiling: # More memory = more CPU = faster initialization # Test with: 256MB, 512MB, 1024MB, 2048MB ```
Step 5: Configure Provisioned Concurrency
```bash # Enable provisioned concurrency for warm starts:
# Create alias: aws lambda create-alias \ --function-name my-function \ --name production \ --function-version '$LATEST'
# Configure provisioned concurrency: aws lambda put-provisioned-concurrency-config \ --function-name my-function \ --qualifier production \ --provisioned-concurrent-executions 5
# Check status: aws lambda get-provisioned-concurrency-config \ --function-name my-function \ --qualifier production
# For Auto Scaling: aws application-autoscaling register-scalable-target \ --service-namespace lambda \ --resource-id function:my-function:production \ --scalable-dimension lambda:function:ProvisionedConcurrency \ --min-capacity 2 \ --max-capacity 10
aws application-autoscaling put-scaling-policy \ --service-namespace lambda \ --scalable-dimension lambda:function:ProvisionedConcurrency \ --resource-id function:my-function:production \ --policy-name my-policy \ --policy-type TargetTrackingScaling \ --target-tracking-scaling-policy-configuration file://scaling-policy.json ```
Step 6: Optimize VPC Configuration
```bash # VPC cold starts are slower due to ENI attachment
# Check if function uses VPC: aws lambda get-function-configuration --function-name my-function | jq '.VpcConfig'
# Use AWS Lambda VPC improvements (automatic): # Functions now use hyperplane ENIs (faster)
# Reduce VPC cold start: # 1. Use provisioned concurrency for VPC functions # 2. Increase memory (faster ENI attachment)
# Alternative: Use VPC-less architecture: # - Use VPC endpoints for AWS services # - Use RDS Proxy for database connections # - Use HTTP APIs instead of private APIs
# For RDS: # Use RDS Proxy: aws rds create-db-proxy \ --db-proxy-name my-proxy \ --engine-family MYSQL \ --auth '{ "authScheme": "SECRETS", "secretArn": "arn:aws:secretsmanager:..." }' \ --role-arn arn:aws:iam::...:role/lambda-role \ --vpc-subnet-ids subnet-xxx subnet-yyy \ --vpc-security-group-ids sg-zzz ```
Step 7: Optimize Database Connections
```python # Python - Connection pooling:
import boto3 import os from psycopg2 import pool
# Initialize connection pool outside handler connection_pool = None
def get_connection(): global connection_pool if connection_pool is None: connection_pool = pool.SimpleConnectionPool( minconn=1, maxconn=10, host=os.environ['DB_HOST'], database=os.environ['DB_NAME'], user=os.environ['DB_USER'], password=os.environ['DB_PASSWORD'] ) return connection_pool.getconn()
def handler(event, context): conn = get_connection() try: # Use connection pass finally: connection_pool.putconn(conn) ```
```javascript // Node.js - Connection reuse:
const { Pool } = require('pg'); let pool;
const getPool = () => { if (!pool) { pool = new Pool({ host: process.env.DB_HOST, database: process.env.DB_NAME, user: process.env.DB_USER, password: process.env.DB_PASSWORD, max: 5, idleTimeoutMillis: 30000 }); } return pool; };
exports.handler = async (event) => { const pool = getPool(); const client = await pool.connect(); try { // Use client } finally { client.release(); } }; ```
Step 8: Use Lambda SnapStart (Java)
```bash # For Java functions, enable SnapStart:
aws lambda update-function-configuration \ --function-name my-java-function \ --snap-start '{"ApplyOn": "PublishedVersions"}'
# Publish version: aws lambda publish-version --function-name my-java-function
# SnapStart caches initialized state # Significantly reduces cold start time
# Requirements: # - Java 11 or later # - Supported runtimes: java11, java17 # - Must be stateless or handle state restoration ```
Step 9: Monitor and Alert
```bash # CloudWatch metrics for cold starts: # - Duration # - InitDuration (in logs)
# Create CloudWatch alarm: aws cloudwatch put-metric-alarm \ --alarm-name lambda-cold-start-high \ --alarm-description "Lambda cold starts taking too long" \ --namespace AWS/Lambda \ --metric-name Duration \ --dimensions Name=FunctionName,Value=my-function \ --statistic Average \ --period 60 \ --threshold 2000 \ --comparison-operator GreaterThanThreshold \ --evaluation-periods 3
# Use Lambda Insights for detailed metrics: aws lambda update-function-configuration \ --function-name my-function \ --layers arn:aws:lambda:us-east-1:580247275435:layer:LambdaInsightsExtension:14
# X-Ray tracing: aws lambda update-function-configuration \ --function-name my-function \ --tracing-config Mode=Active
# View traces in X-Ray console ```
Step 10: Test Cold Start Performance
```bash # Create test script: cat << 'EOF' > test-cold-start.sh #!/bin/bash
FUNCTION=$1 PAYLOAD='{"test": "cold-start"}'
echo "=== Testing Cold Start ==="
for i in {1..5}; do # Force cold start by updating config aws lambda update-function-configuration \ --function-name $FUNCTION \ --description "Force cold start $i" > /dev/null 2>&1
sleep 2
# Invoke function start=$(date +%s%N) aws lambda invoke \ --function-name $FUNCTION \ --payload "$PAYLOAD" \ --cli-binary-format raw-in-base64-out \ response.json > /dev/null 2>&1 end=$(date +%s%N)
duration=$((($end - $start) / 1000000)) echo "Invocation $i: ${duration}ms" done
rm -f response.json EOF
chmod +x test-cold-start.sh ./test-cold-start.sh my-function
# Compare with provisioned concurrency: aws lambda put-provisioned-concurrency-config \ --function-name my-function \ --qualifier '$LATEST' \ --provisioned-concurrent-executions 1
# Test warm start: ./test-cold-start.sh my-function # Should be faster ```
Lambda Cold Start Timeout Checklist
| Check | Command | Expected |
|---|---|---|
| Package size | get-function | < 50MB |
| Memory | get-configuration | Sufficient |
| Timeout | get-configuration | > init time |
| Init duration | CloudWatch logs | < timeout |
| Dependencies | Check package | Minimized |
| VPC config | get-configuration | Necessary? |
Verify the Fix
```bash # After optimizing cold start
# 1. Check init duration reduced aws logs tail /aws/lambda/my-function --since 10m | grep "Init Duration" // < 1000ms
# 2. Test invocation succeeds aws lambda invoke --function-name my-function output.json // StatusCode: 200
# 3. Check response time aws lambda invoke --function-name my-function --cli-binary-format raw-in-base64-out --payload '{}' response.json && cat response.json // Response received
# 4. Monitor under load # Run multiple concurrent invocations // All succeed
# 5. Check CloudWatch metrics aws cloudwatch get-metric-statistics --namespace AWS/Lambda --metric-name Duration --function-name my-function // Reduced duration
# 6. Verify provisioned concurrency aws lambda get-provisioned-concurrency-config --function-name my-function --qualifier production // Status: READY ```
Related Issues
- [Fix AWS Lambda Function Timeout](/articles/fix-aws-lambda-function-timeout)
- [Fix AWS S3 Access Denied Error](/articles/fix-aws-s3-access-denied-error)
- [Fix AWS EC2 Instance Not Starting](/articles/fix-aws-ec2-instance-not-starting)