Fix AWS DynamoDB Throttling

# Fix AWS DynamoDB Throttling

Your application starts throwing ProvisionedThroughputExceededException errors. Latency spikes. Users complain about timeouts. DynamoDB is throttling your requests, and you need to figure out why and how to fix it.

DynamoDB throttling happens when you exceed your read or write capacity. With on-demand mode, throttling is rare but can still occur. With provisioned capacity, it's all about proper capacity planning and understanding your access patterns.

Diagnosis Commands

First, identify which table is being throttled:

bash

aws dynamodb list-tables \
  --query 'TableNames'

Check the table's capacity mode:

bash

aws dynamodb describe-table \
  --table-name my-table \
  --query 'Table.[TableName,BillingModeSummary.BillingMode,ProvisionedThroughput]'

For provisioned tables, see current capacity:

bash

aws dynamodb describe-table \
  --table-name my-table \
  --query 'Table.ProvisionedThroughput.[ReadCapacityUnits,WriteCapacityUnits]'

Check CloudWatch metrics for throttling events:

bash

aws cloudwatch get-metric-statistics \
  --namespace AWS/DynamoDB \
  --metric-name ThrottledRequests \
  --dimensions Name=TableName,Value=my-table \
  --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
  --period 300 \
  --statistics Sum \
  --output table

Check consumed vs provisioned capacity:

```bash aws cloudwatch get-metric-statistics \ --namespace AWS/DynamoDB \ --metric-name ConsumedReadCapacityUnits \ --dimensions Name=TableName,Value=my-table \ --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \ --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \ --period 300 \ --statistics Sum \ --output table

aws cloudwatch get-metric-statistics \ --namespace AWS/DynamoDB \ --metric-name ConsumedWriteCapacityUnits \ --dimensions Name=TableName,Value=my-table \ --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \ --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \ --period 300 \ --statistics Sum \ --output table ```

Check for hot partitions:

bash

aws cloudwatch get-metric-statistics \
  --namespace AWS/DynamoDB \
  --metric-name ReadThrottleEvents \
  --dimensions Name=TableName,Value=my-table \
  --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
  --period 60 \
  --statistics Sum

Get the table's GSIs and their capacity:

bash

aws dynamodb describe-table \
  --table-name my-table \
  --query 'Table.GlobalSecondaryIndexes[*].[IndexName,ProvisionedThroughput.ReadCapacityUnits,ProvisionedThroughput.WriteCapacityUnits]'

Check if auto scaling is enabled:

```bash aws application-autoscaling describe-scalable-targets \ --service-namespace dynamodb \ --resource-ids table/my-table

aws application-autoscaling describe-scaling-policies \ --service-namespace dynamodb \ --resource-id table/my-table ```

Common Causes and Solutions

Insufficient Provisioned Capacity

The most straightforward cause: you're simply exceeding your provisioned capacity.

Increase read capacity:

bash

aws dynamodb update-table \
  --table-name my-table \
  --provisioned-throughput ReadCapacityUnits=100,WriteCapacityUnits=50

Increase write capacity for a GSI:

bash

aws dynamodb update-table \
  --table-name my-table \
  --global-secondary-index-updates '[{"Update":{"IndexName":"my-gsi","ProvisionedThroughput":{"ReadCapacityUnits":50,"WriteCapacityUnits":25}}}]'

Enable Auto Scaling

Auto scaling adjusts capacity based on actual usage:

```bash # Create a scalable target for reads aws application-autoscaling register-scalable-target \ --service-namespace dynamodb \ --scalable-dimension dynamodb:table:ReadCapacityUnits \ --resource-id table/my-table \ --min-capacity 10 \ --max-capacity 200

# Create a scalable target for writes aws application-autoscaling register-scalable-target \ --service-namespace dynamodb \ --scalable-dimension dynamodb:table:WriteCapacityUnits \ --resource-id table/my-table \ --min-capacity 10 \ --max-capacity 200

# Create scaling policy for reads aws application-autoscaling put-scaling-policy \ --service-namespace dynamodb \ --scalable-dimension dynamodb:table:ReadCapacityUnits \ --resource-id table/my-table \ --policy-name read-scaling-policy \ --policy-type TargetTrackingScaling \ --target-tracking-scaling-policy-configuration file://read-scaling.json ```

Where read-scaling.json contains:

json

{
  "PredefinedMetricSpecification": {
    "PredefinedMetricType": "DynamoDBReadCapacityUtilization"
  },
  "TargetValue": 70.0,
  "ScaleOutCooldown": 60,
  "ScaleInCooldown": 60
}

Switch to On-Demand Mode

On-demand mode handles capacity automatically but costs more for predictable workloads:

bash

aws dynamodb update-table \
  --table-name my-table \
  --billing-mode PAY_PER_REQUEST

Note: You can't switch from on-demand back to provisioned within 24 hours.

Hot Partition Problem

DynamoDB partitions data by partition key. If one partition gets too much traffic, it throttles even if total capacity is sufficient.

Identify hot partitions:

bash

aws cloudwatch get-metric-statistics \
  --namespace AWS/DynamoDB \
  --metric-name PartitionKeyLevelReads \
  --dimensions Name=TableName,Value=my-table \
  --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
  --period 300 \
  --statistics Maximum

Solutions for hot partitions:

1. Use a more distributed partition key:

```javascript // Bad - all users in one partition { PK: "USER", SK: "user123" }

// Better - users distributed across partitions { PK: "USER#A#user123", SK: "PROFILE" } // 'A' is a hash prefix ```

2. Add a random suffix:

javascript

// Add random suffix for write-heavy workloads
const suffix = Math.floor(Math.random() * 10);
const partitionKey = `ITEM#${itemId}#${suffix}`;

3. Use write sharding:

javascript

// When writing, distribute across N shards
const shardId = hashCode(itemId) % 10;
const partitionKey = `ITEM#${shardId}#${itemId}`;

Inefficient Queries

Full table scans consume massive capacity:

bash

aws cloudwatch get-metric-statistics \
  --namespace AWS/DynamoDB \
  --metric-name ScanCount \
  --dimensions Name=TableName,Value=my-table \
  --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
  --period 300 \
  --statistics Sum

Replace scans with queries:

```bash # Bad: Scan aws dynamodb scan \ --table-name my-table \ --filter-expression "attribute_exists(email)"

# Good: Query with index aws dynamodb query \ --table-name my-table \ --index-name email-index \ --key-condition-expression "email = :email" \ --expression-attribute-values '{":email":{"S":"user@example.com"}}' ```

Large Items

Large items consume more capacity units. A 1KB item costs 1 read unit, but a 4KB item costs 4 read units.

Check item sizes:

bash

aws dynamodb scan \
  --table-name my-table \
  --projection-expression "PK" \
  --return-consumed-capacity TOTAL

Solutions:

1. Split large items:

```javascript // Instead of one large item { PK: "USER#123", data: {...largeObject} }

// Split into smaller items { PK: "USER#123", SK: "PROFILE", data: {...profileData} } { PK: "USER#123", SK: "PREFERENCES", data: {...prefs} } { PK: "USER#123", SK: "SETTINGS", data: {...settings} } ```

2. Compress large attributes:

javascript

const compressed = zlib.gzipSync(JSON.stringify(largeData)).toString('base64');

GSI Throttling

GSIs have separate capacity and can throttle independently:

bash

aws cloudwatch get-metric-statistics \
  --namespace AWS/DynamoDB \
  --metric-name ConsumedWriteCapacityUnits \
  --dimensions Name=TableName,Value=my-table,Name=GlobalSecondaryIndexName,Value=my-gsi \
  --start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
  --period 300 \
  --statistics Sum

Increase GSI capacity:

bash

aws dynamodb update-table \
  --table-name my-table \
  --global-secondary-index-updates '[{
    "Update": {
      "IndexName": "my-gsi",
      "ProvisionedThroughput": {
        "ReadCapacityUnits": 100,
        "WriteCapacityUnits": 50
      }
    }
  }]'

Burst Credit Exhaustion

DynamoDB allows bursting above provisioned capacity using accumulated credits. When credits are exhausted, throttling occurs.

Check for burst patterns:

bash

aws cloudwatch get-metric-statistics \
  --namespace AWS/DynamoDB \
  --metric-name ConsumedReadCapacityUnits \
  --dimensions Name=TableName,Value=my-table \
  --start-time $(date -u -d '24 hours ago' +%Y-%m-%dT%H:%M:%SZ) \
  --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
  --period 3600 \
  --statistics Sum,Maximum,Average

Solution: Set provisioned capacity to handle your typical peak load, not just average.

Adaptive Capacity

DynamoDB's adaptive capacity automatically adjusts partition capacity for hot partitions, but it takes time (minutes to hours).

Force adaptive capacity to kick in faster by increasing provisioned capacity temporarily:

bash

aws dynamodb update-table \
  --table-name my-table \
  --provisioned-throughput ReadCapacityUnits=200,WriteCapacityUnits=100

Then reduce after traffic stabilizes:

bash

aws dynamodb update-table \
  --table-name my-table \
  --provisioned-throughput ReadCapacityUnits=100,WriteCapacityUnits=50

Verification Steps

After making changes, monitor throttling:

```bash # Check throttling events aws cloudwatch get-metric-statistics \ --namespace AWS/DynamoDB \ --metric-name ThrottledRequests \ --dimensions Name=TableName,Value=my-table \ --start-time $(date -u -d '30 minutes ago' +%Y-%m-%dT%H:%M:%SZ) \ --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \ --period 60 \ --statistics Sum

# Check capacity utilization aws cloudwatch get-metric-statistics \ --namespace AWS/DynamoDB \ --metric-name ConsumedReadCapacityUnits \ --dimensions Name=TableName,Value=my-table \ --start-time $(date -u -d '30 minutes ago' +%Y-%m-%dT%H:%M:%SZ) \ --end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \ --period 60 \ --statistics Sum ```

Test with actual workload:

```bash # Simple write test aws dynamodb put-item \ --table-name my-table \ --item '{"PK":{"S":"TEST#1"},"data":{"S":"test data"}}' \ --return-consumed-capacity TOTAL

# Simple read test aws dynamodb get-item \ --table-name my-table \ --key '{"PK":{"S":"TEST#1"}}' \ --return-consumed-capacity TOTAL ```

Set up CloudWatch alarms:

bash

aws cloudwatch put-metric-alarm \
  --alarm-name dynamodb-throttling \
  --alarm-description "DynamoDB throttling detected" \
  --namespace AWS/DynamoDB \
  --metric-name ThrottledRequests \
  --dimensions Name=TableName,Value=my-table \
  --statistic Sum \
  --period 60 \
  --threshold 1 \
  --comparison-operator GreaterThanOrEqualToThreshold \
  --evaluation-periods 1 \
  --alarm-actions arn:aws:sns:us-east-1:123456789012:alerts

Monitor capacity utilization:

bash

aws cloudwatch put-metric-alarm \
  --alarm-name dynamodb-capacity-high \
  --alarm-description "DynamoDB capacity utilization over 80%" \
  --namespace AWS/DynamoDB \
  --metric-name ConsumedReadCapacityUnits \
  --dimensions Name=TableName,Value=my-table \
  --statistic Sum \
  --period 300 \
  --threshold 80 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 2 \
  --alarm-actions arn:aws:sns:us-east-1:123456789012:alerts

Diagnosis Commands

Common Causes and Solutions

Insufficient Provisioned Capacity

Enable Auto Scaling

Switch to On-Demand Mode

Hot Partition Problem

Inefficient Queries

Large Items

GSI Throttling

Burst Credit Exhaustion

Adaptive Capacity

Verification Steps

Share this guide

More AWS Troubleshooting Guides

AWS DynamoDB Contributor Insights Not Showing

AWS DynamoDB DAX Cache Miss

AWS DynamoDB Global Table Replication Lag

AWS Step Functions Workflow Stuck Waiting

AWS Step Functions Execution Throttled

AWS EventBridge Pipe Source Error