Introduction MongoDB's balancer migrates chunks between shards to maintain even data distribution. When a chunk migration fails—due to network issues, document size limits, or concurrent modifications—the balancer marks the chunk as jumbo and skips it, leading to data imbalance and hot shards.
Symptoms - `config.chunks` shows chunks with `jumbo: true` flag - Balancer logs show `chunk migration failed` with specific error messages - One shard has significantly more data than others - Queries routing to the overloaded shard have higher latency - `db.printShardingStatus()` shows uneven chunk distribution
Common Causes - Chunk contains documents exceeding the 16MB BSON size limit - Network timeout during migration of large chunks - Concurrent write to the chunk during migration causing version conflict - Recipient shard running out of disk space during migration - Migration throttled by `balancer` window being too restrictive
Step-by-Step Fix 1. **Identify failed migrations and jumbo chunks**: ```javascript // Find jumbo chunks db.getSiblingDB("config").chunks.find({ jumbo: true }).forEach(function(c) { print("Jumbo chunk: " + c.ns + " | shard: " + c.shard + " | min: " + JSON.stringify(c.min)); });
// Check balancer status sh.getBalancerState(); sh.isBalancerRunning(); ```
- 1.Clear the jumbo flag and retry migration:
- 2.```javascript
- 3.// Clear jumbo flag for a specific chunk
- 4.db.getSiblingDB("config").chunks.updateOne(
- 5.{ ns: "mydb.mycollection", jumbo: true },
- 6.{ $unset: { jumbo: "" } }
- 7.);
// Or clear all jumbo flags db.getSiblingDB("config").chunks.updateMany( { jumbo: true }, { $unset: { jumbo: "" } } ); ```
- 1.Split the oversized chunk before migration:
- 2.```javascript
- 3.// Find the chunk range
- 4.var chunk = db.getSiblingDB("config").chunks.findOne({
- 5.ns: "mydb.mycollection",
- 6.jumbo: true
- 7.});
// Split at the midpoint sh.splitAt("mydb.mycollection", chunk.min);
// Or split manually db.adminCommand({ split: "mydb.mycollection", middle: chunk.min }); ```
- 1.Manually move the chunk to a specific shard:
- 2.```javascript
- 3.db.adminCommand({
- 4.moveChunk: "mydb.mycollection",
- 5.find: { shardKeyField: "value" },
- 6.to: "shard2",
- 7._secondaryThrottle: true,
- 8._waitForDelete: true
- 9.});
- 10.
` - 11.Check migration logs for specific error causes:
- 12.```javascript
- 13.// On the mongos
- 14.db.adminCommand({ getLog: "global" }).log.filter(function(line) {
- 15.return line.match(/moveChunk|migrate|jumbo/i);
- 16.});
- 17.
` - 18.Adjust balancer settings for the migration window:
- 19.```javascript
- 20.// Set a wider balancer window
- 21.sh.setBalancerState(true);
- 22.db.adminCommand({
- 23.setParameter: 1,
- 24.balancerBulkMigrateBatchSize: 2
- 25.});
- 26.
`