Home / MongoDB / MongoDB Chunk Migration Failed in Sharded Cluster

MongoDB

MongoDB Chunk Migration Failed in Sharded Cluster

How to diagnose and retry failed chunk migrations in MongoDB sharded clusters.

Yesterday3 min read

Illustration of MongoDB database diagnostics.

Introduction MongoDB's balancer migrates chunks between shards to maintain even data distribution. When a chunk migration fails—due to network issues, document size limits, or concurrent modifications—the balancer marks the chunk as jumbo and skips it, leading to data imbalance and hot shards.

Symptoms - `config.chunks` shows chunks with `jumbo: true` flag - Balancer logs show `chunk migration failed` with specific error messages - One shard has significantly more data than others - Queries routing to the overloaded shard have higher latency - `db.printShardingStatus()` shows uneven chunk distribution

Common Causes - Chunk contains documents exceeding the 16MB BSON size limit - Network timeout during migration of large chunks - Concurrent write to the chunk during migration causing version conflict - Recipient shard running out of disk space during migration - Migration throttled by `balancer` window being too restrictive

Step-by-Step Fix 1. Identify failed migrations and jumbo chunks: ```javascript // Find jumbo chunks db.getSiblingDB("config").chunks.find({ jumbo: true }).forEach(function(c) { print("Jumbo chunk: " + c.ns + " | shard: " + c.shard + " | min: " + JSON.stringify(c.min)); });

// Check balancer status sh.getBalancerState(); sh.isBalancerRunning(); ```

1.Clear the jumbo flag and retry migration:
2.```javascript
3.// Clear jumbo flag for a specific chunk
4.db.getSiblingDB("config").chunks.updateOne(
5.{ ns: "mydb.mycollection", jumbo: true },
6.{ $unset: { jumbo: "" } }
7.);

// Or clear all jumbo flags db.getSiblingDB("config").chunks.updateMany( { jumbo: true }, { $unset: { jumbo: "" } } ); ```

1.Split the oversized chunk before migration:
2.```javascript
3.// Find the chunk range
4.var chunk = db.getSiblingDB("config").chunks.findOne({
5.ns: "mydb.mycollection",
6.jumbo: true
7.});

// Split at the midpoint sh.splitAt("mydb.mycollection", chunk.min);

// Or split manually db.adminCommand({ split: "mydb.mycollection", middle: chunk.min }); ```

1.Manually move the chunk to a specific shard:
2.```javascript
3.db.adminCommand({
4.moveChunk: "mydb.mycollection",
5.find: { shardKeyField: "value" },
6.to: "shard2",
7._secondaryThrottle: true,
8._waitForDelete: true
9.});
10.`
11.Check migration logs for specific error causes:
12.```javascript
13.// On the mongos
14.db.adminCommand({ getLog: "global" }).log.filter(function(line) {
15.return line.match(/moveChunk|migrate|jumbo/i);
16.});
17.`
18.Adjust balancer settings for the migration window:
19.```javascript
20.// Set a wider balancer window
21.sh.setBalancerState(true);
22.db.adminCommand({
23.setParameter: 1,
24.balancerBulkMigrateBatchSize: 2
25.});
26.`

Prevention - Use an appropriate shard key that distributes data evenly - Monitor chunk sizes and split proactively before they grow too large - Set `balancer` to run during off-peak hours with `sh.setBalancerWindow()` - Ensure all shards have sufficient disk headroom (at least 20% free) - Monitor chunk distribution with `db.collection.getShardDistribution()` - Avoid schema designs that create unsharable chunks (all documents with same shard key) - Set `chunkSize` to an appropriate value (default 128MB, consider 64MB for faster migrations)