Introduction MongoDB limits each aggregation pipeline stage to 100MB of RAM. When a `$group` or `$sort` stage processes more data than fits in this limit, the pipeline fails with `MongoServerError: Exceeded memory limit for $group, but didn't allow external sort`. This is a common issue when aggregating large datasets without proper optimization.
Symptoms - `MongoServerError: Exceeded memory limit for $group` or `$sort` - `Sort exceeded memory limit of 104857600 bytes` - Aggregation works in development with small datasets but fails in production - `$explain` output shows stage with high `nReturned` and `executionTimeMillis` - Pipeline works with `$limit` but fails on the full collection
Common Causes - `$group` stage producing too many unique groups to fit in 100MB - `$sort` on a large unindexed collection without `allowDiskUse` - `$unwind` on arrays with many elements creating a Cartesian product explosion - Missing `$match` stage early in the pipeline to filter data before expensive stages - Aggregation on collections with large documents that individually consume significant memory
Step-by-Step Fix 1. **Enable disk use for the aggregation": ```javascript db.orders.aggregate([ { $match: { status: "completed" } }, { $group: { _id: "$customer_id", total: { $sum: "$amount" } } }, { $sort: { total: -1 } } ], { allowDiskUse: true // Allows stages to spill to disk }); ```
- 1.**Add an early $match stage to reduce data before expensive stages":
- 2.```javascript
- 3.// BAD: groups entire collection then filters
- 4.db.orders.aggregate([
- 5.{ $group: { _id: "$customer_id", total: { $sum: "$amount" } } },
- 6.{ $match: { total: { $gt: 1000 } } }
- 7.]);
// GOOD: filter first, then group db.orders.aggregate([ { $match: { amount: { $gt: 50 }, created_at: { $gte: ISODate("2026-01-01") } } }, { $group: { _id: "$customer_id", total: { $sum: "$amount" } } }, { $match: { total: { $gt: 1000 } } } ]); ```
- 1.**Use $project to reduce document size before grouping":
- 2.```javascript
- 3.db.orders.aggregate([
- 4.{ $match: { status: "completed" } },
- 5.{ $project: { customer_id: 1, amount: 1, _id: 0 } }, // Only needed fields
- 6.{ $group: { _id: "$customer_id", total: { $sum: "$amount" } } }
- 7.]);
- 8.
` - 9.**Create an index to support the aggregation":
- 10.```javascript
- 11.// Index to support the $match and $sort stages
- 12.db.orders.createIndex({ status: 1, created_at: -1, customer_id: 1, amount: 1 });
// Check if the index is being used db.orders.explain("executionStats").aggregate([ { $match: { status: "completed" } } ]); ```
- 1.**Break large aggregations into smaller batches":
- 2.```javascript
- 3.// Process by date ranges instead of the entire collection
- 4.const startDate = new Date("2026-01-01");
- 5.const endDate = new Date("2026-04-01");
- 6.const results = [];
for (let d = new Date(startDate); d < endDate; d.setMonth(d.getMonth() + 1)) { const monthEnd = new Date(d); monthEnd.setMonth(monthEnd.getMonth() + 1);
const monthResult = db.orders.aggregate([ { $match: { created_at: { $gte: d, $lt: monthEnd } } }, { $group: { _id: "$customer_id", total: { $sum: "$amount" } } } ]).toArray();
results.push(...monthResult); } ```