Introduction
MongoDB Cursor Not Found errors occur when an application attempts to retrieve more results from a cursor that has been garbage collected by the server, causing CursorNotFound: Cursor id not found exceptions during iteration. This error commonly affects long-running queries, large result set iterations, change stream consumers, and batch processing jobs. MongoDB cursors have a default timeout of 10 minutes (600 seconds) of inactivity - if the application doesn't call getMore within this window, the server automatically closes the cursor to free resources. Common causes include slow processing between batch fetches exceeding cursor timeout, network interruptions breaking cursor sessions, mongos router changes during sharded cluster operations, cursor kill commands issued accidentally, application crashes before cursor completion, change stream resume failures after replica set failover, batch size too large causing long fetch times, server-side cursor garbage collection, and driver bugs not properly handling cursor continuation. The fix requires understanding cursor lifecycle, implementing proper timeout configuration, optimizing batch sizes, and adding resilience patterns for cursor recovery. This guide provides production-proven troubleshooting for CursorNotFound errors across Node.js, Python, Java, Go drivers and various MongoDB deployment patterns (replica sets, sharded clusters, Atlas).
Symptoms
MongoServerError: Cursor not found with id: <id>CursorNotFoundException: cursor id '1234567890' not found- Change stream throws "cursor not found" after failover
- Batch iteration works for first N documents then fails
getMorecommand fails with cursor not found- Cursor disappears during long-running aggregation pipeline
- Error occurs at consistent time intervals (around 10 minutes)
- Retry of same query succeeds temporarily
- Multiple applications sharing cursor cause conflicts
- Cursor killed during server maintenance or restart
Common Causes
- Cursor idle timeout (10 minutes default) exceeded between batch fetches
- Processing logic too slow between
getMorerequests - Network partition between client and MongoDB server
- Mongos router restart during cursor iteration (sharded clusters)
- Replica set failover during cursor lifetime
- Application crashed before cursor exhausted
noCursorTimeoutflag not set for long-running cursors- Cursor explicitly killed via
killCursorscommand - Driver not properly maintaining cursor session
- Change stream not configured with resume tokens
- Server restart or step-down during cursor iteration
- Cursor resource limits exceeded on server
Step-by-Step Fix
### 1. Diagnose cursor timeout issues
Understand cursor lifecycle:
```javascript // MongoDB cursor lifecycle: // 1. Query returns first batch + cursor ID // 2. Client iterates, calls getMore for additional batches // 3. If getMore not called within 10 minutes, cursor times out // 4. Server garbage collects cursor (frees memory) // 5. Subsequent getMore fails with CursorNotFound
// Check current cursor timeout setting (default: 10 minutes) // This is server-side and not configurable per-cursor // But can be extended with noCursorTimeout flag
// Monitor active cursors db.currentOp({ "active" : true, "op" : "getmore" })
// Shows active getMore operations: // { // "opid": 12345, // "op": "getmore", // "ns": "db.collection", // "cursorid": 1234567890, // "durationMillis": 5000, // "client": "192.168.1.1:54321" // }
// Check for timed-out cursors in logs // Look for messages like: // "cursor 1234567890 not found" // "CursorNotFound: cursor id ... not valid or not found" ```
Identify slow cursor iteration:
```javascript // Profile cursor operations to find slow iteration db.setProfileLevel(2); // Log all operations
// Run your query, then check profile db.system.profile.find({ op: "getmore" }).sort({ ts: -1 }).limit(10)
// Output shows getMore timing: // { // "op": "getmore", // "ns": "mydb.mycollection", // "cursorid": 1234567890, // "durationMillis": 15000, // Time for this getMore // "ts": ISODate("2024-01-15T10:00:00Z") // }
// If durationMillis is high, server is slow returning batches // If gaps between getMore calls > 10 minutes, cursor will timeout
// Calculate time between batches // If processing each document takes 1 second and batch size is 101: // 101 documents × 1 second = 101 seconds between getMore calls // Well under 10-minute timeout
// But if batch size is 1000 and processing takes 1 second each: // 1000 seconds = 16.7 minutes > 10-minute timeout // Cursor will timeout before all batches fetched ```
### 2. Fix cursor timeout with noCursorTimeout
Use noCursorTimeout flag:
```javascript // Node.js (MongoDB Native Driver) // Option 1: Find with noCursorTimeout const cursor = db.collection('myCollection') .find({ query: 'condition' }) .addCursorFlag('noCursorTimeout', true);
try { for await (const doc of cursor) { // Process document (can take arbitrarily long) await processDocument(doc); } } finally { // IMPORTANT: Always close cursor when done await cursor.close(); }
// Option 2: Aggregation with noCursorTimeout const cursor = db.collection('myCollection').aggregate([ { $match: { status: 'active' } }, { $lookup: { from: 'other', localField: '_id', foreignField: 'refId' } } ], { allowDiskUse: true, cursor: { batchSize: 100 } }).addCursorFlag('noCursorTimeout', true);
// Python (PyMongo) cursor = db.collection.find( { "status": "active" }, no_cursor_timeout=True # Enable noCursorTimeout )
try: for doc in cursor: process_document(doc) finally: cursor.close() # Always close when done
// Java (MongoDB Java Driver) FindIterable<Document> iterable = collection.find(eq("status", "active")) .noCursorTimeout(true); // Enable noCursorTimeout
try (MongoCursor<Document> cursor = iterable.iterator()) { while (cursor.hasNext()) { Document doc = cursor.next(); processDocument(doc); } }
// Go (mongo-go-driver) opts := options.Find().SetNoCursorTimeout(true) cursor, err := collection.Find(ctx, filter, opts) if err != nil { return err } defer cursor.Close(ctx) // Always close when done
for cursor.Next(ctx) { var doc bson.M cursor.Decode(&doc) processDocument(doc) } ```
Important caveats:
```javascript // WARNING: noCursorTimeout cursors consume server resources // They do NOT count toward cursor limit but hold memory
// Best practices: // 1. ALWAYS close cursors in finally block // 2. Don't leave cursors open indefinitely // 3. Monitor for leaked cursors // 4. Use with batchSize to control memory
// Manually kill leaked cursors if needed db.killCursors("myCollection", [1234567890n])
// Check for open cursors (serverStatus) db.serverStatus().metrics.cursor.open
// Output: // { // "noTimeout": 5, // noCursorTimeout cursors // "pinned": 2, // Pinned cursors (transactions) // "total": 150 // Total open cursors // }
// If noTimeout is high, investigate unclosed cursors ```
### 3. Optimize batch size
Tune batchSize for workload:
```javascript // batchSize controls how many documents returned per getMore // Default: 101 documents (first batch), then 101 per getMore
// Small batchSize: More getMore calls, less memory // Large batchSize: Fewer getMore calls, more memory, longer processing
// Node.js - Set batchSize const cursor = db.collection('myCollection') .find({}) .batchSize(50); // Smaller batches for faster processing
// Or for aggregation const cursor = db.collection('myCollection').aggregate([ { $match: { status: 'active' } } ], { cursor: { batchSize: 50 } });
// Python - Set batchSize cursor = db.collection.find( { "status": "active" }, batch_size=50, # Smaller batches no_cursor_timeout=True )
// Calculate optimal batchSize: // 1. Measure average document size doc = db.collection.findOne() avgDocSize = Object.bsonsize(doc) // ~1KB typical
// 2. Consider available memory // Client memory / batchSize / avgDocSize = memory needed
// 3. Consider processing time // batchSize × processingTimePerDoc < 600 seconds (10 min timeout) // Example: 100 docs × 0.5 seconds = 50 seconds (safe) // Example: 1000 docs × 1 second = 1000 seconds (timeout!)
// 4. Adjust based on network RTT // High latency: larger batchSize to amortize round trips // Low latency: smaller batchSize for better streaming ```
### 4. Handle change stream cursor errors
Change stream resume pattern:
```javascript // Node.js - Change stream with resume const { MongoClient } = require('mongodb');
async function runChangeStream() { const client = new MongoClient('mongodb://localhost:27017'); await client.connect();
const collection = client.db('mydb').collection('myCollection'); let resumeToken = null;
while (true) { try { // Open change stream const pipeline = [{ $match: { operationType: { $in: ['insert', 'update', 'replace'] } } }]; const options = { fullDocument: 'updateLookup', resumeAfter: resumeToken // Resume from last processed event };
const changeStream = collection.watch(pipeline, options);
// Listen for changes changeStream.on('change', (change) => { console.log('Change:', change); resumeToken = change._id; // Save resume token
// Process change processChange(change); });
// Handle errors changeStream.on('error', async (err) => { console.error('Change stream error:', err);
// Check if cursor error (resumable) if (err.code === 43) { // CursorNotFound console.log('Cursor not found, resuming...'); await changeStream.close(); // Loop continues, opens new change stream with resumeToken } else { throw err; // Non-resumable error } });
// Keep running await new Promise(() => {});
} catch (err) { console.error('Fatal error:', err); await new Promise(resolve => setTimeout(resolve, 5000)); // Backoff // Retry loop } } }
// Python - Change stream with resume from pymongo.errors import CursorNotFound, ConnectionFailure
def run_change_stream(): resume_token = None
while True: try: pipeline = [{'$match': {'operationType': {'$in': ['insert', 'update']}}}] resume_kwargs = {'resume_after': resume_token} if resume_token else {}
with collection.watch(pipeline, **resume_kwargs) as stream: for change in stream: resume_token = change['_id'] process_change(change)
except (CursorNotFound, ConnectionFailure): print("Cursor lost, resuming...") continue # Loop retries with resume_token except Exception as e: print(f"Fatal error: {e}") time.sleep(5) # Backoff ```
Change stream configuration:
```javascript // Configure change stream for resilience const options = { // Resume configuration resumeAfter: lastKnownResumeToken,
// Start at specific point startAtOperationTime: Timestamp({ t: Date.now()/1000, i: 1 }),
// Full document lookup (reduces need to query collection) fullDocument: 'updateLookup',
// Collation for consistent ordering collation: { locale: 'simple' },
// Max await time (how long getMore waits for changes) maxAwaitTimeMS: 60000, // 1 minute
// Batch size batchSize: 100 };
// Change stream cursor also has timeout // But automatic resume handles this transparently ```
### 5. Fix sharded cluster cursor issues
Handle mongos failover:
```javascript // In sharded clusters, cursors are managed by mongos // If mongos fails during cursor iteration, cursor is lost
// Solution 1: Retry logic for cursor errors async function iterateWithRetry(query, maxRetries = 3) { for (let i = 0; i < maxRetries; i++) { try { const cursor = db.collection.find(query).addCursorFlag('noCursorTimeout', true);
const results = []; for await (const doc of cursor) { results.push(doc); }
await cursor.close(); return results;
} catch (err) {
if (err.code === 43 && i < maxRetries - 1) { // CursorNotFound
console.log(Cursor lost, retrying... (${i + 1}/${maxRetries}));
continue; // Retry
}
throw err; // Other error or max retries exceeded
}
}
}
// Solution 2: Use read preference to stick to same mongos const cursor = db.collection.find(query, {}, { readPreference: 'primaryPreferred', // Or 'secondaryPreferred' noCursorTimeout: true });
// Solution 3: Paginate instead of cursor // More resilient for sharded clusters async function paginateCollection(query, pageSize = 1000) { const results = []; let lastId = null;
while (true) { const pageQuery = lastId ? { ...query, _id: { $gt: lastId } } : query;
const page = await db.collection.find(pageQuery) .sort({ _id: 1 }) .limit(pageSize) .toArray();
results.push(...page);
if (page.length < pageSize) break; // Last page
lastId = page[page.length - 1]._id; }
return results; } ```
### 6. Handle server-side cursor cleanup
Prevent cursor garbage collection:
```javascript // MongoDB automatically cleans up cursors: // - Idle cursors timeout after 10 minutes // - Closed connections release cursors // - Server restart clears all cursors
// To prevent cleanup during long operations:
// 1. Keep cursor active with periodic getMore async function keepCursorAlive(cursor, intervalMs = 300000) { // 5 minutes const keepAlive = setInterval(async () => { try { await cursor.tryNext(); // Triggers getMore without consuming } catch (err) { clearInterval(keepAlive); throw err; } }, intervalMs);
return () => clearInterval(keepAlive); }
// Usage const cursor = db.collection.find({}).addCursorFlag('noCursorTimeout', true); const stopKeepAlive = keepCursorAlive(cursor);
try { for await (const doc of cursor) { await processDocument(doc); } } finally { stopKeepAlive(); await cursor.close(); }
// 2. Use snapshot queries for idempotent operations // If cursor fails, restart from known point async function processWithSnapshot() { const checkpoint = await getCheckpoint(); // Last processed _id
const query = checkpoint ? { _id: { $gt: checkpoint } } : {};
try { const cursor = db.collection.find(query).sort({ _id: 1 });
for await (const doc of cursor) { await processDocument(doc); await saveCheckpoint(doc._id); // Save progress } } catch (err) { if (err.code === 43) { // Cursor lost, restart from checkpoint console.log('Cursor lost, will resume from checkpoint'); return processWithSnapshot(); // Recursive retry } throw err; } } ```
### 7. Debug cursor issues
Enable cursor debugging:
```javascript // MongoDB server-side profiling db.setProfileLevel(2); // Log all operations
// Find cursor-related operations db.system.profile.find({ op: { $in: ["query", "getmore"] }, ts: { $gt: new Date(Date.now() - 3600000) } // Last hour }).sort({ ts: -1 }).limit(20)
// Check cursor metrics db.serverStatus().metrics.cursor
// Output: // { // "timedOut": 15, // Cursors that timed out // "open": { // "noTimeout": 5, // noCursorTimeout cursors // "pinned": 2, // Transaction cursors // "total": 150 // Total open cursors // } // }
// If timedOut is high, applications not closing cursors // If noTimeout is high, investigate unclosed cursors
// Driver-side debugging (Node.js) const client = new MongoClient(uri, { monitorCommands: true // Log all commands });
client.on('commandStarted', (event) => { if (event.commandName === 'getMore') { console.log('getMore:', event.command); } });
client.on('commandFailed', (event) => { if (event.commandName === 'getMore') { console.error('getMore failed:', event.failure); } }); ```
Prevention
- Set
noCursorTimeoutfor long-running cursor operations - Always close cursors in finally blocks
- Use appropriate batchSize to keep getMore frequency high
- Implement checkpoint/resume patterns for batch processing
- Configure change streams with resume tokens
- Monitor server cursor metrics for leaks
- Use pagination instead of cursors for user-facing features
- Document cursor timeout behavior in runbooks
- Test cursor resilience under failover scenarios
- Consider serverless patterns (Atlas Functions) for cursor operations
Related Errors
- **MongoDB connection timeout**: Network connectivity issues
- **MongoDB max time exceeded**: Query exceeded maxTimeMS
- **MongoDB transaction aborted**: Transaction conflict or timeout
- **MongoDB write concern error**: Replica set write failure
- **Change stream invalidated**: Collection dropped or renamed