Introduction
WiredTiger is MongoDB's default storage engine since version 3.2, providing document-level concurrency control, compression, and efficient cache management. When WiredTiger errors occur, they can affect data integrity, query performance, and server stability. These errors range from cache pressure warnings to severe corruption requiring data recovery.
Symptoms
WiredTiger errors manifest in various severity levels:
```text # Cache pressure warnings WT_CACHE_STUCK: application threads blocked Cache stuck for threshold milliseconds WiredTiger record store exceeded cache size
# Checkpoint errors Checkpoint failed: I/O error Cannot create checkpoint: disk full Checkpoint stuck waiting for eviction
# Corruption errors WiredTiger corruption detected WT_PANIC: WiredTiger library panic Error reading WiredTiger metadata
# File handle exhaustion Too many open files: WiredTiger Cannot open WiredTiger file
# In logs {"t":{"$date":"..."},"s":"E","c":"STORAGE","id":22423,"msg":"WiredTiger error","attr":{"error":"WT_PANIC: WiredTiger library panic"}} {"msg":"Cache stuck","attr":{"duration":60000}} {"msg":"Checkpoint failed","attr":{"error":"No space left on device"}} ```
Common Causes
- 1.Cache size misconfiguration - Cache too large for available RAM
- 2.Eviction pressure - Dirty pages exceeding eviction threshold
- 3.Checkpoint I/O failures - Disk issues during checkpoint creation
- 4.WiredTiger file corruption - Hardware failure or abrupt shutdown
- 5.File descriptor limits - OS limit on open files too restrictive
- 6.Journal issues - Journal file corruption or sync failures
- 7.Compression codec errors - Invalid compression settings
Step-by-Step Fix
Step 1: Diagnose WiredTiger State
Check WiredTiger statistics:
```javascript mongosh
// WiredTiger cache statistics db.serverStatus().wiredTiger.cache
// Key metrics: // "bytes currently in the cache" vs "maximum bytes configured" // "tracked dirty bytes in the cache" - should be < 5% of cache // "pages evicted by application threads" - high values indicate pressure
// Complete WiredTiger stats db.serverStatus().wiredTiger
// Check for errors in server status db.serverStatus().metrics.errors ```
Check WiredTiger connection state:
```javascript // Connection statistics db.serverStatus().wiredTiger.connection
// Look for: // "files currently open" - compare to OS limit // "memory allocations" - should not be excessive ```
Step 2: Fix Cache Overflow
When cache stuck message appears:
```javascript // Current cache metrics let cache = db.serverStatus().wiredTiger.cache let percentUsed = cache["bytes currently in the cache"] / cache["maximum bytes configured"] let percentDirty = cache["tracked dirty bytes in the cache"] / cache["maximum bytes configured"]
print("Cache usage: " + (percentUsed * 100).toFixed(1) + "%") print("Dirty pages: " + (percentDirty * 100).toFixed(1) + "%")
// Dirty pages > 20% triggers aggressive eviction // Dirty pages > 5% indicates write pressure ```
Adjust cache size:
# Edit mongod.conf
sudo nano /etc/mongod.confstorage:
wiredTiger:
engineConfig:
cacheSizeGB: 4 # Reduce if system memory pressure
# Or leave unset for default (50% RAM - 1GB)sudo systemctl restart mongodAdjust eviction thresholds:
storage:
wiredTiger:
engineConfig:
evictionDirtyTarget: 5 # Percentage of cache dirty pages trigger
evictionDirtyTrigger: 20 # Percentage for aggressive evictionStep 3: Handle Checkpoint Failures
Check checkpoint status:
```javascript // Checkpoint statistics db.serverStatus().wiredTiger.checkpoint
// Look for: // "checkpoints created" - should increment regularly // "time spent creating checkpoints" - should not be excessive ```
If checkpoint fails due to disk:
```bash # Check disk space df -h /var/lib/mongodb
# If disk full, see fix-mongodb-disk-full article ```
If checkpoint stuck:
```javascript // Force checkpoint (can temporarily block operations) db.adminCommand({ fsync: 1, lock: false })
// Or restart MongoDB to clear checkpoint state ```
Step 4: Fix File Handle Exhaustion
Check file limits:
```bash # Current MongoDB open files cat /proc/$(pgrep mongod)/limits | grep "open files"
# Or lsof -p $(pgrep mongod) | wc -l
# System limits ulimit -n ```
Increase limits:
```bash # Temporary ulimit -n 65535
# Permanent - edit /etc/security/limits.conf sudo nano /etc/security/limits.conf ```
mongod soft nofile 65535
mongod hard nofile 65535```bash # Or in mongod.conf (overrides system limits) # But not recommended - fix system limits instead
# Restart MongoDB sudo systemctl restart mongod ```
Step 5: Handle Corruption
Detect corruption:
```javascript // Validate collections db.collection.validate()
// Full validation (more thorough, slower) db.collection.validate({ full: true })
// Check WiredTiger metadata use admin db.runCommand({ validate: "WiredTiger.wt" }) ```
If corruption found:
```bash # Stop MongoDB immediately sudo systemctl stop mongod
# Check WiredTiger files ls -la /var/lib/mongodb/WiredTiger*
# WiredTiger recovery mongod --repair --dbpath /var/lib/mongodb
# Or use WiredTiger standalone recovery wt -h /var/lib/mongodb verify wt -h /var/lib/mongodb salvage ```
For severe corruption:
```bash # Backup existing data sudo cp -r /var/lib/mongodb /var/lib/mongodb-corrupted-backup
# If replica set member - resync from healthy member # If standalone - restore from backup
# For replica set secondary sudo rm -rf /var/lib/mongodb/* sudo systemctl start mongod # Will perform initial sync ```
Step 6: Fix Journal Issues
Check journal statistics:
```javascript // Journal stats db.serverStatus().wiredTiger.log
// Look for: // "log bytes written" vs "log bytes of checkpoint" // "log syncs" - sync calls to journal // "log sync time" - should not be excessive ```
If journal corrupted:
```bash # Stop MongoDB sudo systemctl stop mongod
# Remove journal files (data loss risk!) sudo rm /var/lib/mongodb/journal/WiredTigerLog.*
# Run repair mongod --repair --dbpath /var/lib/mongodb
# Restart sudo systemctl start mongod ```
Step 7: Configure Compression Settings
Check current compression:
```javascript // Collection statistics show compression db.collection.stats() // "compressed pages" ratio indicates compression effectiveness
// WiredTiger engine config db.serverStatus().wiredTiger ```
Adjust compression if needed:
storage:
wiredTiger:
collectionConfig:
blockCompressor: "snappy" # Options: none, snappy, zlib, zstd
indexConfig:
prefixCompression: trueVerification
Verify WiredTiger health:
```javascript // 1. Cache healthy let cache = db.serverStatus().wiredTiger.cache let used = cache["bytes currently in the cache"] / cache["maximum bytes configured"] print("Cache usage: " + (used * 100).toFixed(1) + "%") // Should be < 95%
// 2. Dirty pages low let dirty = cache["tracked dirty bytes in the cache"] / cache["maximum bytes configured"] print("Dirty pages: " + (dirty * 100).toFixed(1) + "%") // Should be < 20%
// 3. No errors in server status db.serverStatus().metrics.errors
// 4. Checkpoints running db.serverStatus().wiredTiger.checkpoint
// 5. Validate collections db.collection.validate() ```
System-level verification:
```bash # No errors in logs grep -i "WiredTiger error" /var/log/mongodb/mongod.log | tail -20 # Should show no recent errors
# File handles sufficient lsof -p $(pgrep mongod) | wc -l # Should be well below limit
# Disk space adequate df -h /var/lib/mongodb ```
Common Pitfalls
- Setting cache too large - Leaves insufficient RAM for OS and connections
- Ignoring cache pressure warnings - Can lead to stuck state and timeouts
- Not monitoring dirty pages - High dirty ratio indicates write backlog
- Abrupt shutdown without checkpoint - Recovery may lose recent writes
- Setting file limits too low - WiredTiger needs file handles for tables
Best Practices
- Leave 10-20% RAM free for OS and connections beyond WiredTiger cache
- Monitor cache dirty percentage with alerts at 10% and 20%
- Use snappy compression for balanced performance and space savings
- Set file descriptor limits to 64K or higher
- Ensure clean shutdowns to allow checkpoint completion
- Monitor checkpoint duration for disk performance issues
- Schedule maintenance windows for validate operations
Related Issues
- MongoDB Memory Limit Exceeded
- MongoDB Disk Full
- MongoDB Index Build Failed
- MongoDB Oplog Error