Introduction Cassandra tombstones are markers for deleted data. When a query scans too many tombstones (default threshold: 1000 per query), Cassandra logs a warning and may return incomplete results or timeout. High tombstone counts significantly degrade read performance because Cassandra must check each tombstone to determine if the data is deleted.
Symptoms - `Read 1000 live rows and 100001 tombstone cells` in Cassandra logs - `TombstoneOverwhelmingException` or read timeouts on queries - `nodetool tablestats` shows high `Dropped mutations` or `Tombstones` - Queries that previously ran fast now timeout consistently - `cassandra.yaml` tombstone_warn_threshold and tombstone_failure_threshold exceeded
Common Causes - Frequent DELETE operations without compaction running - Updating the same rows repeatedly with TTL expiring - Range scans over partitions with many deleted rows - SizeTieredCompactionStrategy not compacting tombstone-heavy SSTables - Application pattern of writing then deleting temporary data
Step-by-Step Fix 1. **Check tombstone counts per table": ```bash nodetool tablestats mykeyspace # Look for: # Read Count, Read Latency # Tombstone count information
# Use sstablemetadata for detailed tombstone info sstablemetadata /var/lib/cassandra/data/mykeyspace/mytable/*.db ```
- 1.**Force compaction to clear tombstones":
- 2.```bash
- 3.# Major compaction merges all SSTables and removes tombstones
- 4.# past gc_grace_seconds
- 5.nodetool compact mykeyspace mytable
# Check tombstone count after compaction nodetool tablestats mykeyspace.mytable ```
- 1.**Adjust tombstone thresholds temporarily":
- 2.```yaml
- 3.# /etc/cassandra/cassandra.yaml
- 4.tombstone_warn_threshold: 5000 # Default: 1000
- 5.tombstone_failure_threshold: 200000 # Default: 100000
- 6.
` - 7.**Run repairs to ensure tombstone replication":
- 8.```bash
- 9.# Run repair on all nodes before reducing gc_grace_seconds
- 10.nodetool repair mykeyspace mytable
# If the table is a single-node table, you can reduce gc_grace_seconds ALTER TABLE mykeyspace.mytable WITH gc_grace_seconds = 86400; -- 1 day ```
- 1.**Redesign the query to avoid tombstone-heavy scans":
- 2.```sql
- 3.-- BAD: range scan hitting many tombstones
- 4.-- SELECT * FROM events WHERE user_id = 123 AND event_date > '2025-01-01';
-- GOOD: narrow the partition key SELECT * FROM events WHERE user_id = 123 AND event_bucket = '2026-04' AND event_date > '2026-04-01';
-- Use IN clause instead of range scan when possible SELECT * FROM events WHERE user_id = 123 AND event_id IN ('a', 'b', 'c'); ```