Introduction PostgreSQL autovacuum removes dead tuples and prevents transaction ID wraparound. When it falls behind on large tables with high update/delete rates, table bloat accumulates, query performance degrades, and in extreme cases, the database shuts down to prevent XID wraparound.

Symptoms - `pg_stat_user_tables` shows `n_dead_tup` much higher than `n_live_tup` - Table bloat causing sequential scans to read many empty pages - Queries on affected tables slow down gradually over days - `age(datfrozenxid)` approaching 2 billion on critical databases - Autovacuum logs show `autovacuum: running` on same table for hours without completing

Common Causes - `autovacuum_vacuum_scale_factor` default (0.2) is too high for large tables (triggers at 20% dead) - `autovacuum_max_workers` insufficient for the number of busy tables - `maintenance_work_mem` too small, causing vacuum to process rows in small batches - Long-running transactions preventing vacuum from removing dead tuples - Autovacuum being throttled by `autovacuum_vacuum_cost_delay`

Step-by-Step Fix 1. **Identify tables where autovacuum is falling behind": ```sql SELECT schemaname, relname, n_live_tup, n_dead_tup, CASE WHEN n_live_tup > 0 THEN round(100.0 * n_dead_tup / n_live_tup, 1) ELSE 0 END AS dead_pct, last_autovacuum, last_vacuum FROM pg_stat_user_tables WHERE n_dead_tup > 10000 ORDER BY n_dead_tup DESC LIMIT 20; ```

  1. 1.**Tune per-table autovacuum settings":
  2. 2.```sql
  3. 3.-- For a high-churn table, make autovacuum more aggressive
  4. 4.ALTER TABLE orders SET (
  5. 5.autovacuum_vacuum_scale_factor = 0.01,
  6. 6.autovacuum_vacuum_threshold = 100,
  7. 7.autovacuum_analyze_scale_factor = 0.01,
  8. 8.autovacuum_analyze_threshold = 50
  9. 9.);
  10. 10.`
  11. 11.**Increase global autovacuum resources":
  12. 12.```sql
  13. 13.ALTER SYSTEM SET autovacuum_max_workers = 6;
  14. 14.ALTER SYSTEM SET autovacuum_vacuum_cost_limit = 1000;
  15. 15.ALTER SYSTEM SET autovacuum_vacuum_cost_delay = 2;
  16. 16.ALTER SYSTEM SET maintenance_work_mem = '1GB';
  17. 17.SELECT pg_reload_conf();
  18. 18.`
  19. 19.**Run manual VACUUM on the worst tables":
  20. 20.```sql
  21. 21.-- Run in parallel for multiple tables
  22. 22.VACUUM (VERBOSE, ANALYZE) orders;
  23. 23.VACUUM (VERBOSE, ANALYZE) order_items;

-- Use VACUUM FULL for extreme bloat (requires exclusive lock) -- VACUUM FULL orders; ```

  1. 1.**Check for long-running transactions blocking vacuum":
  2. 2.```sql
  3. 3.SELECT
  4. 4.pid,
  5. 5.now() - xact_start AS duration,
  6. 6.state,
  7. 7.query
  8. 8.FROM pg_stat_activity
  9. 9.WHERE state != 'idle'
  10. 10.AND xact_start < now() - interval '1 hour'
  11. 11.ORDER BY xact_start;

-- Terminate if appropriate SELECT pg_terminate_backend(12345); ```

  1. 1.**Monitor XID wraparound risk":
  2. 2.```sql
  3. 3.SELECT
  4. 4.datname,
  5. 5.age(datfrozenxid) AS xid_age,
  6. 6.round(100.0 * age(datfrozenxid) / 2000000000, 1) AS wraparound_pct
  7. 7.FROM pg_database
  8. 8.ORDER BY age(datfrozenxid) DESC;
  9. 9.`

Prevention - Set per-table autovacuum parameters for high-churn tables - Monitor `n_dead_tup / n_live_tup` ratio and alert when it exceeds 10% - Set `statement_timeout` on application queries to prevent long-running transactions - Run manual VACUUM during maintenance windows for the busiest tables - Monitor XID age and alert when it exceeds 1 billion - Use `pg_repack` for online table bloat removal without exclusive locks - Keep `autovacuum_vacuum_cost_delay` low (2ms or less) on modern SSDs