Database Filesystem Full During WAL/Redo Log Archiving

Introduction When Write-Ahead Log (WAL) or redo log archiving falls behind, log files accumulate on disk until the filesystem reaches 100% capacity. At this point, the database stops accepting writes, and in severe cases, even reads fail. This is a critical production incident.

1.Temporarily increase WAL retention limit to buy time:
2.```sql
3.-- Check current WAL usage
4.SELECT pg_walfile_name(pg_current_wal_lsn()),
5.pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS used_wal
6.FROM pg_control_checkpoint();

-- Check oldest required WAL SELECT slot_name, restart_lsn, active FROM pg_replication_slots; ```

1.Free space by moving archived WAL to alternative storage:
2.```bash
3.# Move WAL archives to a temp location on a different disk
4.mkdir -p /mnt/backup/pg_wal_archive
5.mv /var/lib/postgresql/16/main/pg_wal/archive_status/*.* /mnt/backup/pg_wal_archive/
6.`
7.Remove inactive replication slots that are holding WAL:
8.```sql
9.SELECT slot_name, active, restart_lsn FROM pg_replication_slots;
10.-- If a slot is inactive and holding WAL:
11.SELECT pg_drop_replication_slot('orphaned_standby_slot');
12.`
13.Fix the archive command and restart archiving:
14.```sql
15.-- Check archive status
16.SELECT * FROM pg_stat_archiver;

-- Verify archive_command is correct SHOW archive_command;

-- Fix and reload ALTER SYSTEM SET archive_command = 'wal-g wal-push %p'; SELECT pg_reload_conf(); ```

-- Archive current log ALTER SYSTEM ARCHIVE LOG CURRENT;

-- Delete archived logs older than 2 days using RMAN -- rman target / -- DELETE ARCHIVELOG UNTIL TIME 'SYSDATE-2'; ```