# PostgreSQL Index Corrupted: Reindex and Recovery
Your queries are returning wrong results, or you're seeing these errors:
ERROR: index "users_pkey" contains unexpected zero page at block 0
ERROR: could not read block 123 in file "base/16384/16543": read only 0 of 8192 bytes
ERROR: duplicate key value violates unique constraint "users_email_key"Index corruption is serious but usually recoverable. Let's diagnose and fix it.
Signs of Index Corruption
Symptoms
- Queries return incorrect results (missing rows or extra rows)
- UNIQUE constraint violations on valid inserts
- ORDER BY returns rows in wrong order
- Queries with WHERE clauses return wrong results
- ERROR messages mentioning index files or blocks
- Database crashes when accessing certain tables
Verify Corruption
```sql -- Install amcheck extension CREATE EXTENSION IF NOT EXISTS amcheck;
-- Check btree indexes for corruption SELECT bt_index_check(c.oid) FROM pg_index i JOIN pg_class c ON c.oid = i.indexrelid WHERE c.relkind = 'i';
-- More thorough check (includes parent table check) SELECT bt_index_parent_check(c.oid) FROM pg_index i JOIN pg_class c ON c.oid = i.indexrelid WHERE c.relkind = 'i';
-- Check specific index SELECT bt_index_check('users_pkey'::regclass); SELECT bt_index_parent_check('users_email_key'::regclass); ```
If these functions return any error, the index is corrupted.
Check All Indexes in a Database
```sql -- List all indexes with their tables SELECT n.nspname AS schema, c.relname AS index_name, t.relname AS table_name, pg_size_pretty(pg_relation_size(c.oid)) AS index_size, i.indisunique AS is_unique, i.indisprimary AS is_primary FROM pg_class c JOIN pg_index i ON c.oid = i.indexrelid JOIN pg_class t ON t.oid = i.indrelid JOIN pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind = 'i' ORDER BY pg_relation_size(c.oid) DESC;
-- Check all btree indexes DO $$ DECLARE r RECORD; has_error BOOLEAN := false; BEGIN FOR r IN SELECT c.oid, c.relname, n.nspname FROM pg_class c JOIN pg_index i ON c.oid = i.indexrelid JOIN pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind = 'i' AND c.relname NOT LIKE 'pg_%' LOOP BEGIN PERFORM bt_index_check(r.oid); RAISE NOTICE 'Index %.%: OK', r.nspname, r.relname; EXCEPTION WHEN OTHERS THEN RAISE WARNING 'Index %.%: CORRUPTED - %', r.nspname, r.relname, SQLERRM; has_error := true; END; END LOOP; IF has_error THEN RAISE WARNING 'One or more indexes are corrupted!'; END IF; END $$; ```
Common Causes of Index Corruption
- 1.Hardware failures - Disk errors, memory errors, power failures
- 2.Operating system crashes - Unclean shutdowns
- 3.PostgreSQL crashes - During write operations
- 4.File system issues - NFS problems, file system corruption
- 5.Bug in PostgreSQL - Rare but possible in certain versions
- 6.Copy-on-write filesystems - Btrfs/ZFS without proper configuration
Fixing Corrupted Indexes
Method 1: REINDEX (Simple)
For non-critical systems or during maintenance windows:
```sql -- Rebuild a specific index REINDEX INDEX users_pkey;
-- Rebuild all indexes on a table REINDEX TABLE users;
-- Rebuild all indexes in a schema REINDEX SCHEMA public;
-- Rebuild all indexes in the database REINDEX DATABASE mydb; ```
Warning: REINDEX locks the table against writes. For production systems, use REINDEX CONCURRENTLY.
Method 2: REINDEX CONCURRENTLY (Production)
For PostgreSQL 12+, rebuild indexes without locking:
```sql -- Rebuild index without blocking writes REINDEX INDEX CONCURRENTLY users_pkey;
-- Rebuild all indexes on a table concurrently REINDEX TABLE CONCURRENTLY users;
-- Rebuild all indexes in a schema concurrently REINDEX SCHEMA CONCURRENTLY public; ```
Note: CONCURRENTLY takes longer but allows reads and writes during rebuild.
Method 3: Manual Index Recreation
If REINDEX fails, recreate indexes manually:
```sql -- 1. Get index definition SELECT indexdef FROM pg_indexes WHERE indexname = 'users_email_key';
-- 2. Create new index with different name CREATE UNIQUE INDEX CONCURRENTLY users_email_key_new ON users(email);
-- 3. Drop old index in a transaction BEGIN; DROP INDEX users_email_key; ALTER INDEX users_email_key_new RENAME TO users_email_key; COMMIT;
-- For primary keys, you need to handle constraints -- 1. Create new unique index CREATE UNIQUE INDEX CONCURRENTLY users_pkey_new ON users(id);
-- 2. Drop constraint and recreate ALTER TABLE users DROP CONSTRAINT users_pkey; ALTER TABLE users ADD CONSTRAINT users_pkey PRIMARY KEY USING INDEX users_pkey_new; ```
Fixing Specific Corruption Types
Corrupted Primary Key Index
```sql -- Check primary key SELECT bt_index_check('users_pkey'::regclass);
-- If corrupted, rebuild REINDEX INDEX CONCURRENTLY users_pkey;
-- If REINDEX fails, recreate manually CREATE UNIQUE INDEX CONCURRENTLY users_pkey_new ON users(id); ALTER TABLE users DROP CONSTRAINT users_pkey; ALTER TABLE users ADD CONSTRAINT users_pkey PRIMARY KEY USING INDEX users_pkey_new; ```
Corrupted Unique Constraint Index
```sql -- Find and fix duplicate issues first SELECT email, count(*) FROM users GROUP BY email HAVING count(*) > 1;
-- Resolve duplicates DELETE FROM users a USING users b WHERE a.id > b.id AND a.email = b.email;
-- Rebuild unique index REINDEX INDEX CONCURRENTLY users_email_key; ```
Corrupted Foreign Key Index
Foreign keys don't require indexes, but a corrupted index used for FK checks:
```sql -- Find the index SELECT indexname FROM pg_indexes WHERE tablename = 'orders' AND indexname LIKE '%user_id%';
-- Rebuild REINDEX INDEX CONCURRENTLY orders_user_id_idx; ```
Corrupted System Catalog Indexes
If system indexes are corrupted:
```bash # Stop PostgreSQL sudo systemctl stop postgresql
# Start in single-user mode postgres --single -D /var/lib/postgresql/16/main postgres
# At the prompt REINDEX DATABASE postgres;
# Exit Ctrl+D
# Start PostgreSQL normally sudo systemctl start postgresql ```
Checking for Data Corruption
Index corruption is often accompanied by or confused with data corruption:
```sql -- Install pageinspect for deep inspection CREATE EXTENSION IF NOT EXISTS pageinspect;
-- Check a table's pages SELECT * FROM page_header(get_raw_page('users', 0));
-- Check for dead tuples CREATE EXTENSION IF NOT EXISTS pgstattuple; SELECT * FROM pgstattuple('users');
-- Check index pages SELECT * FROM page_header(get_raw_page('users_pkey', 0)); ```
If data pages are corrupted:
```sql -- Create new table from what's readable CREATE TABLE users_recovered AS SELECT * FROM users;
-- Check row counts SELECT count(*) FROM users; SELECT count(*) FROM users_recovered;
-- Replace if counts match BEGIN; DROP TABLE users; ALTER TABLE users_recovered RENAME TO users; -- Recreate indexes and constraints COMMIT; ```
Preventing Index Corruption
Hardware and System Configuration
```conf # postgresql.conf
# Ensure data durability fsync = on synchronous_commit = on full_page_writes = on
# WAL checksums wal_log_hints = on
# Data checksums (set at initdb) # Check if enabled: SELECT data_checksums FROM pg_database LIMIT 1; ```
# Enable checksums on existing database (PostgreSQL 12+)
pg_checksums -D /var/lib/postgresql/16/main --enable
# Note: Requires PostgreSQL to be stoppedRegular Maintenance
```sql -- Schedule regular index checks CREATE OR REPLACE FUNCTION check_all_indexes() RETURNS TABLE(index_name text, status text) LANGUAGE plpgsql AS $$ BEGIN RETURN QUERY SELECT c.relname::text, CASE WHEN bt_index_check(c.oid, true) THEN 'OK' ELSE 'CORRUPTED' END FROM pg_class c JOIN pg_namespace n ON c.relnamespace = n.oid WHERE c.relkind = 'i' AND n.nspname NOT IN ('pg_catalog', 'pg_toast'); RETURN; END; $$;
-- Run weekly SELECT * FROM check_all_indexes(); ```
Proper Filesystem Configuration
For copy-on-write filesystems (Btrfs, ZFS):
```bash # Turn off copy-on-write for PostgreSQL data chattr +C /var/lib/postgresql/16/main
# For ZFS, set recordsize zfs set recordsize=8K pool/postgresql
# Ensure proper sync behavior # ZFS: zfs set sync=standard pool/postgresql ```
Verification After Repair
```sql -- Verify all indexes are healthy DO $$ DECLARE r RECORD; BEGIN FOR r IN SELECT c.oid, c.relname FROM pg_class c JOIN pg_namespace n ON c.relnamespace = n.oid WHERE c.relkind = 'i' AND n.nspname NOT LIKE 'pg_%' LOOP PERFORM bt_index_check(r.oid); RAISE NOTICE 'Index %: verified', r.relname; END LOOP; RAISE NOTICE 'All indexes verified successfully'; END $$;
-- Verify table data integrity SELECT schemaname, relname, n_live_tup, n_dead_tup, last_vacuum, last_analyze FROM pg_stat_user_tables ORDER BY n_dead_tup DESC;
-- Run ANALYZE to update statistics ANALYZE VERBOSE; ```
Monitoring for Corruption
Create a monitoring function:
```sql CREATE OR REPLACE FUNCTION monitor_index_health() RETURNS TABLE( schema_name text, index_name text, table_name text, index_size text, status text ) LANGUAGE plpgsql AS $$ BEGIN FOR schema_name, index_name, table_name, index_size IN SELECT n.nspname::text, c.relname::text, t.relname::text, pg_size_pretty(pg_relation_size(c.oid))::text FROM pg_class c JOIN pg_index i ON c.oid = i.indexrelid JOIN pg_class t ON t.oid = i.indrelid JOIN pg_namespace n ON n.oid = c.relnamespace WHERE c.relkind = 'i' AND n.nspname NOT IN ('pg_catalog', 'pg_toast') ORDER BY pg_relation_size(c.oid) DESC LOOP BEGIN PERFORM bt_index_check((schema_name || '.' || index_name)::regclass); status := 'OK'; EXCEPTION WHEN OTHERS THEN status := 'CORRUPTED: ' || SQLERRM; END; RETURN NEXT; END LOOP; RETURN; END; $$;
-- Run check SELECT * FROM monitor_index_health() WHERE status != 'OK'; ```
Quick Reference
| Action | Command |
|---|---|
| Check index | SELECT bt_index_check('index_name'); |
| Check index with heap | SELECT bt_index_parent_check('index_name'); |
| Rebuild index | REINDEX INDEX index_name; |
| Rebuild concurrently | REINDEX INDEX CONCURRENTLY index_name; |
| Rebuild table indexes | REINDEX TABLE table_name; |
| Rebuild schema indexes | REINDEX SCHEMA schema_name; |
| Get index definition | SELECT indexdef FROM pg_indexes WHERE indexname='name'; |
| Create index concurrently | CREATE INDEX CONCURRENTLY name ON table(col); |
| Check checksums | SELECT data_checksums FROM pg_database LIMIT 1; |