Introduction

Database backup cannot restore when backup file incomplete or disk error. This guide provides step-by-step diagnosis and resolution.

Symptoms

Typical error output:

bash
Error: Database operation failed
Check database error logs for details
Verify connection and query syntax

Common Causes

  1. 1.Lock conflict or deadlock between concurrent transactions
  2. 2.Resource exhausted (connections, disk, memory)
  3. 3.Replication or failover configuration issue
  4. 4.Data corruption or constraint violation

Step-by-Step Fix

Step 1: Check Current State

bash
# Check database status
systemctl status postgresql
# View error logs
tail -f /var/log/postgresql/postgresql.log
# Check active connections
psql -c "SELECT count(*) FROM pg_stat_activity;"

Step 2: Identify Root Cause

bash
# Check active queries
SELECT * FROM pg_stat_activity WHERE state = 'active';
# View locks
SELECT * FROM pg_locks;
# Check replication status
SELECT * FROM pg_stat_replication;

Step 3: Apply Primary Fix

```bash # Primary fix: Check and resolve issue # View blocking queries SELECT * FROM pg_stat_activity WHERE state = 'active';

# Kill blocking query if needed SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE query = 'problem_query';

# Check for locks SELECT * FROM pg_locks WHERE NOT granted; ```

Step 4: Apply Alternative Fix

```bash # Alternative: Check logs and restart # View detailed error log tail -100 /var/log/postgresql/postgresql.log

# Check database size SELECT pg_size_pretty(pg_database_size('mydb'));

# Restart database if needed systemctl restart postgresql ```

Step 5: Verify the Fix

bash
psql -c "SELECT version();"
# Should return PostgreSQL version
psql -c "SELECT count(*) FROM pg_stat_activity;"
# Check connection count

Common Pitfalls

  • Not monitoring lock wait times
  • Using long-running transactions
  • Ignoring replication lag
  • Not backing up before schema changes

Best Practices

  • Monitor database metrics continuously
  • Set appropriate lock timeouts
  • Regular backup and restore testing
  • Keep statistics updated
  • Database Deadlock
  • Connection Pool Exhausted
  • Replication Failed
  • Query Timeout