Fix Database Connection Refused After Failover - Site Down

Introduction

After a database failover (automatic or manual), applications may continue trying to connect to the old primary database instance that is now down or in read-only mode. The application returns database connection refused errors to users, causing complete site downtime. This is especially common with connection pooling libraries that cache connection details and do not automatically reconnect to the new primary.

Symptoms

Application returns 500 Internal Server Error to all requests
Application logs show Connection refused or could not connect to server
Database failover completed successfully but application does not know
Connection pool shows all connections as broken or stale
Direct connection to the new primary database works but the application fails

Common Causes

Application connection string points to the old primary IP/hostname
DNS record for database endpoint not updated after failover
Connection pool holding stale connections to the old primary
Read replica promoted to primary but application still writing to old endpoint
Connection retry logic not configured or retry count too low

Step-by-Step Fix

1.Verify the new primary database is accepting connections:
2.```bash
3.# PostgreSQL
4.psql -h new-primary-host -U appuser -d appdb -c "SELECT 1"
5.# MySQL
6.mysql -h new-primary-host -u appuser -p appdb -e "SELECT 1"
7.`
8.Update the application connection string:
9.```bash
10.# Update environment variable or config file
11.export DATABASE_URL="postgresql://appuser:password@new-primary-host:5432/appdb"
12.# Or update the config file
13.sudo nano /etc/myapp/config.yml
14.`
15.Restart the application to clear the connection pool:
16.```bash
17.sudo systemctl restart myapp
18.# Or for containerized apps:
19.docker restart myapp-container
20.`
21.If using a connection pooler (PgBouncer, ProxySQL), restart it:
22.```bash
23.sudo systemctl restart pgbouncer
24.# Verify it is connecting to the new primary
25.psql -h localhost -p 6432 -U appuser -d appdb -c "SELECT inet_server_addr()"
26.`
27.Update DNS if the application connects via hostname:
28.```bash
29.# Update the database hostname to point to the new primary IP
30.# Then flush DNS caches on application servers
31.sudo systemd-resolve --flush-caches
32.`
33.Implement automatic reconnection in the application:
34.```python
35.# Python SQLAlchemy with automatic retry
36.from sqlalchemy import create_engine
37.from sqlalchemy.exc import OperationalError

engine = create_engine(DATABASE_URL, pool_pre_ping=True) # pool_pre_ping sends a test query on each checkout to verify connection ```

Prevention

Use connection poolers (PgBouncer, ProxySQL) that handle failover transparently
Configure pool_pre_ping or equivalent to detect stale connections
Use database hostnames (not IPs) in connection strings for easier failover
Implement automatic connection retry with exponential backoff
Test database failover procedures regularly in staging environments

Recommended hosting recovery options

Disclosure: FixWikiHub may earn a commission from clearly disclosed partner links.

SiteGround

Staging, backups, and managed WordPress tooling can help when you need a cleaner recovery environment.

Review SiteGround

WP Engine

Isolated environments, restore points, and platform controls can reduce recovery risk during major fixes.

Review WP Engine

Fix Site Down Database Connection Refused After Failover

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Prevention

Share this guide

More Downtime Troubleshooting Guides

Fix Grafana Alerting Notification Channel Delivery Failing

Fix Prometheus Target Down Scraping Failing

Fix MetricBeat System Module Causing High System Overhead

Fix Filebeat Logstash Pipeline Backpressure

Fix Packetbeat Nginx Logs Missing Metrics After Update

Fix Nomad Client Drained With No Allocations Scheduled