What's Actually Happening

Ceph distributed storage cluster shows HEALTH_WARN or HEALTH_ERR status. Storage operations may fail or be degraded.

The Error You'll See

```bash $ ceph health detail

HEALTH_WARN 1 OSDs are down; 1 OSDs are out; Degraded data redundancy ```

OSD down:

bash
HEALTH_ERR 3 OSDs down; 3 OSDs out

Monitor quorum lost:

bash
HEALTH_ERR 1 mons down, quorum mon1,mon2

PG error:

bash
HEALTH_WARN 2 pg incomplete; 1 pg stuck

Why This Happens

  1. 1.OSD failures - Object storage daemons down or out
  2. 2.Monitor issues - Monitor quorum problems
  3. 3.Network problems - Nodes cannot communicate
  4. 4.Disk failures - OSD storage disks failed
  5. 5.Configuration errors - Wrong cluster settings
  6. 6.PG issues - Placement groups degraded or stuck

Step 1: Check Ceph Cluster Status

```bash # Check cluster health: ceph health ceph health detail

# Detailed status: ceph status ceph -s

# Extended status: ceph -w

# Check OSD tree: ceph osd tree

# Check OSD status: ceph osd stat

# Check OSD dump: ceph osd dump

# Check monitor status: ceph mon stat

# Check monitor dump: ceph mon dump

# Check PG status: ceph pg stat

# Check PG dump: ceph pg dump

# Check cluster config: ceph config dump

# Check service logs: journalctl -u ceph-osd@* -f journalctl -u ceph-mon@* -f ```

Step 2: Fix OSD Down Issues

```bash # List down OSDs: ceph osd tree down

# Check specific OSD: ceph osd find osd.0

# Check OSD service: systemctl status ceph-osd@0

# Start OSD: systemctl start ceph-osd@0

# Check OSD process: ps aux | grep ceph-osd

# Check OSD log: tail -f /var/log/ceph/ceph-osd.0.log

# OSD crash logs: ceph osd crash ls

# Archive crashes: ceph osd crash archive-all

# Check OSD weight: ceph osd tree | grep osd.0

# Check OSD in/out: ceph osd tree | grep out

# Mark OSD in: ceph osd in osd.0

# Mark OSD out (remove from data): ceph osd out osd.0

# Mark OSD down: ceph osd down osd.0

# Destroy OSD completely: ceph osd destroy osd.0 --yes-i-really-mean-it

# Purge OSD: ceph osd purge osd.0 --yes-i-really-mean-it

# Check OSD disk: lsblk fdisk -l

# Check OSD data: ceph-volume lvm list

# Recreate OSD: ceph-volume lvm create --bluestore --data /dev/sdb ```

Step 3: Fix Monitor Quorum Issues

```bash # Check monitor status: ceph mon stat

# List monitors: ceph mon dump

# Check quorum: ceph quorum_status

# Check specific monitor: systemctl status ceph-mon@mon1

# Start monitor: systemctl start ceph-mon@mon1

# Check monitor log: tail -f /var/log/ceph/ceph-mon.mon1.log

# Check monmap: ceph mon getmap -o /tmp/monmap monmaptool --print /tmp/monmap

# Add monitor: ceph mon add mon3 192.168.1.3:6789

# Remove monitor: ceph mon remove mon3

# Check monitor rank: ceph mon dump | grep rank

# Check monitors in quorum: ceph mon dump | grep "in quorum"

# If quorum lost: # Check if at least one mon has monmap

# Extract monmap from working mon: ceph-mon -i mon1 --extract-monmap /tmp/monmap

# Inject monmap: ceph-mon -i mon2 --inject-monmap /tmp/monmap

# Start monitor with monmap: ceph-mon -i mon2 --monmap /tmp/monmap

# Check monitor store: ls -la /var/lib/ceph/mon/ceph-mon1/ ```

Step 4: Check Network Connectivity

```bash # Test connectivity between nodes: ping ceph-node1 ping ceph-node2

# Check Ceph ports: nc -zv ceph-node1 6789 # Monitor nc -zv ceph-node1 6800 # OSD

# OSD ports: 6800-7300

# Check firewall: iptables -L -n | grep ceph

# Allow Ceph ports: iptables -I INPUT -p tcp --dport 6789 -j ACCEPT iptables -I INPUT -p tcp --dport 6800:7300 -j ACCEPT

# Using ufw: ufw allow 6789/tcp ufw allow 6800:7300/tcp

# Check network interface: ip addr show

# Check public network: ceph config get mon public network

# Check cluster network: ceph config get mon cluster network

# Fix network config: ceph config set mon public_network 192.168.1.0/24 ceph config set mon cluster_network 192.168.2.0/24

# Check bind addresses: ceph config get mon mon_addr ceph config get osd osd_addr

# Check for network partition: ceph osd tree | grep host

# Test from each node: for node in node1 node2 node3; do echo "Testing $node:" ping -c 2 $node done

# Check MTU: ip link show | grep mtu

# Check latency: ping -c 10 ceph-node1 | grep rtt ```

Step 5: Check OSD Disk Issues

```bash # Check disk status: lsblk fdisk -l

# Check for disk errors: dmesg | grep -i error dmesg | grep -i sdb

# Check disk health: smartctl -a /dev/sdb

# Check I/O errors: iostat -x

# Check disk space: df -h /var/lib/ceph/osd/ceph-0

# Check OSD data directory: ls -la /var/lib/ceph/osd/ceph-0/

# Check BlueStore: ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0

# Check OSD metadata: ceph osd metadata osd.0

# Check OSD device: ceph-volume lvm list osd.0

# Check for disk corruption: # Run fsck on underlying device

# Check journal (if FileStore): ceph-osd --osd-data /var/lib/ceph/osd/ceph-0 --check-journal

# Recover OSD: # If disk replaced: ceph-volume lvm zap /dev/sdb ceph-volume lvm create --bluestore --data /dev/sdb --osd-id 0

# Check for slow OSDs: ceph osd perf

# Check disk latency: ceph osd df tree ```

Step 6: Fix Placement Group Issues

```bash # Check PG status: ceph pg stat

# List problematic PGs: ceph pg dump_stuck inactive ceph pg dump_stuck undersized ceph pg dump_stuck stale ceph pg dump_stuck degraded

# Check specific PG: ceph pg query 1.0

# Check PG mapping: ceph pg map 1.0

# List incomplete PGs: ceph pg ls incomplete

# Force PG recovery: ceph pg force-recovery 1.0

# Force PG backfill: ceph pg force-backfill 1.0

# Mark PG lost: ceph pg mark_unfound_lost revert 1.0

# Check PG autoscaler: ceph osd pool autoscale-status

# Adjust PG count: ceph osd pool set pool1 pg_num 256

# Check PG autoscale mode: ceph osd pool get pool1 pg_autoscale_mode

# Enable autoscale: ceph osd pool set pool1 pg_autoscale_mode on

# Check PG history: ceph pg 1.0 query | grep history

# Cancel recovery/backfill: ceph pg cancel-recovery 1.0 ```

Step 7: Check Pool Configuration

```bash # List pools: ceph osd pool ls ceph osd pool ls detail

# Check pool stats: ceph osd pool stats

# Check pool details: ceph osd pool get pool1 all

# Check pool size (replication): ceph osd pool get pool1 size

# Set pool size: ceph osd pool set pool1 size 3

# Check min_size: ceph osd pool get pool1 min_size

# Set min_size: ceph osd pool set pool1 min_size 2

# Check crush rule: ceph osd pool get pool1 crush_rule

# Check crush map: ceph osd crush dump

# Check crush tree: ceph osd crush tree

# Check device classes: ceph osd crush class ls

# Check rule set: ceph osd crush rule ls

# Check bucket weights: ceph osd tree

# Check pool usage: ceph osd pool df

# Check pool quota: ceph osd pool get-quota pool1

# Set quota: ceph osd pool set-quota pool1 max_objects 10000 ```

Step 8: Check Cluster Configuration

```bash # Check global config: ceph config dump

# Check monitor config: ceph config get mon

# Check OSD config: ceph config get osd

# Check specific setting: ceph config get osd osd_max_scrub_ops

# Set configuration: ceph config set osd osd_max_scrub_ops 10

# Check config history: ceph config log

# Reset configuration: ceph config reset

# Check require min release: ceph config get mon mon_require_min_ceph_release

# Check OSD heartbeat: ceph config get osd osd_heartbeat_interval ceph config get osd osd_heartbeat_grace

# Check OSD timeout: ceph config get osd osd_op_timeout

# Check client timeout: ceph config get client client_timeout

# Check async recovery: ceph config set osd osd_async_recovery_min_cost 100

# Check scrub settings: ceph config get osd osd_scrub_sleep ```

Step 9: Check and Repair OSDs

```bash # Check OSD status: ceph osd tree

# Repair OSD: ceph osd repair osd.0

# Deep scrub: ceph osd deep-scrub osd.0

# Check scrub progress: ceph osd scrub-stat

# Force scrub: ceph osd scrub osd.0

# Check recovery progress: ceph -w | grep recovery

# Check recovery status: ceph osd recovery-degradedPriority ceph osd recovery-backfillPriority

# Adjust recovery speed: ceph config set osd osd_recovery_priority 1

# Check rebalance: ceph osd rebalance

# Pause recovery: ceph osd set norecovery ceph osd set nobackfill

# Resume recovery: ceph osd unset norecovery ceph osd unset nobackfill

# Check OSD flags: ceph osd dump | grep flags

# Set flags: ceph osd set pause ceph osd set noup ceph osd set nodown

# Unset flags: ceph osd unset pause ceph osd unset noup ceph osd unset nodown

# Check OSD weight: ceph osd reweight osd.0 1.0

# Rebalance weights: ceph osd reweight-by-utilization ```

Step 10: Ceph Verification Script

```bash # Create verification script: cat << 'EOF' > /usr/local/bin/check-ceph.sh #!/bin/bash

echo "=== Ceph Health ===" ceph health 2>/dev/null || echo "Cannot connect to cluster"

echo "" echo "=== Cluster Status ===" ceph -s 2>/dev/null || echo "Cannot get status"

echo "" echo "=== OSD Status ===" ceph osd tree 2>/dev/null || echo "Cannot get OSD tree"

echo "" echo "=== OSD Stats ===" ceph osd stat 2>/dev/null || echo "Cannot get OSD stats"

echo "" echo "=== Monitor Status ===" ceph mon stat 2>/dev/null || echo "Cannot get monitor status"

echo "" echo "=== Monitor Dump ===" ceph mon dump 2>/dev/null | head -20 || echo "Cannot get monitor dump"

echo "" echo "=== PG Status ===" ceph pg stat 2>/dev/null || echo "Cannot get PG status"

echo "" echo "=== Stuck PGs ===" ceph pg dump_stuck stale inactive undersized degraded 2>/dev/null | head -10 || echo "No stuck PGs"

echo "" echo "=== Pool List ===" ceph osd pool ls detail 2>/dev/null | head -20 || echo "Cannot list pools"

echo "" echo "=== OSD Perf ===" ceph osd perf 2>/dev/null | head -10 || echo "Cannot get OSD perf"

echo "" echo "=== OSD Metadata ===" ceph osd metadata 2>/dev/null | grep -E "osd_id|hostname|device" | head -20 || echo "Cannot get metadata"

echo "" echo "=== Recovery Status ===" ceph -w 2>/dev/null | grep -E "recovery|rebalance" | head -10 || echo "No active recovery"

echo "" echo "=== Network Connectivity ===" for node in $(ceph osd tree 2>/dev/null | grep host | awk '{print $4}' | head -5); do if [ -n "$node" ]; then echo "Node: $node" ping -c 2 -W 2 $node 2>&1 | tail -2 fi done

echo "" echo "=== Ceph Services ===" systemctl list-units --type=service | grep ceph 2>/dev/null | head -20

echo "" echo "=== Recent Logs ===" journalctl -u ceph-osd@* --no-pager -n 5 2>/dev/null | tail -10 || echo "No OSD logs"

echo "" echo "=== Recommendations ===" echo "1. Check OSD services running on all nodes" echo "2. Verify monitor quorum has enough members" echo "3. Check disk health on OSD nodes" echo "4. Allow ports 6789, 6800-7300 in firewall" echo "5. Review stuck PGs and force recovery if needed" echo "6. Check pool min_size for degraded pools" echo "7. Verify crush map rules" EOF

chmod +x /usr/local/bin/check-ceph.sh

# Usage: /usr/local/bin/check-ceph.sh ```

Ceph Cluster Health Checklist

CheckExpected
Health statusHEALTH_OK
OSDs upAll OSDs running
Monitor quorumMajority of monitors
PGs active+cleanNo stuck PGs
Network connectivityAll nodes reachable
Disk healthNo I/O errors
Pool sizemin_size OSDs available

Verify the Fix

```bash # After fixing Ceph cluster

# 1. Check health ceph health // HEALTH_OK

# 2. Check status ceph -s // All OSDs up, monitors in quorum

# 3. Check OSDs ceph osd tree // All OSDs up and in

# 4. Check PGs ceph pg stat // All PGs active+clean

# 5. Monitor recovery ceph -w // No recovery/backfill warnings

# 6. Test storage rados -p pool1 put testfile /tmp/testfile // Object stored successfully ```

  • [Fix GlusterFS Volume Not Mounting](/articles/fix-glusterfs-volume-not-mounting)
  • [Fix NFS Mount Failed](/articles/fix-nfs-mount-still-pointing-to-old-file-server-after-migration)
  • [Fix MinIO Bucket Not Accessible](/articles/fix-minio-bucket-not-accessible)