Fix Message Broker Disk Watermark High Watermark Reached Error

Introduction

Message brokers monitor disk usage using low and high watermarks. When disk usage crosses the high watermark threshold, the broker stops accepting new messages to prevent disk exhaustion and potential data corruption. This protection mechanism causes immediate production failures across all topics and partitions stored on the affected disk.

Symptoms

Producers receive RESOURCE_ERROR or disk full exceptions when attempting to publish
Broker logs contain disk watermark exceeded or free disk space below threshold warnings
Broker rejects connections or enters read-only mode for affected partitions
Consumer processing stalls as no new messages arrive
Monitoring dashboards show disk usage above 90% with flatlined production rates

Common Causes

Log retention period set too high, accumulating more data than disk capacity
Consumer lag causing message backlog that prevents log segment cleanup
Unexpected traffic spike producing messages faster than retention policies can compact
Disk not sized to handle peak message volume plus retention buffer
Compaction or deletion policies failing silently, preventing old segment cleanup

Step-by-Step Fix

1.Check current disk usage and watermark configuration: Identify how far over the threshold the disk has gone.
2.```bash
3.df -h /var/lib/kafka/data
4.# Check broker watermark settings in server.properties
5.grep "log.disk" /etc/kafka/server.properties
6.`
7.Reduce log retention period temporarily to free disk space: Lower retention to trigger immediate segment deletion.
8.```bash
9.# Reduce retention to 1 hour temporarily
10.kafka-configs.sh --bootstrap-server localhost:9092 \
11.--alter --entity-type topics --entity-name my-topic \
12.--add-config retention.ms=3600000
13.`
14.Force log cleanup to run immediately: Trigger the log cleaner to reclaim disk space.
15.```properties
16.log.cleaner.enable=true
17.log.cleaner.threads=4
18.log.cleanup.policy=delete
19.`
20.Delete or truncate non-critical topics: Remove temporary or test topics consuming disk space.
21.```bash
22.kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic test-topic
23.kafka-topics.sh --bootstrap-server localhost:9092 --delete --topic staging-events
24.`
25.Expand disk capacity or add broker nodes: If the workload has permanently outgrown current capacity.
26.```bash
27.# Check partition distribution across brokers
28.kafka-reassign-partitions.sh --bootstrap-server localhost:9092 \
29.--generate --topics-to-move-json-file topics.json --broker-list "0,1,2,3"
30.`

Prevention

Set disk high watermark at 85% of total capacity to provide adequate buffer before disk exhaustion
Configure automated alerts at 70% (warning) and 80% (critical) disk usage
Size disk to handle at least 3x the expected daily message volume at peak retention
Enable log compaction for topics where only the latest value per key matters
Monitor consumer lag continuously, as persistent lag is the primary cause of retention backlog
Implement tiered storage to offload older segments to cheaper object storage

Message Broker Disk Watermark High Watermark Reached

Introduction

Symptoms

Common Causes

Step-by-Step Fix

Prevention

Share this guide

More Messaging Troubleshooting Guides

Pulsar Namespace Policy Denied

Pulsar Broker Connection Failed

Pulsar Producer Queue Full

Pulsar Subscription Not Found

Pulsar Topic Not Found

RocketMQ Delay Message Not Delivering