Introduction
The Go http.Server runs an internal accept loop that calls net.Listener.Accept() to receive incoming connections. When the socket encounters an unrecoverable error (file descriptor exhaustion, address already in use on restart, or TCP stack issues), the accept loop may panic or log errors and stop accepting new connections while the server process continues running, appearing healthy but serving no traffic.
Symptoms
accept tcp [::]:8080: accept4: too many open files- Server process running but not accepting connections
http: Accept error: accept tcp [::]:8080: use of closed network connectionpanic: accept tcp [::]:8080: accept: cannot assign requested address- Health check endpoint returns 200 but actual requests timeout
2024/01/15 10:30:00 accept tcp [::]:8080: accept4: too many open files
2024/01/15 10:30:01 accept tcp [::]:8080: accept4: too many open files
# Repeated rapidly - connections not being accepted
# Process still running, health check still responding
# But no new client connections can be establishedCommon Causes
- File descriptor limit reached (ulimit -n)
- TCP TIME_WAIT connections consuming all available sockets
- Port not released from previous instance (SO_REUSEADDR not set)
- Kernel parameter
net.core.somaxconntoo low for traffic spike - Accept errors in tight loop without backoff
Step-by-Step Fix
- 1.Increase file descriptor limits:
- 2.```bash
- 3.# Check current limits
- 4.ulimit -n # Often 1024 by default
# Increase for the process ulimit -n 65536
# System-wide limit echo "fs.file-max = 2097152" >> /etc/sysctl.conf sysctl -p
# Per-service in systemd # /etc/systemd/system/myapp.service [Service] LimitNOFILE=65536 ```
- 1.Handle accept errors gracefully in custom listener:
- 2.```go
- 3.type errorCountingListener struct {
- 4.net.Listener
- 5.errors int
- 6.mu sync.Mutex
- 7.}
func (l *errorCountingListener) Accept() (net.Conn, error) { conn, err := l.Listener.Accept() if err != nil { l.mu.Lock() l.errors++ errs := l.errors l.mu.Unlock()
if errs > 100 { log.Fatalf("Too many accept errors (%d), restarting", errs) }
// Brief backoff to avoid tight error loop time.Sleep(5 * time.Millisecond) return nil, err }
l.mu.Lock() l.errors = 0 // Reset on success l.mu.Unlock() return conn, nil } ```
- 1.Configure TCP socket options:
- 2.```go
- 3.import "syscall"
listener, err := net.Listen("tcp", ":8080") if err != nil { log.Fatal(err) }
// Set SO_REUSEADDR to allow quick restart if tcpl, ok := listener.(*net.TCPListener); ok { file, _ := tcpl.File() fd := int(file.Fd()) syscall.SetsockoptInt(fd, syscall.SOL_SOCKET, syscall.SO_REUSEADDR, 1) }
srv := &http.Server{ Handler: mux, ReadHeaderTimeout: 10 * time.Second, IdleTimeout: 120 * time.Second, } srv.Serve(listener) ```
- 1.Tune kernel TCP parameters:
- 2.```bash
- 3.# Increase listen backlog
- 4.sysctl -w net.core.somaxconn=4096
# Reduce TIME_WAIT duration sysctl -w net.ipv4.tcp_fin_timeout=15
# Enable reuse of TIME_WAIT sockets sysctl -w net.ipv4.tcp_tw_reuse=1
# Increase local port range sysctl -w net.ipv4.ip_local_port_range="1024 65535" ```
- 1.Monitor listener health:
- 2.```go
- 3.func monitorListener(listener net.Listener) {
- 4.go func() {
- 5.ticker := time.NewTicker(30 * time.Second)
- 6.for range ticker.C {
- 7.// Check /proc/net/tcp for connection states
- 8.// Or use netstat/ss programmatically
- 9.out, _ := exec.Command("ss", "-s").Output()
- 10.log.Printf("Socket stats: %s", out)
- 11.}
- 12.}()
- 13.}
- 14.
`
Prevention
- Set
LimitNOFILEin systemd service files to 65536 or higher - Configure
ReadHeaderTimeoutandIdleTimeouton http.Server - Monitor file descriptor usage:
lsof -p <pid> | wc -l - Set up alerts on accept error rates in application logs
- Use
SO_REUSEADDRandSO_REUSEPORTfor zero-downtime restarts - Implement health checks that verify actual connection acceptance
- In Kubernetes, configure
terminationGracePeriodSecondsfor graceful listener shutdown