Introduction

A segmentation fault (segfault) occurs when a program attempts to access memory it is not permitted to access. The most common cause is a null pointer dereference, where code tries to read or write through a pointer that has not been initialized. The kernel logs these events with the faulting address, instruction pointer (RIP), and error code. Without proper debugging symbols and core dumps, segfaults can be extremely difficult to diagnose in production.

Symptoms

  • Application crashes with Segmentation fault (core dumped)
  • dmesg shows myapp[12345]: segfault at 0000000000000000 rip 00007f1234567890
  • Service restarts repeatedly due to crash loop
  • journalctl shows process 12345 (myapp) was killed by signal 11 (SEGV)
  • Application works for some inputs but crashes on specific data

Common Causes

  • Null pointer dereference (accessing ptr->field when ptr is NULL/0x0)
  • Use-after-free: accessing memory that has already been freed
  • Buffer overflow writing beyond allocated array bounds
  • Stack overflow from deep recursion or large local arrays
  • Dangling pointer from double-free or uninitialized pointer

Step-by-Step Fix

  1. 1.Check the kernel log for segfault details:
  2. 2.```bash
  3. 3.dmesg -T | grep segfault | tail -5
  4. 4.# Example output:
  5. 5.# myapp[12345]: segfault at 0000000000000000 ip 00005555555551a2
  6. 6.# sp 00007fffffffe120 error 4 in myapp[555555554000+2000]
  7. 7.# error 4 = read access, address 0x0 (null pointer)
  8. 8.# error 6 = write access, address 0x0 (null pointer write)
  9. 9.`
  10. 10.Enable core dumps for post-mortem analysis:
  11. 11.```bash
  12. 12.# Allow unlimited core size
  13. 13.ulimit -c unlimited
  14. 14.# Set core pattern for predictable naming
  15. 15.echo "/tmp/core.%e.%p" | sudo tee /proc/sys/kernel/core_pattern
  16. 16.# For systemd services, add to unit file:
  17. 17.# [Service]
  18. 18.# LimitCORE=infinity
  19. 19.sudo systemctl daemon-reload
  20. 20.`
  21. 21.Analyze the core dump with gdb:
  22. 22.```bash
  23. 23.gdb /usr/bin/myapp /tmp/core.myapp.12345
  24. 24.(gdb) bt full # Full backtrace with local variables
  25. 25.(gdb) info registers # Register state at crash
  26. 26.(gdb) list # Source code around crash point
  27. 27.(gdb) print ptr # Check the null pointer
  28. 28.(gdb) frame 3 # Go to specific frame in the stack
  29. 29.`
  30. 30.Use addr2line if you do not have the core dump:
  31. 31.```bash
  32. 32.# Extract the RIP address from dmesg
  33. 33.addr2line -e /usr/bin/myapp 0x5555555551a2
  34. 34.# Returns: /home/dev/myapp/src/main.c:42
  35. 35.`
  36. 36.Compile with debug symbols and sanitizers for development:
  37. 37.```bash
  38. 38.gcc -g -fsanitize=address,undefined -o myapp myapp.c
  39. 39.# AddressSanitizer catches: buffer overflows, use-after-free, double-free
  40. 40.# UndefinedBehaviorSanitizer catches: null pointer dereference, signed overflow
  41. 41../myapp
  42. 42.# Will print exact location of the bug at runtime
  43. 43.`
  44. 44.Apply the fix and verify:
  45. 45.```c
  46. 46.// Before (buggy):
  47. 47.struct config *cfg = load_config();
  48. 48.printf("%s\n", cfg->database); // cfg could be NULL

// After (fixed): struct config *cfg = load_config(); if (cfg == NULL) { fprintf(stderr, "Failed to load configuration\n"); return 1; } printf("%s\n", cfg->database); ```

Prevention

  • Always validate pointer returns from functions like malloc, fopen, connect
  • Use AddressSanitizer (-fsanitize=address) in CI/CD pipelines for all builds
  • Enable core dumps in development and staging environments
  • Use static analysis tools (cppcheck, clang-tidy) to catch null pointer risks at compile time
  • Implement graceful error handling with clear error messages instead of allowing crashes