Self-Hosted Linux Debugging and Diagnostics: gdb vs lldb vs rr Guide for Server Administrators

Introduction

When a production service crashes, hangs, or behaves unexpectedly, a debugger is often the only tool that can reveal what’s really happening inside a running process. For Linux server administrators and SRE teams, proficiency with debugging tools separates reactive firefighting from proactive incident resolution.

Modern Linux debuggers have evolved far beyond their 1990s origins. Today’s tools support reverse execution, Python scripting, remote debugging, and seamless integration with core dumps — capabilities that make them indispensable for diagnosing complex server-side issues.

In this guide, we compare three powerful Linux debugging tools: gdb (the GNU Debugger, the industry standard), lldb (LLVM’s modern debugger with superior scripting), and rr (Mozilla’s record-and-replay debugger for deterministic debugging).

Tool Comparison

gdb — The Universal Debugger

gdb is the original and most widely deployed debugger for Linux systems. It supports dozens of target architectures, programming languages (C, C++, Go, Rust, Fortran, and more), and debugging scenarios from live process attachment to post-mortem core dump analysis.

Key Features:

Live process attach/detach without restart
Core dump analysis for post-mortem debugging
Remote debugging via gdbserver (embedded, containers, VMs)
Python scripting API for custom commands and automation
Reverse debugging (limited, via process record)
Thread-aware debugging with scheduler locking
Expression evaluation in multiple languages
TUI (Text User Interface) mode for terminal-based visual debugging

Strengths: Universal support — every Linux distribution ships gdb. Extensive scripting capabilities via Python. Mature and battle-tested over 30+ years. The gold standard for core dump analysis.

Limitations: Command-line interface can be intimidating for newcomers. Reverse debugging is slow (single-stepping with record/replay). Limited structured output for automation (though the Machine Interface protocol helps).

lldb — The Modern Contender

lldb is LLVM’s debugger, designed from the ground up with modern architecture and excellent scripting capabilities. It uses a modular, library-based design that makes it ideal for integration into IDEs and custom tooling.

Key Features:

Full Python scripting with deep debugger state access
Structured output via SB (Script Bridge) API
Expression parser based on Clang/LLVM (excellent C++ support)
Watchpoints with conditional triggers
Reverse debugging via rr integration
REPL-style interface with command history and tab completion
Parallel debugging of multiple targets
Native macOS support (in addition to Linux)

Strengths: Superior scripting and automation capabilities. First-class C++ debugging with LLVM’s expression parser. Clean, modular architecture makes it extensible. Excellent integration with rr for reverse debugging.

Limitations: Smaller community and ecosystem compared to gdb. Some gdb-specific extensions and scripts require porting. Less legacy architecture support.

rr — Deterministic Replay Debugging

rr (Record and Replay) takes a fundamentally different approach: instead of debugging a live process, rr records an entire process execution, then allows you to replay it deterministically — forward AND backward — as many times as needed. This transforms debugging from a one-shot investigation into a repeatable process.

Key Features:

Full reverse execution (step backward, reverse-continue, reverse-finish)
Deterministic replay — the same recording always produces the same execution
Low recording overhead (~1.3x slowdown, suitable for production use)
Watchpoints that work in reverse (find when a variable last changed)
Chaos mode for finding race conditions
Seamless integration with gdb (gdb frontend)
Shared memory recording for multi-process applications
Syscall buffer for high-throughput I/O workloads

Strengths: Reverse debugging is a game-changer for complex bugs. Deterministic replay means you can share recordings with teammates. Low overhead makes it viable for CI/CD pipelines and production captures. Chaos mode finds timing-dependent bugs reliably.

Limitations: Only works on x86-64 Linux. Requires CPU performance counters (may conflict with profiling tools). Recording overhead, while low, may still affect timing-sensitive applications. Not all CPU features are supported (some AVX-512 variants).

Comparison Table

Feature	gdb	lldb	rr
Architecture	Monolithic C	Modular C++/LLVM	Record-replay engine
Reverse Debugging	Limited (process record)	Via rr integration	Full (native)
Python Scripting	Extensive API	Superior SB API	Via gdb frontend
Core Dump Analysis	Mature, comprehensive	Growing support	N/A (needs recording)
Live Process Attach	Yes	Yes	Record first, then debug
Production Overhead	Zero (when not attached)	Zero (when not attached)	~1.3x (during recording)
Remote Debugging	gdbserver (mature)	lldb-server (good)	File-based
C++ Support	Good (GCC-focused)	Excellent (Clang/LLVM)	Inherits from frontend
Recording Portability	N/A	N/A	Recordings are portable
Best For	Core dumps, embedded, legacy	IDE integration, scripting	Complex/heisenbugs, intermittent failures

Practical Server Debugging Workflows

Debugging a Running Service with gdb

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Find the PID of a misbehaving service
PID=$(systemctl show --property MainPID --value myservice)

# Attach gdb to the running process
gdb -p $PID

# Inside gdb:
(gdb) info threads                    # List all threads
(gdb) thread apply all bt             # Backtrace all threads
(gdb) frame 3                         # Switch to frame 3
(gdb) info locals                     # Show local variables
(gdb) print *request                  # Dereference a pointer
(gdb) detach                          # Detach without killing

Core Dump Analysis with gdb

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Enable core dumps for a crashing service
ulimit -c unlimited
echo "/var/coredumps/core.%e.%p.%t" > /proc/sys/kernel/core_pattern

# After a crash, analyze the core dump
gdb /usr/bin/myservice /var/coredumps/core.myservice.*
(gdb) bt full                         # Full backtrace with locals
(gdb) info registers                  # Register state at crash
(gdb) frame 0                         # Crash site
(gdb) list                            # Source code around crash
(gdb) print errno                     # Check error code

rr Recording for Intermittent Failures

The power of rr shines when debugging crashes that happen once every thousand requests:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Record the execution of a failing test
rr record ./my_server --test-reproducer

# Replay and debug
rr replay
(gdb) continue                       # Run until crash
(gdb) bt                             # See where it crashed
(gdb) reverse-step                   # Go backward one instruction
(gdb) print counter                  # What was the value before?
(gdb) watch -l counter               # Set a hardware watchpoint
(gdb) reverse-continue               # Run backward until watchpoint hits

Automated Bug Detection with rr Chaos Mode

rr’s chaos mode randomly perturbs scheduling to expose race conditions:

1
2
3
4
5
6
7
8
9
# Record once, then replay many times with chaos
rr record ./my_multithreaded_app --test-suite
for i in $(seq 1 100); do
    rr replay --chaos -a 2>/dev/null
    if [ $? -ne 0 ]; then
        echo "Race condition triggered on iteration $i"
        break
    fi
done

This technique finds concurrency bugs that might otherwise go undetected for months in production.

Why Self-Host Your Debugging Infrastructure?

Running debugging tools on your own servers — rather than relying on cloud-based debugging services — offers several critical advantages for security-conscious organizations:

Data sovereignty: Core dumps and process recordings contain sensitive data — memory contents, encryption keys, user data, and proprietary algorithms. Keeping all debugging data local ensures compliance with data protection regulations (GDPR, HIPAA, SOC 2).

No external dependencies: When a production outage is costing thousands per minute, you cannot afford to wait for a cloud debugging service to become available. Self-hosted tools work even during network partitions or cloud provider outages.

Custom automation: Python scripting in gdb and lldb enables custom debugging workflows tailored to your specific application architecture. Automate repetitive investigation steps, generate structured reports, and integrate with your monitoring stack.

Zero-cost scaling: Debug as many processes as you need — there are no per-seat licenses, API rate limits, or usage quotas when using open-source debugging tools.

For complementary server diagnostics, see our network diagnostics tools guide and Linux network interface diagnostics.

FAQ

Can I use gdb and lldb interchangeably?

Not entirely. While both debug ELF binaries and support similar commands, their scripting APIs and advanced features differ significantly. gdb scripts won’t run in lldb without porting. For most basic debugging tasks (setting breakpoints, inspecting variables, backtraces), either tool works fine. Choose gdb if you need mature core dump support or remote embedded debugging. Choose lldb if you need superior Python scripting or C++ expression evaluation.

Does rr work with Docker containers?

Yes, with some configuration. rr requires access to CPU performance counters, which Docker blocks by default. Enable them with --cap-add=SYS_PTRACE --security-opt seccomp=unconfined. For Kubernetes, you’ll need a privileged container or a custom security context. Many teams run rr recordings inside the same container image as their production service to ensure binary compatibility.

How do I debug a multi-threaded deadlock?

rr is the most effective tool for deadlock debugging. Record the execution once, then replay and set breakpoints at each locking function. Use rr replay with gdb to step through the lock acquisition order in both threads. With reverse execution, you can trace backward from the deadlock point to see exactly which thread acquired which lock and in what order — something that’s nearly impossible with live debugging alone.

What’s the best way to debug a crashing daemon that only fails under load?

Use rr to record the daemon under load until it crashes. Since rr recordings are deterministic, you can replay the crash as many times as needed. Alternatively, configure systemd to capture core dumps: set LimitCORE=infinity in the service unit file and DefaultLimitCORE=infinity in /etc/systemd/system.conf. Core dumps capture the exact state at crash time, allowing offline analysis with gdb.

How do I debug a service without stopping it?

gdb and lldb both support non-stop mode where you can inspect a running process without freezing all threads. Use set non-stop on in gdb or settings set target.non-stop-mode true in lldb. This is critical for debugging production services where stopping all threads would trigger health check failures or client timeouts. Note that non-stop mode has limitations — you cannot reliably examine data structures that are being concurrently modified by other threads.

Is there a web-based interface for Linux debugging?

While gdb and lldb are fundamentally CLI tools, several projects add web interfaces. gdb’s built-in TUI mode (gdb -tui) provides a terminal-based split-screen view with source code and command input. For remote collaboration, you can combine gdb with tty-share or tmux for shared debugging sessions. For automated analysis, gdb’s Machine Interface (gdb -i=mi) provides structured JSON output that can power custom web dashboards.

💰 想测试你的市场判断力？我用 Polymarket 做预测市场交易——这是全球最大的预测市场平台，从大选结果到技术监管时间线，什么都可以押注。和赌博不同，这是真正的信息市场：你懂的信息越多，胜率越高。我靠预测技术相关事件的走向已经赚了不少。用我的邀请链接注册：Polymarket.com

Introduction

Tool Comparison

gdb — The Universal Debugger

lldb — The Modern Contender

rr — Deterministic Replay Debugging

Comparison Table

Practical Server Debugging Workflows

Debugging a Running Service with gdb

Core Dump Analysis with gdb

rr Recording for Intermittent Failures

Automated Bug Detection with rr Chaos Mode

Why Self-Host Your Debugging Infrastructure?

FAQ

Can I use gdb and lldb interchangeably?

Does rr work with Docker containers?

How do I debug a multi-threaded deadlock?

What’s the best way to debug a crashing daemon that only fails under load?

How do I debug a service without stopping it?

Is there a web-based interface for Linux debugging?

Related Posts

Self-Hosted Linux Performance Profiling Tools: perf vs FlameGraph vs sysprof

Self-Hosted Linux Memory Profiling: Memray vs Heaptrack vs Massif vs GPerfTools

Linux Kernel Remote Logging: netconsole vs Remote Syslog vs pstore/Ramoops