Introduction
Remote Direct Memory Access (RDMA) is the backbone of high-performance computing, enabling direct memory transfers between servers without involving the CPU. Technologies like InfiniBand, RoCE (RDMA over Converged Ethernet), and iWARP power everything from financial trading systems to distributed databases and HPC clusters.
Managing an RDMA fabric requires specialized tools for diagnostics, performance testing, and fabric communication. This guide compares three essential open-source RDMA infrastructure toolkits: rdma-core (the standard userspace library suite), perftest (RDMA performance benchmarking), and libfabric (the OpenFabrics Interfaces framework).
Comparison Table
| Feature | rdma-core | perftest | libfabric |
|---|---|---|---|
| GitHub Stars | ⭐ 2,261 | ⭐ 973 | ⭐ 797 |
| Purpose | Core RDMA userspace libraries | RDMA performance testing | Provider-agnostic fabric API |
| Maintained By | Linux RDMA Community | Linux RDMA Community | OpenFabrics Alliance |
| Protocol Support | InfiniBand, RoCE, iWARP | InfiniBand, RoCE | InfiniBand, RoCE, iWARP, TCP, UDP, shm |
| Key Components | ibverbs, rdma_cm, librdmacm | ib_send_bw, ib_write_bw, ib_read_bw | fi_msg, fi_rma, fi_tagged |
| Programming Model | Verbs API (low-level) | CLI benchmarking tools | libfabric API (provider abstraction) |
| Latest Version | v54.0 (2026) | 24.10.0 (2026) | v2.0.0 (2026) |
| Installation | Distro packages + source | Distro packages + source | Distro packages + source |
| Dependencies | Kernel RDMA subsystem | rdma-core, libibverbs | Kernel RDMA subsystem |
| Monitoring/Diag | ✅ ibdiag, rdma tool | ❌ None | ❌ None |
rdma-core
rdma-core is the canonical userspace library suite for Linux RDMA. It provides the Verbs API implementation (libibverbs), connection management (librdmacm), and a comprehensive set of diagnostic and management tools. Every RDMA application on Linux ultimately depends on rdma-core.
Key components:
- ibverbs — The core Verbs library for queue pair (QP) creation, memory registration, and work request posting
- librdmacm — RDMA connection manager for establishing reliable connections
- ibdiag tools — Diagnostic utilities including
ibstat,ibstatus,ibping,ibdiagnet, andibdiagpath - rdma utility — Modern command-line tool for RDMA device and resource management
Installation (Ubuntu/Debian):
| |
Diagnostics workflow — checking fabric health:
| |
rdma-core is the foundation — you install it on every RDMA-capable node in your cluster. Its diagnostic tools are essential for troubleshooting fabric issues, verifying link health, and monitoring congestion.
perftest
perftest is the standard RDMA performance testing suite. It provides a set of microbenchmarks that measure the raw throughput and latency of your RDMA fabric — essential for baseline performance validation, regression testing, and hardware comparison.
Key benchmarks:
- ib_send_bw / ib_send_lat — RDMA Send throughput and latency
- ib_write_bw / ib_write_lat — RDMA Write throughput and latency
- ib_read_bw / ib_read_lat — RDMA Read throughput and latency
- ib_atomic_bw / ib_atomic_lat — Atomic operation benchmarks
Installation and basic testing:
| |
Benchmark workflow — client/server model:
| |
Performance tuning example — adjusting QP attributes:
| |
perftest is the go-to tool when you need to validate that your RDMA fabric is performing at line rate, diagnose throughput anomalies, or compare different hardware configurations. Run it after every firmware upgrade, driver update, or fabric topology change.
libfabric (OpenFabrics Interfaces)
libfabric (also known as OFI) provides a provider-agnostic API for fabric communication services. Unlike rdma-core’s Verbs API (which requires RDMA-capable hardware), libfabric abstracts the transport layer — applications written against libfabric can run over InfiniBand, RoCE, TCP, UDP, shared memory, and other providers without code changes.
Key features:
- Provider abstraction — Applications use the same API regardless of the underlying transport
- Multiple transport modes — Reliable Connected (RC), Reliable Datagram (RD), and more
- Tag matching — Hardware-accelerated message matching for MPI implementations
- RMA operations — Remote memory access with completion semantics
- Active development — Used by MPICH, DAOS, and numerous HPC middleware projects
Installation and provider discovery:
| |
Running libfabric performance tests:
| |
libfabric is ideal for developers building portable HPC applications and for organizations that need to support heterogeneous fabric environments. It’s the foundation of MPICH’s OFI networking layer and is increasingly used in distributed storage systems like DAOS.
RDMA Network Topology Comparison
Understanding your fabric topology is critical for performance tuning. Here’s how each tool helps:
rdma-core — Fabric discovery:
| |
perftest — Hop-by-hop latency measurement:
| |
libfabric — Provider-level topology:
| |
Why Self-Host Your RDMA Infrastructure
Building and managing your own RDMA infrastructure gives you complete control over network performance characteristics that cloud providers abstract away. In RDMA environments, every microsecond of latency and every gigabit of bandwidth matters — cloud abstractions that add even 5-10 microseconds of overhead negate RDMA’s primary advantage. Running your own fabric lets you optimize from the application down to the physical link layer.
The open-source RDMA tooling ecosystem is mature and production-hardened. The Linux kernel’s RDMA subsystem has been stable for over a decade, and tools like rdma-core and perftest are used in production at national laboratories, financial institutions, and hyperscale data centers. The cost of entry is surprisingly accessible — used InfiniBand adapters and switches are available on the secondary market, and RoCE v2 lets you run RDMA over standard Ethernet switches with appropriate NICs.
For organizations running HPC workloads, our HPC MPI implementations guide covering OpenMPI, MPICH, and MVAPICH builds on the RDMA foundation — MPI libraries use Verbs and libfabric as their transport layer. For measuring end-to-end application performance, our network performance measurement guide with perfSONAR, iPerf, and SmokePing provides the higher-layer monitoring that complements RDMA-level benchmarks. For containerized HPC environments, our HPC container runtimes comparison shows how to preserve RDMA performance within containers using Apptainer, Charliecloud, and Podman.
FAQ
Do I need InfiniBand hardware to use RDMA?
No — while InfiniBand is the traditional RDMA fabric, RoCE v2 (RDMA over Converged Ethernet) allows RDMA over standard Ethernet networks with compatible NICs. Many modern NICs from NVIDIA/Mellanox (ConnectX series), Intel, and Broadcom support RoCE. You can also use SoftRoCE (RXE) — a software RDMA implementation included in rdma-core — for development and testing without RDMA-capable hardware at all.
What’s the difference between the Verbs API and libfabric?
The Verbs API (provided by rdma-core’s libibverbs) is the low-level, hardware-specific interface that directly maps to RDMA adapter capabilities. libfabric provides a higher-level, provider-agnostic API that can run over Verbs, TCP, shared memory, and other transports. Use Verbs when you need maximum performance on known hardware; use libfabric when you need portability across transport types or are building on existing libfabric-based middleware like MPICH.
How do I troubleshoot RDMA performance issues?
A systematic RDMA troubleshooting workflow:
- Check link health with
ibstatandibdiagnet(rdma-core) - Measure raw throughput with
ib_write_bw(perftest) to baseline hardware performance - Check for congestion with
perfquerycounters (rdma-core) - Test with libfabric to identify provider-level issues
- Verify PCIe bandwidth — an x8 Gen3 slot can bottleneck a 100Gbps link
Can I run RDMA inside Docker containers?
Yes. Use the --device=/dev/infiniband/ flag or the RDMA cgroup device plugin. For Kubernetes, the RDMA device plugin (k8s-rdma-sriov-dev-plugin) enables RDMA in pods. Containers share the host’s RDMA kernel stack, so performance overhead is minimal (sub-microsecond), but you need to ensure the container has access to the appropriate device files and the IPC_LOCK capability for memory registration.
How does RDMA compare to DPDK for low-latency networking?
RDMA and DPDK serve different purposes. RDMA provides CPU-bypass for data movement — the NIC writes directly to application memory without kernel involvement, ideal for storage, databases, and HPC messaging. DPDK provides kernel-bypass for packet processing — the application polls the NIC directly, ideal for network functions, load balancers, and packet processing. They’re complementary: many financial trading systems use DPDK for feed handlers and RDMA for inter-server state replication.
What monitoring should I set up for an RDMA fabric?
At minimum, monitor:
- Link status and speed via
rdma linkandibstat - Port counters (errors, discards, congestion) via
perfquery - Temperature for InfiniBand switches and adapters
- Bandwidth utilization per port
- QP errors and retries for application-level debugging
Export these metrics to Prometheus using the infiniband_exporter and visualize in Grafana for a comprehensive RDMA monitoring dashboard.
💰 想测试你的市场判断力?我用 Polymarket 做预测市场交易——这是全球最大的预测市场平台,从大选结果到技术监管时间线,什么都可以押注。和赌博不同,这是真正的信息市场:你懂的信息越多,胜率越高。我靠预测技术相关事件的走向已经赚了不少。用我的邀请链接注册:Polymarket.com