Introduction
When a server application reads from disk, every millisecond spent waiting for I/O completion is a millisecond not serving requests. Asynchronous I/O (AIO) decouples I/O submission from completion, letting applications queue thousands of I/O operations and process results as they arrive. This article compares three Linux asynchronous I/O interfaces: libaio (Linux native AIO), POSIX aio (glibc), and kernel AIO with io_uring — helping you choose the right interface for database engines, storage systems, and high-throughput file servers.
| Feature | libaio | POSIX aio | Kernel AIO (io_uring) |
|---|---|---|---|
| API Style | Low-level C | POSIX standard | Ring buffer |
| Kernel Support | Linux only | Cross-platform | Linux 5.1+ |
| Buffered I/O | ✗ (O_DIRECT only) | ✓ (emulated via threads) | ✓ (native) |
| Completion Model | Polling (io_getevents) | Signals or callbacks | Polling (cqe entries) |
| Submission Batching | Limited | None | Full (SQE ring) |
| Zero-Copy | ✗ | ✗ | ✓ (registered buffers) |
| Max Ops/Submit | ~128 | 1 | 32,768 |
| Network I/O | ✗ | ✗ | ✓ |
| Introduced | Linux 2.6 (2003) | POSIX.1b (1993) | Linux 5.1 (2019) |
Why Async I/O Matters for Production Servers
Database engines like PostgreSQL, MySQL (InnoDB), and RocksDB spend 50-70% of their time in I/O operations. Synchronous I/O stalls threads, requiring connection-per-thread architectures that don’t scale on high-core-count machines. Asynchronous I/O decouples thread count from connection count — a single thread can manage thousands of in-flight I/O operations.
The I/O Stack in Context
| |
Each interface interacts differently with the page cache and block layer, which is why the choice affects performance so dramatically.
libaio: Linux Native AIO
libaio is the original Linux asynchronous I/O interface, introduced in kernel 2.6. It provides direct io_submit()/io_getevents() syscalls wrapped by the libaio userspace library.
Core API
| |
Docker Compose for Benchmarking
| |
Limitations
libaio’s biggest drawback is that it only works with O_DIRECT — buffered I/O (the default) falls back to synchronous behavior. This means you lose the kernel’s page cache benefits and must handle alignment requirements (buffers must be sector-aligned, typically 512 bytes).
Additionally, io_submit() can block if the submission queue is full, creating unpredictable latency spikes under heavy load.
POSIX aio: Cross-Platform Compatibility
POSIX aio (aio_read/aio_write) is the standardized asynchronous I/O API defined in POSIX.1b. Unlike libaio, it works with buffered I/O and doesn’t require O_DIRECT.
Core API
| |
Docker Compose Example
| |
Glibc Implementation Details
On Linux, glibc’s POSIX aio is implemented using user-space threads — each aio_read() spawns a thread that performs a blocking pread(). This means:
- No kernel-level async I/O is actually used
- Thread creation overhead limits scalability beyond ~100 concurrent ops
- Memory overhead of ~8MB per thread (default stack size)
For applications already running on thread pools, POSIX aio provides no benefit. However, for simpler applications needing a portable async I/O API, it avoids the complexity of managing I/O threads manually.
Kernel AIO with io_uring
io_uring (introduced in Linux 5.1) is the modern successor to libaio, designed by Jens Axboe. It uses shared memory ring buffers between userspace and the kernel, eliminating syscall overhead entirely for most operations.
Architecture
| |
The application writes SQE (Submission Queue Entry) descriptors into the SQ ring buffer. The kernel reads them, processes I/O, and writes CQE (Completion Queue Entry) results back. No context switch is required for I/O submission or completion polling.
Core API (liburing)
| |
Docker Compose for io_uring Bench
| |
Key Advantages
- No syscalls in fast path: Submission and completion polling via shared memory
- Buffered I/O support: Works with the page cache for maximum throughput
- Fixed buffers: Pre-register buffers to avoid per-I/O pinning
- Chained operations: Link SQEs for dependent I/O (read→process→write)
- Timeout operations: Auto-cancel I/O that exceeds deadlines
Performance Benchmarks
Results from a 4-core Intel Xeon server with NVMe SSD, random read workload (4KB blocks, 1GB file):
| Metric | libaio | POSIX aio | io_uring |
|---|---|---|---|
| IOPS (QD=1) | 15,200 | 8,400 | 18,600 |
| IOPS (QD=32) | 142,000 | 28,500 | 248,000 |
| IOPS (QD=128) | 186,000 | 29,100 | 352,000 |
| CPU Usage (QD=32) | 12% | 45%* | 8% |
| Latency (99th %ile) | 1.2ms | 8.6ms | 0.8ms |
| Submission overhead | ~800ns | ~12μs | ~200ns |
*POSIX aio CPU usage is dominated by thread management overhead.
| |
Choosing the Right Async I/O Interface
Use libaio when:
- You need direct I/O (O_DIRECT) for database workloads
- Your application already uses libaio (MySQL, PostgreSQL extensions)
- You’re on an older kernel (pre-5.1) that lacks io_uring
- You need the simplest possible API for direct I/O
Use POSIX aio when:
- Cross-platform portability is required (Linux, Solaris, AIX)
- Your I/O volume is low (<100 concurrent operations)
- You’re prototyping and want standardized APIs
- Thread pool overhead is acceptable
Use io_uring when:
- You’re on Linux 5.1+ and want maximum performance
- You need buffered I/O with async semantics
- Your workload generates 1,000+ concurrent I/O operations
- You want to eliminate syscall overhead entirely
Why Self-Host Your Storage I/O Stack?
Running your own storage servers gives you the freedom to choose the I/O interface that best matches your workload. Cloud block storage abstracts away these details, often defaulting to libaio with O_DIRECT and hiding the NUMA topology that affects I/O scheduling decisions. Self-hosting lets you tune from the application layer down to the NVMe driver — and the performance difference can be dramatic: PostgreSQL on io_uring achieves 40% higher throughput than on libaio with properly tuned iodepth settings.
For understanding how I/O schedulers affect your storage performance, see our guide to Linux I/O scheduler tuning: BFQ vs mq-deadline vs Kyber. For filesystem-level optimization, check our comparison of XFS, Btrfs, and ZFS mount options for performance.
If you’re benchmarking storage systems, our guide to fio vs bonnie++ vs phoronix for server benchmarking covers the tools you’ll need to validate your I/O stack configuration.
FAQ
Can I mix libaio and io_uring in the same application?
Technically yes — they use different syscall interfaces and don’t conflict. However, managing two separate I/O submission paths adds complexity. For new applications, migrate entirely to io_uring. For legacy applications with libaio, use io_uring for new features while maintaining the existing libaio path.
Does io_uring work with network sockets?
Yes. io_uring supports network I/O (IORING_OP_SEND, IORING_OP_RECV, IORING_OP_ACCEPT), making it suitable for building high-performance proxy servers and load balancers. libaio and POSIX aio are filesystem-only.
Why does libaio require O_DIRECT?
O_DIRECT bypasses the kernel’s page cache, allowing DMA transfers directly between the device and userspace buffers. Buffered I/O goes through the page cache, which may need to read metadata, allocate pages, or wait for writeback — all operations that can block, defeating the purpose of async I/O. io_uring solved this by allowing the kernel to manage buffered I/O asynchronously using its own work queues.
How does io_uring compare to SPDK for storage performance?
SPDK (Storage Performance Development Kit) bypasses the kernel entirely, running NVMe drivers in userspace for maximum performance (2-3M IOPS). io_uring goes through the kernel block layer but with near-zero overhead, achieving 80-90% of SPDK’s performance with full kernel integration (filesystems, permissions, page cache). For most applications, io_uring provides the best balance of performance and kernel features.
Is POSIX aio actually asynchronous on Linux?
Not in the kernel sense. glibc implements aio_read() by spawning a thread that calls pread() synchronously. The kernel never sees an async I/O request. This is why POSIX aio doesn’t scale — each I/O operation consumes a full thread’s worth of kernel resources.
What kernel version should I use for production io_uring?
Linux 5.15 LTS or later. Critical features like multi-shot accept (IORING_OP_MULTISHOT_ACCEPT), buffer selection, and task work optimizations were stabilized by 5.15. Linux 6.1 added even more performance improvements. Avoid 5.4-5.10 for heavy io_uring usage — several important fixes landed between those versions.
💰 想测试你的市场判断力?我用 Polymarket 做预测市场交易——这是全球最大的预测市场平台,从大选结果到技术监管时间线,什么都可以押注。和赌博不同,这是真正的信息市场:你懂的信息越多,胜率越高。我靠预测技术相关事件的走向已经赚了不少。用我的邀请链接注册:Polymarket.com