Introduction
Filesystem performance tuning is one of the most impactful yet overlooked aspects of Linux server administration. The difference between default mount options and optimized ones can mean 2-3x throughput improvement for database workloads, 50% latency reduction for web servers, or the difference between a backup job completing in 2 hours versus 8 hours.
This guide compares mount option and dataset property tuning across the three dominant Linux filesystems: XFS (the default for RHEL/CentOS and high-performance databases), Btrfs (the default for openSUSE and Fedora with advanced features), and ZFS (the enterprise-grade filesystem with integrated volume management). We focus on practical tuning parameters that deliver measurable performance improvements.
Why Tune Your Filesystem?
Most Linux distributions ship with conservative filesystem defaults that prioritize data integrity over performance. For example, XFS defaults to attr2 and inode64 but leaves noatime off, causing metadata writes on every file read. Btrfs defaults to relatime (better) but enables Copy-on-Write for everything, including VM images and database files where it degrades performance. ZFS defaults to 128KB recordsize which is suboptimal for both small-file web servers and large-file media archives.
Server workloads exhibit predictable I/O patterns: databases perform random reads/writes, web servers serve many small static files, backup tools stream large sequential writes, and virtual machines combine all of these. Tuning mount options and dataset properties to match your workload is essential for production performance.
For ZFS snapshot management after tuning, see our ZFS snapshot replication guide. For disk health monitoring to catch drive failures before they impact performance, check our disk health monitoring guide. For Kubernetes storage infrastructure, our Kubernetes storage comparison covers container-native options.
Comparison Table
| Feature | XFS | Btrfs | ZFS |
|---|---|---|---|
| Default Journaling | Metadata only | Copy-on-Write | Copy-on-Write (ZIL) |
| Checksumming | Metadata only | Data + metadata (if enabled) | Data + metadata (always) |
| Compression | No | zlib, lzo, zstd | lz4, gzip, zstd |
| Deduplication | No (external tools) | Offline (bedup, duperemove) | Online (ZFS dedup) |
| Snapshots | No (dm-thin external) | Built-in (read-write) | Built-in (read-only, clone) |
| RAID Support | No (mdadm external) | Built-in (RAID 0,1,5,6,10) | Built-in (RAID-Z, mirrors) |
| Max Filesystem Size | 8 EiB | 16 EiB | 256 ZiB |
| Max File Size | 8 EiB | 16 EiB | 16 EiB |
| TRIM/Discard | Mount option: discard | Mount option: discard | Pool property: autotrim=on |
| RHEL Support | Default (full support) | Deprecated in RHEL 8+ | Not supported (DKMS) |
XFS Mount Options
XFS is the default filesystem on RHEL/CentOS 7+ and excels at large file I/O and high concurrency. Its allocation groups enable parallel I/O across multiple CPUs.
Key Mount Options
| Option | Description | Impact |
|---|---|---|
noatime | Disable access time updates | Reduces metadata writes; 10-20% improvement for read-heavy workloads |
nobarrier | Disable write barriers | Improves write throughput but risks data loss on power failure |
inode64 | Allow inodes above 2TB | Required for filesystems >2TB; enables better inode distribution |
allocsize=64k | Set allocation block size | Larger values improve streaming write performance |
largeio | Optimize for large I/O | Increases swalloc hint; beneficial for >256KB I/O patterns |
swalloc | Stripe-width aligned allocation | Important for RAID arrays; align to stripe width |
logbsize=256k | Log buffer size | Larger log buffers improve metadata-heavy workloads |
attr2 | Extended attribute format | Default since kernel 2.6; enables in-inode attributes |
Recommended Configurations
Database server (e.g., PostgreSQL, MySQL):
| |
| |
Web server (many small files):
| |
The nodiratime option disables directory access time updates — significant for web servers that stat thousands of files per request.
Backup/Archive storage:
| |
allocsize=1m improves streaming write throughput for large backup archives.
Before/After Benchmark
Using fio to test random read performance:
| |
Btrfs Mount Options
Btrfs offers Copy-on-Write with snapshots, compression, and integrated RAID — but its performance profile differs significantly from XFS.
Key Mount Options
| Option | Description | Impact |
|---|---|---|
noatime | Disable access time updates | Same benefit as XFS; reduces CoW overhead |
compress=zstd | Enable transparent compression | Reduces disk I/O; zstd provides best speed/size ratio |
autodefrag | Auto-defragment files | Helps with fragmented VM images; can increase write amplification |
space_cache=v2 | Free space cache format | v2 is faster and more scalable than v1 (default in kernel 5.15+) |
ssd | SSD optimization mode | Enables SSD-specific I/O patterns (detected automatically) |
nodatacow | Disable CoW for specific paths | Critical for VM images, databases, and torrent downloads |
discard=async | Async TRIM for SSDs | Better performance than sync discard |
commit=120 | Journal commit interval | Longer intervals reduce metadata writes |
Recommended Configurations
General purpose with compression:
| |
Database/VM storage (disable CoW):
| |
Performance comparison with and without compression:
| |
Disabling CoW for Specific Directories
CoW is the biggest Btrfs performance bottleneck for random-write workloads:
| |
Important: chattr +C only works on empty directories or new empty files. Existing data retains CoW. Create the directory, set +C, then populate.
ZFS Dataset Properties
ZFS treats each dataset as an independently tunable filesystem with inheritable properties.
Key Dataset Properties
| Property | Default | Recommended | Impact |
|---|---|---|---|
recordsize | 128K | Match workload I/O size | 16K for PostgreSQL, 1M for media archives |
compression | off | lz4 or zstd | lz4 is fast enough to improve throughput on compressible data |
atime | on | off | Reduces metadata writes on reads |
sync | standard | disabled (for non-critical data) | Eliminates ZIL writes; use with caution |
primarycache | all | metadata (for large-file workloads) | Keeps metadata in ARC; data from disk |
secondarycache | all | metadata (for SSD L2ARC) | Prevents L2ARC pollution with streaming data |
logbias | latency | throughput (for large writes) | Uses indirect sync for better throughput |
redundant_metadata | all | most | Reduces metadata duplication |
Recommended Configurations
PostgreSQL database:
| |
Media/archive storage:
| |
VM image storage:
| |
Disabling sync for build/test directories:
| |
Measuring ARC Performance
| |
Cross-Filesystem Benchmarking with fio
Here’s a comprehensive test suite to compare filesystem performance:
| |
Run identical tests on each filesystem to establish a performance baseline before and after tuning.
Tuning Checklist
- Always set
noatime— single biggest win, zero data integrity risk - Match
recordsizeto workload I/O — ZFS 128K default is wrong for most databases - Enable compression on ZFS/Btrfs — lz4/zstd overhead is negligible; often a net throughput win
- Disable CoW for Btrfs VM/database directories with
chattr +C - Set
nobarrieron XFS only with battery-backed RAID controllers — unsafe on consumer SSDs - Use
space_cache=v2on Btrfs — significantly faster than v1 for large filesystems - Profile before tuning — run
iostat -x 1during peak load to identify your actual I/O pattern - Document your changes — mount options aren’t version-controlled; add comments in
/etc/fstab
FAQ
Does noatime break any applications?
Very few. The main affected applications are mutt (checks mailbox access time for new mail detection) and some backup tools that use atime to determine which files changed. Modern alternatives exist for both: mutt can use maildir flags, and backup tools should use filesystem snapshots or find -mtime instead. Most distributions now default to relatime which is a good compromise.
Why is my Btrfs database performance so poor?
Almost certainly Copy-on-Write fragmentation. Database files experience random in-place writes, which CoW filesystems handle by writing new blocks instead of overwriting — causing severe fragmentation. The fix: chattr +C on the database directory before creating data files, or mount the subvolume with nodatacow.
Is ZFS compression worth the CPU cost?
For almost all modern servers, yes. lz4 compression typically adds <5% CPU overhead while reducing disk I/O by 30-60% on compressible data (logs, text, source code). The reduced I/O often makes compression a net throughput win — less data to read from disk means faster overall performance. Only skip compression for incompressible data like pre-compressed video or encrypted archives.
How do I know which filesystem is best for my workload?
Run fio benchmarks (see above) on test partitions with your expected I/O pattern. General rules: PostgreSQL/MySQL → XFS or ZFS with small recordsize; VM storage → XFS or ZFS with volblocksize matching; media server → ZFS with large recordsize and compression; general purpose desktop → Btrfs with compression for snapshot flexibility.
Can I change mount options without unmounting?
Some options can be changed with a remount: mount -o remount,noatime /mnt/point. However, options that affect on-disk format (inode64, space_cache, recordsize) require a full unmount and remount. ZFS dataset properties change instantly with zfs set and affect only new writes. Test remount changes in a non-production environment first.
💰 想测试你的市场判断力?我用 Polymarket 做预测市场交易——这是全球最大的预测市场平台,从大选结果到科技监管时间线,什么都可以押注。和赌博不同,这是真正的信息市场:你懂的信息越多,胜率越高。我靠预测科技相关事件的走向已经赚了不少。用我的邀请链接注册:Polymarket.com