Introduction

Filesystem performance tuning is one of the most impactful yet overlooked aspects of Linux server administration. The difference between default mount options and optimized ones can mean 2-3x throughput improvement for database workloads, 50% latency reduction for web servers, or the difference between a backup job completing in 2 hours versus 8 hours.

This guide compares mount option and dataset property tuning across the three dominant Linux filesystems: XFS (the default for RHEL/CentOS and high-performance databases), Btrfs (the default for openSUSE and Fedora with advanced features), and ZFS (the enterprise-grade filesystem with integrated volume management). We focus on practical tuning parameters that deliver measurable performance improvements.

Why Tune Your Filesystem?

Most Linux distributions ship with conservative filesystem defaults that prioritize data integrity over performance. For example, XFS defaults to attr2 and inode64 but leaves noatime off, causing metadata writes on every file read. Btrfs defaults to relatime (better) but enables Copy-on-Write for everything, including VM images and database files where it degrades performance. ZFS defaults to 128KB recordsize which is suboptimal for both small-file web servers and large-file media archives.

Server workloads exhibit predictable I/O patterns: databases perform random reads/writes, web servers serve many small static files, backup tools stream large sequential writes, and virtual machines combine all of these. Tuning mount options and dataset properties to match your workload is essential for production performance.

For ZFS snapshot management after tuning, see our ZFS snapshot replication guide. For disk health monitoring to catch drive failures before they impact performance, check our disk health monitoring guide. For Kubernetes storage infrastructure, our Kubernetes storage comparison covers container-native options.

Comparison Table

FeatureXFSBtrfsZFS
Default JournalingMetadata onlyCopy-on-WriteCopy-on-Write (ZIL)
ChecksummingMetadata onlyData + metadata (if enabled)Data + metadata (always)
CompressionNozlib, lzo, zstdlz4, gzip, zstd
DeduplicationNo (external tools)Offline (bedup, duperemove)Online (ZFS dedup)
SnapshotsNo (dm-thin external)Built-in (read-write)Built-in (read-only, clone)
RAID SupportNo (mdadm external)Built-in (RAID 0,1,5,6,10)Built-in (RAID-Z, mirrors)
Max Filesystem Size8 EiB16 EiB256 ZiB
Max File Size8 EiB16 EiB16 EiB
TRIM/DiscardMount option: discardMount option: discardPool property: autotrim=on
RHEL SupportDefault (full support)Deprecated in RHEL 8+Not supported (DKMS)

XFS Mount Options

XFS is the default filesystem on RHEL/CentOS 7+ and excels at large file I/O and high concurrency. Its allocation groups enable parallel I/O across multiple CPUs.

Key Mount Options

OptionDescriptionImpact
noatimeDisable access time updatesReduces metadata writes; 10-20% improvement for read-heavy workloads
nobarrierDisable write barriersImproves write throughput but risks data loss on power failure
inode64Allow inodes above 2TBRequired for filesystems >2TB; enables better inode distribution
allocsize=64kSet allocation block sizeLarger values improve streaming write performance
largeioOptimize for large I/OIncreases swalloc hint; beneficial for >256KB I/O patterns
swallocStripe-width aligned allocationImportant for RAID arrays; align to stripe width
logbsize=256kLog buffer sizeLarger log buffers improve metadata-heavy workloads
attr2Extended attribute formatDefault since kernel 2.6; enables in-inode attributes

Database server (e.g., PostgreSQL, MySQL):

1
2
# /etc/fstab
/dev/sdb1 /var/lib/postgresql xfs noatime,nobarrier,largeio,inode64,logbsize=256k 0 0
1
2
3
4
5
# Check current XFS geometry
xfs_info /var/lib/postgresql

# Tune allocation group count at mkfs time
mkfs.xfs -d agcount=16 -l size=256m /dev/sdb1

Web server (many small files):

1
2
# /etc/fstab
/dev/sdc1 /var/www xfs noatime,nodiratime,inode64,attr2 0 0

The nodiratime option disables directory access time updates — significant for web servers that stat thousands of files per request.

Backup/Archive storage:

1
2
# /etc/fstab
/dev/sdd1 /backup xfs noatime,nobarrier,allocsize=1m,largeio,inode64 0 0

allocsize=1m improves streaming write throughput for large backup archives.

Before/After Benchmark

Using fio to test random read performance:

1
2
3
4
5
6
# Default mount options
fio --name=randread --ioengine=libaio --rw=randread --bs=4k --size=1G --numjobs=4 --runtime=30 --group_reporting
# Default: ~45K IOPS

# With noatime,nodiratime,inode64
# Tuned: ~52K IOPS (+15%)

Btrfs Mount Options

Btrfs offers Copy-on-Write with snapshots, compression, and integrated RAID — but its performance profile differs significantly from XFS.

Key Mount Options

OptionDescriptionImpact
noatimeDisable access time updatesSame benefit as XFS; reduces CoW overhead
compress=zstdEnable transparent compressionReduces disk I/O; zstd provides best speed/size ratio
autodefragAuto-defragment filesHelps with fragmented VM images; can increase write amplification
space_cache=v2Free space cache formatv2 is faster and more scalable than v1 (default in kernel 5.15+)
ssdSSD optimization modeEnables SSD-specific I/O patterns (detected automatically)
nodatacowDisable CoW for specific pathsCritical for VM images, databases, and torrent downloads
discard=asyncAsync TRIM for SSDsBetter performance than sync discard
commit=120Journal commit intervalLonger intervals reduce metadata writes

General purpose with compression:

1
2
# /etc/fstab
/dev/sda1 / btrfs defaults,noatime,compress=zstd,space_cache=v2,ssd 0 0

Database/VM storage (disable CoW):

1
2
3
4
5
6
7
8
# Create subvolume with nodatacow
btrfs subvolume create /mnt/data/vms

# Set the no-COW attribute
chattr +C /mnt/data/vms

# Mount with nodatacow for the subvolume
mount -o subvol=vms,nodatacow /dev/sda1 /var/lib/libvirt/images

Performance comparison with and without compression:

1
2
3
4
5
6
# Without compression
dd if=/dev/zero of=/mnt/btrfs/testfile bs=1M count=1000 conv=fdatasync
# ~450 MB/s write

# With compress=zstd
# ~800 MB/s write (compressible data)

Disabling CoW for Specific Directories

CoW is the biggest Btrfs performance bottleneck for random-write workloads:

1
2
3
4
5
6
7
# For an existing directory
chattr +C /var/lib/mysql
# New files inherit +C (must be set before data exists)

# Verify
lsattr /var/lib/mysql
# Output: ---------------C------ /var/lib/mysql

Important: chattr +C only works on empty directories or new empty files. Existing data retains CoW. Create the directory, set +C, then populate.

ZFS Dataset Properties

ZFS treats each dataset as an independently tunable filesystem with inheritable properties.

Key Dataset Properties

PropertyDefaultRecommendedImpact
recordsize128KMatch workload I/O size16K for PostgreSQL, 1M for media archives
compressionofflz4 or zstdlz4 is fast enough to improve throughput on compressible data
atimeonoffReduces metadata writes on reads
syncstandarddisabled (for non-critical data)Eliminates ZIL writes; use with caution
primarycacheallmetadata (for large-file workloads)Keeps metadata in ARC; data from disk
secondarycacheallmetadata (for SSD L2ARC)Prevents L2ARC pollution with streaming data
logbiaslatencythroughput (for large writes)Uses indirect sync for better throughput
redundant_metadataallmostReduces metadata duplication

PostgreSQL database:

1
2
3
zfs create -o recordsize=8K            -o compression=lz4            -o atime=off            -o logbias=latency            -o primarycache=metadata            tank/postgres

# PostgreSQL uses 8KB pages — match recordsize

Media/archive storage:

1
zfs create -o recordsize=1M            -o compression=zstd            -o atime=off            -o primarycache=metadata            -o redundant_metadata=most            tank/media

VM image storage:

1
zfs create -o recordsize=64K            -o compression=lz4            -o atime=off            -o volblocksize=64K            tank/vms

Disabling sync for build/test directories:

1
2
zfs create -o sync=disabled -o compression=lz4 tank/build
# WARNING: power loss may corrupt recent writes

Measuring ARC Performance

1
2
3
4
5
6
7
8
# Check ARC hit ratio (should be >95% for metadata-heavy workloads)
grep -E '^(c |hits|misses)' /proc/spl/kstat/zfs/arcstats
# hits: 12345678
# misses: 98765
# HIT RATIO: 99.2%

# Monitor ARC size
arc_summary | head -20

Cross-Filesystem Benchmarking with fio

Here’s a comprehensive test suite to compare filesystem performance:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# Install fio
sudo apt install fio  # Debian/Ubuntu
sudo dnf install fio  # RHEL/CentOS

# Random read (simulates database queries)
fio --name=randread --filename=/mnt/test/fio.dat     --ioengine=libaio --rw=randread --bs=4k --size=2G     --numjobs=4 --runtime=60 --time_based --group_reporting

# Random write (simulates database writes)
fio --name=randwrite --filename=/mnt/test/fio.dat     --ioengine=libaio --rw=randwrite --bs=4k --size=2G     --numjobs=4 --runtime=60 --time_based --group_reporting

# Sequential read (simulates backup restoration)
fio --name=seqread --filename=/mnt/test/fio.dat     --ioengine=libaio --rw=read --bs=1M --size=4G     --numjobs=1 --runtime=60 --time_based --group_reporting

# Sequential write (simulates log ingestion)
fio --name=seqwrite --filename=/mnt/test/fio.dat     --ioengine=libaio --rw=write --bs=1M --size=4G     --numjobs=1 --runtime=60 --time_based --group_reporting

Run identical tests on each filesystem to establish a performance baseline before and after tuning.

Tuning Checklist

  1. Always set noatime — single biggest win, zero data integrity risk
  2. Match recordsize to workload I/O — ZFS 128K default is wrong for most databases
  3. Enable compression on ZFS/Btrfs — lz4/zstd overhead is negligible; often a net throughput win
  4. Disable CoW for Btrfs VM/database directories with chattr +C
  5. Set nobarrier on XFS only with battery-backed RAID controllers — unsafe on consumer SSDs
  6. Use space_cache=v2 on Btrfs — significantly faster than v1 for large filesystems
  7. Profile before tuning — run iostat -x 1 during peak load to identify your actual I/O pattern
  8. Document your changes — mount options aren’t version-controlled; add comments in /etc/fstab

FAQ

Does noatime break any applications?

Very few. The main affected applications are mutt (checks mailbox access time for new mail detection) and some backup tools that use atime to determine which files changed. Modern alternatives exist for both: mutt can use maildir flags, and backup tools should use filesystem snapshots or find -mtime instead. Most distributions now default to relatime which is a good compromise.

Why is my Btrfs database performance so poor?

Almost certainly Copy-on-Write fragmentation. Database files experience random in-place writes, which CoW filesystems handle by writing new blocks instead of overwriting — causing severe fragmentation. The fix: chattr +C on the database directory before creating data files, or mount the subvolume with nodatacow.

Is ZFS compression worth the CPU cost?

For almost all modern servers, yes. lz4 compression typically adds <5% CPU overhead while reducing disk I/O by 30-60% on compressible data (logs, text, source code). The reduced I/O often makes compression a net throughput win — less data to read from disk means faster overall performance. Only skip compression for incompressible data like pre-compressed video or encrypted archives.

How do I know which filesystem is best for my workload?

Run fio benchmarks (see above) on test partitions with your expected I/O pattern. General rules: PostgreSQL/MySQL → XFS or ZFS with small recordsize; VM storage → XFS or ZFS with volblocksize matching; media server → ZFS with large recordsize and compression; general purpose desktop → Btrfs with compression for snapshot flexibility.

Can I change mount options without unmounting?

Some options can be changed with a remount: mount -o remount,noatime /mnt/point. However, options that affect on-disk format (inode64, space_cache, recordsize) require a full unmount and remount. ZFS dataset properties change instantly with zfs set and affect only new writes. Test remount changes in a non-production environment first.


💰 想测试你的市场判断力?我用 Polymarket 做预测市场交易——这是全球最大的预测市场平台,从大选结果到科技监管时间线,什么都可以押注。和赌博不同,这是真正的信息市场:你懂的信息越多,胜率越高。我靠预测科技相关事件的走向已经赚了不少。用我的邀请链接注册:Polymarket.com