← Back to posts
backup self-hosted data-integrity · · 12 min read

Self-Hosted Backup Verification & Integrity Testing Guide 2026

Complete guide to verifying backup integrity for self-hosted systems. Compare Borgmatic, Restic, Kopia, and dedicated tools for automated restore testing, checksum validation, and disaster recovery readiness.

OS
Editorial Team

A backup that hasn’t been verified is not a backup — it’s a wish. Every sysadmin has heard the horror story: months of automated backup jobs running silently, only to discover on recovery day that every single archive is corrupt, incomplete, or missing critical files. The problem isn’t that self-hosted backup tools are unreliable — Borg, Restic, and Kopia all use strong authenticated encryption. The problem is nobody tests until it’s too late.

This guide covers the tools and strategies that turn “hope your backups work” into a verifiable, automated process. We’ll look at Borgmatic, Restic’s built-in verification, Kopia’s consistency checks, and dedicated restore testing frameworks that catch problems before they become disasters.

Why Backup Verification Matters

Self-hosted infrastructure faces unique risks that cloud providers abstract away. Bit rot on spinning disks, network interruptions during S3 uploads, filesystem snapshots that capture partial writes, and storage backends that silently return I/O errors — all of these can produce backup files that look complete but contain corrupted or missing data.

The three pillars of backup verification are:

  1. Integrity checking — confirming that stored data hasn’t been corrupted since backup time. Every serious backup tool uses authenticated encryption (AEAD) or cryptographic checksums to detect tampering and bit rot.
  2. Consistency validation — ensuring that the backup represents a coherent state of the source system. A database dumped mid-transaction, or a filesystem captured while files were actively being written, produces internally inconsistent backups.
  3. Restore testing — the only definitive proof that a backup works is actually recovering data from it. Automated restore testing extracts a sample of files, verifies checksums, and reports any discrepancies.

Without all three, you’re flying blind. A backup job that reports “success” based solely on exit code tells you nothing about whether the data inside is actually usable.

Borgmatic: Automated Backup Orchestration with Built-In Verification

Borgmatic (2,224 stars, last updated April 2026) is a configuration-driven wrapper around BorgBackup (13,222 stars) that adds automated scheduling, retention policies, and — critically — built-in verification hooks.

How Borgmatic Verifies Backups

Borgmatic runs three verification layers automatically:

  • borg check — validates repository metadata, archive index consistency, and segment file integrity using CRC32 checksums and HMAC authentication tags
  • Extract testing — periodically extracts a random subset of archived files to a temporary directory and compares them against expected checksums
  • Hook-based validation — runs custom scripts (database dumps, application health checks) before and after backup creation
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# /etc/borgmatic/config.yaml
location:
  source_directories:
    - /home
    - /var/www
    - /etc
  repositories:
    - user@backup-server:/backups/server.borg

storage:
  encryption_passphrase: "your-strong-passphrase"
  compression: zstd,6

retention:
  keep_daily: 7
  keep_weekly: 4
  keep_monthly: 6

consistency:
  # Frequency of each check type
  checks:
    - name: repository
      frequency: 2 weeks
    - name: archives
      frequency: 1 month
    - name: extract
      frequency: 3 months
      extraction_prefix: "borgmatic-extract-test-"

The consistency section is where Borgmatic shines. Unlike running borg check manually, Borgmatic schedules different check types at different frequencies — lightweight repository checks run often, while expensive extract tests run quarterly to avoid impacting production storage.

docker Deployment

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
version: "3.8"
services:
  borgmatic:
    image: ghcr.io/borgmatic-collective/borgmatic:latest
    volumes:
      - /etc/borgmatic:/etc/borgmatic.d
      - /root/.config/borgmatic:/root/.config/borgmatic
      - /root/.cache/borgmatic:/root/.cache/borgmatic
      - /data:/data:ro
      - /backup-repo:/backup-repo
    environment:
      - BORG_PASSPHRASE=your-backup-passphrase
      - BORG_UNKNOWN_UNENCRYPTED_REPO_ACCESS_IS_OK=yes
    command: ["borgmatic", "create", "--stats", "--verbosity", "1"]
    restart: "no"

Schedule the container via cron or systemd timer:

1
2
3
4
5
6
7
8
9
# /etc/cron.d/borgmatic
0 2 * * * root docker run --rm \
  -v /etc/borgmatic:/etc/borgmatic.d \
  -v /root/.config/borgmatic:/root/.config/borgmatic \
  -v /data:/data:ro \
  -v /backup-repo:/backup-repo \
  -e BORG_PASSPHRASE=your-passphrase \
  ghcr.io/borgmatic-collective/borgmatic:latest \
  borgmatic create --stats

Restic: Built-In Integrity with Cryptographic Proofs

Restic (33,165 stars, last updated April 2026) is written in Go and supports a wide range of backends: local filesystem, SFTP, REST server, S3-compatible storage, Backblaze B2, Azure Blob, and Google Cloud Storage.

Restic’s Verification Commands

Restic uses authenticated encryption (XSalsa20-Poly1305) for all stored data. Every blob is encrypted with a unique key and includes a MAC for tamper detection.

Full repository check:

1
2
3
4
5
# Verify all data integrity — reads every blob
restic -r s3:s3.amazonaws.com/my-backup-bucket check

# Check only repository structure (faster)
restic -r /backup/restic-repo check --read-data-subset=10%

Specific snapshot verification:

1
2
3
4
5
6
7
# Check a single snapshot for consistency
restic -r /backup/restic-repo check --with-cache

# Restore a snapshot to verify recoverability
restic -r s3:s3.amazonaws.com/my-backup-bucket restore latest \
  --target /tmp/restore-test \
  --include "/var/www/config"

The --read-data-subset flag is critical for large repositories. A 10TB repository check might take hours with full read, but --read-data-subset=10% validates a random 10% sample in a fraction of the time — catching most corruption issues without the full I/O cost.

Automated Restore Testing with Restic

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#!/bin/bash
# /usr/local/bin/restic-verify.sh
set -euo pipefail

REPO="s3:s3.amazonaws.com/my-backup-bucket"
RESTORE_DIR="/tmp/restic-verify-$(date +%s)"
LOGFILE="/var/log/restic-verify.log"

cleanup() {
    rm -rf "$RESTORE_DIR"
}
trap cleanup EXIT

echo "$(date): Starting restore verification" >> "$LOGFILE"

# Restore latest snapshot to temp directory
restic -r "$REPO" restore latest --target "$RESTORE_DIR" 2>&1 >> "$LOGFILE"

# Verify critical files exist and are non-empty
[nginx](https://nginx.org/)CAL_FILES=(
    "/etc/nginx/nginx.conf"
    "/var/www/config/database.yml"
    "/etc/ssl/certs/server.crt"
)

FAILED=0
for file in "${CRITICAL_FILES[@]}"; do
    full_path="$RESTORE_DIR$file"
    if [ ! -s "$full_path" ]; then
        echo "FAIL: $file is missing or empty" >> "$LOGFILE"
        FAILED=$((FAILED + 1))
    else
        echo "OK: $file ($(wc -c < "$full_path") bytes)" >> "$LOGFILE"
    fi
done

if [ "$FAILED" -gt 0 ]; then
    echo "$(date): VERIFICATION FAILED ($FAILED files)" >> "$LOGFILE"
    exit 1
fi

echo "$(date): All checks passed" >> "$LOGFILE"

Docker Deployment

Restic’s official image includes the binary but requires you to mount configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
version: "3.8"
services:
  restic-backup:
    image: restic/restic:latest
    volumes:
      - /data:/data:ro
      - /etc/restic:/etc/restic
      - /tmp/restic-cache:/cache
    environment:
      - RESTIC_REPOSITORY=s3:s3.amazonaws.com/my-backup-bucket
      - RESTIC_PASSWORD=your-restic-password
      - AWS_ACCESS_KEY_ID=your-access-key
      - AWS_SECRET_ACCESS_KEY=your-secret-key
      - RESTIC_CACHE_DIR=/cache
    command: ["backup", "/data", "--verbose"]
    restart: "no"

Kopia: Fast Verification with Client-Side Deduplication

Kopia (13,038 stars, last updated April 2026) is a cross-platform backup tool written in Go that combines deduplication, compression, and encryption with a unique content-addressable storage model. It includes both CLI and a web-based GUI.

Kopia’s Verification Approach

Kopia stores data in content-addressed blobs with BLAKE2b-256 hashing. Every piece of data is identified by its hash, making it impossible to serve corrupted content — the hash simply won’t match.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Initialize repository on S3
kopia repository create s3 \
  --bucket=my-backup-bucket \
  --access-key=AKIA... \
  --secret-access-key=...

# Full integrity check
kopia snapshot verify latest

# Verify specific source path
kopia snapshot verify latest --verify-files-percent=25

# Check repository consistency
kopia repository validate

The --verify-files-percent flag works similarly to Restic’s --read-data-subset but operates at the file level — it actually extracts and checksums 25% of files in the snapshot, providing stronger guarantees than a metadata-only check.

Automated Kopia Verification

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#!/bin/bash
# /usr/local/bin/kopia-verify.sh
set -euo pipefail

# Connect to repository
kopia repository connect filesystem --path /backup/kopia-repo

# Run snapshot verification with 10% file sampling
OUTPUT=$(kopia snapshot verify --verify-files-percent=10 2>&1)
EXIT_CODE=$?

if [ $EXIT_CODE -ne 0 ]; then
    echo "Kopia verification failed:"
    echo "$OUTPUT"
    exit 1
fi

echo "Kopia verification passed. Sample: $(echo "$OUTPUT" | grep -o '[0-9]* files' | head -1)"

Docker Deployment with GUI

Kopia’s unique advantage is its built-in web UI, which is useful for monitoring verification status:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
version: "3.8"
services:
  kopia:
    image: ghcr.io/kopia/kopia:latest
    ports:
      - "51515:51515"
    volumes:
      - /data:/data:ro
      - /backup/kopia-repo:/repository
      - kopia-config:/app/config
    environment:
      - KOPIA_PASSWORD=your-kopia-password
    command: ["server", "start", "--address=0.0.0.0:51515",
              "--server-username=admin",
              "--server-password=dashboard-password",
              "--config-file=/app/config/repository.config"]
    restart: unless-stopped

volumes:
  kopia-config:

Comparison: Backup Verification Tools

FeatureBorgmatic + BorgResticKopia
EncryptionAES-256-CTR-HMACXSalsa20-Poly1305AES-256-GCM
Hash algorithmHMAC-SHA256Poly1305 MACBLAKE2b-256
Metadata checkborg checkrestic checkkopia repository validate
Full data read checkborg check --repository-only offrestic check reads all blobskopia snapshot verify
Partial sampling❌ Not built-in--read-data-subset=10%--verify-files-percent=25
Extract testing✅ Borgmatic extract testManual restore to temp dirManual via snapshot restore
Automated scheduling✅ YAML configRequires cron/systemdBuilt-in scheduler + cron
Backend supportSSH, local, rcloneS3, SFTP, REST, B2, Azure, GCSS3, B2, GCS, Azure, SFTP, WebDAV
Web UI❌ CLI only❌ CLI only✅ Built-in web GUI
Retention policies✅ Flexible (daily/weekly/monthly)forget --keep-* flags✅ Policies with keep-latest, keep-hourly
Compressionlz4, zstd, lzmanone (by design)zstd, s2, deflate
LanguagePythonGoGo
Stars2,224 (wrapper) / 13,222 (Borg)33,16513,038

Additional Verification Tools and Strategies

Beyond the three main backup engines, several supplementary tools strengthen your verification posture:

Vorta — Desktop GUI for Borg

Vorta provides a desktop GUI for Borg with integrated scheduling and log monitoring. While it doesn’t add new verification methods beyond borg check, it makes monitoring easier for non-technical users and surfaces errors in a GUI rather than requiring log parsing.

1
2
3
4
# Install on desktop Linux
pip install vorta
# Or via Flatpak
flatpak install flathub com.borgbase.Vorta

Restic Browser — Interactive Repository Explorer

restic-browser is an Electron-based GUI for browsing Restic repositories. It lets you inspect snapshots, verify individual files, and estimate recovery times — useful for pre-disaster-recovery planning.

Database-Specific Verification

For database backups, generic file integrity checks are insufficient. A .sql dump can be byte-perfect but contain logically inconsistent data. Use database-specific tools:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# PostgreSQL: verify logical dump integrity
pg_restore --list /backup/db-dump.sql.dump > /dev/null 2>&1
echo "Exit code: $? — 0 means dump structure is valid"

# MySQL: check dump syntax
mysql --no-defaults --database=test_restore < /backup/db-dump.sql
echo "Restored to test database — verify with SELECT COUNT(*)"

# PostgreSQL: use pgBackRest for physical backup verification
# See our detailed PostgreSQL backup guide: ../pgbackrest-vs-barman-vs-wal-g-self-hosted-postgresql-backup-guide/

Storage Backend Health Monitoring

Your backup is only as reliable as the storage it sits on. Monitor the underlying storage:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# Check filesystem health (ext4)
sudo tune2fs -l /dev/sdb1 | grep -i "filesystem state"

# Check ZFS pool status
zpool status -x

# Monitor disk SMART data
sudo smartctl -a /dev/sdb | grep -E "Reallocated|Pendin[minio](https://min.io/)orrectable"

# For S3/MinIO: use mc admin info
mc admin info myminio

See our guides on MinIO S3 storage and NAS solutions for storage infrastructure that supports reliable backup targets.

Choosing the Right Verification Strategy

The best approach depends on your infrastructure scale and tolerance for false negatives:

Small setups (1-3 servers, < 500GB): Restic with weekly check --read-data-subset=10% and monthly restore-to-temp testing. The low overhead makes frequent verification practical, and the S3 backend support means offsite copies are easy.

Medium setups (5-20 servers, 1-10TB): Borgmatic with automated consistency checks. The YAML-driven scheduling handles retention and verification at scale, and Borg’s deduplication keeps storage costs down for large datasets.

Large setups (20+ servers, 10TB+): Kopia with its content-addressable model and built-in scheduler. The web UI provides visibility across multiple backup sources, and the --verify-files-percent flag offers tunable verification depth without overwhelming I/O.

Regardless of tool, the golden rule remains: an untested backup is not a backup. Schedule verification as rigorously as you schedule the backup itself, and set up alerting so you hear about failures during business hours — not at 3 AM on a Sunday when you’re trying to restore production.

FAQ

How often should I verify my backups?

For production systems, run lightweight integrity checks (metadata-only) weekly and full data read checks monthly. For critical databases or compliance-regulated data, add weekly restore-to-temp testing. The verification frequency should be proportional to how much data you’d lose if the backup turned out to be corrupt — daily verification for systems with RPO under 1 hour, weekly for less critical workloads.

What’s the difference between integrity check and restore testing?

An integrity check confirms that stored data hasn’t been corrupted — it validates checksums, MAC tags, and repository structure. Restore testing goes further by actually extracting data to a temporary location and verifying that the recovered files are usable. Integrity checks catch bit rot and storage corruption; restore testing catches logical errors like incomplete database dumps, permission issues, and missing dependencies.

Can backup verification impact production performance?

Full data read checks do generate I/O and network traffic, especially for large repositories. Use sampling flags (--read-data-subset, --verify-files-percent) to reduce impact. Schedule heavy verification during off-peak hours. Borgmatic’s approach of running different check types at different frequencies — lightweight checks often, extract tests rarely — is designed specifically to balance confidence with performance.

Does encryption affect verification speed?

Yes, but the impact is usually small. Restic’s XSalsa20-Poly1305 and Kopia’s AES-256-GCM both have hardware acceleration on modern CPUs. Borg’s AES-256-CTR-HMAC is slightly slower but still achieves hundreds of MB/s on typical server hardware. The bottleneck is usually disk I/O or network bandwidth to the storage backend, not cryptographic overhead.

What happens if verification detects corruption?

Most backup tools will report which blobs or segments are corrupted but cannot repair them — the data is already lost at that point. The correct response is: (1) immediately create a fresh backup from the live system, (2) investigate the root cause (failing disk, network issue, storage backend problem), (3) if you have multiple backup targets, cross-reference to see if the corruption is isolated to one backend, and (4) for critical data, restore from the most recent known-good snapshot while the fresh backup runs.

Is it safe to run backup verification on production servers?

Yes, if you use sampling. Running restic check with --read-data-subset=10% on a production server reads only a fraction of the backup data and typically completes in minutes. Full extract tests should run on a separate host that has read-only access to the backup repository, not on the production server. For Kopia, the web UI server can run on a monitoring host while backups execute from production servers.

Advertise here