You provisioned a new Proxmox Backup Server, the install ran clean, and now the prompt is asking what kind of datastore to point it at. Pick wrong here and you spend the next six months fighting slow restores, blown capacity forecasts, or worse, a corrupted datastore with no checksums to tell you when the rot started.
The choice between Directory, LVM-Thin, and ZFS is the single biggest decision in a PBS deployment after hardware sizing. This post breaks down what each backend gives you, where it falls short, and which one belongs under your production workload.
Key Takeaways
- Directory works on any filesystem but gives you no checksums, no native deduplication awareness, and no snapshot recovery.
- LVM-Thin delivers block-level speed and thin provisioning, but a corrupted thin pool can take the whole datastore with it.
- ZFS is the only backend with end-to-end checksums, scrubs, compression, and snapshot rollback. It is the default for multi-client production.
- ZFS deduplication is almost never worth enabling. PBS already deduplicates above the filesystem and the RAM cost is brutal.
- Rule of thumb: production MSP or hosting use cases want ZFS. Existing LVM shops can start on LVM-Thin and migrate. Labs and dev get Directory.
Why the Backend Choice Matters
Proxmox Backup Server stores everything as content-addressed chunks. Dedup and incremental performance both live above the filesystem layer, in the PBS daemon. What the underlying storage backend gives you is everything else: integrity, recovery, write amplification, and how badly your day goes when something fails.
Three things hinge on the backend choice.
Integrity. PBS verifies chunks by re-reading them and comparing hashes. That catches bit rot eventually, but on a 50 TB datastore a full verify takes hours. ZFS checksums every block on every read, for free. Directory and LVM-Thin trust the disk.
Recovery posture. When a sector goes bad, ZFS tells you which file is affected and which chunk. ext4 over LVM-Thin gives you an EIO and a guess. Directory on top of XFS depends entirely on what the filesystem decides to log.
Capacity behavior under pressure. A full ZFS pool degrades predictably. A full LVM-Thin pool can corrupt every volume in it. A full directory backend just fails new writes, but garbage collection slows to a crawl long before that.
Get this layer right and the rest of PBS administration becomes routine. Get it wrong and you are explaining to a client why their restore is unreadable.
The Three Backends: At a Glance
Directory is plain filesystem storage. You hand PBS a path on an existing filesystem (ext4, XFS, anything mountable), and it writes chunks as files into a directory tree. Zero prerequisites. Whatever your filesystem does, PBS inherits.
LVM-Thin sits on a thin-provisioned logical volume. You carve a thin pool out of LVM, create a thin LV, format it with a filesystem, and mount that as the datastore. You get thin provisioning, fast snapshots at the LV level, and decent sequential write performance. You also inherit LVM's failure modes.
ZFS combines volume management and filesystem in one layer. You build a zpool from raw devices, create a dataset, and point PBS at the mount point. Checksums, compression, scrubs, snapshots, send/recv, and copy-on-write are all native.
Feature Matrix
| Capability | Directory | LVM-Thin | ZFS |
|---|---|---|---|
PBS chunk dedup support | |||
Filesystem-level dedup | Optional (avoid) | ||
Transparent compression | FS-dependent | FS-dependent | Native LZ4/ZSTD |
End-to-end checksums | |||
Snapshot capability | FS-dependent | LV-level | Native, instant |
Raw device required | |||
Recovery complexity | Medium | High | Low |
The big distinctions are checksums and recovery complexity. Those are the columns that matter when something breaks.
Directory Backend
Directory is the path of least resistance. If you have a mounted filesystem with free space, you have a viable datastore.
Pros. No raw devices required. Works on any filesystem your kernel can mount. You can put a datastore on an NFS share, an SMB mount, a USB drive, a local ext4 partition, anything. Setup is one command. If you need to move the datastore, you copy the directory tree.
Cons. No native checksums. PBS's chunk verification is the only integrity check, and it's expensive. Reliability depends entirely on the underlying filesystem. On ext4 with a flaky disk, silent corruption can spread for months before a verify catches it. No snapshot capability at the storage layer, so atomic rollback after a bad sync job is impossible without filesystem-level help. Garbage collection is slower than on block-backed datastores because every chunk file is an inode operation.
When to use. Development environments. Test labs. A single client with low backup churn and a short retention window. Any scenario where "the datastore is gone" is annoying rather than catastrophic.
# Assumes /mnt/backups is already mounted with adequate free space
mkdir -p /mnt/backups/pbs-datastore
chown backup:backup /mnt/backups/pbs-datastore
proxmox-backup-manager datastore create main \
--path /mnt/backups/pbs-datastore \
--comment "Directory-backed datastore on ext4"Filesystem matters
A directory datastore is only as reliable as the filesystem under it. ext4 on a single disk gives you no protection against bit rot. If you go this route, at minimum run it on XFS or ext4 on top of mdraid with regular consistency checks.
LVM-Thin Backend
LVM-Thin is the answer when you already have an LVM setup and need to slot PBS into it without rebuilding the storage layer.
Pros. Block-level write performance is solid for sequential workloads, which is what PBS chunk writes look like. Thin provisioning lets you overcommit the underlying pool, which is useful in mixed environments. LV-level snapshots are fast to take. Integration with Proxmox VE is well-trodden ground, so the operational tooling is familiar.
Cons. No built-in deduplication at the storage layer (PBS still deduplicates chunks, but the filesystem above LVM-Thin sees them as opaque files). No checksums. The real pain is recovery: if the thin pool's metadata corrupts, every thin LV in that pool can become unrecoverable simultaneously. There is no graceful degradation. You either run thin_check, repair, and pray, or you restore from a sync target. Filling the thin pool past 100 percent puts the entire pool into read-only mode and risks data loss.
When to use. You already run LVM-Thin elsewhere, you need to get PBS online today, and your retention window is short enough that a full datastore rebuild from a sync target is operationally acceptable. Treat it as a stepping stone toward ZFS rather than a final answer.
# Create a volume group on raw devices
pvcreate /dev/sdb /dev/sdc
vgcreate pbs-vg /dev/sdb /dev/sdc
# Create the thin pool, leaving 5% for metadata growth
lvcreate --type thin-pool -l 95%FREE -n pbs-pool pbs-vg
# Create a thin LV and format it with XFS
lvcreate -V 8T -T pbs-vg/pbs-pool -n pbs-data
mkfs.xfs /dev/pbs-vg/pbs-data
# Mount and hand to PBS
mkdir -p /mnt/pbs-data
mount /dev/pbs-vg/pbs-data /mnt/pbs-data
echo "/dev/pbs-vg/pbs-data /mnt/pbs-data xfs defaults 0 2" >> /etc/fstab
proxmox-backup-manager datastore create main \
--path /mnt/pbs-data \
--comment "LVM-Thin datastore"Monitor thin pool usage
LVM-Thin does not protect you from overcommit. If actual usage hits the underlying pool size, every thin LV in the pool can go read-only at once. Alert on pool usage at 80 percent and never run a thin pool past 90 percent without an expansion plan.
ZFS Backend
ZFS is what the datastore setup guide recommends, and it is what every production deployment of any size should run on. The reasons are not subtle.
Pros. Block-level checksums on every read mean silent corruption is detected immediately, not months later. Native LZ4 or ZSTD compression typically saves another 15 to 30 percent on top of PBS's own ZSTD compression on chunks (text-heavy workloads see more). Scrubs catch bit rot before it propagates. Native snapshots are atomic and instant. ZFS send/recv is the cleanest path for offsite replication outside of PBS's own sync jobs. If a disk fails in a mirror or RAIDZ vdev, the pool keeps running and tells you exactly what to replace.
Recommended tunables. PBS chunks are 4 MB by default, so the right ZFS recordsize is 1M or higher to avoid write amplification. LZ4 compression has near-zero CPU cost and pays for itself. atime updates are wasted IOPS for a backup datastore.
# Create a mirrored pool on two NVMe devices (or RAIDZ2 for spinning disks)
zpool create -o ashift=12 pbs-pool mirror /dev/nvme0n1 /dev/nvme1n1
# Create the dataset with PBS-friendly properties
zfs create \
-o recordsize=1M \
-o compression=lz4 \
-o atime=off \
-o xattr=sa \
-o dnodesize=auto \
pbs-pool/datastore
# Hand the mount point to PBS
proxmox-backup-manager datastore create main \
--path /pbs-pool/datastore \
--comment "ZFS-backed datastore"
# Schedule a weekly scrub
echo "0 3 * * 0 root /usr/sbin/zpool scrub pbs-pool" > /etc/cron.d/zfs-scrubDo not enable ZFS deduplication
ZFS dedup needs roughly 5 GB of RAM per TB of stored data and adds significant write latency. PBS already deduplicates chunks at the application layer, so ZFS dedup is duplicating work for almost no gain. Leave it off unless you have explicitly sized RAM for it and verified the workload benefits.
For mirror layouts on spinning disks, swap the mirror keyword for raidz2 with at least 6 disks. For all-flash, mirrored vdevs scale better than RAIDZ. The PBS performance tuning post goes deeper on vdev geometry and ARC sizing.
Performance and Capacity in Practice
The headline question is which backend is fastest. The honest answer is that backend choice rarely dominates throughput. Network and disk hardware do. What backend choice changes is the floor under bad conditions and the ceiling on operational confidence.
Relative Performance and Resource Profile
| Metric | Directory | LVM-Thin | ZFS |
|---|---|---|---|
Sequential write throughput | Baseline | +5-15% | +10-25% (with LZ4) |
Effective dedup ratio (typical) | 2-4x (PBS only) | 2-4x (PBS only) | 2.5-5x (PBS + ZFS LZ4) |
CPU overhead | Low | Low | Medium |
RAM overhead | Low | Low-Medium | Medium-High (ARC) |
Restore reliability rating | Filesystem-dependent | Good (if pool healthy) | Excellent |
Garbage collection speed | Slow (inode-bound) | Medium | Fast |
A few observations from running these in production.
ZFS reads benefit from ARC caching, which makes verification jobs noticeably faster after the first run. Directory and LVM-Thin lean on the page cache, which is fine but less tuned for the access patterns PBS produces.
LZ4 compression on ZFS typically adds 5 to 10 percent to write throughput, not subtracts. Less data hitting the disk wins against the CPU cost. ZSTD compresses better but costs more CPU. LZ4 is the right default for PBS unless you are storage-bound and CPU-rich.
Restore reliability is where ZFS pulls clearly ahead. A scrub catches silent corruption before a restore needs it. On Directory or LVM-Thin you find out about corruption when the restore fails.
Which One to Pick
The decision is rarely about benchmarks. It is about what you can afford to lose and how fast you need to be back online.
Multi-client MSP or hosting production: ZFS. Checksums and scrubs are the difference between "we caught it and the sync target is clean" and "the client is asking why their restore is half-corrupted." If you charge SLAs against this storage, run ZFS. The capacity planning guide assumes ZFS for sizing math.
Existing LVM shop that needs PBS today: LVM-Thin. Stand it up, get backups flowing, monitor pool usage aggressively, and plan a migration window to ZFS within a quarter. Treat LVM-Thin as a bridge.
Lab, dev, single-machine homelab with short retention: Directory. No prerequisites, no rebuild required if you change your mind. Run it on top of mdraid or a single SSD and accept that you are trusting the filesystem.
Anything touching production data outside those carve-outs: ZFS. The operational gap between ZFS and the alternatives only widens at scale. The decision gets easier the more data you have under management. Pair it with security hardening and the deployment stops being a liability.
Wrapping Up
PBS is opinionated about chunks and indifferent about the storage underneath. That makes the backend choice a pure operations decision, not a feature decision. Directory works. LVM-Thin works faster. ZFS works and tells you when something is wrong before it becomes a restore failure. Pick the backend that matches the consequences of the data you are storing.
Skip the storage tuning
remote-backups.com runs ZFS-backed datastores with hardware checksums and LZ4 compression out of the box. No vdev geometry decisions, no scrub scheduling, no pool sizing math.
Get Started


