remote-backups.comremote-backups.com
Contact illustration
Sign In
Don't have an account ?Sign Up

PBS Encryption Internals: Server-Blind Backups

Two backup products can both advertise "AES-256" and mean entirely different things. One stores your data encrypted on disk and decrypts it back into server memory whenever a job runs. The other never sees your data at all. Proxmox Backup Server does the second thing, and once you understand how, the operational consequences (subpoenas, insiders, storage breaches) stop feeling like marketing slogans.

Key Takeaways
  • PBS encryption happens on the client, before chunks ever leave the host.
  • Each chunk is encrypted with AES-256-GCM. The server only sees ciphertext, auth tag, and nonce.
  • The keyfile lives on the client under /etc/proxmox-backup/. Lose it and the backups are unrecoverable.
  • Chunk IDs are HMAC-SHA-256 over plaintext using a keyring-derived key, so dedup still works inside one keyring.
  • An optional RSA master key gives you an escrow path without spreading the daily key across machines.
  • On any modern x86 CPU with AES-NI, encryption overhead is typically within a few percent. The network is almost always the bottleneck first.

"Encrypted at Rest" Is Not What You Think

There are two encryption models in storage, and conflating them causes most of the confusion in backup conversations.

The first is server-side encryption at rest. The server holds the keys. When data arrives, it gets encrypted before being written to disk. When data is read, the server decrypts it. Anyone with operational access to that server has, by definition, access to the plaintext: backup software, monitoring agents, anyone with sudo, and (when the legal process applies) the operator of that machine. Disk theft is mitigated. A subpoena to the provider, a rogue insider, or any process running on the box, are not.

The second is client-side encryption. Plaintext is encrypted on the source host, before it crosses any network boundary. The server stores opaque ciphertext. There is no decryption path on the server side because there is no key on the server side. Disk theft is mitigated. So is the subpoena scenario. So is the insider with disk access.

Proxmox Backup Server does the second model. The custom protocol covered in part 1 of this series was designed around this assumption: the server negotiates which chunks it already has and accepts new ones blindly. It does not need to read them, ever.

The marketing term and the technical term

"Zero-knowledge hosting" is the phrase the marketing pages reach for. The technically accurate name is "client-side authenticated encryption with server-blind storage". They mean the same thing in this context.

The PBS Encryption Pipeline

Walk through what happens to a single chunk from the moment the backup client reads it off the source disk.

  1. The client reads bytes from the source (VM disk, file archive, whatever the source plugin produces).
  2. The content-defined chunker (covered in part 2) splits the stream into variable-sized chunks averaging around 4 MiB.
  3. For each chunk, the client computes an HMAC-SHA-256 over the plaintext using a key derived from the active keyring. The 32-byte output becomes that chunk's identity.
  4. The client encrypts the chunk with AES-256-GCM using a data key from the keyring. GCM produces ciphertext plus a 16-byte authentication tag. A 12-byte nonce is generated for this chunk. GCM's security collapses if a nonce ever repeats under the same key. PBS uses a fresh random 96-bit nonce per chunk, which gives a collision probability negligible at any realistic backup volume.
  5. The client uploads the bundle (nonce, ciphertext, auth tag) to the server, keyed by the HMAC-derived chunk ID.
  6. The server stores the bundle in .chunks/ and updates its index. It does not, and cannot, look inside.

What the server keeps on disk is an opaque blob plus the GCM auth tag and nonce. Tampering with even one byte invalidates the tag, so any modification is detectable when the client later decrypts.

rust
fn store_chunk(plaintext: &[u8], keyring: &Keyring) -> ChunkRef {
    // 1. Chunk identity is HMAC over plaintext, not ciphertext.
    //    Same plaintext + same keyring -> same chunk_id -> dedup.
    let chunk_id = hmac_sha256(&keyring.id_key, plaintext);

    // 2. Fresh nonce per chunk. AES-256-GCM auth tag covers the ciphertext.
    let nonce = random_nonce_12();
    let (ciphertext, auth_tag) = aes_256_gcm_encrypt(
        &keyring.data_key,
        &nonce,
        plaintext,
    );

    // 3. Server receives the bundle. It never sees plaintext.
    server.put(chunk_id, Bundle { nonce, ciphertext, auth_tag });

    ChunkRef { id: chunk_id, len: plaintext.len() }
}
Per-chunk pipeline (simplified pseudo-code)

No additional authenticated data is mixed in at this layer; the encrypted blob plus the chunk's content-derived ID is what gets stored.

Why HMAC-Then-Encrypt for the Chunk ID

The chunk ID design is subtle and worth a paragraph of its own, because the obvious approach is wrong.

Naive design: hash the ciphertext, use that as the chunk ID. This breaks deduplication entirely. AES-GCM uses a fresh nonce per encryption, so the same plaintext encrypted twice produces different ciphertext, which hashes to different IDs. Every backup would upload everything again.

PBS approach: HMAC the plaintext using a key derived from the keyring. The same plaintext under the same keyring always produces the same chunk ID, so the content-addressed storage layer covered in part 3 still works. Two different keyrings produce different HMAC keys, and therefore different chunk IDs for the same plaintext. So clients with different keyrings cannot deduplicate against each other.

That last point is deliberate. Cross-tenant deduplication with encryption is a covert channel: an attacker who suspects a victim has a particular file can encrypt it themselves and watch whether the server short-circuits the upload. The HMAC-keyed design closes that channel.

The Keyfile Is the Backup

The keyfile is a small JSON document containing the key material and metadata. The default location is under /etc/proxmox-backup/ on the client. It is optionally wrapped with a passphrase via PBKDF2 (configurable iteration count).

The crucial property: the keyfile is the data. Anyone with that file (and the passphrase, if set) can decrypt every backup ever made with it. Conversely, without it, the backups are random bytes. There is no Proxmox-side recovery path for keyring-only deployments. (The optional master-key model below provides a self-managed escrow path, but it is yours, not ours.) There is no support ticket. There is no key escrow at remote-backups.com or at Proxmox GmbH. We deliberately do not have a way to recover your data, because if we did, so would anyone who served us a subpoena.

The single most important operational task

Backing up the keyfile is the single most important operational task in any Proxmox Backup Server deployment that uses encryption. Losing the keyfile means total, permanent, unrecoverable data loss. No exceptions, no recovery, no workaround.

How to Actually Back Up the Keyfile

Standard backup advice does not apply here, because the keyfile must survive the loss of the systems it protects. You cannot store it on the same host whose backups depend on it.

A workable storage pattern looks like this:

  • Print the keyfile as a paper QR code with proxmox-backup-client key paperkey (see proxmox-backup-client key help for the current flag set). Store at least two paper copies in geographically separated safes.
  • Keep a password-protected digital copy on an offline encrypted USB drive in a third location. Update it only when you rotate the key.
  • If the organisation has a vault or HSM (1Password, Bitwarden, Vaultwarden, AWS KMS, on-prem HSM), store an encrypted copy there too.
  • Do not store the keyfile on the same machine that runs the backup client.
  • Do not email it to yourself. Do not paste it into chat. Do not commit it to a private repo.

Conceptually the recovery flow has two halves: generate the paperkey from your active keyfile, then prove you can reconstruct the keyfile from the paper before you trust it. Use proxmox-backup-client key help to confirm the exact subcommands and flags for your installed version, since these have shifted across releases. The operational point is unchanged: the paperkey IS your recovery path. If you cannot rebuild the keyfile from it on a scratch host, you do not have a backup of your key.

bash
proxmox-backup-client key help
Check the available key subcommands on your installation

The Master Key: Escrow Without Sharing the Daily Key

The keyfile-only model has an operational sharp edge: every backup host needs the keyfile, and the keyfile is also your recovery key. Spread it across twenty hosts and you have twenty places it can leak.

The master key model gives you a way out. You generate an RSA keypair. The public half goes onto every backup client. The private half goes into long-term cold storage (offline vault, HSM, paper in a safe deposit box). When the client generates a data key for a backup, it also wraps a copy of that data key with the master public key and stores the wrapped copy alongside the ciphertext on the server.

Day to day, recovery uses the regular keyfile on the client. The master private key never touches a connected machine. If the daily keyfile is lost or the client is destroyed, you retrieve the master private key from cold storage, unwrap a data key from the wrapped copy on the server, and decrypt. The master private key is your last-resort recovery secret, and it has never been online.

Keyring-only vs keyring with master key escrow
Property
Keyring only
Keyring + master key
Recovery if client keyfile is lost
Impossible
Possible via master private key
Recovery if a client is fully compromised
Backups from that client are recoverable only via that keyfile
Rotate the daily key, master key remains safe offline
Operational complexity
Low
Medium (cold storage discipline)
Blast radius of one leaked keyfile
All backups under that keyring
Limited to one daily key window
Useful for compliance evidence
Yes, simpler story
Yes, stronger separation-of-duties story

The trade-off is real. The master key model adds an extra secret to protect, and the cold-storage process becomes part of your DR plan. For homelabs, keyring-only with two paper copies of the keyfile is usually enough. For multi-host fleets and anything touching SOC 2 or ISO 27001 controls, the master key model pays for itself the first time a host gets reimaged.

What the Server Actually Stores

Look at .chunks/ on an unencrypted datastore and you will find LZ4-compressed plaintext. Run file on a chunk and you might recognise it as a tar fragment or a disk image fragment. Run strings and you may pull readable text.

Look at .chunks/ on an encrypted datastore and you find AES-256-GCM ciphertext with a nonce and auth tag prefix. file reports "data". strings produces nothing. There is no structural information left, because GCM encrypts in CTR mode and authenticates with GHASH, so the ciphertext is statistically indistinguishable from random to anyone without the key.

Server-blind is not metadata-blind

What the server still sees: chunk counts, chunk sizes, upload timing, and which chunks repeat across snapshots. An attacker with full datastore access can infer change rates and dataset structure from this metadata, even with strong encryption. Server-blind means content-blind, not metadata-blind.

Verify jobs still run, but their meaning shifts. A verify job recomputes the stored hash of each chunk against its on-disk content. That detects bit rot, silent corruption, and tampering. It does not, and cannot, prove that your current keyring still successfully decrypts the chunk back to its original plaintext. The server has no plaintext to compare against.

Verify is not a restore test

A green verify result on encrypted chunks proves the chunks have not been corrupted on disk. It does not prove your keyfile still works, that your backup client still has access to it, or that a restore would succeed end-to-end. Pair verify jobs with periodic restore drills as described in our restore testing and DR drills post. With encryption, verify alone is not enough.

Performance: How Much Does Encryption Cost?

The honest answer is "almost nothing on any modern x86 CPU, and possibly a lot on small ARM boards".

In our testing on Xeon and EPYC systems pushing to a 10 GbE PBS target, encrypted backups typically run within a few percent of unencrypted throughput (we have seen roughly 3-5% on representative workloads). AES-NI (Intel's hardware AES instruction set, mirrored by AMD) makes AES-256-GCM essentially free at the speeds a backup client typically pushes. Almost every x86 CPU shipped in the last decade has AES-NI.

On older or smaller chips, the picture changes. ARM cores without the optional crypto extensions (some Raspberry Pi generations, some low-end NAS boards) can show double-digit percentage overhead for AES-GCM. That can matter if you are pushing a large initial seed. Once you are doing incremental backups of changed chunks only, even a slow CPU keeps up with most home internet uplinks.

Check whether your client CPU has hardware AES acceleration:

bash
# x86: look for the 'aes' flag
grep -m1 -o 'aes' /proc/cpuinfo && echo "AES-NI present"

# ARM: look for 'aes' in 'Features' (requires ARMv8 crypto extensions)
grep -m1 'Features' /proc/cpuinfo | grep -o 'aes' && echo "ARMv8 AES present"

# OpenSSL benchmark, useful real-world reference
openssl speed -evp aes-256-gcm
Verify AES-NI on the backup client

For most deployments the real ceiling is the network, not the cipher. A 1 Gbit uplink saturates at around 110 MB/s. Modern CPUs encrypt AES-256-GCM at multiple GB/s per core. The cipher is not the bottleneck.

How This Plays Out in Real Incidents

The point of all of this becomes concrete in three scenarios.

The subpoena. A hosting provider receives a legal order to produce a customer's data. With server-side at-rest encryption, the provider holds the keys and produces plaintext. With Proxmox Backup Server's client-side encryption, the provider produces ciphertext and metadata; the plaintext keys never existed on provider infrastructure to be compelled.

The insider threat. A datacenter employee with disk access copies a set of backup chunks to a USB drive. With at-rest encryption, the keys live on the same systems they have access to. With client-side encryption, they have a pile of ciphertext that is useless without a keyfile that never touched the provider's infrastructure.

The storage breach. An attacker exfiltrates raw disks or a snapshot of the datastore. At-rest encryption: they also need to compromise the key management system. Client-side encryption: ciphertext is the only thing on the disk, and the keyring is somewhere else entirely.

The trade-off is the one already mentioned: the customer is solely responsible for the keyfile. There is no password reset. There is no "I lost my key, can you help" path. That responsibility is the whole point. If a provider could help, so could anyone who compelled the provider. For deeper coverage of the threat model and operator workflow, see the dedicated client-side encryption post, and pair it with the ransomware protection patterns write-up if attacker resilience is on your radar.

Wrapping Up the Series

This is the fourth and final post in the under-the-hood series. Step back and the pieces fit together cleanly. The protocol from part 1 exists to make chunk negotiation cheap, so the client and server can agree on what is missing in a single round of Known and Upload messages. The content-defined chunker from part 2 produces stable boundaries that survive insertions and deletions, which is what makes deduplication actually save space rather than just look like it does on a brochure. The content-addressed chunk store from part 3 lets the server identify duplicates without parsing anything, which is exactly the property you need when the server is not allowed to read the data. And the encryption layer covered here turns the server into a blind sink: it can verify, count, and compact, but it cannot read.

Each design choice supports the others. Server-blind storage is only practical because dedup works on encrypted chunks. Dedup on encrypted chunks is only safe because chunk IDs are HMACs over plaintext rather than hashes over ciphertext. The HMAC scheme is only meaningful because chunking is content-defined. And content-defined chunking is only useful because the protocol exposes per-chunk existence checks. Pull one and the rest stops making sense.

That coherence is why Proxmox Backup Server feels different to operate than a generic backup tool with an encryption checkbox. The encryption was not bolted on. It was the constraint the whole system was designed around.

Looking for a Proxmox Backup Server target that is genuinely server-blind?

remote-backups.com runs EU-hosted PBS targets with isolated credentials and zero key access on our side. Your keyfile stays with you. Always.

View Plans

Not for existing backups. Chunk IDs are derived from the plaintext using the current keyring, so a new keyring produces new chunk IDs and a new datastore lineage. You can start a fresh backup chain under a new key, keep the old chain readable with the old keyfile, and retire the old chain on its own schedule. There is no in-place re-keying because that would require the server to decrypt and re-encrypt, which is exactly what the design prevents.

The same thing as losing the keyfile. The passphrase wraps the key material via PBKDF2, and there is no recovery hatch. If you used the master key escrow model, the master private key still decrypts everything. If you did not, the data is gone. This is also why we recommend storing the unwrapped keyfile (or its paperkey) in a safe alongside any password manager copy.

Not within a single keyring. Same plaintext, same keyring, same chunk ID. The ratios from part 3 of this series apply unchanged. Across different keyrings, dedup does not happen, which is the intended behaviour. Plan capacity per keyring, not per datastore.

Yes, with a combination of evidence. Show the auditor that no encryption key is configured on the server side, that chunk files on disk are random when sampled, that verify jobs operate only on stored hashes, and that the keyfile lives exclusively on the client. The master key escrow model strengthens this story by adding a separation-of-duties artefact (private key offline, public key on clients only).

Optional. Many deployments run keyring-only with paper backups of the keyfile and never miss the master key. The master key model becomes worth the operational overhead when you have multiple clients, audit requirements, or a higher-than-usual risk of a client being compromised or reimaged without warning.
Bennet Gallein
Bennet Gallein

remote-backups.com operator

Infrastructure enthusiast and founder of remote-backups.com. I build and operate reliable backup infrastructure powered by Proxmox Backup Server, so you can focus on what matters most: your data staying safe.