remote-backups.comremote-backups.com
Contact illustration
Sign In
Don't have an account ?Sign Up

PBS Pruning and Garbage Collection: Keep Your Datastore Lean

  • February 17, 2026
  • 11 min read
Table of Contents
Bennet Gallein
Bennet Gallein
remote-backups.com operator

Proxmox Backup Server datastores grow. Every backup job creates new snapshots, and every snapshot references chunks on disk. Without active maintenance, your datastore accumulates snapshots you no longer need and chunks that no snapshot references anymore. Storage fills up, performance degrades, and if you're paying for offsite or cloud-backed storage, costs climb.

Key Takeaways
  • Pruning removes snapshot manifests based on retention rules; garbage collection reclaims the actual disk space
  • Always run in order: prune → backup → verify → garbage collect
  • GC has a 24-hour safety window — never run it while backup or sync jobs are active
  • Weekly GC is sufficient for most deployments; daily GC adds overhead with minimal benefit
  • Use the PBS prune simulator to preview which snapshots would be kept/removed before applying changes

PBS gives you two tools to manage this: pruning removes old snapshots based on retention rules, and garbage collection reclaims disk space by deleting orphaned chunks. They solve different problems and must run in the right order. This post explains both mechanisms, walks through practical retention policies, and lays out a scheduling strategy that keeps your datastore lean without risking data loss.

Pruning Explained

Pruning decides which snapshots to keep and which to delete. It operates on snapshot metadata only. When a snapshot gets pruned, PBS removes its manifest (the index of chunk references), but it does not immediately free disk space. The chunks that snapshot referenced may still be shared with other snapshots through deduplication. Disk space recovery happens later, during garbage collection.

Retention Rules

PBS offers five retention parameters. Each one defines how many snapshots to keep for a given time bucket:

  • keep-last N: Keep the N most recent snapshots regardless of timing.
  • keep-daily N: Keep the most recent snapshot from each of the last N calendar days.
  • keep-weekly N: Keep the most recent snapshot from each of the last N calendar weeks.
  • keep-monthly N: Keep the most recent snapshot from each of the last N calendar months.
  • keep-yearly N: Keep the most recent snapshot from each of the last N calendar years.

Evaluation Order

PBS evaluates these rules in a fixed order: keep-last, then keep-daily, keep-weekly, keep-monthly, keep-yearly. A snapshot that satisfies an earlier rule is excluded from evaluation by later rules. This means overlap is possible and intentional: the same snapshot can count toward keep-last and keep-daily simultaneously.

If you set keep-last 3 and keep-daily 7, and your three most recent snapshots happen to fall on distinct days, those same snapshots also satisfy three of your seven daily slots. The practical result is fewer total snapshots than you might expect from adding the numbers together.

Retention Policy Examples

Here are three common retention configurations and their approximate snapshot counts, assuming one backup per day:

Retention Policy Comparison
Parameter
Homelab
Small setup, short history
Production
Business-critical VMs
Compliance
Long-term archival needs
keep-last
3
14
7
keep-daily
7
14
30
keep-weekly
4
8
12
keep-monthly
2
12
24
keep-yearly
-
-
3
Approx. snapshots retained
~14
~30
~55
Approximate counts assume one backup per day and account for overlap between rules. Actual counts vary depending on backup frequency and timing.

Homelab example: keep-last 3, keep-daily 7, keep-weekly 4, keep-monthly 2 covers a week of daily granularity plus two months of coarser history. This is enough to recover from most accidental deletions and gives you a reasonable window to detect problems.

Production example: keep-last 14, keep-weekly 8, keep-monthly 12 gives two weeks of daily rollback, two months of weekly checkpoints, and a full year of monthly snapshots. This balances storage cost against the longer detection windows that business environments require.

Namespace Scoping

In multi-tenant or organized setups, prune jobs can target specific namespaces using the --ns flag. This lets you apply different retention policies to different groups of VMs within the same datastore. A production namespace might keep 12 monthly snapshots while a development namespace keeps only 2.

Garbage Collection Explained

Pruning removes snapshot manifests but leaves chunks on disk. Many of those chunks are still referenced by other snapshots through deduplication. Garbage collection (GC) identifies which chunks are truly orphaned and deletes them. This is the step that actually frees disk space.

Two Phases

GC runs in two distinct phases:

Mark phase (read-only): PBS scans every snapshot manifest in the datastore and marks each referenced chunk. This phase is safe; it only reads data. It builds a complete picture of which chunks are still needed.

Sweep phase (deletes): PBS walks the chunk store and deletes every chunk that was not marked in the previous phase. This is the phase that frees disk space and the one that requires exclusive access.

Never run GC while backups or sync jobs are active

If a backup job is writing new chunks during the sweep phase, GC might delete chunks that the in-progress backup has not yet registered in its manifest. This can corrupt the new backup. Always schedule GC during a window when no backup or sync jobs are running.

The atime Safety Window

PBS uses a safety mechanism based on chunk access times. Chunks created or accessed within the last 24 hours and 5 minutes are never deleted during sweep, even if they appear unreferenced. This protects against race conditions where a backup job has written chunks but has not yet finalized its manifest.

In practice, this means you should allow at least 24 hours between your last backup or prune operation and a GC run. The safety window handles edge cases, but scheduling with this buffer removes any ambiguity.

Performance Characteristics

GC duration scales with the number of chunks in the datastore, not with the total data size. A 10TB datastore with high deduplication (fewer unique chunks) runs GC faster than a 2TB datastore with minimal deduplication (many unique chunks). The mark phase is I/O-bound (reading manifests), and the sweep phase is I/O-bound (deleting files). On large datastores with millions of chunks, GC can take hours. This is normal and expected.

Scheduling Best Practices

The order of operations matters. The recommended sequence is:

  1. Prune old snapshots
  2. Backup to create new snapshots
  3. Verify backup integrity
  4. Garbage collect to reclaim space

Pruning first ensures that old snapshots are marked for removal before new backups add more data. Verification runs after backups complete to confirm the new data is intact. GC runs last, well after all write operations finish.

Here is a practical weekly schedule:

Recommended Maintenance Schedule
Task
Schedule
Frequency
Prune
21:30
Daily
Backup
22:00
Daily
Verify
Sun 01:00
Weekly
Garbage Collection
Sun 03:00
Weekly

Why not daily GC? Garbage collection is CPU and I/O intensive, and running it daily provides minimal benefit. Orphaned chunks from daily pruning accumulate slowly. A weekly GC run reclaims the same total space with less operational overhead. The default GC schedule in PBS is already weekly, and for most deployments, this is the right cadence.

Multi-tenant considerations: If your datastore uses namespaces for different clients or environments, prune jobs should be scoped per namespace with appropriate retention for each. GC operates at the datastore level and handles all namespaces in a single run.

CLI and Automation

The PBS web UI handles scheduling well, but CLI automation gives you more control for scripted environments and infrastructure-as-code workflows.

Create a Prune Job

bash
proxmox-backup-manager prune-job create daily-prune \
  --store main-datastore \
  --schedule "daily 21:30" \
  --keep-last 14 \
  --keep-daily 14 \
  --keep-weekly 8 \
  --keep-monthly 12
Create daily prune job

For namespace-scoped pruning, add the --ns flag:

bash
proxmox-backup-manager prune-job create dev-prune \
  --store main-datastore \
  --ns development \
  --schedule "daily 21:30" \
  --keep-last 3 \
  --keep-daily 7 \
  --keep-weekly 4
Namespace-scoped prune job

Create a GC Job

bash
proxmox-backup-manager gc-job create weekly-gc \
  --store main-datastore \
  --schedule "Sun 03:00"
Create weekly GC job
Use the PBS prune simulator to preview retention before applying

The PBS web UI includes a prune simulator under each datastore's content view. Select a backup group and click "Prune" with the "simulate" option enabled. PBS shows you exactly which snapshots would be kept and which would be removed under your proposed retention settings, without deleting anything. Use this before committing to a new retention policy.

Scripting with Failure Notifications

For environments that need alerting beyond what PBS provides natively, wrap prune and GC commands in a script that catches failures:

bash
#!/bin/bash
proxmox-backup-manager prune-job run daily-prune 2>&1 || {
  echo "Prune job failed on $(hostname) at $(date)" | \
    curl -X POST -d @- https://your-webhook-url
  exit 1
}
prune-with-alerts.sh

Pair this with the notification integrations described in our monitoring and alerting guide.

Monitoring Prune and GC

Running prune and GC jobs is only useful if you verify they complete successfully.

Where to Check Status

PBS web UI: Each datastore shows a task log with the result of every prune and GC run. Look for the green checkmark. Failed jobs appear with error details.

Syslog: PBS logs all job activity to syslog. Filter for proxmox-backup-proxy or proxmox-backup-manager entries to see prune and GC results.

Key Metrics to Watch

Chunk count trend: Monitor the total number of chunks in your datastore over time. A steadily increasing chunk count despite regular GC may indicate that your retention policy is keeping too many snapshots or that deduplication ratios are declining (often caused by changed workload patterns).

Space reclaimed per GC: Each GC run reports how much space it freed. If this number drops to near zero week after week, your prune settings may not be removing enough snapshots to create orphaned chunks. If it spikes, a large batch of old snapshots was just pruned.

Job duration: GC runs that grow progressively longer signal a growing chunk store. This is expected as your datastore fills, but unusually long runs may indicate filesystem performance issues.

Set up alerts for failed prune and GC jobs. A failed prune means snapshots accumulate beyond your retention intent. A failed GC means disk space is not being reclaimed. Both lead to a full datastore if left unchecked. We cover notification setup in detail in our monitoring and alerting post.

Conclusion

Pruning and garbage collection are the two maintenance operations that keep a PBS datastore under control. Pruning defines what to keep through retention rules. Garbage collection reclaims the space left behind. They must run in the right order, at the right frequency, and with monitoring to catch failures.

For most deployments: prune daily, garbage collect weekly, verify weekly, and monitor all three. The PBS defaults are sensible starting points. Adjust retention based on your recovery requirements and storage budget, and use the prune simulator to validate changes before applying them.

If you're backing up to an offsite PBS datastore, these operations matter even more. Every orphaned chunk you keep is storage you're paying for. At remote-backups.com, we handle GC scheduling on our managed PBS infrastructure so your offsite datastore stays lean without manual intervention. Combined with proper restore testing and monitoring, you get a backup pipeline that maintains itself.

Ready to offload PBS maintenance? Check out our managed Proxmox Backup Server hosting.