PBS Backup Scheduling for Multi-Client

March 20, 2026
10 min read

You have 40 clients. They all want nightly backups. You have an 8-hour overnight window and a single Proxmox Backup Server cluster. Do the math. If every client kicks off at midnight, your disk I/O flatlines, network saturates, and half the jobs fail or run into the morning. Garbage collection never gets a chance to finish. Verification sits in a queue. Monday morning starts with a wall of alerts.

Scheduling PBS backups across dozens of clients is an engineering problem, not a checkbox exercise. This post covers how to build a schedule that actually works.

Key Takeaways

Stagger backup start times across your window instead of running everything at once
Schedule garbage collection per-datastore AFTER that datastore's backups complete
Small, fast clients go first; large clients get dedicated late-window slots
Always use UTC for cron schedules to avoid DST surprises
Track actual backup durations weekly and adjust the schedule based on real data
Build retry windows into the schedule so a failed job doesn't cascade into the next day

The Scheduling Problem

When you manage a single Proxmox Backup Server instance, one or two backup jobs at midnight works fine. Scale that to 20, 40, or 60 clients and everything changes. Concurrent backup jobs compete for the same resources: disk I/O, network bandwidth, CPU for chunk indexing, and memory for deduplication lookups.

PBS handles concurrent writes to the same datastore, but performance degrades as parallelism increases. If you run a datastore-per-client architecture, each job writes to its own datastore. That helps with lock contention but doesn't solve I/O bottleneck on the same physical storage.

The goal is simple: every client backed up, verified, with garbage collection completed and room for retries, all before business hours.

Understanding PBS Resource Constraints

Before designing a schedule, you need to know what's fighting for resources.

Disk I/O

PBS is write-heavy during backup ingestion. Each backup job streams chunks to disk, deduplicates against the chunk index, and writes new chunks plus the snapshot manifest. During garbage collection, the pattern flips to read-heavy as PBS scans all chunk references. Running both simultaneously is the fastest way to tank throughput.

Network Bandwidth

Every client pushes data over the network. A 1 Gbps link supports roughly 450 GB/hour of throughput in practice. If you have 10 clients each sending 50 GB concurrently, you need that full hour just for raw transfer. Factor in encryption overhead and protocol framing, and the real number is lower.

For offsite replication to remote targets, bandwidth constraints are even tighter. Sync jobs that replicate to a remote Proxmox Backup Server instance share the same uplink as incoming backups.

Garbage Collection

GC runs per-datastore, not globally. While GC is running on a datastore, new backup jobs targeting that datastore will queue or fail depending on timing. GC on a 2 TB datastore with high churn can take 30+ minutes. Multiply that by 40 datastores and you have a real scheduling constraint.

Never Run GC During Active Backups

Garbage collection and backup jobs targeting the same datastore do not mix. GC can block writes and cause backup failures. Always schedule GC after the last backup job for that datastore completes.

Verification and Memory

Verification jobs (proxmox-backup-client verify) check chunk integrity on disk. They're read-heavy and can run alongside backups to other datastores, but they add I/O load. Memory usage scales with datastore size because PBS needs to hold chunk index structures in RAM during operations. A datastore with millions of chunks can consume several GB.

Scheduling Strategies

Staggered Start Times

The simplest and most effective strategy: don't start everything at once. Group clients by estimated backup size and spread start times across your backup window.

Small clients (under 20 GB changed data) finish in 10-30 minutes. Large clients (100+ GB) might need 2-3 hours. Schedule small, fast jobs early in the window. They finish quickly, freeing up I/O for the bigger jobs that follow.

Group by Duration, Not Data Size

A client with 500 GB total but only 2 GB daily change is a fast backup. A client with 50 GB total but 40 GB daily change is a slow one. Group by expected duration, not total dataset size.

Here's what a staggered schedule looks like for 20 clients across an 8-hour window (22:00 to 06:00 UTC):

20-Client Staggered Schedule

Client	client-01	client-02	client-03	client-04	client-05	client-06	client-07	client-08	client-09	client-10	client-11	client-12	client-13	client-14	client-15	client-16	client-17	client-18	client-19	client-20
Data Size	5 GB	8 GB	12 GB	15 GB	18 GB	22 GB	30 GB	35 GB	40 GB	45 GB	50 GB	60 GB	75 GB	80 GB	100 GB	120 GB	150 GB	200 GB	300 GB	500 GB
Start (UTC)	22:00	22:00	22:15	22:15	22:30	22:30	23:00	23:00	23:30	23:30	00:00	00:00	00:45	01:00	01:30	02:00	02:30	03:00	03:00	03:00
Est. Duration	10 min	15 min	15 min	20 min	20 min	25 min	30 min	35 min	40 min	40 min	45 min	50 min	60 min	65 min	90 min	100 min	120 min	150 min	180 min	210 min
GC Schedule	Sat 07:00	Sat 07:00	Sat 07:15	Sat 07:15	Sat 07:30	Sat 07:30	Sat 08:00	Sat 08:00	Sat 08:30	Sat 08:30	Sun 07:00	Sun 07:00	Sun 07:30	Sun 07:30	Sun 08:00	Sun 08:00	Sun 08:30	Sun 09:00	Sun 09:00	Sun 10:00

Notice the pattern: two small clients start together every 15 minutes early in the window. As job sizes grow, start times spread out more. GC runs on weekends when there's no I/O contention.

Bandwidth-Aware Scheduling

Calculate your available bandwidth and work backwards. If you have a 1 Gbps link and 4 concurrent backup streams, each client gets roughly 250 Mbps of effective throughput. That might be fine for local LAN backups, but for remote targets it changes everything.

PBS supports the --rate-limit flag to cap bandwidth per backup job (specified in bytes per second). Use it to prevent a single large client from starving others.

bash

proxmox-backup-client backup \
    vm/100.img:/dev/vg/vm-100-disk-0 \
    --repository remote-pbs:client-05 \
    --rate-limit 52428800

Bandwidth-limited backup job

That caps the job at 50 MB/s (roughly 400 Mbps), leaving room for other concurrent streams. For offsite targets on remote-backups.com, edge locations reduce latency and improve throughput for geographically distributed clients.

Avoiding GC Conflicts

Garbage collection is the most common scheduling conflict in multi-client PBS deployments. The rules are straightforward:

GC runs after all backups to that datastore are complete
Stagger GC start times across datastores (don't fire 40 GC jobs at 06:00)
Run GC on weekends for large datastores where the process takes 30+ minutes
For small datastores (under 500 GB), weeknight GC at a staggered off-peak time works fine

bash

proxmox-backup-manager garbage-collection start client-05-datastore

Trigger GC for a specific datastore

A practical approach: batch your datastores into GC groups. Group A runs Saturday morning, Group B Sunday morning. Within each group, stagger starts by 15-30 minutes so I/O load is distributed.

Timezone Handling

If you manage clients across timezones, you get distributed load for free. A client in UTC+1 wanting a "midnight backup" starts at 23:00 UTC. A client in UTC-5 wanting the same starts at 05:00 UTC. That spreads your window naturally.

Always Schedule in UTC

Store all cron schedules in UTC and document the local-time equivalent for each client. This prevents confusion during daylight saving transitions and makes automation consistent.

PVE Integration

Most MSP environments run PVE with vzdump targeting a PBS storage backend. Backup scheduling happens in PVE's job configuration, not on the PBS side.

bash

vzdump: backup-client-05
    enabled 1
    storage pbs-client-05
    schedule *-*-* 23:30:00
    mailnotification failure
    mode snapshot
    compress zstd
    notes-template {{guestname}}

/etc/pve/jobs.cfg — vzdump schedule targeting PBS

The schedule field uses systemd calendar format. *-*-* 23:30:00 means "every day at 23:30." For specific days, use patterns like Mon..Fri *-*-* 23:30:00 to skip weekends.

When using PBS namespaces (available in PBS 2.x), specify the target namespace in the PVE storage configuration. Each client's PVE cluster should write to its own isolated namespace or datastore to maintain tenant separation.

The critical coordination point: PVE backup jobs and PBS garbage collection must not overlap on the same datastore. If PVE fires a vzdump at 23:30 and GC is still running from a previous cycle, the backup will fail.

Monitoring Your Schedule

A schedule is only as good as its tracking. You need visibility into actual backup durations, not just whether jobs succeeded.

Track these metrics over time:

Actual vs estimated duration for each client, reviewed weekly
Jobs that consistently overrun their allocated window
GC duration per datastore, which grows as the datastore grows
Failed jobs and retry counts per night

PBS exposes task logs and metrics through its API. Feed these into Prometheus and Grafana for dashboards that show schedule health at a glance. When a client's backup takes 90 minutes instead of the usual 45, you want to know before it cascades into the next slot.

Adjust Quarterly

Review and adjust your schedule at least every quarter. Client data grows, new clients onboard, and what worked in January may cause overlap by April.

When Schedules Fail

Backup jobs fail. Disks fill up, networks hiccup, VMs lock their snapshot. Your schedule needs a plan for this.

Retry Strategy

Build a 60-90 minute retry window at the end of your backup window. If a job fails at 23:30, an automatic retry at 05:00 still finishes before business hours. Don't retry immediately. The condition that caused the failure (I/O saturation, network issue) might still be present. Wait for load to drop.

bash

# Primary backup at 23:30 UTC
30 23 * * * /usr/local/bin/backup-client-05.sh

# Retry window at 05:00 UTC (only runs if primary failed)
0  5  * * * /usr/local/bin/backup-client-05.sh --retry-if-missed

Cron entry with retry window

Alerting

Set up alerts on missed backups with clear thresholds. A single missed nightly backup is a warning. Two consecutive misses is critical. By the third morning without a successful backup, someone should be investigating.

Don't alert on every transient failure. Alert on patterns: a client that fails every Tuesday, a datastore where GC consistently runs past the backup window, a job whose duration has doubled over two weeks.

Common Mistakes

Scheduling Mistakes vs Best Practices

Common Mistakes

Start all backups at midnight
Run GC on a fixed daily schedule regardless of backup timing
Estimate backup windows once and never revisit
No retry window in the schedule
Schedule in local time per client

Best Practices

Stagger start times based on expected duration
Schedule GC per-datastore after backups complete
Track actual durations and adjust quarterly
Reserve 60-90 minutes at end of window for retries
Use UTC everywhere, document local equivalents

Wrapping Up

Scheduling PBS backups for dozens of clients is not a one-time task. It requires understanding your resource constraints, staggering jobs based on real duration data, keeping garbage collection out of backup windows, and building in room for failures. Start with the staggered approach, monitor actual performance weekly, and adjust quarterly. The payoff is mornings that start with green dashboards instead of a flood of alerts.

Need managed PBS with built-in scheduling?

remote-backups.com handles scheduling, garbage collection, monitoring, and geo-replication for multi-client Proxmox Backup Server environments.

View Plans

Yes. GC runs per-datastore, not globally. GC on datastore A does not affect backup writes to datastore B. The conflict only occurs when GC and backups target the same datastore simultaneously.

There's no hard-coded limit. The practical ceiling depends on disk I/O, memory, and network bandwidth. Most deployments run 4-8 concurrent backup streams comfortably on a server with NVMe storage and 64 GB RAM. Beyond that, test and monitor.

Use PVE scheduling (vzdump jobs) for VM and container backups, since PVE manages the snapshot lifecycle. Use PBS-side scheduling for sync jobs, GC, pruning, and verification. The two systems coordinate through the PBS storage backend.

Add additional time slots for those clients, but treat each slot as a separate scheduling entry. A client backing up at 12:00 and 00:00 UTC occupies two slots in your schedule, and both need staggering relative to other jobs in that time range.

Sign In

PBS Backup Scheduling for Multi-Client

Key Takeaways

The Scheduling Problem

Understanding PBS Resource Constraints

Disk I/O

Network Bandwidth

Garbage Collection

Never Run GC During Active Backups

Verification and Memory

Scheduling Strategies

Staggered Start Times

Group by Duration, Not Data Size

20-Client Staggered Schedule

Bandwidth-Aware Scheduling

Avoiding GC Conflicts

Timezone Handling

Always Schedule in UTC

PVE Integration

Monitoring Your Schedule

Adjust Quarterly

When Schedules Fail

Retry Strategy

Alerting

Common Mistakes

Scheduling Mistakes vs Best Practices

Common Mistakes

Best Practices

Wrapping Up

Need managed PBS with built-in scheduling?

Tags

Bennet Gallein

Backup Solutions

Resources

Useful Links

Tools

Newsletter

For Who

Comparisons

Our Network

PBS Backup Scheduling for Multi-Client

Key Takeaways

Never Run GC During Active Backups

Group by Duration, Not Data Size

20-Client Staggered Schedule

Always Schedule in UTC

Adjust Quarterly

Scheduling Mistakes vs Best Practices

Common Mistakes

Best Practices

Need managed PBS with built-in scheduling?

Can I run garbage collection while backups are active on a different datastore?

What's the maximum number of concurrent backup jobs PBS can handle?

Should I use PVE or PBS-side scheduling?

How do I handle clients that need backups more than once per day?

Related Articles

Tags

Share this article

Bennet Gallein

You might also like

PBS Capacity Planning for Multi-Client

PBS Monitoring with Prometheus and Grafana

Proxmox Backup for Hosting Providers