Traditionally, the UCC's backup system has left much to be desired, such as "existing" or "running on reliable hardware", let alone living up to the Rule of Three.
Backups at present run on Mollitz, which is housed in another location (off-site backups!). Contact <[email protected]> if you need to know where it is living.
Backups are run using rdiff-backup, a disk-based incremental backup system. These are managed by rdiff-manager, a Python wrapper written by DavidAdam.
Adding new machines
On the target system (the machine you want to back up):
- Make sure the UCC backup key is installed (e.g. with authroot)
Install rdiff-backup packages.
On the backup server (mollitz):
Copy /backups/conf/example-copy-me to /backups/conf/<HOSTNAME>.conf
Edit /backups/conf/<HOSTNAME>.conf as required - the syntax is documented in rdiff-backup(1) under FILE SELECTION
Add the SSH host key using su -c 'ssh-keyscan HOSTNAME >> ~/.ssh/known_hosts' backups
Now wait until the nightly backup run. The output is:
- sent by email to hostmasters
successful backups leave a log in /backups/<HOSTNAME>/rdiff-backup-data/backup.log and /backups/<HOSTNAME>/rdiff-backup-data/session_statistics.<TIMESTAMP>.data
partially successful backups leave a log in /backups/<HOSTNAME>/rdiff-backup-data/error_log.<TIMESTAMP>.data.gz
In some cases, e.g. if a particular file constantly changes during each and every backup run, a successful backup or update may never be possible
- TODO: this could be mitigated by backing up a stable snapshot, instead of the live filesystem
Checking backup status
To list all the backups available for a particular host, or to see when it was last successful, on the backup server (Mollitz) run:
rdiff-backup --list-increments /backups/<HOSTNAME>
To list how much data is taken for each incremental backup (which is much slower), on the backup server (Mollitz) run:
rdiff-backup --list-increment-sizes /backups/<HOSTNAME>
Restoring a backup
To restore files from backup, on the backup server (Mollitz):
Run rdiff-backup --list-increments /backups/<HOSTNAME> and choose a backup to restore from
Copy the timestamp from the increment list - 2022-02-22T02:00:03+08:00
- Decide where you are going to restore the files - locally (i.e. to Mollitz), where you can inspect them, or back to the remote host
Restoring files locally
Run mkdir /backups/tmp/<HOSTNAME>
Run rdiff-backup --restore <TIMESTAMP> /backups/<HOSTNAME>/path/to/file-or-directory/you/want/to/restore /backups/tmp/<HOSTNAME>
For example, using rdiff-backup -r 2022-02-22T02:00:03+08:00 /backups/merlo/etc /backups/tmp/merlo will restore the contents of merlo's /etc/ directory as of 22nd February to /backups/tmp/merlo.
Restoring files remotely
Be careful - this is easy to mess up, particularly if you are trying to restore to the original path. Note the double-colons in the restore path!
Run rdiff-backup --restore <TIMESTAMP> /backups/<HOSTNAME>/path/to/file-or-directory/you/want/to/restore root@<HOSTNAME>::/path/to/file-or-directory/you/want/to/restore
For example, rdiff-backup -r 2022-02-22T02:00:03+08:00 /backups/merlo/etc merlo::/restored/etc will restore the contents of merlo's /etc/ directory as of 22nd February to /restored/etc on Merlo.
Improvements to rdiff-manager
rdiff-manager is pretty simple but there is plenty of room for improvement. Check the TODO file in the distribution for ideas.
Latest update on the backup system
[ROY] 20240930, 20241111
This section is about a temporary solution for rdiff-backup's version discrepancy issue which causes some machines can't be backup correctly.
molmol space dataset
It's a cronjob on molmol calling /root/zfs-send-script.sh everyday at 01:00, which take incremental snapshot on Space dataset first, then zfs send to dell-ph1 over ssh. Similar cronjob applies to dell-ph1->dell-ph2, 05:00 every Saturday.
Sample script could be found under wheel/docs.
good'ol rdiff-backup style backup
Also a cronjob, on dell-ph1, calling all .sh scripts under /etc/rsync-conf/ everyday 02:00; then take zfs snapshot (via /etc/rsync-conf/zfs-snapshot, on the whole dpool zfs pool) everyday 04:00 to create diffirential entries.
The include/exclude files are the .conf files under /etc/rsync-conf/
To add a new host check /etc/rsync-conf/setuphost, and add .conf & .sh file for this host
rsync script is adopted untouched from https://www.ucc.gu.uwa.edu.au/~dagobah/things/secure-backups.html, stored on target host
data retention
Snapshots older than 6mo will be destroyed on all 3 machines, by cronjobs on 1st every month.
Offsite
dell-ph1=>dell-ph2 full dataset sync via cronjob: /root/zfs-send-offsite.sh every Sunday 04:15AM. Basically the same as molmol to dell-ph1
- intermediate difference will be zipped to the latest snapshot
PVE VMs
All machines are backup to wobbegong via PBS, scheduled every Saturday 00:00.