Warning Livedoc is no longer being updated and will be deprecated shortly. Please refer to https://documentation.tjhsst.edu.

Guardian Backup System

From Livedoc - The Documentation Repository
Revision as of 12:50, 22 November 2010 by Christopher Reffett (talk | contribs)
Jump to: navigation, search

The Guardian Backup System (named after the zone/zpool for lack of a better name) is a set of scripts that is currently used to backup systems to agni.


The backup system consists of three main parts. A script and excludes file that are located on the host to be backed up, an intermediary zone, and a backup host which interfaces with the backup storage and also runs the zone.

Backup Script

A script and excludes file are used on the host to be backed up. The script will backup the entire system (minus anything in the excludes file) to an intermediate zone (currently Guardian) using rsync and the host's keytab. The script also takes an argument to dump databases from a local MySQL server before rsyncing in order to preserve consistency. In this case, a databases file is also needed which contains the databases to be backed up as well as their engine type (currently supports MyISAM and InnoDB). The backup script is executed by an entry in the root crontab. Hosts are currently scheduled to run every 5 minutes with the hour and half-hour off to allow any long-running backups some extra time to finish.

A variant of this script is used to backup shared storage directories independent of their hosts (for example, nfs-mail).


Guardian is the current intermediate zone that all of the systems back up to and from which the backup host copies the backups to more permanent storage. All of the hosts that are to be backed up need to have their host principal in ~root/.k5login on the zone in order to allow them to passwordless rsync using GSSAPI. The reasons for using an intermediate zone are 1) Speed: the zone's storage is tuned primarily for speed so that backups have a minimal impact on the host while the permanent storage is tuned for maximum space savings and 2) Security: if a host is compromised, only root at Guardian is gained and previously made backups are much harder to compromise.

Backup Host

Agni is currently being used with internal SATA drives as the backup host for long-term storage. The zfs filesystem here is tuned for storage space with both compression and dedup enabled. Every morning at 0605, Agni rsyncs a copy of each host's latest backup from guardian to permanent storage and then snapshots the backup. The script also checks to make sure each host actually ran the backup using a per-host checkin file which contains the UNIX Timestamp of when the host finished running its backup script. When it is done, it sends a summary email with information about which hosts did and did not backup successfully along with how long the migration took.

Adding a host

To setup a host to backup, copy the backup script and backup excludes file to the host and edit them appropriately. For a host with a mysql server, you will also need a backup-databases file as well as a .my.cnf file for root with appropriate credentials on each database to be backed up.

On guardian, add the host's principal to /root/.k5login

On agni, create a zfs filesystem for the host and add its FQDN to /root/scripts/backup-hosts so that the migration script knows about it.

Now run the backup script to make sure everything works properly. You can check your excludes file by tailing the script's log file (default is /root/scripts/backup.log). After everything is working, add the backup script to the host crontab in the next available timeslot.


At least on Gentoo Heimdal hosts, it appears that you need to disable GSSAPIDelegateCredentials in order to passwordless ssh to Guardian. Add the following to /root/.ssh/config

Host guardian
GSSAPIDelegateCredentials no

Backups must check-in within 12 hours of the time when the migration script runs or it will flag them as having failed. Since the migration script runs at 6AM, this means backups should be scheduled sometime between 6PM and 5:30AM (currently our host backups start just after midnight and finish by about 2:30).

Current Backup Schedule

The following is the current schedule of host backups. These are setup primarily in the order in which hosts were added to the system. All hosts run the standard backup script unless noted.

  • 2200: nfs-mail (Directory Backup from Casey)
  • 0005: www-new
  • 0010: bugs
  • 0015: license
  • 0020: monitor
  • 0025: ns1
  • 0035: Bottom
  • 0040: ns2
  • 0045: Galapagos
  • 0050: smith
  • 0055: Rockhopper
  • 0105: openvpn
  • 0110: Snares
  • 0115: casey
  • 0120: Antipodes
  • 0125: iodine (runs MySQL)
  • 0135: Weather
  • 0140: Magellanic
  • 0145: Fiordland
  • 0150: ublog (runs MySQL)
  • 0155: openldap1 (runs LDAP)
  • 0205: openldap2 (runs LDAP)
  • 0210: lists
  • 0215: mysql1 (runs MySQL)
  • 0220: royal
  • 0225: king
  • 0235: haimageserver
  • 0240: robustus