Warning Livedoc is no longer being updated and will be deprecated shortly. Please refer to https://documentation.tjhsst.edu.

Difference between revisions of "Guardian Backup System"

From Livedoc - The Documentation Repository
Jump to: navigation, search
(Current Backup Schedule: added lists)
(added mysql1 to backup schedule and fixed a typo)
Line 5: Line 5:
  
 
==Backup Script==
 
==Backup Script==
A script and excludes file are used on the host to be backed up.  The script will backup the entire system (minus anything in the excludes file) to an intermediate zone (currently [[Guardian]]) using rsync and the host's keytab.  A variation of this script is used on hosts with a local MySQL server to dump databases before rsyncing in order to preserve consistency.  In this case, a databases file is also needed which contains the databases to be backed up as well as their engine type (currently supports MyISAM and InnoDB).  The backup script is executed by an entry in the root crontab.  Hosts are currently scheduled to run every 5 minutes with the hour and half-our off to allow any long-running backups some extra time to finish.
+
A script and excludes file are used on the host to be backed up.  The script will backup the entire system (minus anything in the excludes file) to an intermediate zone (currently [[Guardian]]) using rsync and the host's keytab.  A variation of this script is used on hosts with a local MySQL server to dump databases before rsyncing in order to preserve consistency.  In this case, a databases file is also needed which contains the databases to be backed up as well as their engine type (currently supports MyISAM and InnoDB).  The backup script is executed by an entry in the root crontab.  Hosts are currently scheduled to run every 5 minutes with the hour and half-hour off to allow any long-running backups some extra time to finish.
  
 
==Zone==
 
==Zone==
Line 56: Line 56:
 
*0205: [[Iodine]] (runs MySQL)
 
*0205: [[Iodine]] (runs MySQL)
 
*0210: lists
 
*0210: lists
 +
*0215: mysql1 (runs MySQL)

Revision as of 00:42, 6 April 2010

The Guardian Backup System (named after the zone/zpool for lack of a better name) is a set of scripts that is currently used to backup systems to agni.

Layout

The backup system consists of three main parts. A script and excludes file that are located on the host to be backed up, an intermediary zone, and a backup host which interfaces with the backup storage and also runs the zone.

Backup Script

A script and excludes file are used on the host to be backed up. The script will backup the entire system (minus anything in the excludes file) to an intermediate zone (currently Guardian) using rsync and the host's keytab. A variation of this script is used on hosts with a local MySQL server to dump databases before rsyncing in order to preserve consistency. In this case, a databases file is also needed which contains the databases to be backed up as well as their engine type (currently supports MyISAM and InnoDB). The backup script is executed by an entry in the root crontab. Hosts are currently scheduled to run every 5 minutes with the hour and half-hour off to allow any long-running backups some extra time to finish.

Zone

Guardian is the current intermediate zone that all of the systems back up to and from which the backup host copies the backups to more permanent storage. All of the hosts that are to be backed up need to have their host principal in ~root/.k5login on the zone in order to allow them to passwordless rsync using GSSAPI. The reasons for using an intermediate zone are 1) Speed: the zone's storage is tuned primarily for speed so that backups have a minimal impact on the host while the permanent storage is tuned for maximum space savings and 2) Security: if a host is compromised, only root at Guardian is gained and previously made backups are much harder to compromise.

Backup Host

Agni is currently being used with internal SATA drives as the backup host for long-term storage. The zfs filesystem here is tuned for storage space with both compression and dedup enabled. Every morning at 0605, Agni rsyncs a copy of each host's latest backup from guardian to permanent storage and then snapshots the backup. The script also checks to make sure each host actually ran the backup using a per-host checkin file which contains the date of when the host finished running its backup script. When it is done, it sends a summary email with information about which hosts did and did not backup successfully along with how long the migration took.

Adding a host

To setup a host to backup, copy the backup script and backup excludes file to the host and edit them appropriately. For a host with a mysql server, you will also need a backup-databases file as well as a .my.cnf file for root with appropriate credentials on each database to be backed up.

On guardian, add the host's principal to /root/.k5login

On agni, create a zfs filesystem for the host and add its FQDN to /root/scripts/backup-hosts so that the migration script knows about it.

Now run the backup script to make sure everything works properly. You can check your excludes file by changing rsync -aq to rsync -av in the backup script and then tailing the script's log file (default is /root/scripts/backup.log). After everything is working, add the backup script to the host crontab in the next available timeslot.

Notes/Bugs

At least on Gentoo Heimdal hosts, it appears that you need to disable GSSAPIDelegateCredentials in order to passwordless ssh to Guardian. Add the following to /root/.ssh/config

Host guardian
GSSAPIDelegateCredentials no

The backup migration script currently relies on hosts backing up the same day that it migrates the backups. Therefore all backups must start after midnight or the script will always report that those hosts did not backup successfully (technically the backup just has to put a date after midnight in the checkin file).

It is possible for the backup host script to create the checkin file even if the rsync failed for some reason. Therefore it is a good idea to make sure that cron will email any errors to a valid email address so that partially failed backups can be dealt with.

Current Backup Schedule

The following is the current schedule of host backups. These are setup primarily in the order in which hosts were added to the system. All hosts run the standard backup script unless noted.