Warning Livedoc is no longer being updated and will be deprecated shortly. Please refer to https://documentation.tjhsst.edu.


From Livedoc - The Documentation Repository
Revision as of 12:55, 7 June 2013 by Andrew Hamilton (talk | contribs) (add detailed information on cluster software)
Jump to: navigation, search

The CSL SAN (Storage Area Network) is a redundant cluster providing iSCSI and NFS storage to other servers. Its primary purposes is to provide iSCSI storage for VM hard drives.


Snares and Bottom each have an LSI PCI-E dual-port SAS-2 HBA. They are connected via copper SFF-8088 cables to Apocalypse such that each server has access to all of the drives in Apocalypse. Any Server, HBA, or backplane component within Apocalypse can fail without a loss of functionality.

In addition, Rockhopper is included in the cluster as a standby member to provide a third cluster member to break ties and prevent other messy situations. Because it does not have any connections to a storage array, it is not capable of nor configured to run any SAN services.


A number of pieces of software are used on top of the SAN hardware to provide data mangement and redundancy, high availability, and iSCSI and NFS access.


ZFS via the ZFS on Linux project is used as the filesystem on the SAN. Currently there is a single zpool, called apocalypse, with 10 drives in a RAID-Z2 and an eleventh drive as an online spare. This allows any two drives in the pool to fail without any loss of availability or data and the system will automatically start a rebuild after a single drive failure to the spare disk.

Using ZFS as our base filesystem provides a number of benefits including transparent data check-summing and compression. ZFS is also capable of transparent deduplication, however, we do not currently have this feature enabled because of the amount of memory it requires and because this feature is not well-tested in ZFS on Linux.


Corosync and Pacemaker are used to provide high availability failover of SAN services in the event of a hardware or software failure.

Corosync provides messaging and cluster engine services between cluster nodes. It handles the establishment of the cluster and the management of membership and quorum within the cluster.

Pacemaker runs over top of Corosync and provides management of cluster resources using Resource Agents. Pacemaker handles actually starting and stopping cluster resources on each node as well as failing resources between nodes.

Resource Agents

Resource agents are small scripts used by Pacemaker that define how to start, stop, and monitor cluster resources. A large number are included by default with Pacemaker and new ones can be written following a specification provided by Pacemaker.

Currently we use the following resource agents in the cluster:

  • ocf:tjhsst:ZPool (custom written)
  • ocf:heartbeat:IPaddr2
  • ocf:heartbeat:IPv6addr
  • ocf:heartbeat:iSCSITarget
  • ocf:heartbeat:iSCSILogicalUnit
  • stonith:external/riloe
  • stonith:meatware


STONITH (Shoot The Other Node In The Head) provides a means for the cluster to ensure the complete removal of a malfunctioning node from the cluster prior to taking over its resources. This is particularly important when managing non-clustered filesystems which will suffer corruption if they are simultaneously activated on multiple cluster nodes.

STONITH provides resource agents which can be used to kill our cluster nodes using the built-in iLO remote management as well as that can ask the systems administrator to manually kill a node if iLO fails.

On Gentoo, STONITH is a part of the cluster-glue package.


We use the LIO Unified Target integrated into the Linux Kernel to supply iSCSI targets for VM Storage. The LIO LUNs are backed by ZFS Block Devices (ZVOLs). LIO was chosen over other Linux iSCSI targets due to its integration into the Linux Kernel and its active development.

Because all LIO targets and LUNs are managed through pacemaker, it is not necessary to install the management utility (targetcli) on the cluster nodes.


We use the NFSv4 server integrated into the Linux Kernel to provide NFS exports for VM support and Mail storage.