Automated ZPool Scrubber


Project Overview

This is a system administration program written in Go that automates zpool scrubbing. Zpools are are part of a ZFS file system, which will be explained in the next section by a snippet taken from the Zeta Systems website. It reads the dates that each zpool was scrubbed and executes the scrubbing command on the zpool that was scrubbed the longest time ago. It includes the ability to flag a list of zpools that should be avoided by the program. This project was used to automatically perform scrubs on one zpool a week by running the program as a Cron Job on a Linux server.

What is ZFS? (Explained by Zeta Systems)

ZFS is a file system and logical volume manager that is robust, scalable, and easy to administer. It provides greater space for files, hugely improved administration, and greatly improved data integrity. ZFS uses a 128-bit addressing scheme and can store up to 275 billion TB per storage pool.

Storage Pools

ZFS does away with the concept of disk volumes, partitions, and disk provisioning by adopting pooled storage, where all available hard drives in a system are essentially joined together. The combined bandwidth of the pooled devices is available to ZFS, which effectively maximizes storage space, speed, and availability.

ZFS takes available storage drives and pools them together as a single resource, called a “zpool”. this can be optimized for capacity, I/O performace or redundancy, using striping, mirroring, or some form of RAID. If more storage is needed, then more drives can simply be added to the zpool. ZFS sees the new capacity and starts using it automatically, balancing I/O and maximizing throughput.

Instead of pre-allocating metadata like other file systems, ZFS utilizes dynamically allocated metadata as needed, with no initial space required at the initialization and no limit on the files or directories supported by the file system.

File systems are no longer constrained to individual devices, allowing them to share disk space with all file systems in the pool. You no longer need to predetermine the size of a file system, as file systems grow automatically within the disk space allocated to the storage pool When new storage is added, all file systems within the pool can immediately use the additional disk space without additional work.

Scrubbing

ZFS can be scheduled to perfom a “scrub” on all the data in a storage pool, checking each piece of data with its corresponding checksum to verify its integrity, detect any silent data corruption, and to correct any errors where possible.

When the data is stored in a redundant fashion (in a mirrored or RAID-type array) it can be self-healed automatically and without any administrator intervention. since data corruption is logged, ZFS can bring to light defects in memory modules (or other hardware) that cause data to be stored on hard drives incorrectly.

Scrubbing is given low I/O priority so that it has minimal effect on system performance and can operate while the storage pool is in use.

Learning Outcomes

This is my very first project using the Go programming language, which was chosen for efficiency. I learned how write and compile Go code, how to deploy it as a static binary, and how to use the GOPATH environment. Furthermore, I learned more about ZFS filesystems and how to administer them on a large scale (The CU Boulder Data Center).

Project Repo: https://github.com/Novota15/Automated-Zpool-Scrub

Top