Home / Notes / ZFS Notes ======================== These are some notes about working with OpenZFS (mainly on MacOSX, but they should apply generally to any system using OpenZFS). These notes use the standard Bourne shell convention of '$' for the shell prompt for a non-root / unprivileged user and assume that this user uses the 'sudo' command to executed commands with the necessary privileges. When working as a privileged user (e.g. root), sudo may be omitted. The prompt is not part of the commands listed, and, may be different (for example, csh/tcsh historically used '%' as the prompt). These notes refer to a conventional spinning disk as a hard drive and a drive based on non-volatile storage (such as flash) as a solid state drive or ssd. ------------------- High Level Concepts ------------------- There are two important high level concepts when working with ZFS. The first is the concept of pools or zpool, which are groups of one or more disks. The second is the concept of a dataset, which is like a conventional filesystem. ---------------- Creating a zpool ---------------- To work with ZFS, a pool of one or more disks (called a 'zpool') needs to first be created. 1. For a single hard disk: $ sudo zpool create -f -o ashift=12 \ -o feature@async_destroy=enabled \ -o feature@empty_bpobj=enabled \ -o feature@lz4_compress=enabled \ -o feature@spacemap_v2=enabled \ -O casesensitivity=insensitive \ -O normalization=formD \ -O compression=lz4 \ [name] [device] where [name] is the name of the pool, and [device] is the operating system device identifier for the hard disk. For example, if the hard drive is /dev/disk5 (as shown by '$ diskutil list' on MacOSX), then the following command will create a pool named 'Mercury' on that hard drive: $ sudo zpool create -f -o ashift=12 \ -o feature@async_destroy=enabled \ -o feature@empty_bpobj=enabled \ -o feature@lz4_compress=enabled \ -o feature@spacemap_v2=enabled \ -O casesensitivity=insensitive \ -O normalization=formD \ -O compression=lz4 \ Mercury disk5 The '-O casesensitivity=insensitive' and '-O normalization=formD' are optional, but helpful on MacOSX because they make the pool to act like more like a standard Mac disk, see: https://openzfsonosx.org/wiki/Zpool#Feature_flags Note, this command also enables compression for the zpool, which I prefer. For more information about ZFS compression, see: https://klarasystems.com/articles/openzfs1-understanding-transparent-compression/ To enable encryption for the entire pool, add '-O encryption=aes-256-ccm -O keyformat=passphrase' to the list of options, see: https://openzfsonosx.org/forum/viewtopic.php?f=11&t=3713#p11803 2. To combine multiple hard drives and make them look seem like one larger drive: $ sudo zpool create -f -o ashift=12 \ -o feature@async_destroy=enabled \ -o feature@empty_bpobj=enabled \ -o feature@lz4_compress=enabled \ -o feature@spacemap_v2=enabled \ -O casesensitivity=insensitive \ -O normalization=formD \ -O compression=lz4 \ [name] [device1] ... [deviceN] where [name] is the name of the pool, and [device1] through [deviceN] are the operating system device identifiers for the hard disks being combined. This is the equivalent of a RAID 0 volume and offers no protection against drive failure, see: https://en.m.wikipedia.org/wiki/Standard_RAID_levels I do not generally use this. 3. To mirror the contents of one drive to multiple other drives: $ sudo zpool create -f -o ashift=12 \ -o feature@async_destroy=enabled \ -o feature@empty_bpobj=enabled \ -o feature@lz4_compress=enabled \ -o feature@spacemap_v2=enabled \ -O casesensitivity=insensitive \ -O normalization=formD \ -O compression=lz4 \ [name] mirror [device1] ... [deviceN] where [name] is the name of the pool, and [device1] through [deviceN] are the operating system device identifiers for the hard disks being combined. The 'mirror' command between the [name] and [device1] through [deviceN] tells zfs to mirror the contents of the first disk on all the other disks. This is the equivalent of a RAID 1 volume and preserves data as long as at least one drive is operational, see: https://en.m.wikipedia.org/wiki/Standard_RAID_levels 4. When the disks in the zpool are ssds, use '-o ashift=13' instead of '-o ashift=12', as above. An ashift value of 13 may be better for ssds to avoid performance degradation over time, see, e.g.: https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Workload%20Tuning.html#alignment-shift-ashift https://old.reddit.com/r/zfs/comments/6duz3j/zfs_on_a_purely_ssd_setup/ In addition, for ssds, TRIM support should be enabled: $ sudo zpool set autotrim=on [name] where [name] is the name of the pool. TRIM is a ssd specific feature that helps limit performance degradation of ssds, see: https://en.m.wikipedia.org/wiki/Trim_(computing) For SMR hard drives, TRIM support should be disabled: $ sudo zpool set autotrim=off [name] See: https://vermaden.wordpress.com/2022/05/08/zfs-on-smr-drives/ ---------------------------- Mounting / Importing a zpool ---------------------------- Once created, a zpool can be mounted as follows: $ sudo zpool import [name] For example, the following command imports the zpool named 'Mercury' created above: $ sudo zpool import Mercury By default, a zpool is mounted in read-write mode, so that data can be read from and written to the disk(s) in the zpool. To mount a zpool as read-only so that data can only be read from but no data can be written to it (for example to avoid data corruption), the following command can be used: $ sudo zpool import -o readonly=on [name] For example, the following command imports the zpool named 'Mercury' created above in read-only mode: $ sudo zpool import -o readonly=on Mercury ------------------------------- Disabling Spotlight for a zpool ------------------------------- On MacOSX, spotlight is a built-in desktop search feature that I do not like enabled on zpools. It can be disabled for zpools as follows: $ sudo mdutil -i off /Volumes/[name] $ sudo mkdir -p /Volumes/[name]/.fseventsd $ sudo touch /Volumes/[name]/.fseventsd/no_log $ sudo touch /Volumes/[name]/.metadata_never_index where [name] is the zpool's name. Sources: https://openzfsonosx.org/wiki/Stopping_Spotlight_etc._from_changing_ZFS_without_permission https://apple.stackexchange.com/questions/6707/ ---------------------------------- Listing Available / Mounted zpools ---------------------------------- Currently available / mounted zpools can be listed as follows: $ zpool list ----------------------------------------- Unmounting / Ejecting / Exporting a zpool ----------------------------------------- If a zpool is mounted, it can be unmounted / ejected as follows: $ sudo zpool export [name] All mounted zpools can be ejected as follows: $ sudo zpool export -a ---------------- Renaming a zpool ---------------- Once a pool has been created, it can be renamed by first exporting the pool (if it is currently mounted), and then reimporting it with its new name: $ sudo zpool export [old name] $ sudo zpool import [old name] [new name] where [old name] is the zpool's current name, and [new name] is the zpool's new desired name. For example, the following commands export the zpool Mercury and rename it to Mercury7: $ sudo zpool export Mercury $ sudo zpool import Mercury Mercury7 Source: https://forums.freebsd.org/threads/renaming-zfs-pool-via-zpool-import.65498/ ----------------- Scrubbing a zpool ----------------- The data integrity on a zpool can be confirmed by 'scrubbing' it. 1. To start a scrub: $ sudo zpool scrub [name] 2. To monitor a scrub that is in progress: $ zpool status [secs] where [secs] is the refresh interval 3. To stop / cancel a scrub that is in progress: $ sudo zpool scrub -s [name] 4. To pause a scrub that is in progress: $ sudo zpool scrub -p [name] 5. To resume a paused scrub (this is same command used to start a new scrub): $ sudo zpool scrub [name] In each of the above commands, [name] is the zpool's name. Source: https://openzfs.github.io/openzfs-docs/man/8/zpool-scrub.8.html ------------------------------------------------- Creating an Encrypted Dataset / Volume on a zpool ------------------------------------------------- An encrypted volume may be created on a zpool as follows: $ sudo zfs create -o com.apple.browse=off \ -o com.apple.mimic_hfs=on \ -o encryption=on \ -o keylocation=prompt \ -o keyformat=passphrase \ [pool]/[vol] where [pool] is the zpool's name and [vol] is the name of the encrypted volume. This command will prompt for the password for the encrypted volume. For example, the following command creates an encrypted volume named 'crypto' on the zpool Mercury, which was created above: $ sudo zfs create -o com.apple.browse=off \ -o com.apple.mimic_hfs=on \ -o encryption=on \ -o keylocation=prompt \ -o keyformat=passphrase \ Mercury/crypto The '-o com.apple.browse=off' and '-o com.apple.mimic_hfs=on' options are useful for making the encrypted volume act like a traditional MacOS / MacOSX drive but may be omitted, if that compatibility is not needed. Source: https://blog.heckel.io/2017/01/08/zfs-encryption-openzfs-zfs-on-linux/ ------------------------------------ Enable sha512 Checksums for a Volume ------------------------------------ To enable SHA512 checksum for data on volume / filesystem: $ sudo zfs set checksum=sha512 [pool]/[vol] where [pool] is the zpool's name and [vol] is the name of the volume / filesystem on that zpool. SHA512 checksum for data is faster than the default on 64-bit systems by about 50%, see: https://www.freebsd.org/cgi/man.cgi?query=zpool-features&sektion=7 Alternatively, 'checksum=skein', which is 80% faster than the default on 64-bit systems, or 'checksum=edonr', which is 350% faster than the default, could be used. I do not use 'checksum=edonr' because FreeBSD does not currently support it. I may transition to 'checksum=skein' in the future since it may slightly more secure than sha512, see: https://www.freebsd.org/cgi/man.cgi?query=zpool-features&sektion=7 ------------------------------------------------ Increasing / Decreasing Record Size for a Volume ------------------------------------------------ For volumes where large files are going to be stored, increasing the record size from the default of 128K may provide some performance benefits. The record size can be set as follows: $ sudo zfs set recordsize=[size] [pool]/[vol] where: [size] is the desired minimum record size [pool] is the zpool's name [vol] is the name of the volume / filesystem For example, the following command sets the record size for the 'crypto' encrypted volume on the Mercury zpool created above to 1 megabyte (1M): $ sudo zfs set recordsize=1M Mercury/crypto If very small files are going to be stored on a volume, decreasing the record size may provide some performance benefits. For example, the following command would reduce the minimum record size to 64 kilobytes (64K): $ sudo zfs set recordsize=64K Mercury/crypto A few rules of thumb for setting the record size are: 1024K for general-purpose file sharing/storage 64K for KVM virtual machines using Qcow2 file-based storage 16K for MySQL InnoDB 8K for PostgreSQL Source: https://blog.programster.org/zfs-record-size https://klarasystems.com/articles/tuning-recordsize-in-openzfs/ ------------------------------- Mounting All Volumes on a zpool ------------------------------- The following command can be used to mount all the volumes on a zpool: $ zfs list -rH -o name [pool] | xargs -L 1 sudo zfs mount where [pool] is the name of the zpool. Source: https://serverfault.com/questions/450818/recursively-mounting-zfs-filesystems ----------------------------------------- Destroying a Volume/Filesystem on a zpool ----------------------------------------- To destroy a volume / filesystem on a zpool: $ sudo zfs destroy [pool]/[vol] where: [pool] is the zpool's name [vol] is the name of the volume / filesystem that should be destroyed WARNING: BE CAREFUL - THERE IS NO PROMPT FOR CONFIRMATION ------------------ Creating Snapshots ------------------ To create a snapshot: $ zfs snapshot [pool]/[vol]@[snap] where: [pool] is the zpool's name [vol] is the name of the volume / filesystem for which a snapshot should be created [snap] is the name for the snapshot (ex. 2021-11-05-01) ------------------ Renaming Snapshots ------------------ To rename a snapshot: $ zfs rename [pool]/[vol]@[old_name] [new_name] where: [pool] is the zpool's name [vol] is the name of the volume / filesystem on which the snapshot that is being renamed is located [old_name] is the current name for the snapshot (ex. 2021-11-05-01) [new_name] is the new name for the snapshot -------------------- Destroying Snapshots -------------------- To destroy a snapshot: $ zfs destroy [pool]/[vol]@[snap] where: [pool] is the zpool's name [vol] is the name of the volume / filesystem for which a snapshot should be created [snap] is the name for the snapshot (ex. 2021-11-05-01) WARNING: BE CAREFUL - THERE IS NO PROMPT FOR CONFIRMATION Source: https://www.freebsd.org/doc/handbook/zfs-zfs.html https://serverfault.com/questions/192927/ ----------------- Listing Snapshots ----------------- 1. To list all snapshots: $ zfs list -t snapshot or $ zfs list -t all 2. To List latest snapshot on all volumes: $ zfs list | awk '/enc\// { print $1; }' | \ while read VOL ; do \ zfs list -H -t snapshot "$VOL" | \ sort -rn | head -1 | awk '{ print $1; }'; \ done --------------------- Replicating Snapshots --------------------- 1. The first time a snapshot is being replicated, the entire snapshot must be sent, as follows: $ zfs send -ec -v [src_pool]/[src_vol]@[snap] | \ zfs receive -s -v [dest_pool]/[dest_vol] where: [src_pool] is the name of the zpool where the snapshot is located [src_vol] is the name of the filesystem where the snapshot is located [snap] is the snapshot's name (ex. 2021-11-05-01) [dest_pool] is the name of the zpool to which the snap will be replicated [dest_vol] is the name of the filesystem to which the snapshot will be replicated 2. Once a snapshot has been replicated, incremental differences can be sent as follows: $ zfs send -ec -v -i [src_pool]/[src_vol]@[snap_old] \ [src_pool]/[src_vol]@[snap_new] | \ zfs receive -s -v -F [dest_pool]/[dest_vol] where: [src_pool] is the name of the zpool where the snapshot is located [src_vol] is the name of the filesystem where the snapshot is located [snap_old] is the starting snapshot's name (ex. 2021-11-05-01); this snapshot must exist on [dest_pool]/[dest_vol] [snap_new] is the ending snapshot's name (ex. 2021-12-05-01) [dest_pool] is the name of the zpool to which the snap will be replicated [dest_vol] is the name of the filesystem to which the snapshot will be replicated 3. If replication fails before completing, it can be resumed as follows: $ TOKEN="`zfs get -H receive_resume_token [dest_pool]/[dest_vol] | \ awk '{ print $3;}'`" $ zfs send -ec -v -t "$TOKEN" | zfs receive -s -v [dest_pool]/[dest_vol] where: [dest_pool] is the name of the zpool to which the snap will be replicated [dest_vol] is the name of the filesystem to which the snapshot will be replicated In some cases, it may be necessary to abort the incomplete replicated snapshot, which can be done as follows: $ zfs receive -a [dest_pool]/[dest_vol] Source: https://www.freebsd.org/doc/handbook/zfs-zfs.html https://forums.freebsd.org/threads/incremental-zfs-backup-errors.44530/ https://unix.stackexchange.com/questions/343675/ ------------- Helpful Links ------------- ZFS For Dummies - https://ikrima.dev/dev-notes/homelab/zfs-for-dummies/ ------- History ------- 09/05/2023 - Add link to ZFS For Dummies 09/04/2023 - Convert to html; add notes about enabling encryption when creating a zpool 05/09/2022 - Add tips for SMR drives0 04/22/2022 - Initial Version