If this is your first time seeing this series, I recommend starting at the beginning.
Building A New Nas
You can find all the other posts below, this is the 10th post in the series.
Why I chose ZFS
As mentioned before, there are a few popular options for the operating system and RAID software chosen for a DIY NAS. The most popular choices are FreeNAS and Unraid. I went a little outside of the norm and went with Ubuntu running ZFS.
ZFS is a tight combination of RAID, volume manager, and a file system.
There are a few reasons for this, the primary one was I really wanted to use ZFS but I didn't like FreeNAS. FreeNAS is fantastic if you want to use ZFS but all the other features it includes are not implemented that well. If you are looking at a NAS only solution, FreeNAS supports ZFS and is used by many enterprises. If you want virtual machines and more than just storage, it isn't as appealing. Most importantly their virtual machine support is based on bhyve which isn't very popular and I couldn't do much with it.
Why I chose ZFS
- Copy-on-write support
- Snapshots
- Replication
- Efficient Compression
- Caching
- Bit Rot Protection
The one thing that makes ZFS so good is how well it protects your data from loss. When using RAID, disk failure is not your only concern. Bit Rot is another major issue you need to be concerned with.
"Data degradation is the gradual corruption of computer data due to an accumulation of non-critical failures in a data storage device. The phenomenon is also known as data decay, data rot or bit rot."
-Wikipedia
ZFS uses copy-on-write which means any new data is written to a new block. This minimizes how much data is overwritten when new data comes in. If your system goes down in the middle of a write, the data is minimized as ZFS keeps track of all data written and can handle recovery automatically.
All data is checksummed when written to ZFS volumes and then verified. This allows ZFS to automatically detect errors and ultimately correct them.
ZFS can also create snapshots of your data and allow you to instantly recover data from a point in time, entire volumes if need be. This is one of the major benefits of a Copy-on-write file system.
ZFS RAID Levels
ZFS supports all RAID levels supported by most Raid solutions and then some.
RAID 0 / Striping
Raid 0 is when you split data evenly between multiple disks to increase read/write performance at the expense of data protection. RAID 0 is not used often by itself due to the fact any drive failure will cause data loss for all your disks. It is frequently used in combination with other RAID levels as you will soon see.
RAID 0 cannot survive any disk failures.
RAID 1 / Mirroring
RAID 1 is when you have two disks that act as complete copies of one and another. You lose some write performance but read performance can be nearly doubled by reading from both disks at the same time. Mirroring or RAID 1 has the highest storage cost.
ZFS takes this one step further and allows you to use n-way mirroring. This allows you to use more than two disks for mirroring. You can have three or more disks storing the same copies of data. This can have a dramatic improvement in read operations.
Raid 1 can handle one disk failure and still be functional, if you are using n-way mirrors, you can lose n-way - 1 disk.
RAID 5 / Raidz
RAID 5 is a very common RAID level because of it's low cost. Only one additional disk is required to protect any number of other disks. Although it is typically recommended not to go over five drives for RAID 5. RAID 5 has excellent read performance but suffers from slow write performance as it has to write and calculate parity.
RAID 5 can survive a single disk failure.
RAID 6 / Raidz2
RAID 6 is exactly like RAID5 but an additional disk is dedicated to parity protection. With RAID 6 you might have 3-7 (or more) disks with two additional drives storing parity data. The parity performance penalty for RAID 6 is even higher than RAID 5.
One of the biggest advantages to RAID 6 is having data protection when a drive fails. This is important as RAID5 needs to rebuild the missing disks and this puts a lot of strain on your disks. It is not uncommon for another drive to fail during this process. With RAID 5 you would experience complete data loss in this situation, with RAID 6 you would still be protected. It is also common that drive failures are not discovered or fixed immediately, and RAID 6 gives more time until complete failure.
RAID 6 you can lose up to two disks without losing data.
RAID 7, Raidz3
RAID 7 is the next step past RAID 6 where three drives are used for data protection. Write performance is considerably worse than RAID 5 but you have three drives dedicated to writing parity data. RAID 7 isn't as common due to the large performance penalty and most larger deployments involve a hybrid RAID solution.
RAID 7 can lose three disks without any data loss.
RAID 10, striped mirrors
RAID 10 is when you strip (RAID 0) two or more RAID 1 mirror. This allows you to gain more performance and space beyond what is capable of a single mirrored pair of disks. RAID 10 has excellent random write performance compared to other options while still having good data protection.
RAID 10 can survive one or more disks before data loss. If the disks that fail are not in the same mirror set, you will not experience data loss. If for example, you have two mirrors (RAID 1) with a RAID 0 stripe between them, you can lose a disk from either mirror or both mirrors and have no data loss. If both disks in a mirror fail at once, your RAID 0 array between them will be broken and you will lose all data.
RAID 50, striped raidz
With ZFS Raid 50 can also be RAID 60 or 70. RAID 50 combines the storage efficiency of raidz (RAID 5) and some of the performance from striping (RAID 0).
Depending on your RAID levels, you can lose up to 1 disk per raidz array without losing data.
Compression
ZFS has fantastic compression support. Traditionally I would recommend not using compression on a file system, but the way ZFS does it will actually improve performance. By compressing data you trade some CPU time for reducing the amount of data written to disk. Seeing as CPU performance is exponentially faster than spinning disks, this tradeoff works in our favor. ZFS takes this one step further and will stop compression if it isn't getting a good ratio. This typically happens when trying to compress data that has already been compressed or cannot optimally be compressed. This prevents wasting CPU cycles trying to compress it.
Caching
ZFS has the ability to use an SSD or NVME as a caching device for an array. This can provide a dramatic performance improvement for your read or write activity.
ZFS has two levels of caching:
- ZIL (Write)
- ARC and L2ARC (Read)
When ZFS writes to disk, it always stores this data on disk and then later moves this data as evenly as possible to your pool. This will mean the data is stored twice in most cases without a ZIL cache.
ARC/L2ARC acts as a read cache for data that may be accessed in the near future.
Using a ZIL cache on slower raidz/2/3 levels can dramatically improve write performance. ARC/L2ARC can be extremely helpful for database applications where there is a lot of frequently accessed data.
Data scrubbing
Data scrubbing is an on-demand job that will go through your data trying to find any data corruption and automatically fix it. This is typically scheduled to run weekly or monthly depending on your environment.
Snapshots
One of the awesome benefits of Copy-on-write file systems is snapshots. How does a full backup taken every 15 minutes that completes instantly and has zero impact on the performance of your system sound? Pretty freaking rad right?
Because new data does not overwrite old blocks whenever possible, ZFS can keep track of what belongs with each snapshot and just give you the blocks assigned to that snapshot.
Write hole prevention
When writing to RAID 5/6/7 arrays you may experience a power failure before the parity data is written to disk. This creates a vulnerability called a write hole where your data and parity are not consistent. When you attempt to recover a failed array it will fail as the parity data is not intact. Because ZFS avoids updating data in existing blocks and instead writes it to new un-used blocks, ZFS is able to auto-detect and recover from this situation.
How resilient is ZFS?
NAS 2019 Build Series
- Building A New Nas
- NAS Build 2019 Step 1 - Ordering Parts
- NAS Build 2019 Step 2 - First parts delivery
- NAS Build 2019 Step 3 - Fan Upgrades
- NAS Build 2019 Step 4 - Power Supply Installation
- NAS Build 2019 Step 5 - Final Parts Delivery
- NAS Build 2019 Step 6 - CPU Installation
- NAS Build 2019 Step 7 - Hard Disks!
- NAS Build 2019 Step 8 - Firmware Updates
- NAS Build 2019 Step 9 - OS Install & Initial Testing