-
Notifications
You must be signed in to change notification settings - Fork 9
How to Calculate the Odds of Physical Attack Data Loss for a ZFS Array
Note: This method should be generalizable to other array and filesystem types. The general method is:
- Calculate the number of data loss drive destruction combinations for a given number of destroyed drives
- Calculate the number of non-data loss drive destruction combinations for a given number of destroyed drives
- Sum the above
- Divide 1) by 3)
Consider a ZFS array of identical vdevs with a given redundancy level, r. Assume a physical attacker with no knowledge of the array's configuration destroys r + 1 (the minimum number of destroyed drives necessary to result in data loss) drives. What is the probability P that said destruction actually results in data loss?
Because the drives are being deliberately and randomly destroyed, this calculation is completely independent of drive specs and reliability data. For example, a drive's AFR has no effect on whether it is destroyed when tossed into a shredder.
For a thorough discussion on array reliability based on drive specs and reliability, see High Availability and Disaster Recovery, Concepts, Design, Implementation by Klaus Schmidt.
ZFS arrays stripe data across vdevs at the top level, with no parity. This means the loss of a single vdev in a ZFS array results in data loss.
It is possible for a ZFS array to lose more than r + 1 drives without suffering data loss. Consider, for example, an array containing 2 x 4 HDD RAIDZ2 vdevs. The redundancy is given by the "2", meaning that each vdev can lose 2 HDDs without suffering data loss. If either vdev by itself loses 3 or more HDDs, though, the array suffers data loss. Ergo r + 1 in this case is 2 + 1 = 3.
However, what if both vdevs lose 2 HDDs each, for a total of 4 HDDs? Because neither vdev has exceeded its redundancy, neither would suffer data loss.
Clearly, the aforesaid is a probability problem. What might not be immediately obvious, though, is that it's also a combinations problem. For a quick primer on this, see the Combinations heading here.
The key equation to keep in mind here is the one that gives the number of unique combinations (read: order doesn't matter, no repetition) in which r items can be chosen from a larger set of n items:
Eq. 1: n!/(r!(n - r)!)
The second equation to keep in mind is the one that gives the number of permutations in which r items can be chosen from a larger set of n items:
Eq. 2: n!/(n - r)!
2 array types are considered, those containing only:
- RAIDZr
- mirror
vdevs.
The following variables are defined:
-
F, the minimum number of destroyed drives necessary for data loss
- F = 2 for all ZFS arrays containing mirrors
- F = r + 1 for ZFS array containing RAIDZr vdevs only
- N, the total number of drives the array has before any drive destruction
- V, the total number of vdevs
-
D, the number of drives per vdev = N/V
- D = 2 for all ZFS arrays containing only mirrors
- L, the total number of combinations of F destroyed drives that result in data loss
- I, the total number of combinations of F destroyed drives that do not result in data loss
- C, L + I
- P, L/C
Data loss occurs whenever F drives from any vdev are destroyed. Combinatorically, this is the same as picking any 3 drives from a vdev. The number of such combinations per vdev is therefore:
D!/(F!(D - F)!)
However, because this can be done for each vdev and only needs to happen to 1 vdev for data loss to occur, the above expression must be multiplied by V, such that:
L = V(D!/(F!(D - F)!))
Data loss does not occur when less than F drives are destroyed per vdev.
Assume 1 drive is picked from the first vdev and the remaining F - 1 = r drives are picked from the remaining vdevs. From D, pick r drives:
D!/(r!(D - r)!)
Because that combination occurs for every drive in D, and for every multiply by D:
D(D!/(r!(D - r)!))
Because the above combination occurs for every permutation of 2 vdevs, multiply by that factor:
(V!/(V - 2)!)D(D!/(r!(D - r)!))
For these values of V, 1 drive can fail from each vdev. The number of tuples consisting of 1 drive from each vdev is:
D^F
This can be done for any 3 vdevs in the array, so multiply by that combination:
(V!/(3!(V - 3!))D^F
Putting all of the above together:
I = (V!/(V - 2)!)D(D!/(r!(D - r)!)) + (V!/(3!(V - 3!))D^F for V ≥ 3
and
I = (V!/(V - 2)!)D(D!/(r!(D - r)!)) for V < 3
Data loss occurs whenever F drives from any vdev are destroyed. For mirrors, F = 2.
Because only 1 vdev needs to be destroyed:
L = V!/(V - 1)!
Data loss will not occur if 2 vdevs each have 1 drive destroyed. This is equivalent picking 1 drive from 1 vdev, and picking drives one at a time, in turn, from the remaining vdevs:
(V - 1)!/(V - 2)!
Since this is possible for all drives in the array, multiply by N:
I = N(V - 1)!/(V - 2)!
While the previous computation is interesting, it's limited in its potential for comparison as RAIDZ2 vdevs are invulnerable to data loss from the destruction of 2 drives. Therefore, the additional case of 3 drives being destroyed is considered.
Data loss will occur if 1 vdev has both its drives destroyed and 1 other vdev has only 1 drive destroyed. There are 4 possible states of this per pair of vdevs, and so:
4(V!/(2!(V - 2)!)) = L = 2(V!/(V - 2)!)
Data loss will not occur if each vdev has only 1 drive destroyed. The number of tuples consisting of 1 drive from each vdev is:
D^F = 8
Since D is always 2 for mirrors and F is 3.
This is true for every combination of 3 vdevs selected for the array, and so:
I = 8(V!/(3!(V - 3)!)
Organized Alphabetically:
- Explainers
- How Linux, BSD, UNIX, and macOS Relate to Each Other
- Why I Use Resilio Sync Instead of Syncthing
- Why US Buyers Should Purchase Datacenter HDDs instead of NAS HDDs
- Why You Should Separate Compute and Backup Workloads
- Why You Shouldn't Stress Test HDDs Unless You're Trying to Maximize Uptime
- Why You Shouldn't Use Most Premade NAS Solutions
- Guides
- Disaster Recovery and Backups for OpenRC BSDs to non ZFS Repositories
- Disk Encryption Options
- How Much Raw Storage You'll Need for RAID
- How Often Arrays Can Be Scrubbed Without Reducing HDD Life
- How to Calculate the Odds of Physical Attack Data Loss for a ZFS Array
- How to Configure a Samba Server
- How to Generate an Affordable Server or NAS Parts List
- How to Get Your Home Wired for Ethernet
- How to Install OpenIndiana
- How to Install Pycharm on Debian from the JetBrains script
- How to Set Up Regular, Recurring, Incremental, Online Filesystem Backups using Restic
- How to Set Up Regular, Recurring, Recursive, Incremental, Online, In Place Filesystem Backups Using zfsnap
- How to Store HDDs Long Term
- How to Update dnscrypt proxy in Debian with Minimal Downtime
- Projects
- Ongoing
- Future (in order of descending priority/implementation)
- Recommended Hardware
- Recommended Software
- Troubleshooting
- Useful Links