Modern Storagesystems have a extreme data density with a high capacity. Fast data processing requires high datarates/ frequencies resulting in thew following problems:
- Errors when processing data in RAM (bit-flop)
- Errors in the chain storage-driver -> storage-controller -> cabling - > backplane -> disks
- Errors on Raid 5/6 systems (example a corrupted filesystem on a powerloss during write, see write hole problem)
- Errors that are not detected with disk internal checksums
- Silent dataerrors (bit rot). Data is corrupted with a statistical rate due to magnetic fields, radioactivity or simply by chance. A study at Cern several years ago with disks where data-density was much lower than today results in a undetected error-rate of about 3 x 10-15 errors what means that you must expect 3 errors per TB that are not detected and not caused by a hardware problem. This is a serious problem with critical data as well as long term storage. see http://en.wikipedia.org/wiki/Data_corruption
- Data modification by sabotage, a virus, human errors, revision safe storage
The Solution: ZFS
While you can control RAM problems quite easily with ECC-RAM all other problems require integrated and advanced concepts like ZFS.
- End to End checksums. This means that real data and metadate is verified by checksums controlled by the OS storage subsystem (and not on a single datablock level done by disk or raid) . This is the only way to descover all sort of errors during processing despite the reason and a self-healing of errors during every read or on a scrub with the help of checksums and redundancy.
- CopyOnWrite. This modern storage concept means that a data modification/update is done successfully or does not modify any data (always consistent filesystem, no fschk needed). If you hinder a reuse of old datablocks you have read-only snapshots without a delay, a needed copy or initial space consumption. This allows versioning and revison safe read only storage without extra efforts.
- Integrated Software Raid with Volumemanagement to give ZFS full control of what is really on disk independent from any controller or whatever cache optionally with a secure sync-write behaviour.