Startseite
Bild
Bild
  • ready to use and comfortable ZFS storage appliance for iSCSI/FC, NFS and SMB
  • Active Directory support with Snaps as Previous Version
  • user friendly Web-GUI that includes all functions for a sophisticated NAS or SAN appliance.
  • commercial use allowed
  • no capacity limit
  • free download for End-User


Bild
  • Individual support and consulting
  • increased GUI performance/ background agents
  • bugfix/ updates/ access to bugfixes
  • extensions like comfortable ACL handling, disk and realtime monitoring or remote replication
  • appliance diskmap, security and tuning (Pro complete)
  • Redistribution/Bundling/setup on customers demand optional
please request a quotation.
Details: Featuresheet.pdf

napp-it-zfs


Tuning/ best use:


In general:

  • Use mainstream hardware like Intel server chipsets and nics, SuperMicro boards or LSI HBA in IT mode
  • Use as much RAM as possible (nearly all free RAM is used for read caching)
    to serve most reads from RAM instead of slow disks
  • Add a SSD for an additional read cache (L2Arc) but only if you cannot add more RAM
    Check ARC usage: home >> System >> Statistics >> ARC
    If you want to cache sequential data, set zfs:l2arc_noprefetch=0
    see https://storagetuning.wordpress.com/2011/12/01/zfs-tuning-for-ssds/
  • Do not fill up SSDs as performance degrades. Use reservations or do a "secure erase" on SSDs that are not new, followed by overprovision SSDs with Host protected Areas, read http://www.anandtech.com/show/6489/playing-with-op
    tools: http://www.thomas-krenn.com/en/wiki/SSD Over-provisioning using hdparm
    or http://www.hdat2.com/files/cookbook v11.pdf

    Google: Enhancing the Write Performance of SSDs
  • Disable sync write on filesystems.
    If you need sync write or wish to disable write back cache (LU) for data-security reasons:

    Add a dedicated Slog as ZIL device with low latency, prefer DRAM based ones like
    a ZeusRAM or a fast (best SLC) SSD with a supercap, use a small partition of a large SSD
    Examples: ZeusRAM SAS SSD (DRAM based, fastest at all)
    Intel S3700 100/200GB with a supercap included (about 60% of the performance of a ZeusRAM)
    see benchmarks about sync write performance at http://napp-it.org/doc/manuals/benchmarks.pdf

    Enterprise class Powerloss Protection that guarantees commited writes to be on disk is a must for an Slog.
  • Add a ZIL accelerator when secure sync write is required, check ZIL usage: home >> System >> Statistics >> ZIL
    constantin.glez.de/blog/2010/07/solaris-zfs-synchronous-writes-and-zil-explained
    read about speed degration on SSD: www.ddrdrive.com/zil accelerator.pdf
    read about basics of a ZIL accelerator: http://www.open-zfs.org/w/images/9/98/DDRdrive zil rw revelation.pdf

    some benchmarksabout quality of SSDs as a journal device
    http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

    Effect of SSD overprovisioning and write saturation on steady load:
    http://www.tomshardware.com/reviews/sandisk-x210-ssd-review,3648-6.html
  • Disable ZFS property atime (log last access time)
  • Use as much vdevs as possible if you need I/O. I/O performance scales with number of vdevs.
    so use 2/3 way mirrorred vdevs if you need best I/O values ex for ESXi datastores
    or multiple Raid-Z2/Z3 for a fileserver (similar Raid-60+)
  • Use as much disks as possible if you need sequential performance that scales with number of disks
    but avoid too large vdevs (like max 10 disks in Z2 or 16 in Z3 due to resilver time)
    Try to combine this with number of vdevs.
    Good example: 24 disk case: use 2 x 10 disk Raid-Z2 + hotspare + opt. Zil + opt. Arc
    If you need more space: replace one vdev with larger disks (disk by disk, resilver),
    optionally later: replace the other vdev. Ignore that your pool is unbalanced in the meantime.
  • Prefer  ashift=12 vdevs even with older 512B disks or you have problem replacing them with newer 4k disks
    To do you can modify sd.conf or create a ashift=12 vdev with newer disks and replace the disks with older 512B ones.
    If one disk in a vdev is 4k it creates a vdev with ashift=12 too, try this with a testpool first.
  • Prefer Enterprise class SSD only pools for best performance or add a really fast write optimized ZIL with a supercap with desktop SSDs. For professional usage you can combine a pool build from of Intels new S3500 with a ZIL on a faster S3700 (100 or 200GB)
    read: http://www.anandtech.com/show/6433/intel-ssd-dc-s3700-200gb-review ,prefer SSDs with a large spare area like the S3700 or do not use the whole SSD i.e. do not fill a SSD pool above say 70% , read http://www.anandtech.com/show/6489/playing-with-op
  • For a fast pool stay below 80% pool fillrate, for high performance pools, stay below 50% fillrate (set pool reservations to force)
    Throughput is a function of pool-fillrate. This is the price (fragmentation) that you have to pay for the copy on write filesystem
    that gives you the always consistent superior filesystem with snapshots and online scrubbing
    http://blog.delphix.com/uday/2013/02/19/78/
    http://www.trivadis.com/uploads/tx_cabagdownloadarea/kopfschmerzen_mit_zfs_abgabe_tvd_whitepaper.pdf
  • Use a large data/backup/media pool and add a fast SSD only pool if you need high performance (ex ESXi datastore).
    Prefer fast enterprise SSDs like Intel S3700 for heavy write usage or S3500 for mostly read usage.
    If you use consumer SSDs, use a reservation or extra overprovisioning ex create a HPA (host protected area of 10-20%)
  • Avoid dedup in nearly any case (use it only when absolutely needed on smaller dedicated pools not on large general use pools)
  • if you like compression, use LZ4 if available
    Enabling compress can increase or lower performance, that depends on compressor, CPU and data.
  • Tune ZFS recordsize
    https://blogs.oracle.com/roch/entry/tuning zfs recordsize
  • Use expander with SAS disks, prefer multiple HBA controller when using Sata disks (even better/faster with SAS disks)
  • You may tweak the max_pending setting in /etc/system. Lower values may improve latency, larger values throughput.
    Default is mostly 10 (good for fast Sata/SAS disks), lower on slow disks, set higher (30-50) on fast SSDs.
    http://www.c0t0d0s0.org/archives/7370-A-little-change-of-queues.html


Hardware:

  • Prefer hardware similar to ZFS boxes that are sold together with NexentaStor/OI/OmniOS
    (in most cases Supermicro and Intel Xeon based)
  • Use LSI HBA controller or controller that comes with raidless IT mode like the 9207
    or can be crossflashed to LSI IT mode (ex IBM 1015)
  • Use ECC RAM (you should not reduce security level from filesystem using unreliable RAM)
  • Use Intel Nics (1 GBe, prefer 10 GBe like the Intel X-540)
  • Onboard Sata/AHCI is hotplug capable but this is disabled per default:
    To enable add the following line to /etc/system (OmniOS)

    set sata:sata_auto_online=1



Network:



Napp-In-One (Virtual Storage server)


Comstar and blocksize on filesystems


Disk Driver setting (Multipath etc)

  • Check used driver for disks in menu disk-details-prtconfig
  • Check disk driver config files (napp-it menu disk-details)
  • Force 4k alignment in sd.conf (disk-details)


Timeout settings (time to wait for faultet disks)


Special configurations


Dedup and L2Arc

  • There are some rules of thumb that can help
    Realtime dedup needs to process the dedup table on every read or write. Dedup table is in RAM or without enough RAM on disk. In this case performance can become horrible like a snap delete that can last days.

    In a worst case scenario, you can need up to 5GB RAM for every TB of data. Beside this RAM for dedup, you want RAM as readcache for performance. There is also a rule of thumb. For a good performance, use about 1 GB RAM per TB data. On some workloads you can use less, others should have more RAM. For the OS itself, you can add about 2 GB RAM.

    some exampels
    If you have a 5 TB pool, you need 5 x 5 + 5 + 2 = 32 GB RAM.
    Not a problem regarding RAM price
    This is ok if you have a high dedup rate say 5-10 and a very expensive and fast storage.
    But with a high performance pool, you probable want more readcache than only 5 GB out of 32 GB.

    if you have a 20 TB pool, you need 20 x 5 + 20 +2 = 128 GB RAM
    Up from here RAM becomes very expensive and without very high dedup rates (not very propable)
    the RAM for dedup is more expensive than the disks that you save

    AND
    in most cases, you want the RAM for performance and now you use it for capacity saving.
    If you would use the whole RAM as readcache this would be a much faster setup.

    A fast L2Arc like an NVMe can help a little but count 5% of the L2Arc size as RAM need to manage L2Arc entries.
    Another option without RAM need is LZ4 compress.


Avoid:

  • Realtek nics in server or client, optionally update to newest driver releases
  • Copy tools on Windows like Teracopy


ZFS Tuning options (may not apply to your OS release):

napp-it 27.12.2023