<feed xmlns='http://www.w3.org/2005/Atom'>
<title>src/include/sys/vdev_impl.h, branch zfs-0.7.0-rc2</title>
<subtitle>FreeBSD source tree</subtitle>
<id>https://cgit-dev.freebsd.org/src/atom?h=zfs-0.7.0-rc2</id>
<link rel='self' href='https://cgit-dev.freebsd.org/src/atom?h=zfs-0.7.0-rc2'/>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/'/>
<updated>2016-10-24T17:45:59Z</updated>
<entry>
<title>Turn on/off enclosure slot fault LED even when disk isn't present</title>
<updated>2016-10-24T17:45:59Z</updated>
<author>
<name>Tony Hutter</name>
<email>hutter2@llnl.gov</email>
</author>
<published>2016-10-24T17:45:59Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=1bbd8770490f0e5b8c575865ab70f6853bca2a2a'/>
<id>urn:sha1:1bbd8770490f0e5b8c575865ab70f6853bca2a2a</id>
<content type='text'>
Previously when a drive faulted, the statechange-led.sh script would lookup
the drive's LED sysfs entry in /sys/block/sd*/device/enclosure_device, and
turn it on.  During testing we noticed that if you pulled out a drive, or if
the drive was so badly broken that it no longer appeared to Linux, that the
/sys/block/sd* path would be removed, and the script could not lookup the
LED entry.

To fix this, this patch looks up the disks's more persistent
"/sys/class/enclosure/X:X:X:X/Slot N" LED sysfs path at pool import.  It then
passes that path to the statechange-led script to use, rather than having the
script look it up on the fly.  This allows the script to turn on/off the slot
LEDs even when the drive is missing.

Closes #5309 
Closes #2375 </content>
</entry>
<entry>
<title>Multipath autoreplace, control enclosure LEDs, event rate limiting</title>
<updated>2016-10-19T19:55:59Z</updated>
<author>
<name>Tony Hutter</name>
<email>hutter2@llnl.gov</email>
</author>
<published>2016-10-19T19:55:59Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=6078881aa18a45ea065a887e2a8606279cdc0329'/>
<id>urn:sha1:6078881aa18a45ea065a887e2a8606279cdc0329</id>
<content type='text'>
1. Enable multipath autoreplace support for FMA.

This extends FMA autoreplace to work with multipath disks.  This
requires libdevmapper to be installed at build time.

2. Turn on/off fault LEDs when VDEVs become degraded/faulted/online

Set ZED_USE_ENCLOSURE_LEDS=1 in zed.rc to have ZED turn on/off the enclosure
LED for a drive when a drive becomes FAULTED/DEGRADED.  Your enclosure must
be supported by the Linux SES driver for this to work.  The enclosure LED
scripts work for multipath devices as well.  The scripts will clear the LED
when the fault is cleared.

3. Rate limit ZIO delay and checksum events so as not to flood ZED

ZIO delay and checksum events are rate limited to 5/sec in the zfs module.

Reviewed-by: Richard Laager &lt;rlaager@wiktel.com&gt;
Reviewed by: Don Brady &lt;don.brady@intel.com&gt;
Reviewed-by: Brian Behlendorf &lt;behlendorf1@llnl.gov&gt;
Signed-off-by: Tony Hutter &lt;hutter2@llnl.gov&gt;
Closes #2449 
Closes #3017 
Closes #5159 </content>
</entry>
<entry>
<title>OpenZFS 7090 - zfs should throttle allocations</title>
<updated>2016-10-14T00:59:18Z</updated>
<author>
<name>Don Brady</name>
<email>don.brady@intel.com</email>
</author>
<published>2016-10-14T00:59:18Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=3dfb57a35e8cbaa7c424611235d669f3c575ada1'/>
<id>urn:sha1:3dfb57a35e8cbaa7c424611235d669f3c575ada1</id>
<content type='text'>
OpenZFS 7090 - zfs should throttle allocations

Authored by: George Wilson &lt;george.wilson@delphix.com&gt;
Reviewed by: Alex Reece &lt;alex@delphix.com&gt;
Reviewed by: Christopher Siden &lt;christopher.siden@delphix.com&gt;
Reviewed by: Dan Kimmel &lt;dan.kimmel@delphix.com&gt;
Reviewed by: Matthew Ahrens &lt;mahrens@delphix.com&gt;
Reviewed by: Paul Dagnelie &lt;paul.dagnelie@delphix.com&gt;
Reviewed by: Prakash Surya &lt;prakash.surya@delphix.com&gt;
Reviewed by: Sebastien Roy &lt;sebastien.roy@delphix.com&gt;
Approved by: Matthew Ahrens &lt;mahrens@delphix.com&gt;
Ported-by: Don Brady &lt;don.brady@intel.com&gt;
Reviewed-by: Brian Behlendorf &lt;behlendorf1@llnl.gov&gt;

When write I/Os are issued, they are issued in block order but the ZIO
pipeline will drive them asynchronously through the allocation stage
which can result in blocks being allocated out-of-order. It would be
nice to preserve as much of the logical order as possible.

In addition, the allocations are equally scattered across all top-level
VDEVs but not all top-level VDEVs are created equally. The pipeline
should be able to detect devices that are more capable of handling
allocations and should allocate more blocks to those devices. This
allows for dynamic allocation distribution when devices are imbalanced
as fuller devices will tend to be slower than empty devices.

The change includes a new pool-wide allocation queue which would
throttle and order allocations in the ZIO pipeline. The queue would be
ordered by issued time and offset and would provide an initial amount of
allocation of work to each top-level vdev. The allocation logic utilizes
a reservation system to reserve allocations that will be performed by
the allocator. Once an allocation is successfully completed it's
scheduled on a given top-level vdev. Each top-level vdev maintains a
maximum number of allocations that it can handle (mg_alloc_queue_depth).
The pool-wide reserved allocations (top-levels * mg_alloc_queue_depth)
are distributed across the top-level vdevs metaslab groups and round
robin across all eligible metaslab groups to distribute the work. As
top-levels complete their work, they receive additional work from the
pool-wide allocation queue until the allocation queue is emptied.

OpenZFS-issue: https://www.illumos.org/issues/7090
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/4756c3d7
Closes #5258 

Porting Notes:
- Maintained minimal stack in zio_done
- Preserve linux-specific io sizes in zio_write_compress
- Added module params and documentation
- Updated to use optimize AVL cmp macros</content>
</entry>
<entry>
<title>Add -lhHpw options to "zpool iostat" for avg latency, histograms, &amp; queues</title>
<updated>2016-05-12T19:36:32Z</updated>
<author>
<name>Tony Hutter</name>
<email>hutter2@llnl.gov</email>
</author>
<published>2016-02-29T18:05:23Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=193a37cb2430960ce759daf12ce5cc804818aba1'/>
<id>urn:sha1:193a37cb2430960ce759daf12ce5cc804818aba1</id>
<content type='text'>
Update the zfs module to collect statistics on average latencies, queue sizes,
and keep an internal histogram of all IO latencies.  Along with this, update
"zpool iostat" with some new options to print out the stats:

-l: Include average IO latencies stats:

 total_wait     disk_wait    syncq_wait    asyncq_wait  scrub
 read  write   read  write   read  write   read  write   wait
-----  -----  -----  -----  -----  -----  -----  -----  -----
    -   41ms      -    2ms      -   46ms      -    4ms      -
    -    5ms      -    1ms      -    1us      -    4ms      -
    -    5ms      -    1ms      -    1us      -    4ms      -
    -      -      -      -      -      -      -      -      -
    -   49ms      -    2ms      -   47ms      -      -      -
    -      -      -      -      -      -      -      -      -
    -    2ms      -    1ms      -      -      -    1ms      -
-----  -----  -----  -----  -----  -----  -----  -----  -----
  1ms    1ms    1ms  413us   16us   25us      -    5ms      -
  1ms    1ms    1ms  413us   16us   25us      -    5ms      -
  2ms    1ms    2ms  412us   26us   25us      -    5ms      -
    -    1ms      -  413us      -   25us      -    5ms      -
    -    1ms      -  460us      -   29us      -    5ms      -
196us    1ms  196us  370us    7us   23us      -    5ms      -
-----  -----  -----  -----  -----  -----  -----  -----  -----

-w: Print out latency histograms:

sdb           total           disk         sync_queue      async_queue
latency    read   write    read   write    read   write    read   write   scrub
-------  ------  ------  ------  ------  ------  ------  ------  ------  ------
1ns           0       0       0       0       0       0       0       0       0
...
33us          0       0       0       0       0       0       0       0       0
66us          0       0     107    2486       2     788      12      12       0
131us         2     797     359    4499      10     558     184     184       6
262us        22     801     264    1563      10     286     287     287      24
524us        87     575      71   52086      15    1063     136     136      92
1ms         152    1190       5   41292       4    1693     252     252     141
2ms         245    2018       0   50007       0    2322     371     371     220
4ms         189    7455      22  162957       0    3912    6726    6726     199
8ms         108    9461       0  102320       0    5775    2526    2526      86
17ms         23   11287       0   37142       0    8043    1813    1813      19
34ms          0   14725       0   24015       0   11732    3071    3071       0
67ms          0   23597       0    7914       0   18113    5025    5025       0
134ms         0   33798       0     254       0   25755    7326    7326       0
268ms         0   51780       0      12       0   41593   10002   10002       0
537ms         0   77808       0       0       0   64255   13120   13120       0
1s            0  105281       0       0       0   83805   20841   20841       0
2s            0   88248       0       0       0   73772   14006   14006       0
4s            0   47266       0       0       0   29783   17176   17176       0
9s            0   10460       0       0       0    4130    6295    6295       0
17s           0       0       0       0       0       0       0       0       0
34s           0       0       0       0       0       0       0       0       0
69s           0       0       0       0       0       0       0       0       0
137s          0       0       0       0       0       0       0       0       0
-------------------------------------------------------------------------------

-h: Help

-H: Scripted mode. Do not display headers, and separate fields by a single
    tab instead of arbitrary space.

-q: Include current number of entries in sync &amp; async read/write queues,
    and scrub queue:

 syncq_read    syncq_write   asyncq_read  asyncq_write   scrubq_read
 pend  activ   pend  activ   pend  activ   pend  activ   pend  activ
-----  -----  -----  -----  -----  -----  -----  -----  -----  -----
    0      0      0      0     78     29      0      0      0      0
    0      0      0      0     78     29      0      0      0      0
    0      0      0      0      0      0      0      0      0      0
    -      -      -      -      -      -      -      -      -      -
    0      0      0      0      0      0      0      0      0      0
    -      -      -      -      -      -      -      -      -      -
    0      0      0      0      0      0      0      0      0      0
-----  -----  -----  -----  -----  -----  -----  -----  -----  -----
    0      0    227    394      0     19      0      0      0      0
    0      0    227    394      0     19      0      0      0      0
    0      0    108     98      0     19      0      0      0      0
    0      0     19     98      0      0      0      0      0      0
    0      0     78     98      0      0      0      0      0      0
    0      0     19     88      0      0      0      0      0      0
-----  -----  -----  -----  -----  -----  -----  -----  -----  -----

-p: Display numbers in parseable (exact) values.

Also, update iostat syntax to allow the user to specify specific vdevs
to show statistics for.  The three options for choosing pools/vdevs are:

Display a list of pools:
    zpool iostat ... [pool ...]

Display a list of vdevs from a specific pool:
    zpool iostat ... [pool vdev ...]

Display a list of vdevs from any pools:
    zpool iostat ... [vdev ...]

Lastly, allow zpool command "interval" value to be floating point:
    zpool iostat -v 0.5

Signed-off-by: Tony Hutter &lt;hutter2@llnl.gov
Signed-off-by: Brian Behlendorf &lt;behlendorf1@llnl.gov&gt;
Closes #4433
</content>
</entry>
<entry>
<title>OpenZFS 6736 - ZFS per-vdev ZAPs</title>
<updated>2016-05-02T21:27:45Z</updated>
<author>
<name>Joe Stein</name>
<email>joe.stein@delphix.com</email>
</author>
<published>2016-04-11T20:16:57Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=e0ab3ab553e36595344d9cbdc240d380ad203b60'/>
<id>urn:sha1:e0ab3ab553e36595344d9cbdc240d380ad203b60</id>
<content type='text'>
6736 ZFS per-vdev ZAPs
Reviewed by: Matthew Ahrens &lt;mahrens@delphix.com&gt;
Reviewed by: John Kennedy &lt;john.kennedy@delphix.com&gt;
Reviewed by: George Wilson &lt;george.wilson@delphix.com&gt;
Reviewed by: Don Brady &lt;don.brady@intel.com&gt;
Reviewed by: Dan McDonald &lt;danmcd@omniti.com&gt;

References:
  https://www.illumos.org/issues/6736
  https://github.com/openzfs/openzfs/commit/215198a

Ported-by: Don Brady &lt;don.brady@intel.com&gt;
Signed-off-by: Brian Behlendorf &lt;behlendorf1@llnl.gov&gt;
Closes #4515
</content>
</entry>
<entry>
<title>FreeBSD r256956: Improve ZFS N-way mirror read performance by using load and locality information.</title>
<updated>2016-02-26T19:24:35Z</updated>
<author>
<name>smh</name>
<email>smh@FreeBSD.org</email>
</author>
<published>2016-02-13T01:47:22Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=9f500936c82137ef3a57c53013894f622dcec14e'/>
<id>urn:sha1:9f500936c82137ef3a57c53013894f622dcec14e</id>
<content type='text'>
The existing algorithm selects a preferred leaf vdev based on offset of the zio
request modulo the number of members in the mirror. It assumes the devices are
of equal performance and that spreading the requests randomly over both drives
will be sufficient to saturate them. In practice this results in the leaf vdevs
being under utilized.

The new algorithm takes into the following additional factors:
* Load of the vdevs (number outstanding I/O requests)
* The locality of last queued I/O vs the new I/O request.

Within the locality calculation additional knowledge about the underlying vdev
is considered such as; is the device backing the vdev a rotating media device.

This results in performance increases across the board as well as significant
increases for predominantly streaming loads and for configurations which don't
have evenly performing devices.

The following are results from a setup with 3 Way Mirror with 2 x HD's and
1 x SSD from a basic test running multiple parrallel dd's.

With pre-fetch disabled (vfs.zfs.prefetch_disable=1):

== Stripe Balanced (default) ==
Read 15360MB using bs: 1048576, readers: 3, took 161 seconds @ 95 MB/s
== Load Balanced (zfslinux) ==
Read 15360MB using bs: 1048576, readers: 3, took 297 seconds @ 51 MB/s
== Load Balanced (locality freebsd) ==
Read 15360MB using bs: 1048576, readers: 3, took 54 seconds @ 284 MB/s

With pre-fetch enabled (vfs.zfs.prefetch_disable=0):

== Stripe Balanced (default) ==
Read 15360MB using bs: 1048576, readers: 3, took 91 seconds @ 168 MB/s
== Load Balanced (zfslinux) ==
Read 15360MB using bs: 1048576, readers: 3, took 108 seconds @ 142 MB/s
== Load Balanced (locality freebsd) ==
Read 15360MB using bs: 1048576, readers: 3, took 48 seconds @ 320 MB/s

In addition to the performance changes the code was also restructured, with
the help of Justin Gibbs, to provide a more logical flow which also ensures
vdevs loads are only calculated from the set of valid candidates.

The following additional sysctls where added to allow the administrator
to tune the behaviour of the load algorithm:
* vfs.zfs.vdev.mirror.rotating_inc
* vfs.zfs.vdev.mirror.rotating_seek_inc
* vfs.zfs.vdev.mirror.rotating_seek_offset
* vfs.zfs.vdev.mirror.non_rotating_inc
* vfs.zfs.vdev.mirror.non_rotating_seek_inc

These changes where based on work started by the zfsonlinux developers:
https://github.com/zfsonlinux/zfs/pull/1487

Reviewed by:	gibbs, mav, will
MFC after:	2 weeks
Sponsored by:	Multiplay

References:
  https://github.com/freebsd/freebsd@5c7a6f5d
  https://github.com/freebsd/freebsd@31b7f68d
  https://github.com/freebsd/freebsd@e186f564

Performance Testing:
  https://github.com/zfsonlinux/zfs/pull/4334#issuecomment-189057141

Porting notes:
- The tunables were adjusted to have ZoL-style names.
- The code was modified to use ZoL's vd_nonrot.
- Fixes were done to make cstyle.pl happy
- Merge conflicts were handled manually
- freebsd/freebsd@e186f564bc946f82c76e0b34c2f0370ed9aea022 by my
  collegue Andriy Gapon has been included. It applied perfectly, but
  added a cstyle regression.
- This replaces 556011dbec2d10579819078559a77630fc559112 entirely.
- A typo "IO'a" has been corrected to say "IO's"
- Descriptions of new tunables were added to man/man5/zfs-module-parameters.5.

Ported-by: Richard Yao &lt;ryao@gentoo.org&gt;
Signed-off-by: Brian Behlendorf &lt;behlendorf1@llnl.gov&gt;
Closes #4334
</content>
</entry>
<entry>
<title>Disable LBA weighting on files and SSDs</title>
<updated>2015-09-01T22:22:07Z</updated>
<author>
<name>Richard Yao</name>
<email>ryao@gentoo.org</email>
</author>
<published>2015-08-29T16:01:07Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=fb40095f5f0853946f8150481ca22602d1334dfe'/>
<id>urn:sha1:fb40095f5f0853946f8150481ca22602d1334dfe</id>
<content type='text'>
The LBA weighting makes sense on rotational media where the outer tracks
have twice the bandwidth of the inner tracks. However, it is detrimental
on nonrotational media such as solid state disks, where the only effect
is to ensure that metaslabs enter the best-fit allocation behavior
sooner, which is detrimental to performance. It also makes no sense on
files where the underlying filesystem can arrange things however it
wants.

Signed-off-by: Richard Yao &lt;ryao@gentoo.org&gt;
Signed-off-by: Brian Behlendorf &lt;behlendorf1@llnl.gov&gt;
Closes #3712
</content>
</entry>
<entry>
<title>Illumos 5818 - zfs {ref}compressratio is incorrect with 4k sector size</title>
<updated>2015-06-10T23:24:01Z</updated>
<author>
<name>Matthew Ahrens</name>
<email>mahrens@delphix.com</email>
</author>
<published>2015-05-20T04:14:01Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=c3520e7f1f567bd4e6a28eff4867c70850e8a854'/>
<id>urn:sha1:c3520e7f1f567bd4e6a28eff4867c70850e8a854</id>
<content type='text'>
5818 zfs {ref}compressratio is incorrect with 4k sector size
Reviewed by: Alex Reece &lt;alex@delphix.com&gt;
Reviewed by: George Wilson &lt;george@delphix.com&gt;
Reviewed by: Richard Elling &lt;richard.elling@richardelling.com&gt;
Reviewed by: Steven Hartland &lt;killing@multiplay.co.uk&gt;
Approved by: Albert Lee &lt;trisk@omniti.com&gt;

References:
  https://www.illumos.org/issues/5818
  https://github.com/illumos/illumos-gate/commit/81cd5c5

Ported-by: Don Brady &lt;don.brady@intel.com&gt;
Signed-off-by: Brian Behlendorf &lt;behlendorf1@llnl.gov&gt;
Closes #3432
</content>
</entry>
<entry>
<title>Illumos #5244 - zio pipeline callers should explicitly invoke next stage</title>
<updated>2015-04-30T22:07:47Z</updated>
<author>
<name>George Wilson</name>
<email>george.wilson@delphix.com</email>
</author>
<published>2014-10-20T22:07:45Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=98b254188a730553361adfabca9f658421be2b82'/>
<id>urn:sha1:98b254188a730553361adfabca9f658421be2b82</id>
<content type='text'>
5244 zio pipeline callers should explicitly invoke next stage
Reviewed by: Adam Leventhal &lt;ahl@delphix.com&gt;
Reviewed by: Alex Reece &lt;alex.reece@delphix.com&gt;
Reviewed by: Christopher Siden &lt;christopher.siden@delphix.com&gt;
Reviewed by: Matthew Ahrens &lt;mahrens@delphix.com&gt;
Reviewed by: Richard Elling &lt;richard.elling@gmail.com&gt;
Reviewed by: Dan McDonald &lt;danmcd@omniti.com&gt;
Reviewed by: Steven Hartland &lt;killing@multiplay.co.uk&gt;
Approved by: Gordon Ross &lt;gwr@nexenta.com&gt;

References:
  https://www.illumos.org/issues/5244
  https://github.com/illumos/illumos-gate/commit/738f37b

Porting Notes:

1. The unported "2932 support crash dumps to raidz, etc. pools"
   caused a merge conflict due to a copyright difference in
   module/zfs/vdev_raidz.c.
2. The unported "4128 disks in zpools never go away when pulled"
   and additional Linux-specific changes caused merge conflicts in
   module/zfs/vdev_disk.c.

Ported-by: Richard Yao &lt;richard.yao@clusterhq.com&gt;
Signed-off-by: Brian Behlendorf &lt;behlendorf1@llnl.gov&gt;
Closes #2828
</content>
</entry>
<entry>
<title>5313 Allow I/Os to be aggregated across ZIO priority classes</title>
<updated>2015-04-24T22:16:56Z</updated>
<author>
<name>Justin T. Gibbs</name>
<email>justing@spectralogic.com</email>
</author>
<published>2015-04-11T18:51:06Z</published>
<link rel='alternate' type='text/html' href='https://cgit-dev.freebsd.org/src/commit/?id=ec8501ee1274205f277a7287c3de8119d361afaf'/>
<id>urn:sha1:ec8501ee1274205f277a7287c3de8119d361afaf</id>
<content type='text'>
Reviewed by: Andriy Gapon &lt;avg@FreeBSD.org&gt;
Reviewed by: Will Andrews &lt;willa@SpectraLogic.com&gt;
Reviewed by: Matt Ahrens &lt;mahrens@delphix.com&gt;
Reviewed by: George Wilson &lt;george@delphix.com&gt;
Approved by: Robert Mustacchi &lt;rm@joyent.com&gt;

References:
  https://www.illumos.org/issues/5313
  https://github.com/illumos/illumos-gate/commit/fe319232

Ported-by: DHE &lt;git@dehacked.net&gt;
Signed-off-by: Brian Behlendorf &lt;behlendorf1@llnl.gov&gt;
Closes #3280
</content>
</entry>
</feed>
