[CFT] Improved ZFS metaslab code (faster write speed)

From: Martin Matuska <mm_at_FreeBSD.org>
Date: Sun, 22 Aug 2010 17:15:01 +0200
Dear FreeBSD community,

many of our [2] (and Solaris [3]) users today are complaining about slow
ZFS writes. One of the causes for these writes is the selection of the
proper allocation method for allocation of new blocks [3] [4]. Another
issue a write slowdown during TXG sync times.

Solaris 10 (and OpenSolaris up to november 2009) have the
following scenario:

- pool has more than 30% free space: use first fit method [1]
- pool has less than 30% free space: use best fit method [1]

This causes a major slowdown of the writes if we go below 30% of free
space. On large pools, 30% may be terabytes of free space.

OpenSolaris has changed this in November 2009 and the Oracle Storage
Appliances also included the new code in Q1/2010 [1].

The source [1] states, that with this change they archieved a speedup
of: "50% Improved OLTP Performance, 70% Reduced Variability, 200%
Improvement on MS Exchange"

I would like to issue a Call For Testing for the following 9-CURRENT patch:
http://people.freebsd.org/~mm/patches/zfs/zfs_metaslab.patch

To apply the patch against 8-STABLE, you need to apply the v15 update first:
http://people.freebsd.org/~mm/patches/zfs/v15/stable-8-v15.patch

The patch includes the following OpenSolaris onnv revisions:
10921 (partial), 11146, 11728, 12047

And covers the following Bug IDs:
6826241 Sync write IOPS drops dramatically during TXG sync
6869229 zfs should switch to shiny new metaslabs more frequently
6917066 zfs block picking can be improved
6918420 zdb -m has issues printing metaslab statistics

References:
[1] http://blogs.sun.com/roch/entry/doubling_exchange_performance
[2] http://forums.freebsd.org/showthread.php?t=8270
[3]
http://blogs.everycity.co.uk/alasdair/2010/07/zfs-runs-really-slowly-when-free-disk-usage-goes-above-80/
[4] http://blogs.sun.com/bonwick/entry/zfs_block_allocation
[5] http://blogs.sun.com/bonwick/entry/space_maps
Received on Sun Aug 22 2010 - 13:15:04 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:06 UTC