CFT: TRIM Consolodation on UFS/FFS filesystems

From: Kirk McKusick <mckusick_at_mckusick.com>
Date: Mon, 20 Aug 2018 12:40:56 -0700
I have recently added TRIM consolodation support for the UFS/FFS
filesystem. This feature consolodates large numbers of TRIM commands
into a much smaller number of commands covering larger blocks of
disk space. Best described by the commit message:

  Author: mckusick
  Date: Sun Aug 19 16:56:42 2018
  New Revision: 338056
  URL: https://svnweb.freebsd.org/changeset/base/338056

  Log:
    Add consolodation of TRIM / BIO_DELETE commands to the UFS/FFS filesystem.
    
    When deleting files on filesystems that are stored on flash-memory
    (solid-state) disk drives, the filesystem notifies the underlying
    disk of the blocks that it is no longer using. The notification
    allows the drive to avoid saving these blocks when it needs to
    flash (zero out) one of its flash pages. These notifications of
    no-longer-being-used blocks are referred to as TRIM notifications.
    In FreeBSD these TRIM notifications are sent from the filesystem
    to the drive using the BIO_DELETE command.
    
    Until now, the filesystem would send a separate message to the drive
    for each block of the file that was deleted. Each Gigabyte of file
    size resulted in over 3000 TRIM messages being sent to the drive.
    This burst of messages can overwhelm the drive's task queue causing
    multiple second delays for read and write requests.
    
    This implementation collects runs of contiguous blocks in the file
    and then consolodates them into a single BIO_DELETE command to the
    drive. The BIO_DELETE command describes the run of blocks as a
    single large block being deleted. Each Gigabyte of file size can
    result in as few as two BIO_DELETE commands and is typically less
    than ten.  Though these larger BIO_DELETE commands take longer to
    run, they do not clog the drive task queue, so read and write
    commands can intersperse effectively with them.
    
    Though this new feature has been throughly reviewed and tested, it
    is being added disabled by default so as to minimize the possibility
    of disrupting the upcoming 12.0 release. It can be enabled by running
    ``sysctl vfs.ffs.dotrimcons=1''. Users are encouraged to test it.
    If no problems arise, we will consider requesting that it be enabled
    by default for 12.0.
    
    Reviewed by:  kib
    Tested by:    Peter Holm
    Sponsored by: Netflix

This support is off by default, but I am hoping that I can get enough
testing to ensure that it (a) works, and (b) is helpful that it will
be reasonable to have it turned on by default in 12.0. The cutoff for
turning it on by default in 12.0 is September 19th. So I am requesting
your testing feedback in the near-term. Please let me know if you have
managed to use it successfully (or not) and also if it provided any
performance difference (good or bad).

To enable TRIM consolodation either use `sysctl vfs.ffs.dotrimcons=1'
or just set the `dotrimcons' variable in sys/ufs/ffs/ffs_alloc.c to 1.

Everything you need to test TRIM consolodation is obtained by setting
the above sysctl. However, if you want to collect statistics on how
effective the TRIM consolodation is working, the attached diff will
allow you to easily get statitics on how the TRIM is going. Compile your
kernel and the mount command. Note that if you do not do a buildworld,
you will need to copy /sys/sys/mount.h to /usr/include/sys/mount.h to
get the patched mount command to compile. Then run `mount -v'
(or `mount -v | grep /mnt' to get just the statistics for /mnt).

Removing a 30Mb file without TRIM consolodation:
/dev/md0 on /mnt (ufs, local, writes: sync 10 async 482, reads: sync 7 async 0, fsid d43f795b6a7d34fb, TRIM: total 952 total blocks 7616)

While removing the same file with TRIM consolodation:
/dev/md0 on /mnt (ufs, local, writes: sync 10 async 482, reads: sync 7 async 0, fsid d43f795b6a7d34fb, TRIM: total 3 total blocks 7616)

It also tracks pending blocks and pending files. These numbers are only
printed out when they are non-zero. Here is an example running with soft
updates right after a file has been rm'ed, but its blocks not yet released:
/dev/md0 on /mnt (ufs, local, soft-updates, writes: sync 2 async 251, reads: sync 5 async 0, fsid 303f795b1be0c459, pending blocks 7616, pending files 1)

Finally it tracks inflight BIO_DELETEs and total blocks represented by
those inflight BIO_DELETEs. These numbers are also only printed out when
they are non-zero. These statistics let you see how much of a backlog
of BIO_DELETEs you have backed up at/in the disk drive and you can track
how quickly they drain.

	Kirk McKusick
Received on Mon Aug 20 2018 - 17:35:39 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:17 UTC