Re: Help ZFS FreeBSD 8.0 RC2 Write performance issue

From: Dan Nelson <dnelson_at_allantgroup.com> Date: Wed, 11 Nov 2009 17:26:28 -0600 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:57 UTC

In the last episode (Nov 11), Sam Fourman Jr. said:
> On Wed, Nov 11, 2009 at 2:49 PM, Dan Nelson <dnelson_at_allantgroup.com> wrote:
> > In the last episode (Nov 11), Ivan Voras said:
> >> Sam Fourman Jr. wrote:
> >> > I am running FreeBSD 8.0RC2 and I dont understand why my ZFS/NFS is
> >> > acting weird on writes.   I get ~150mbit writes idk if this is good
> >> > or not?   but it paused for a few seconds every once and awhile.
> >>
> >> You didn't give any "iostat" statistics - I suspect that if you
> >> correlate ifstat and iostat output that you will see that network
> >> "pauses" happen during spikes in IO.  You should check for this and
> >> post your results.
> >
> > Yes, iostat would be useful here.  "iostat -zxC 2" will give you
> > per-disk stats plus CPU usage every 2 seconds (CPU may be a factor if
> > you have compression enabled).
> >
> > On a Solaris box I admin, setting zfs_write_limit_override helped
> > stuttering while doing heavy writes.   It's not exported on FreeBSD, but
> > it should be easy to add it as a RW sysctl; it lives in dsl_pool.c and
> > can be tweaked at runtime.   Start big and tune it down so each write
> > burst takes under a second; it looks like you're writing solid for
> > around 6-8 seconds now.   The number will vary depending on your disk
> > speed and how much ARC you have.
> 
> here are some iostats for you. I do not believe I have compression enabled
> am I mistaken?  isn'y SATA2 300MB/s?  and I am doing ~6MB/s per disk?  I
> built this machine with 4GB of memory because I thought ZFS would like it. 
> now maybe a re(4) interface isnt the best choice.  if that is the problem
> here I can change it.  We spent ~$800 on a hardware RAID card thinking
> that it would help performance
> 
> Why is it that with sftp we do not see the pauses in Network transfer?

Your service times below are worrying (both for the NFS and sftp cases);
anything above 50ms for a sustained period is usually a problem.  You
mention hardware RAID, but I see 6 da# devices; are these just hooked up in
passthrough mode, or is each da device backed by multiple SATA disks?  What
kind of write caching do you have enabled on the RAID?  Can you disable it? 
It sort of looks like zfs is bursting more data to disk than the RAID card
has RAM for, and it's spending multiple seconds just trying to recover. 
It's also odd that you're seeing queue depths up to 70 on those disks when
zfs by default should only do 35 (sysctl vfs.zfs.vdev.max_pending).

NFS has data consistency guarantees that require it to flush to disk more
often than your sftp connection, so that may explain why the disk behaviour
is different.

> # iostat -zxC 1 (with NFS and ZFS)
[...]
>                         extended device statistics             cpu
> device     r/s   w/s    kr/s    kw/s wait svc_t  %b  us ni sy in id
> da0        0.0 104.8     0.0  6057.3   67 624.9 100   0  0  1  1 98
> da1        0.0 104.8     0.0  6057.3   63 531.2 100
> da2        0.0 104.8     0.0  6057.3   68 625.4 100
> da3        0.0 104.8     0.0  6032.3   63 551.6 100
> da4        0.0 104.8     0.0  6057.3   68 625.7 100
> da5        0.0 105.8     0.0  6121.1   61 510.1 101
>                         extended device statistics             cpu
> device     r/s   w/s    kr/s    kw/s wait svc_t  %b  us ni sy in id
> da0        0.0 106.8     0.0  6160.2   68 616.8 100   0  0  2  1 97
> da1        0.0 106.8     0.0  6160.2   62 522.8 100
> da2        0.0 106.8     0.0  6160.2   67 617.1 100
> da3        0.0 106.8     0.0  6185.1   62 522.8 100
> da4        0.0 106.8     0.0  6160.2   67 617.4 100
> da5        0.0 106.8     0.0  6160.2   62 503.4 100

-- 
	Dan Nelson
	dnelson_at_allantgroup.com