Re: Disk performance under CURRENT

From: Scott Long <scottl_at_freebsd.org>
Date: Fri, 21 May 2004 20:09:35 -0600
Kris Kennaway wrote:
> On Fri, May 21, 2004 at 03:47:29PM -0700, Kevin Oberman wrote:
> 
>>I just ran test of disk writing performance under V4 (STABLE) and V5
>>(CURRENT) and was surprised at the difference.
>>
>>The test was simple and not at all rigorous. Just a dd bs=256k
>>if=/dev/zero of=/dev/ad2. This is about the simplest way of dealing with
>>a disk. No file system or anything else. Just raw data to the device.
>>
>>Under STABLE, I get an average of 25 MB/sec to the disk. Under CURRENT,
>>it drops to 15 MB/sec. I did this because I had noted that it was now
>>taking over an hour to backup my system disk (40 GB) when it was only
>>taking 40 minutes when I was running V4.6. The STABLE system was built
>>yesterday. The CURRENT system last Sunday.
>>
>>Any idea why this is so much slower? It looks to me like it must be in
>>either geom or the disk driver.
> 
> 
> Yes, disk performance sucks on 5.x..this is something phk is planning
> to work on.
> 
> Kris

I'm not really sure what the smoking gun is here compared to 4.x.  Yes,
in order to do any sort of disk i/o you have to go through the VFS layer
which means that you are going to run into Giant, but once you leave
there and go into the block layer and below you shouldn't need Giant
anymore (assuming that you are using the ATA hardware and driver).  Is
the problem something related to the extra contexts in GEOM, or is it
an inefficiency with ATA locking, or is it that ithreads can't always
preempt?  I've experimented with taking out the g_up context, but it
doesn't appear to make a measurable difference.  Maybe g_down is
causing a bad effect, though I can't imagine why.  On high-end SMP
hardware, the AAC driver performs considerably better than 4.x in both
sequential and random I/O when multiple processes and/or threads are
involved.  For single threaded I/O on lower-end UP hardware, there is
little difference in my testing between 4.x and 5.x, though 4.x tends
to still win.

PHK's plans to short-circuit direct device I/O away from VFS will be
interesting and will almost certainly have a performance benefit, but
that still won't address the common case of doing I/O through a 
filesystem.  We need to really sit down and instrument the I/O path
and figure out what the bottlenecks are on both UP and SMP.  Maybe it
is 100% the fault of VFS and Giant, but we need real measurements to
find out.

Scott
Received on Fri May 21 2004 - 17:09:47 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:54 UTC