Re: Serious performance issues, broken initialization, and a likely fix

From: Alexander Kabaev <kan_at_freebsd.org> Date: Tue, 9 Aug 2005 02:48:13 +0000 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:40 UTC

On Mon, 08 Aug 2005 19:43:29 -0600
Scott Long <scottl_at_samsco.org> wrote:

> Ade Lovett wrote:
> > Or perhaps it should be just "Here be dragons"...
> > 
> > Whilst attempting to nail down some serious performance issues
> > (compared with 4.x) in preparation for a 6.x rollout here, we've
> > come across something of a fundamental bug.
> > 
> > In this particular environment (a Usenet transit server, so very
> > high network and disk I/O) we observed that processes were spending
> > a considerable amount of time in state 'wswbuf', traced back to
> > getpbuf() in vm/vm_pager.c
> > 
> > To cut a long story short, the order in which nswbuf is being
> > initialized is completely, totally, and utterly wrong -- this was
> > introduced by revision 1.132 of vm/vnode_pager.c just over 4 years
> > ago.
> > 
> > In vnode_pager.c we find:
> > 
> > static void
> > vnode_pager_init(void)
> > {
> > 	vnode_pbuf_freecnt = nswbuf / 2 + 1;
> > }
> > 
> > Unfortunately, nswbuf hasn't been assigned to yet, just happens to
> > be zero (in all cases), and thus the kernel believes that there is
> > only ever *one* swap buffer available.
> > 
> > kern_vfs_bio_buffer_alloc() in kern/vfs_bio.c which actually does
> > the calculation and assignment, is called rather further on in the
> > process, by which time the damage has been done.
> > 
> > The net result is that *any* calls involving getpbuf() will be
> > unconditionally serialized, completely destroying any kind of
> > concurrency (and performance).
> > 
> > Given the memory footprint of our machines, we've hacked in a
> > simple:
> > 
> > 	nswbuf = 0x100;
> > 
> > into vnode_pager_init(), since the calculation ends up giving us the
> > maximum number anyway.  There are a number of possible 'correct'
> > fixes in terms of re-ordering the startup sequence.
> > 
> > With the aforementioned hack, we're now seeing considerably better
> > machine operation, certainly as good as similar 4.10-STABLE boxes.
> > 
> > As per $SUBJECT, this affects all of RELENG_5, RELENG_6, and HEAD,
> > and should, IMO, be considered an absolutely required fix for
> > 6.0-RELEASE.
> > 
> > -aDe
> > 
> 
> My vote is to revert rev 1.132 and replace the XXX comment with a more
> detailed explaination of the perils involved.  Do you have any kind of
> easy to run regression test that could be used to quantify this
> problem and guard against it in the future?  Thanks very very much
> for looking into it and providing such a good explaination.
> 
> Scott
> _______________________________________________

I experimented with calling vm_pager_init at vm_pager_bufferinit time
instead of calling it as last thing in vm_mem_init and my test system
runs with no (visible) ill effects. I wonder if we can collapse
vm_pager_init and vm_pager_bufinit into single function and get rid of
pages initialization at SI_SUB_VM time. I guess now would be a good
time to ask our VM know-how holders. 

I do support reverting rev1.132 of vm_pager.c in RELENG_5 and RELENG_6
as a more conservative and safe choice though.
-- 
Alexander Kabaev