Re: Poor NFS server performance in 6.0 with SMP and mpsafenet=1

From: Gavin Atkinson <gavin.atkinson_at_ury.york.ac.uk> Date: Tue, 13 Dec 2005 13:50:17 +0000 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:49 UTC

On Thu, 2005-11-10 at 11:45 +0000, Gavin Atkinson wrote:
> On Wed, 2 Nov 2005, Robert Watson wrote:
> > On Wed, 2 Nov 2005, Gavin Atkinson wrote:
> >> On Wed, 2005-11-02 at 16:23 +0100, Bernd Walter wrote:
> >>> On Wed, Nov 02, 2005 at 02:58:36PM +0000, Gavin Atkinson wrote:
> >>>> I'm seeing incredibly poor performance when serving files from an SMP
> >>>> FreeBSD 6.0RC1 server to a Solaris 10 client.  I've done some
> >>>> experimenting and have discovered that either removing SMP from the
> >>>> kernel, or setting debug.mpsafenet=0 in loader.conf massively improves
> >>>> the speed.  Switching preemption off seems to also help.
> >>> Which scheduler?
> >> 
> >> BSD.  As I say, I'm running 6.0-RC1 with the standard GENERIC kernel, apart 
> >> from the options I have listed as being changed above.  Polling is 
> >> therefore also not enabled.
> >
> > This does sound like a scheduling problem.  I realize it's time-consuming, 
> > but would it be possible to have you run each of the above test cases twice 
> > more (or maybe even once) to confirm that in each case, the result is 
> > reproduceable?  I've recently been looking at a scheduling problem relating 
> > to PREEMPTION and the netisr for loopback traffic, and is basically a result 
> > of poorly timed context switching ending up being a worst cast scenario.  I 
> > suspect something similar is likely here.  Have you tried varying the number 
> > of nfsd worker threads on the server to see how that changes matters?
> 
> No problem.  Sorry it's taken so long to get back to you, it's been a 
> hectic week :( Anyway, the trend is consistantly reproducable, although 
> the results themselves can vary between runs in the SMP/mpsafenet cases by 
> as much as 20%. Here are the averages of three reruns, which I've also 
> done for ULE:
> 
>  					4BSD	 ULE
> No SMP, mpsafenet=1			 78.7	 62.7
> No SMP, mpsafenet=0			 71.1	 76.0
> No SMP, mpsafenet=1, no PREEMPTION	 54.7	 55.5
> No SMP, mpsafenet=0, no PREEMPTION	 73.6	 77.6
>     SMP, mpsafenet=1			346.5	309.5
>     SMP, mpsafenet=0			 56.9	 88.4
>     SMP, mpsafenet=1, no PREEMPTION	320.2	136.6
>     SMP, mpsafenet=0, no PREEMPTION	 57.0	 77.9
> 
> The above are results for 4 nfsd servers (nfsd -n 4).  It turns out that 
> you were correct in thinking that the number of nfsd processes would make 
> a difference, here are some timings for the GENERIC+SMP kernel (eg with 
> PREEMPTION/4BSD, the slowest one above), with varying numbers of 
> processes:
> 
>    1	  2	  4	  8	  12	  16
> 52.8	59.2	319.3	356.1	377.3	388.1
> 
> As before, all tests were done with freshly rebooted server and with a 
> single "dry run" transfer to warm the vm cache up.  The file transferred 
> each time is 512meg worth of /dev/random output.  I'm actually quite 
> surprised about how much difference reducing the number of threads made.
> 
> Does all of this information help track down the cause of the problem? 
> I'm happy to time more transfers with different configs if you want to 
> explore other avenues.

Any further thoughts on this?  The machines will be going live and into
colo within the week, so I'll lose the chance to test anything further
on them.

Gavin