Re: machdep.cpu_idle_hlt and SMP perf?

From: John Baldwin <jhb_at_freebsd.org>
Date: Thu, 9 Feb 2006 11:13:30 -0500
On Wednesday 08 February 2006 12:17, Andrew Gallatin wrote:
> John Baldwin writes:
>  > On Tuesday 07 February 2006 17:46, Andrew Gallatin wrote:
>  > > John Baldwin writes:
>  > >  > On Tuesday 07 February 2006 17:15, Andrew Gallatin wrote:
>  > >  > > John Baldwin writes:
>  > >  > >  > On Monday 06 February 2006 17:37, Andrew Gallatin wrote:
>  > >  > >  > > John Baldwin writes:
>  > >  > >  > >  > On Monday 06 February 2006 14:46, Andrew Gallatin wrote:
>  > >  > >  > >  > > Andre Oppermann writes:
>  > >  > >  > >  > >  > Andrew Gallatin wrote:
>  > >  > >  > >  > >  > > Why dooes machdep.cpu_idle_hlt=1 drop my 10GbE
>  > >  > >  > >  > >  > > network rx performance by a considerable amount
>  > >  > >  > >  > >  > > (7.5Gbs -> 5.5Gbs)?
>  > >  > >  > >  >
>  > >  > >  > >  > You may be seeing problems because it might simply take a
>  > >  > >  > >  > while for the CPU to wake up from HLT when an interrupt
>  > >  > >  > >  > comes in.  The 4BSD scheduler tries to do IPIs to wakeup
>  > >  > >  > >  > any sleeping CPUs when it schedules a new thread, but
>  > >  > >  > >  > that would add higher latency for ithreads than just
>  > >  > >  > >  > preempting directly to the ithread.  Oh, you have to turn
>  > >  > >  > >  > that on, it's off by default
>  > >  > >  > >  > (kern.sched.ipiwakeup.enabled=1).
>  > >  > >  > >
>  > >  > >  > > Hmm..  It seems to be on by default.  Unfortunately, it does
>  > >  > >  > > not seem to help.
>  > >  > >  >
>  > >  > >  > I'm not sure.
>  > >  > >
>  > >  > > One thing which really helps is disabling preemption.  If I do
>  > >  > > that, I get 7.7Gb/sec with machdep.cpu_idle_hlt=1.  This is
>  > >  > > slightly better than machdep.cpu_idle_hlt=0 and no PREEMPTION.
>  > >  > >
>  > >  > > BTW, net.isr.direct=1 in all testing.
>  > >  >
>  > >  > Do you have very little userland activity in this test?
>  > >
>  > > Essentially none.  netserver just sits in a loop, reading from the
>  > > socket and throwing the data away.
>  >
>  > If you disable preemption then in effect you are letting the idle CPUs
>  > pick up the ithread and not disturbing what is running on the non-idle
>  > CPU. sched_4bsd is supposed to be triggering the same behavior, except
>  > that it has to send an IPI to awaken the idle CPUs.  When you have
>  > idle_hlt=0, there are no idle CPUs, so 4bsd thinks they are all busy and
>  > preempts.  When you disable preemption, it just leaves the ithread on
>  > the runqueue until one of the idle CPUs notices the new thread in its
>  > idle loop and runs it.  When you have idle_hlt=1, then 4bsd doesn't
>  > preempt but sends an IPI.  It doesn't even try to preempt unless it
>  > thinks all CPUs are busy.
>
> I wish we had a lightweight way to watch all this stuff.  I can't
> wait for dtrace.

You can try using KTR with KTR_SCHED and then using schedgraph.py to look at 
what happens.  I'm not sure how lightweight that might be if you just have 
KTR on and no other debug stuff.

> FWIW, if I use SCHED_ULE, performance sucks regardless of idle_hlt.

Hmmmm.

>  > One thing disabling PREEMPTION does is that it enables some explicit
>  > FULL_PREEMPTION-like behavior in _mtx_unlock_sleep().  You might want to
>  > try #if 0'ing that code out to see if that is why having PREEMPTION off
>  > makes a difference.  (Ironically, having PREEMPTION on means
>  > _mtx_unlock_sleep() will preempt less often.)
>
> Removing that code did not seem to matter.  I still get good
> performance with SCHED_4BSD, PREEMPTION disabled, idle_hlt=1, and that
> code removed.

Ok.  Hmmmmm.

-- 
John Baldwin <jhb_at_FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org
Received on Thu Feb 09 2006 - 15:13:59 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:52 UTC