On Wednesday 08 February 2006 12:17, Andrew Gallatin wrote: > John Baldwin writes: > > On Tuesday 07 February 2006 17:46, Andrew Gallatin wrote: > > > John Baldwin writes: > > > > On Tuesday 07 February 2006 17:15, Andrew Gallatin wrote: > > > > > John Baldwin writes: > > > > > > On Monday 06 February 2006 17:37, Andrew Gallatin wrote: > > > > > > > John Baldwin writes: > > > > > > > > On Monday 06 February 2006 14:46, Andrew Gallatin wrote: > > > > > > > > > Andre Oppermann writes: > > > > > > > > > > Andrew Gallatin wrote: > > > > > > > > > > > Why dooes machdep.cpu_idle_hlt=1 drop my 10GbE > > > > > > > > > > > network rx performance by a considerable amount > > > > > > > > > > > (7.5Gbs -> 5.5Gbs)? > > > > > > > > > > > > > > > > You may be seeing problems because it might simply take a > > > > > > > > while for the CPU to wake up from HLT when an interrupt > > > > > > > > comes in. The 4BSD scheduler tries to do IPIs to wakeup > > > > > > > > any sleeping CPUs when it schedules a new thread, but > > > > > > > > that would add higher latency for ithreads than just > > > > > > > > preempting directly to the ithread. Oh, you have to turn > > > > > > > > that on, it's off by default > > > > > > > > (kern.sched.ipiwakeup.enabled=1). > > > > > > > > > > > > > > Hmm.. It seems to be on by default. Unfortunately, it does > > > > > > > not seem to help. > > > > > > > > > > > > I'm not sure. > > > > > > > > > > One thing which really helps is disabling preemption. If I do > > > > > that, I get 7.7Gb/sec with machdep.cpu_idle_hlt=1. This is > > > > > slightly better than machdep.cpu_idle_hlt=0 and no PREEMPTION. > > > > > > > > > > BTW, net.isr.direct=1 in all testing. > > > > > > > > Do you have very little userland activity in this test? > > > > > > Essentially none. netserver just sits in a loop, reading from the > > > socket and throwing the data away. > > > > If you disable preemption then in effect you are letting the idle CPUs > > pick up the ithread and not disturbing what is running on the non-idle > > CPU. sched_4bsd is supposed to be triggering the same behavior, except > > that it has to send an IPI to awaken the idle CPUs. When you have > > idle_hlt=0, there are no idle CPUs, so 4bsd thinks they are all busy and > > preempts. When you disable preemption, it just leaves the ithread on > > the runqueue until one of the idle CPUs notices the new thread in its > > idle loop and runs it. When you have idle_hlt=1, then 4bsd doesn't > > preempt but sends an IPI. It doesn't even try to preempt unless it > > thinks all CPUs are busy. > > I wish we had a lightweight way to watch all this stuff. I can't > wait for dtrace. You can try using KTR with KTR_SCHED and then using schedgraph.py to look at what happens. I'm not sure how lightweight that might be if you just have KTR on and no other debug stuff. > FWIW, if I use SCHED_ULE, performance sucks regardless of idle_hlt. Hmmmm. > > One thing disabling PREEMPTION does is that it enables some explicit > > FULL_PREEMPTION-like behavior in _mtx_unlock_sleep(). You might want to > > try #if 0'ing that code out to see if that is why having PREEMPTION off > > makes a difference. (Ironically, having PREEMPTION on means > > _mtx_unlock_sleep() will preempt less often.) > > Removing that code did not seem to matter. I still get good > performance with SCHED_4BSD, PREEMPTION disabled, idle_hlt=1, and that > code removed. Ok. Hmmmmm. -- John Baldwin <jhb_at_FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.orgReceived on Thu Feb 09 2006 - 15:13:59 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:52 UTC