Re: cvs commit: src/sys/kern sched_ule.c

From: Kris Kennaway <kris_at_FreeBSD.org> Date: Tue, 04 Mar 2008 11:03:52 +0100 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:28 UTC

Jeff Roberson wrote:
> On Tue, 4 Mar 2008, Garrett Wollman wrote:
> 
>> <<On Sat, 1 Mar 2008 22:29:50 -1000 (HST), Jeff Roberson 
>> <jroberson_at_chesapeake.net> said:
>>
>>> Kris has done some excellent benchmarking as usual.  Here you can see 
>>> the
>>> improvement in postgres depending on various scheduler debug settings:
>>
>>> http://people.freebsd.org/~kris/scaling/pgsql-16cpu.png
>>
>> Can you comment on the area under the knee in the 8-cpu topologies?  I
>> seems surprising that 16 cores performs worse than 8 cores in this
>> regime.
> 
> Depending on the flags you can see different scaling properties of 
> different cpu selection algorithms.  That's what the userret=x, 
> tryself=y parameters are changing.  Certain parameters can cause less 
> concurrency which works better when workloads are heavily contended.
> 
> See the light blue line, tryself=0, userret=0.  This scales up more 
> poorly because there is less concurrency when there is no lock 
> contention but behaves better when there is contention because we're 
> less likely to distribute load that would preempt a lock holder.
> 
> The default settings scale the best when there is little or no 
> contention. That's userret=1, tryself=1.  There are other parameters 
> that are important but these were the ones we were most recently 
> experimenting with.  This drops off harshly when there is significant 
> contention because most of the threads end up blocked against the same 
> lock and may be preempted then rely on priority propagation to kick in.
> 
> The default settings should encourage further refinements to subsystem 
> locking to yield the best performance.

I didnt run the 8-core configuration with the ULE topology patch, so 
part of the reason why it has a kink at 5 threads is probably due to 
poor scheduling.  This system is very sensitive to scheduling decisions, 
as you can see from the previous CVS curve.

I think there is also something else going on at high loads (>15) on 
this test, so it should be viewed as a WIP.  Specifically, contention 
doesnt seem to be high enough to account for a 30% performance drop, and 
I see similar drops on other 8-core tests where contention is eliminated.

What you should focus on is the large difference between the green curve 
showing previous CVS performance, with the brown curve showing current 
default performance.

Kris