Re: ULE 2.0

From: Jeff Roberson <jroberson_at_chesapeake.net> Date: Thu, 4 Jan 2007 23:08:58 -0800 (PST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:04 UTC

On Fri, 5 Jan 2007, David Xu wrote:

> Jeff Roberson wrote:
>
>
>> I just looked it up.  The solaris algorithm prevents direct preemption if 
>> the priority is not in the kernel or realtime priority range.  Otherwise it 
>> simply sets NEEDRESCHED.  When assigning a thread to a cpu it checks to see 
>> if the last cpu it executed on is running a higher priority thread (lower 
>> for solaris where higher is better) and if so puts it there. Otherwise it 
>> finds the cpu which is executing the lowest priority thread and assigns it 
>> there.
>> 
>> This basically balances the cpus by priority rather than by load. Balancing 
>> by priority indirectly balances by load because threads which have slept 
>> for a long time will eventually have a higher priority.  This is pretty 
>> smart, although it ends up inspecting per-cpu state of remote cpus on 
>> almost every wakeup.  I would expect this to perform badly but perhaps the 
>> improvement in user-space speed outweighs the kernel time.
>> 
> The lookup does not always happen, there is a per-cpu rechoose value, it
> might be some ticks, a thread slept long enough (exceed the interval
> ticks) will end up looking for a cpu which has lower priority can run
> it, if it can not find one, it will stay at a cpu which has lowest priority 
> but closest to the thread's last cpu, this is done by looking
> a cpu from leaf to root in a cpu topology tree. if the resumed thread
> is still within the interval, it will stay at its last cpu.
> The alogrithm bets that after some ticks, a sleeping thread's cache is 
> trashed by other threads, so it should look for a cpu can run it
> immediately, but still closest to its last cpu. if a cpu is selected,
> it will look for its next cpu which is neighbour in a circle link list,
> check its runqueue length at the priority, and may use the next cpu
> rather than the cpu returned by above algorithm, the circle link list
> might link cpus within same locality together, e.g dual-core cpus
> might be a candidate. there is another path which makes a chip balance, e.g 
> it tries to balance two chips which are multiple-cores, and make them have 
> same load ratio. I am not Solaris expert, so above may not be
> very accurate. :-)

Thanks for the description.  I'm going to look into this over the next few 
weeks.

>
>> I'll investigate this furher.  One slight problem with this approach for 
>> ULE is that sitting on the run-queue for a long period of time improves 
>> your position on the run-queue but not directly your priority.  I think I 
>> can solve this though by resetting the priority as soon as you run which 
>> will take into account the ticks that you were waiting.
>
> Another problem in linux and ULE scheduler is they are less history
> friendly within past load, it only looks the thread itself's run time
> and sleep time, and don't know the system load, and calculate its priority 
> based on this, under heavy load, it could be very inaccuracy,
> maybe you still need some anti-unfairness code, for example a thread
> stayed on a runqueue too long, ULE boosts its priority a bit, though I
> don't know how to do it in cheapest way.

This has changed significantly with the ULE 2.0 commit.  I no longer have 
the split run queue.  ULE now boosts the priority as the thread sits on 
the run queue via a circular queue mechanism.

>
>> Anyway, thanks for the pointer Xu.  I may hack this up and compare it to 
>> ULE's current balancing.
>
> Yesterday, I have tested super-smack benchmark on 2-cpu machine, ULE
> decreased performance about 40%, this might be a regression though.

I have found significant new bugs that were introduced mostly by changes 
in the threading system that ULE did not keep up with.  It was, for a 
time, slightly faster than 4BSD on supersmack on several of my machines. 
I know it can be improved.  I'm going to look into this sun algorithm in 
the next few weeks and see how it works out with ULE.

Thanks,
Jeff

>
>> 
>> Cheers,
>> Jeff
>> 
>
> Regards,
> David Xu
>