Re: [PATCH] microoptimize locking primitives by introducing randomized delay between atomic ops

From: Mateusz Guzik <mjguzik_at_gmail.com>
Date: Sun, 10 Jul 2016 14:44:28 +0200
On Sun, Jul 10, 2016 at 03:22:47PM +0300, Konstantin Belousov wrote:
> On Sun, Jul 10, 2016 at 01:13:26PM +0200, Mateusz Guzik wrote:
> > If the lock is contended, primitives like __mtx_lock_sleep will spin
> > checking if the owner is running or the lock was freed. The problem is
> > that once it is discovered that the lock is free, multiple CPUs are
> > likely to try to do the atomic op which will make it more costly for
> > everyone and throughput suffers.
> > 
> > The standard thing to do is to have some sort of a randomized delay so
> > that this kind of behaviour is reduced.
> > 
> > As such, below is a trivial hack which takes cpu_ticks() into account
> > and performs % 2048, which in my testing gives reasonbly good results.
> > 
> > Please note there is definitely way more room for improvement in general.
> > 
> > In terms of results, there was no statistically significant change in
> > -j 40 buildworld nor buildkernel.
> > 
> > However, a 40-way find on a ports tree placed on tmpfs yielded the following:
> 
> I am curious why did you added randomizer to sx adaptive loop but not to
> lockmgr loop, and probably most important, to the spinlocks (unless I
> misread the patch).

This is a simple first step where I modified loops which I suspect
benefit the most from the trivial change.

lockmgr and other places do have loops with a configurable but the same
delay for everyone. So they are already somewhat randomized by the time
of the arrival in the primitive.

On the other hand loops modified in the patch all check for the
availability of the lock without constant delay and thus are
significantly more suspectible to trying to grab it at the same time.

That said, I do intent to take care of the "static" loops as well later.

Meanwhile, running:
time (find . -depth 2 -type dir | xargs -n 1 -P 40 -I DIR make -C DIR
build-depends-list > /dev/null 2> /dev/null)

drops from ~4:00 minutes to ~2:42 with the patch applied,

-- 
Mateusz Guzik <mjguzik gmail.com>
Received on Sun Jul 10 2016 - 10:44:33 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:06 UTC