On Mon, Aug 01, 2016 at 11:37:50AM -0700, John Baldwin wrote: > On Sunday, July 31, 2016 02:41:13 PM Mateusz Guzik wrote: > > On Sun, Jul 31, 2016 at 01:49:28PM +0300, Konstantin Belousov wrote: > > [snip] > > > > After an irc discussion, the following was produced (also available at: > > https://people.freebsd.org/~mjg/lock_backoff_complete4.diff): > > > > Differences: > > - uint64_t usage was converted to u_int (also see r303584) > > - currently unused features (cap limit and return value) were removed > > - lock_delay args got packed into a dedicated structure > > lock_delay_enabled declaration seems to be stale? > Oops, thanks. > I would maybe just provide a "standard" lock_delay_init function that the > sysinit's use rather than duplicating the same exact code 3 times. I'm > not sure we really want to use different tunables for different lock types > anyway. (Alternatively we could even just have a single 'config' variable > that is a global. We can always revisit this in the future if we find that > we need that granularity, but it would remove an extra pointer indirection > if you just had a single 'lock_delay_config' that was exported as a global > for now and initialized in a single SYSINIT.) > The per-lock type config is partially an artifact of the real version of the patch which has different configs per state of the lock, see loops with rowner_loops in the current implementation of rw and sx locks and this is were it mattered. It was cut off from this patch for simplicity (90% of the benefit for 10% of the work). That said, fine tuned it does matter for "mere" spinning as well but here I put very low values on purpose. Putting them all in one config makes for a small compatibility issue, where debug.lock.delay_* sysctls would disappear later. So I would prefer to just keep this as I don't think it matters much. I have further optimisation to primitives not related to spinning. They boil down to the fact that KDTRACE_HOOKS-enabled kernels contain an unconditional function call to lockstat_nsecs even with the lock held. > I think the idea is fine. I'm less worried about the overhead of the > divide as you are only doing it when you are contesting (so you are already > sort of hosed anyway). Long delays in checking the lock cookie can be > bad (see my local APIC snafu which only polled once per microsecond). I > don't really think a divide is going to be that long? > This should be perfectly fine. One could argue the time wasted should be wasted efficiently, i.e. the more cpu_spinwait, the better, at least on amd64. -- Mateusz Guzik <mjguzik gmail.com>Received on Mon Aug 01 2016 - 18:08:49 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:07 UTC