On Wed, Sep 09, 2009 at 03:01:37AM +0200, Luigi Rizzo wrote: > RELENG_8/amd64 (can not try on i386) has the following problem: > > callout_reset(..., t, ...) > > processes the callout after t+1 ticks instead of the t ticks > that one sees on RELENG_7 and earlier. > > I found it by pure chance, noticing that dummynet on RELENG_8 > has a jitter that is two ticks instead of one tick. > Other systems with rely on frequent callouts might see > problems as well. > > An indirect way to see the problem is the following: > > kldload dummynet > > sysctl net.inet.ip.dummynet.tick_adjustment; \ > sleep 1; sysctl net.inet.ip.dummynet.tick_adjustment > > on a working system, the variable should remain mostly unchanged; > on a non-working system you see it growing at a rate HZ/2 > > I suspect the bug is introduced by the change in kern_timeout.c rev. 1.114 > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/kern/kern_timeout.c.diff?r1=1.113;r2=1.114 > > which changes softclock() to stop before the current 'ticks' > so processes everything one tick late. > > I understand the race described in the cvs log, but this does > not seem a proper fix -- it violates POLA by changing the semantics > of callout_reset(), and does not really fix the race, but just adds > an extra uncertainty of 1 tick on when a given callout will be run > > If the problem is a race between hardclock() which updates 'ticks', > and the various hardclock_cpu() instances which might arrive early, > I would suggest two alternative options: > > 1. create a per-cpu 'ticks' (say a field cc_ticks in struct callout_cpu), > increment it at the start of hardclock_cpu, and use cc->ticks > instead of 'ticks' in the various callout functions involved > with manipulation of the callwheel > callout_tick(), softclock(), callout_reset_on() > > 2. start softclock() at cc->cc_softticks -1, i.e. > > ... > CC_LOCK(cc) > - while (cc->cc_softticks != ticks) { > + while (cc->cc_softticks-1 != ticks) { > ... #2 also need this change in callout_tick() mtx_lock_spin_flags(&cc->cc_lock, MTX_QUIET); - for (; (cc->cc_softticks - ticks) < 0; cc->cc_softticks++) { + for (; (cc->cc_softticks - ticks) <= 0; cc->cc_softticks++) { bucket = cc->cc_softticks & callwheelmask; Just tested it, it seems to fix the problem. cheers luigiReceived on Tue Sep 08 2009 - 23:10:10 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:55 UTC