Re: Expensive timeout(9) function ?

From: John Baldwin <jhb_at_FreeBSD.org>
Date: Tue, 06 Jan 2004 10:21:23 -0500 (EST)
On 05-Jan-2004 Bruce Evans wrote:
> On Sun, 4 Jan 2004, Bjoern A. Zeeb wrote:
> 
>> what reports do you expect with the
>>
>>      "Expensive timeout(9) function"
>>
>> message ?
> 
> What you reported (function names and timeout time) is interesting.
> 
> Why do we see it ?
> 
> Kernel bugs :-).
> 
>> Expensive timeout(9) function: 0xc04885a0(0) 1.024846430 s   [1]
>> Expensive timeout(9) function: 0xc04885a0(0) 1.024846430 s   [1]
>> Expensive timeout(9) function: 0xc04b3940(0) 0.008629758 s   [2]
>> Expensive timeout(9) function: 0xc04b39a0(0) 0.004333781 s   [2]
>> Expensive timeout(9) function: 0xc04f71f0(0) 0.027004551 s   [3]
>> Expensive timeout(9) function: 0xc04f71f0(0) 0.027004551 s   [3]
>> Expensive timeout(9) function: 0xc04f71f0(0) 0.027004551 s   [3]
>>
>> [1] sys/kern/kern_synch.c:loadav()
>> [2] sys/kern/uipc_domain.c:pfslowtimo()
>> [3] sys/netinet/ip_fw2.c:ipfw_tick()
> 
> [1] is easiest to understand.  loadav() is obviously broken since it uses
> sleep locks.  Apparently it sometimes sleeps for more than 1 second
> altogether!  There is a check for sleeping in timeouts under DIAGNOSTIC.
> I would expect to complaints from this too if you just used DIAGNOSTIC
> to get the above.
> 
> [3] ipfw_tick() is obviously broken in the same way.  This is from
> blind conversion of splimp() to a sleep lock.  Mutexes work quite
> differently from spl's.  A quick fix for timeout routines that only
> lock things once might be to use mtx_trylock() and not do anything in
> the timeout routine (except re-arm the timeout, perhaps with a smaller
> interval) if the mutex cannot be acquired immediately.  This depends
> on the exact timing of timeout routines not being critical (not that
> we have exact timing -- the above shows all timeouts being delayed by
> a factor of at least 100 (1 second instead of 1/100 seconds)).  This
> should work expecially well in loadav() -- loadav() intentionally adds
> jitter to the interval.  This might have worked in schedcpu() too
> (schedcpu() was converted to a thread).

Ugh, loadav() needs to move to a thread, too, then.  Perhaps loadav()
and schedcpu() can share a thread by having the schedcpu thread just
run loadav() occasionally.

-- 

John Baldwin <jhb_at_FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
Received on Tue Jan 06 2004 - 06:21:26 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:36 UTC