Re: 7.0 CURRENT, need help with panic: Trying sleep, but thread marked as sleeping prohibited

From: Robert Watson <rwatson_at_FreeBSD.org>
Date: Wed, 17 Oct 2007 11:40:47 +0100 (BST)
On Tue, 16 Oct 2007, Victor M. Blood wrote:

>> Do you have the debugging options (INVARIANTS and INVARIANT_SUPPORT)
>> enabled?  They are now disabled in RELENG_7.  If not you will just get a
>> deadlock when you are unlucky :)
> Yes, all debug options are stay by default.
> = KERN
> ...
> # Debugging for use in -current
> options         KDB                     # Enable kernel debugger support.
> options         DDB                     # Support DDB.
> options         GDB                     # Support remote GDB.
> options         INVARIANTS              # Enable calls of extra sanity checking
> options         INVARIANT_SUPPORT       # Extra sanity checks of internal structures, required by INVARIANTS
> options         WITNESS                 # Enable checks to detect deadlocks and cycles
> options         WITNESS_SKIPSPIN        # Don't run witness on spinlocks for speed
> ...
> = END KERN
>
> I have updated ipfilter to 4.1.27 until cvsup with tag=RELING_7 and sources 
> was from 10.10.2007 which system shows me panic with _sx_sleep() and 
> _rw_sleep() with ipf 4.1.23. After some tests :)) /ping, smbfs, telnet, 
> etc.../ Update of ipfilter was finished at 17:00 GMT+3, now: # uptime 23:51 
> up 7:12, 2 users, load averages: 1,32 1,23 1,09
>
> Now I'm have update CVS-tree, and build world for RELENG_7. With ipfilter 
> 4.1.23 system stay alive 1-2 min with inet work, I have been compelled to 
> disable ipfilter (ipf -D) for work with network. While any failures are not 
> present, all works normally.

The bug in ipfilter has to do with using a sleepable lock class in an 
interrupt or a software interrupt thread.  This can lead to deadlocks, 
although is relatively unlikely to do so, so is reported by invariants testing 
as a fatal condition.  The panic won't turn up without invariants enabled, and 
in practice the deadlock is quite unlikely, but reflects a violation of the 
assumptions under which kernel synchronization is designed to work. 
Switching to a non-sleepable lock class doesn't provide an instant solution 
because the non-sleepable lock will then be held over a potentially sleepable 
path for managing the firewall from user space (if a copyin/copyout results in 
a page fault that sleeps waiting on disk I/O, or worse, network I/O from 
network-backed swap, which could lead directly to the deadlock).  Chances are, 
this is relatively easy to fix, but someone needs to do that -- ideally 
someone very familiar with ipfilter. :-)

In practice, I wouldn't expect the deadlock to occur much/at all, FWIW, so 
users with common configurations won't run into a problem, so with invariants 
disabled you may well be fine.

Robert N M Watson
Computer Laboratory
University of Cambridge
Received on Wed Oct 17 2007 - 08:40:48 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:19 UTC