Re: panic on 5.2 BETA: blockable sleep lock

From: Don Lewis <truckman_at_FreeBSD.org>
Date: Tue, 25 Nov 2003 23:33:31 -0800 (PST)
On 26 Nov, Stefan Ehmann wrote:
> I got the following panic twice when starting xawtv using 5.2 BETA (CVS
> from Oct 23)
> panic: blockable sleep lock (sleep mutex) sellck
> _at_/usr/src/sys/kern/sys_generic.c:1145
> 
> I don't think it is directly related to bktr since the last commit there
> was ~3 months ago. Here is the dmesg output for completeness though.
> bktr0: <BrookTree 878> mem 0xdf103000-0xdf103fff irq 10 at device 12.0
> on pci0
> bktr0: Card has no configuration EEPROM. Cannot determine card make.
> bktr0: IMS TV Turbo, Philips FR1236 NTSC FM tuner.
> 
> 
> Here is a backtrace:
> 
> GNU gdb 5.2.1 (FreeBSD)
> Copyright 2002 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you
> are
> welcome to change it and/or distribute copies of it under certain
> conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for
> details.
> This GDB was configured as "i386-undermydesk-freebsd"...
> panic: blockable sleep lock (sleep mutex) sellck _at_
> /usr/src/sys/kern/sys_generic.c:1145
> panic messages:
> --
> panic: blockable sleep lock (sleep mutex) sellck _at_
> /usr/src/sys/kern/sys_generic.c:1145
> 
> syncing disks, buffers remaining... 3023 3023 panic: mi_switch: switch
> in a critical section

> #5  0xc05153b7 in panic () at /usr/src/sys/kern/kern_shutdown.c:550
> #6  0xc053c010 in witness_lock (lock=0xc076cf40, flags=8, 
>     file=0xc06d2139 "/usr/src/sys/kern/sys_generic.c", line=1145)
>     at /usr/src/sys/kern/subr_witness.c:609
> #7  0xc050b57a in _mtx_lock_flags (m=0xc3a56dc0, opts=0, 
>     file=0xc06ff79c "\037#nÀ\t", line=-1065955520)
>     at /usr/src/sys/kern/kern_mutex.c:221
> #8  0xc0540436 in selrecord (selector=0xc076cf40, sip=0xc3a56dc0)
>     at /usr/src/sys/kern/sys_generic.c:1145
> #9  0xc08509f1 in bktr_poll () from /boot/kernel/bktr.ko


The bktr_poll() implementation invokes the macros DISABLE_INTR() and
ENABLE_INTR() as a wrapper around a block of code that calls
selrecord().

        DISABLE_INTR(s);
 
        if (events & (POLLIN | POLLRDNORM)) {
   
                switch ( FUNCTION( minor(dev) ) ) {
                case VBI_DEV:
                        if(bktr->vbisize == 0)
                                selrecord(td, &bktr->vbi_select);
                        else
                                revents |= events & (POLLIN | POLLRDNORM);
                        break;
                }
        }

        ENABLE_INTR(s);

The implementation of DISABLE_INTR() and ENABLE_INTR() is defined in
bktr_os.h as:

#if defined(__FreeBSD__)
#if (__FreeBSD_version >=500000)
#define DECLARE_INTR_MASK(s)    /* no need to declare 's' */
#define DISABLE_INTR(s)         critical_enter()
#define ENABLE_INTR(s)          critical_exit()
#else


The man page for critical_enter() and friends says:

DESCRIPTION
     These functions are used to prevent preemption in a critical region of
     code.  All that is guaranteed is that the thread currently executing on a
     CPU will not be preempted.  Specifically, a thread in a critical region
     will not migrate to another CPU while it is in a critical region.  The
     current CPU may still trigger faults and exceptions during a critical
     section; however, these faults are usually fatal.

     The cpu_critical_enter() and cpu_critical_exit() functions provide the
     machine dependent disabling of preemption, normally by disabling inter-
     rupts on the local CPU.

     The critical_enter() and critical_exit() functions provide a machine
     independent wrapper around the machine dependent API.  This wrapper cur-
     rently saves state regarding nested critical sections.  Nearly all code
     should use these versions of the API.

     Note that these functions are not required to provide any inter-CPU syn-
     chronization, data protection, or memory ordering guarantees and thus
     should not be used to protect shared data structures.

     These functions should be used with care as an infinite loop within a
     critical region will deadlock the CPU.  Also, they should not be inter-
     locked with operations on mutexes, sx locks, semaphores, or other syn-
     chronization primitives.


The problem is that selrecord() wants to lock a MTX_DEF mutex, which can
cause a context switch if the mutex is already locked by another thread.
This is contrary to what bktr_poll() wants to accomplish by calling
critical_enter().

I'm guessing that bktr_poll() wants to prevent bktr->vbisize from being
updated by an interrupt while the above mentioned block of code is
executing, possibly to prevent a select wakeup from being missed.  If
this is correct, the proper fix would probably be for a mutex to be
added to the bktr structure, and the DISABLE_INTR() and ENABLE_INTR()
calls above to lock and unlock this mutex.  Other places in the code
that manipulate bktr->vbisize should grab the mutex before doing so, and
there are places in the code that are currently calling tsleep() that
should probably be converted to use msleep() with this new mutex.  One
example is this code fragment from vbi_read():

        while(bktr->vbisize == 0) {
                if (ioflag & IO_NDELAY) {
                        return EWOULDBLOCK;
                }
   
                bktr->vbi_read_blocked = TRUE;
                if ((status = tsleep(VBI_SLEEP, VBIPRI, "vbi", 0))) {
                        return status;
                }
        }
Received on Tue Nov 25 2003 - 22:33:40 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:31 UTC