Re: 5.2-BETA panic: page fault

From: Don Lewis <truckman_at_FreeBSD.org> Date: Sun, 30 Nov 2003 14:42:16 -0800 (PST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:32 UTC

On 30 Nov, Stefan Ehmann wrote:
> On Sun, 2003-11-30 at 11:13, Stefan Ehmann wrote:
>> This happens to me several times a day (cvsup from yesterday didn't
>> change anything). The panic message is always the same, the backtrace is
>> different though (but always seems to be file system related in some
>> way)
>> 
>> Here's one from today:
> 
> As per request I made a (hopefully more useful) backtrace with a patched
> gdb version:
> 
> (kgdb) bt

> #12 0xc050f8f8 in panic () at /usr/src/sys/kern/kern_shutdown.c:550
> #13 0xc068248c in trap_fatal (frame=0xd7f2ea48, eva=0)
>     at /usr/src/sys/i386/i386/trap.c:821
> #14 0xc0682152 in trap_pfault (frame=0xd7f2ea48, usermode=0, eva=0)
>     at /usr/src/sys/i386/i386/trap.c:735
> #15 0xc0681d63 in trap (frame=
>       {tf_fs = -672006120, tf_es = -672006128, tf_ds = -1068105712,
> tf_edi = -1066664931, tf_esi = 228, tf_ebp = -671946076, tf_isp =
> -671946124, tf_ebx = 0, tf_edx = 16777217, tf_ecx = -1011687424, tf_eax
> = -1011660928, tf_trapno = 12, tf_err = 0, tf_eip = -1068475565, tf_cs =
> 8, tf_eflags = 66178, tf_esp = 2, tf_ss = -1011660928}) at
> /usr/src/sys/i386/i386/trap.c:420
> #16 0xc06743d8 in calltrap () at {standard input}:94
> #17 0xc0505b53 in _mtx_lock_flags (m=0x0, opts=0, 
>     file=0xc06bfc1d "/usr/src/sys/kern/kern_lock.c", line=228)
>     at /usr/src/sys/kern/kern_mutex.c:214
> #18 0xc0502b54 in lockmgr (lkp=0xc3b2e028, flags=0, interlkp=0xe4, 
>     td=0xc06bfc1d) at /usr/src/sys/kern/kern_lock.c:228
> #19 0xc0566d87 in vfs_busy (mp=0x0, flags=0, interlkp=0x0, td=0x0)
>     at /usr/src/sys/kern/vfs_subr.c:527
> #20 0xc056374c in lookup (ndp=0xd7f2ec00) at
> /usr/src/sys/kern/vfs_lookup.c:559

It seems pretty clear that the panic is caused by passing a null pointer
to mtx_lock().  That is pretty clear from the eva=0 argument to
trap_pfault() and the m=0x0 argument to _mtx_lock_flags().  If you have
INVARIANTS defined, the first thing that _mtx_lock_flags() does is to
dereference m->mtx_object, which is at beginning of struct mtx.

There appears to be some stack spammage happening, and it is pretty much
consistent between this stack trace and the previous one displayed by
the unpatched version of gdb.  Notice how all the arguments to
vfs_busy() are NULL/0, but td and interlkp are passed directly to
lockmgr(), which has a non-NULL td and interlkp arguments, though the
interlkp argument looks seriously bogus.  Looking at the lockmgr() call
in vfs_busy(), I don't see how the flags argument to lockmgr() could be
0.  If the mp argument to vfs_busy() were really NULL, vfs_busy() would
have paniced before calling lockmgr().

I wonder if an interrupt handler is stomping on the stack ...