Re: It's happening again (panic early in boot)

From: Ian FREISLICH <if_at_hetzner.co.za>
Date: Sat, 05 Jun 2004 23:44:09 +0200
John Baldwin wrote:
> On Friday 04 June 2004 11:14 am, Ian FREISLICH wrote:
> > John Baldwin wrote:
> > > On Friday 04 June 2004 06:45 am, Ian FREISLICH wrote:
> > > > Hi
> > > >
> > > > Every month or so after it started working I get this panic.
> > > > The panic then goes away after a month or two, with no
> > > > explanation.  During the existence of the panic I try new kernel
> > > > source once a day.
> > > >
> > > > This is an SMP machine.  Using the same source UP kernels work
> > > > fine, SMP kernels don't.  The last SMP kernel that worked is
> > > > circa May 17.
> > >
> > > grr, I still don't know why this happens.  One thing though is
> > > that if we can fix the nested panic we might can work on the first
> > > one.
> >
> > If you want access to the box in question, I can arrange that.
> >
> > > > Booting [/boot/kernel/kernel]...
> > > > /boot/kernel/acpi.ko text=0x3a0e4 data=0x19e4+0x11ac
> > > > syms=[0x4+0x6860+0x4+0x8a87 ]
> > > > Copyright (c) 1992-2004 The FreeBSD Project.
> > > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
> > > > 1994 The Regents of the University of California. All rights reserved.
> > > > FreeBSD 5.2-CURRENT #15: Fri Jun  4 10:23:23 SAST 2004
> > > >    
> > > > ianf_at_brane-dead.freislich.nom.za:/usr/src/sys/i386/compile/BRANE-DEAD
> > > > Preloaded elf kernel "/boot/kernel/kernel" at 0xc0728000.
> > > > Preloaded elf module "/boot/kernel/acpi.ko" at 0xc0728244.
> > > > Timecounter "i8254" frequency 1193182 Hz quality 0
> > > > CPU: Pentium II/Pentium II Xeon/Celeron (267.27-MHz 686-class CPU)
> > > >   Origin = "GenuineIntel"  Id = 0x634  Stepping = 4
> > > >
> > > > Features=0x80fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,
> > > >MCA, CMO V,MMX>
> > > > real memory  = 201261056 (191 MB)
> > > > avail memory = 191311872 (182 MB)
> > > > MPTable: <OEM00000 PROD00000000>
> > > > kernel trap 12 with interrupts disabled
> > > >
> > > >
> > > > Fatal trap 12: page fault while in kernel mode
> > > > cpuid = 0; apic id = 00
> > > > fault virtual address   = 0x1c
> > > > fault code              = supervisor write, page not present
> > > > instruction pointer     = 0x8:0xc058d98e
> > >
> > > Can you do a gdb -k on kernel.debug and do 'l *' on this address?  That
> > > might let us fix the panic in vm_fault().
> >
> > Is this what you're after?
> >
> > (kgdb) l * 0xc058d98e
> > 0xc058d98e is in vm_fault (machine/atomic.h:154).
> > 149     static __inline int
> > 150     atomic_cmpset_int(volatile u_int *dst, u_int exp, u_int src)
> > 151     {
> > 152             int res = exp;
> > 153
> > 154             __asm __volatile (
> > 155             "       " __XSTRING(MPLOCKED) " "
> > 156             "       cmpxchgl %1,%2 ;        "
> > 157             "       setz    %%al ;          "
> > 158             "       movzbl  %%al,%0 ;       "
> >
> > Ian
>
> Hmm, darn inlines. :) Can you compile the kernel with either
> INVARIANTS or MUTEX_NOINLINE so that mutex ops aren't inlined,
> reproduce the panic and then do the same lookup using the new faulting
> IP?

(kgdb) l * 0xc04b9828
0xc04b9828 is in _mtx_lock_flags (../../../kern/kern_mutex.c:247).
242     void
243     _mtx_lock_flags(struct mtx *m, int opts, const char *file, int line)
244     {
245
246             MPASS(curthread != NULL);
247             KASSERT(m->mtx_object.lo_class == &lock_class_mtx_sleep,
248                 ("mtx_lock() of spin mutex %s _at_ %s:%d", m->mtx_object.lo_name,
249                 file, line));
250             WITNESS_CHECKORDER(&m->mtx_object, opts | LOP_NEWORDER | LOP_EXCLUSIVE,
251                 file, line);


Interstingly with INVARIENTS, the panic is exactly the same except
for this (new) text at the end of the multiple panic:

panic: page fault
at line 815 in file ../../../i386/i386/trap.ccpuid = 0; 
Uptime: 1s
panic: _mtx_lock_sleep: recursed on non-recursive mutex system map _at_ ../../../vm/vm_map.c:2876

at line 437 in file ../../../kern/kern_mutex.ccpuid = 0; 
Uptime: 1s
panic: _mtx_lock_sleep: recursed on non-rep

Ian

--
Ian Freislich
Received on Mon Jun 07 2004 - 09:00:19 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:56 UTC