RE: LOR: sched lock vs. sio + panic in sched_choose() [ULE + SMP panic]

From: John Baldwin <jhb_at_FreeBSD.org>
Date: Fri, 06 Jun 2003 12:39:46 -0400 (EDT)
On 06-Jun-2003 David P. Reese Jr. wrote:
> I've been getting a lot of these for the last two weeks on my SMP box.
> This panic is on  -CURRENT from earlier today.  Scheduler is ULE.
> 
> lock order reversal
>  1st 0xc047f820 sched lock (sched lock) _at_ /usr/src/sys/kern/kern_intr.c:548
>  2nd 0xc04b83c0 sio (sio) _at_ /usr/src/sys/dev/sio/sio.c:3242

This is a duplicate panic because you are using a serial console.

> Stack backtrace:
> backtrace(c0400378,c04b83c0,c0463120,c0463120,c041266b) at backtrace+0x17
> witness_lock(c04b83c0,8,c041266b,caa,c11efc00) at witness_lock+0x697
> _mtx_lock_spin_flags(c04b83c0,0,c041266b,caa,0) at _mtx_lock_spin_flags+0xd1
> siocnputc(c0463280,d,5,d1d62b68,0) at siocnputc+0x81
> cnputc(a,ffffffff,1,c0415c53,c) at cnputc+0x56
> putchar(a,d1d62b68,d1d62ab4,c0491d40,0) at putchar+0xcd
> kvprintf(c0415c52,c025eba0,d1d62b68,a,d1d62b88) at kvprintf+0x7d
> printf(c0415c52,c,c0415a4d,c03fe55b,c0489b20) at printf+0x57

This is the real panic below:

> trap_fatal(d1d62c14,38,d1d62bf0,c0236c9d,38) at trap_fatal+0x76
> trap(d1d60018,c0240010,c0470010,c11dcbe0,c0482280) at trap+0x123
> calltrap() at calltrap+0x5
> --- trap 0xc, eip = 0xc0253ec7, esp = 0xd1d62c54, ebp = 0xd1d62c68 ---
> sched_choose(c11dee40,c03fe7a6,28c,0,c11db668) at sched_choose+0x77
> choosethread(c11dcbe0,2,c03fdb89,1dc,b6e81bd0) at choosethread+0x36
> mi_switch(c047f820,0,c03fb1fd,224,c11db5ac) at mi_switch+0x200
> ithread_loop(c11da180,d1d62d48,c03fb0ae,30c,55ff44fd) at ithread_loop+0x256
> fork_exit(c022caf0,c11da180,d1d62d48) at fork_exit+0xc0
> fork_trampoline() at fork_trampoline+0x1a
> --- trap 0x1, eip = 0, esp = 0xd1d62d7c, ebp = 0 ---
> 
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; lapic.id = 01000000
> fault virtual address   = 0x38
> fault code              = supervisor read, page not present
> instruction pointer     = 0x8:0xc0253ec7
> stack pointer           = 0x10:0xd1d62c54
> frame pointer           = 0x10:0xd1d62c68
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 14 (swi7: tty:sio clock)
> kernel: type 12 trap, code=0
> Stopped at      sched_choose+0x77:      movl    0x38(%eax),%eax

This is a ULE and SMP panic that Jeff keeps looking for.  Seems to be a
NULL pointer deference of some sort.

> I recall most if not all of these panics occuring when swi7: tty:sio clock
> is the current process.  These are not completely repeatable, but if I
> simply reboot a couple of times, I can get the panic to occur while the
> rc scripts are being run.

Can you do a 'l *sched_choose+0x77' in gdb on kernel.debug to get
the source line corresponding to this panic?

-- 

John Baldwin <jhb_at_FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/
Received on Fri Jun 06 2003 - 07:39:49 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:10 UTC