Re: new panic in cpu_reset() with WITNESS

From: Andriy Gapon <avg_at_FreeBSD.org>
Date: Mon, 23 Jan 2012 10:21:57 +0200
on 22/01/2012 18:35 Gleb Smirnoff said the following:
> On Sat, Jan 21, 2012 at 03:38:59PM +0200, Andriy Gapon wrote:
> A> on 20/01/2012 17:41 Gleb Smirnoff said the following:
> A> > 
> A> > On Tue, Jan 17, 2012 at 03:02:42PM +0400, Gleb Smirnoff wrote:
> A> > T>   New panic has been introduced somewhere between
> A> > T> r229851 and r229932, that happens on shutdown if
> A> > T> kernel has WITNESS and doesn't have WITNESS_SKIPSPIN.
> A> > 
> A> > I've run through binary search and panic was introduced
> A> > by r229854.
> A> > 
> A> 
> A> Which means that it was introduced by one of the earlier commits (probably
> A> mine).  Thank you for this investigative work!
> A> Although, to be frank and honest, I still don't see how the problem could be
> A> a non-issue before.
> 
> I can confirm that I can reboot succesfully with a new kernel
> and kern.stop_scheduler_on_panic=0.

This is very puzzling.  You observation means that stop_scheduler_on_panic has
an effect on the code outside the panic path.  I reviewed the
SCHEDULER_STOPPED() changes and I still can not see how that could be possible
(modulo compiler bugs).

Unfortunately, I also can not reproduce the problem here - without
WITNESS_SKIPSPIN I am running into a panic[*] during boot.
BTW, do you get any other LOR reports _before_ you do a reboot?

Can you try to change printfs in witness to db_printfs?  Perhaps this will allow
to get the details of the LOR in uart_cnputc.  Maybe that will reveal some
important additional details.

Finally, can you try the patch from my other post?

[*]
The panic:
...
Root mount waiting for: usbus0
panic: mtx_lock_spin: recursed on non-recursive mutex cnputs_mtx _at_
/usr/src/sys/kern/kern_cons.c:500

cpuid = 0
curthread: 0xfffffe000230b900
stack: 0xffffff802cda3000 - 0xffffff802cda7000
stack pointer: 0xffffff802cda5c50
KDB: stack backtrace:
db_trace_self_wrapper() at 0xffffffff802bf41a = db_trace_self_wrapper+0x2a
kdb_backtrace() at 0xffffffff8053382a = kdb_backtrace+0x3a
panic() at 0xffffffff804f8690 = panic+0x260
_mtx_lock_spin_flags() at 0xffffffff804e5eff = _mtx_lock_spin_flags+0xef
cnputs() at 0xffffffff804b0ea6 = cnputs+0x36
putchar() at 0xffffffff8053887f = putchar+0x24f
kvprintf() at 0xffffffff80536b1b = kvprintf+0x9b
vprintf() at 0xffffffff80538225 = vprintf+0x85
printf() at 0xffffffff805382f7 = printf+0x67
witness_checkorder() at 0xffffffff8054c361 = witness_checkorder+0x821
_mtx_lock_spin_flags() at 0xffffffff804e5f1d = _mtx_lock_spin_flags+0x10d
uart_cnputc() at 0xffffffff803c8e6c = uart_cnputc+0x3c
cnputc() at 0xffffffff804b0acd = cnputc+0x6d
cnputs() at 0xffffffff804b0ebb = cnputs+0x4b
putchar() at 0xffffffff8053887f = putchar+0x24f
kvprintf() at 0xffffffff80536b1b = kvprintf+0x9b
vprintf() at 0xffffffff80538225 = vprintf+0x85
printf() at 0xffffffff805382f7 = printf+0x67
witness_checkorder() at 0xffffffff8054c361 = witness_checkorder+0x821
_mtx_lock_spin_flags() at 0xffffffff804e5f1d = _mtx_lock_spin_flags+0x10d
cpu_new_callout() at 0xffffffff806fadbb = cpu_new_callout+0x9b
callout_cc_add() at 0xffffffff8050e379 = callout_cc_add+0x129
callout_reset_on() at 0xffffffff8050efcc = callout_reset_on+0x24c
sleepq_set_timeout() at 0xffffffff8053ff56 = sleepq_set_timeout+0x106
_sleep() at 0xffffffff805036f2 = _sleep+0x342
pause() at 0xffffffff80503920 = pause+0x90
usb_pause_mtx() at 0xffffffff803ee841 = usb_pause_mtx+0x71
usbd_req_set_address() at 0xffffffff803e9f1c = usbd_req_set_address+0xac
usb_alloc_device() at 0xffffffff803def89 = usb_alloc_device+0x3d9
uhub_explore() at 0xffffffff803e5b63 = uhub_explore+0x3a3
usb_bus_explore() at 0xffffffff803d4187 = usb_bus_explore+0xd7
usb_process() at 0xffffffff803e8bc1 = usb_process+0xa1
fork_exit() at 0xffffffff804c801a = fork_exit+0x1aa
fork_trampoline() at 0xffffffff806bf3ee = fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffff802cda6d00, rbp = 0 ---
KDB: enter: panic
[ thread pid 15 tid 100031 ]
Stopped at      0xffffffff8053348d = kdb_enter+0x3d:    movq    $0,0x11212d0(%rip)
db> show locks
exclusive sx USB config SX lock (USB config SX lock) r = 0 (0xfffffe0005040070)
locked _at_ /usr/src/sys/dev/usb/usb_device.c:2643
exclusive spin mutex callout (callout) r = 0 (0xffffffff8160b3f8) locked _at_
/usr/src/sys/kern/kern_timeout.c:409
exclusive spin mutex sleepq chain (sleepq chain) r = 0 (0xffffffff81655668)
locked _at_ /usr/src/sys/kern/subr_sleepqueue.c:237


-- 
Andriy Gapon
Received on Mon Jan 23 2012 - 07:22:01 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:23 UTC