Re: firewire panic

From: Mark Millard <markmi_at_dsl-only.net>
Date: Sat, 12 Nov 2016 01:13:10 -0800
Andriy Gapon avg at FreeBSD.org wrote on Fri Nov 11 12:26:59 UTC 2016 :

> Does anyone still use firewire or hack on code?
> I've recently tried to connect an external firewire HDD enclosure and got this:
> 
> Unread portion of the kernel message buffer:
> lock order reversal:
>  1st 0xfffff8002b0f2f48 sbp (sbp) _at_ /usr/src/sys/kern/kern_mutex.c:158
>  2nd 0xfffff8003f86f460 CAM device lock (CAM device lock) _at_
> /usr/src/sys/cam/scsi/scsi_xpt.c:2323
> stack backtrace:
> #0 0xffffffff8068d220 at witness_debugger+0x70
> #1 0xffffffff8068cd81 at witness_checkorder+0x7a1
> #2 0xffffffff8061bab8 at __mtx_lock_flags+0x98
> #3 0xffffffff802b663d at scsi_scan_lun+0x11d
> #4 0xffffffff802b51f7 at scsi_action+0x67
> #5 0xffffffff802a756a at xpt_action+0x1a
> #6 0xffffffff8047459e at sbp_cam_scan_target+0xce
> #7 0xffffffff8064f856 at softclock_call_cc+0x2d6
> #8 0xffffffff8064fbf7 at softclock+0x47
> #9 0xffffffff80602190 at intr_event_execute_handlers+0xe0
> #10 0xffffffff806029ec at ithread_execute_handlers+0x2c
> #11 0xffffffff8060285b at ithread_loop+0x5b
> #12 0xffffffff805ff72f at fork_exit+0xdf
> #13 0xffffffff8082483e at fork_trampoline+0xe
> lock order reversal:
> panic: mutex sbp not owned at /usr/src/sys/dev/firewire/sbp.c:967
> cpuid = 2
> curthread: 0xfffff8000ada5000
> stack: 0xfffffe0504ded000 - 0xfffffe0504df1000
> stack pointer: 0xfffffe0504df0a00
> KDB: stack backtrace:
> db_trace_self_wrapper() at 0xffffffff80420bbb = db_trace_self_wrapper+0x2b/frame
> 0xfffffe0504df0930
> kdb_backtrace() at 0xffffffff80670359 = kdb_backtrace+0x39/frame 0xfffffe0504df09e0
> vpanic() at 0xffffffff8063986c = vpanic+0x14c/frame 0xfffffe0504df0a20
> panic() at 0xffffffff806395b3 = panic+0x43/frame 0xfffffe0504df0a80
> __mtx_assert() at 0xffffffff8061c40d = __mtx_assert+0xed/frame 0xfffffe0504df0ac0
> sbp_cam_scan_lun() at 0xffffffff80474667 = sbp_cam_scan_lun+0x37/frame
> 0xfffffe0504df0af0
> xpt_done_process() at 0xffffffff802aacfa = xpt_done_process+0x2da/frame
> 0xfffffe0504df0b30
> xpt_done_td() at 0xffffffff802ac2e5 = xpt_done_td+0xd5/frame 0xfffffe0504df0b80
> fork_exit() at 0xffffffff805ff72f = fork_exit+0xdf/frame 0xfffffe0504df0bf0
> fork_trampoline() at 0xffffffff8082483e = fork_trampoline+0xe/frame
> 0xfffffe0504df0bf0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> -- 
> Andriy Gapon

I normally do not run the debug kernel configuration, even for head. But I tried on an old PowerMac where I use FireWire to temporarily mount a drive at times.

I can confirm such things on turning on a FireWire drive with an SSD in it. An example is:

Context:
> # uname -apKU
> FreeBSD FBSDG4S 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r308247M: Fri Nov 11 23:48:53 PST 2016     markmi_at_FreeBSDx64:/usr/obj/powerpcvtscdbg_clang_gcc421_kernel/powerpc.powerpc/usr/src/sys/GENERICvtsc-DBG  powerpc powerpc 1200014 1200014


Got:
> Nov 12 00:23:47 FBSDG4S kernel: lock order reversal:
> Nov 12 00:23:47 FBSDG4S kernel: 1st 0x5d24d10 sbp (sbp) _at_ /usr/src/sys/kern/kern_mutex.c:220
> Nov 12 00:23:47 FBSDG4S kernel: 2nd 0xd31224a4 firewire (firewire) _at_ /usr/src/sys/dev/firewire/firewire.c:302
> Nov 12 00:23:47 FBSDG4S kernel: stack backtrace:
> Nov 12 00:23:47 FBSDG4S kernel: #0 0x51e43c at witness_checkorder+0x96c
> Nov 12 00:23:47 FBSDG4S kernel: #1 0x47e7b4 at __mtx_lock_flags+0xfc
> Nov 12 00:23:47 FBSDG4S kernel: #2 0x1db6a8 at fw_asyreq+0x334
> Nov 12 00:23:47 FBSDG4S kernel: #3 0x1eb318 at sbp_mgm_orb+0x320
> Nov 12 00:23:47 FBSDG4S kernel: #4 0x1eb58c at sbp_login_callout+0x58
> Nov 12 00:23:47 FBSDG4S kernel: #5 0x4bda30 at softclock_call_cc+0x1d0
> Nov 12 00:23:47 FBSDG4S kernel: #6 0x4bdddc at softclock+0xbc
> Nov 12 00:23:47 FBSDG4S kernel: #7 0x45df1c at intr_event_execute_handlers+0x180
> Nov 12 00:23:47 FBSDG4S kernel: #8 0x45fa70 at ithread_loop+0x10c
> Nov 12 00:23:47 FBSDG4S kernel: #9 0x45a530 at fork_exit+0xc0
> Nov 12 00:23:47 FBSDG4S kernel: #10 0x83598c at fork_trampoline+0xc

There was more on the console after this that did not make to the messages file.
As I remember part of it matched your scsi_xpt.c:2323 reference above.



I also got (earlier in the boot):

> Nov 12 00:22:26 FBSDG4S kernel: Trying to mount root from ufs:/dev/ufs/FBSDG4Srootfs [rw,noatime]...
> Nov 12 00:22:26 FBSDG4S kernel: Sleeping on "smu" with the following non-sleepable locks held:
> Nov 12 00:22:26 FBSDG4S kernel: exclusive sleep mutex tod2rl (tod2rl) r = 0 (0xea0d50) locked _at_ /usr/src/sys/kern/subr_rtc.c:135
> Nov 12 00:22:26 FBSDG4S kernel: stack backtrace:
> Nov 12 00:22:26 FBSDG4S kernel: #0 0x51da58 at witness_warn+0x2c0
> Nov 12 00:22:26 FBSDG4S kernel: #1 0x4acaa0 at _sleep+0xc8
> Nov 12 00:22:26 FBSDG4S kernel: #2 0x81d594 at smu_run_cmd+0x160
> Nov 12 00:22:26 FBSDG4S kernel: #3 0x81e610 at smu_gettime+0x60
> Nov 12 00:22:26 FBSDG4S kernel: #4 0x5021a8 at inittodr+0xa8
> Nov 12 00:22:26 FBSDG4S kernel: #5 0x584e7c at vfs_mountroot+0x12e8
> Nov 12 00:22:26 FBSDG4S kernel: #6 0x4305e8 at start_init+0x90
> Nov 12 00:22:26 FBSDG4S kernel: #7 0x45a530 at fork_exit+0xc0
> Nov 12 00:22:26 FBSDG4S kernel: #8 0x83598c at fork_trampoline+0xc

And:

> Nov 12 00:22:26 FBSDG4S kernel: lock order reversal: (Giant after non-sleepable)
> Nov 12 00:22:26 FBSDG4S kernel: 1st 0xea0d50 tod2rl (tod2rl) _at_ /usr/src/sys/kern/subr_rtc.c:135
> Nov 12 00:22:26 FBSDG4S kernel: 2nd 0xdfdf74 Giant (Giant) _at_ /usr/src/sys/kern/kern_synch.c:230

(Traceback jumbled up with other text.)

There is another lor with the initial text all jumbled up with other text, not just the traceback.


It took some time to get these experiments done because I initially got an immediate crash after:

boot kernel.debug

(which should have found and used /boot/kernel.debug/kernel ). Using /boot/loader.conf to pick the kernel crashed before even allowing access to the loader prompt.

So I tried to figure out what was wrong with my debug kernel build and all my alternate kernel tests failed the same way (debug or not).

Turned out nothing was wrong with the kernel builds: renaming the original kernel.debug to kerdbg so the path would be:

/boot/kerdbg/kernel

and then using:

boot kerdbg

worked fine.

Apparently kernel.debug is too many characters for how things work on the PowerMacs and there is no protection/notice for such an issue.


===
Mark Millard
markmi at dsl-only.net
Received on Sat Nov 12 2016 - 08:13:19 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:08 UTC