12-Current panics on boot (didn't a week ago.)

From: Andrew Reilly <areilly_at_bigpond.net.au>
Date: Sat, 24 Mar 2018 14:56:53 +1100
Hi all,

For reasons that still escape me, I haven't been able to get a kernel dump to debug, sorry.

Just thought that I'd generate a fairly low-quality report, to see if anyone has some ideas.

The last kernel that I have that booted OK (and I'm now running) is:
FreeBSD Zen.ac-r.nu 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r331064M: Sat Mar 17 07:54:51 AEDT 2018     root_at_Zen:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

The machine is a:
CPU: AMD Ryzen 7 1700 Eight-Core Processor           (2994.46-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
  Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>

Kernels built from head as of a couple of hours ago get through launching the other CPUs and then stops somewhere in random, apparently:

SMP: AP CPU #2 Launched!
Timecounter "TSC-low" frequency 1497223020 Hz quality 1000
random: entpanic: mtx_lock() of spin mutex (null) _at_ /usr/src/sys/kern/subr_bus.c:617
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00004507a0
vpanic() at vpanic+0x18d/frame 0xfffffe0000450800
doadump () at doadump/frame 0xfffffe0000450880
__mtx_lock_flags() at __mtx_lock_flags+0x163/frame 0xfffffe00004508d0
devctl_queue_data_f() at devctl_queue_data_f+0x6a/frame 0xfffffe0000450900
g_dev_taste() at g_dev_taste+0x370/frame 0xfffffe0000450a10
g_new_provider_event() at g_new_provider_event+0xfa/frame 0xfffffe0000450a30
g_run_events() at g_run_events+0x151/frame 0xfffffe0000450a70
fork_exit() at fork_exit+0x84/frame 0xfffffe0000450ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000450ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 14 tid 100052 ]
Stopped at kdb_enter+0x3b: movq    $0,kdb_why
db> dump
Cannot dump: no dump device specified.
db> 

Now dumping worked fine the last time the kernel panicked: I have dumpdev=AUTO in rc.conf and I have swap on nvd0p3 (first) and /dev/zvol/root/swap
(second, larger than the first.)

Root on the nvd0p2 is ZFS, and ther's a four-drive raidZ with user directories and what-not on them, and another ZFS on an external USB drive that I use
for backups, unmounted.

In the new kernels, we clearly aren't even getting as far as finding the hubs and controllers, let alone the drives.

I've attached dmesg.boot from the last boot from last week's good kernel.  (While briefly in yoyo mode I turned the SMT back on, so now there are 16 cores
instead of the eight mentioned in the crash dump.  Didn't help, but I haven't turned it back off yet.)

Cheers,

Andrew


Received on Sat Mar 24 2018 - 04:59:21 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:15 UTC