Re: 12-Current panics on boot (didn't a week ago.)

From: Warner Losh <imp_at_bsdimp.com> Date: Sat, 24 Mar 2018 08:11:13 -0600 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:15 UTC

That lock has been there for a long, long time  (like 5 or 6 major
releases)... It's surprising that it's causing issues now.

Can you bisect versions to find when this starts happening?

Warner

On Fri, Mar 23, 2018 at 9:56 PM, Andrew Reilly <areilly_at_bigpond.net.au>
wrote:

> Hi all,
>
> For reasons that still escape me, I haven't been able to get a kernel dump
> to debug, sorry.
>
> Just thought that I'd generate a fairly low-quality report, to see if
> anyone has some ideas.
>
> The last kernel that I have that booted OK (and I'm now running) is:
> FreeBSD Zen.ac-r.nu 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r331064M: Sat
> Mar 17 07:54:51 AEDT 2018     root_at_Zen:/usr/obj/usr/src/amd64.amd64/sys/GENERIC
> amd64
>
> The machine is a:
> CPU: AMD Ryzen 7 1700 Eight-Core Processor           (2994.46-MHz K8-class
> CPU)
>   Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
>   Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,
> APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
>
> Kernels built from head as of a couple of hours ago get through launching
> the other CPUs and then stops somewhere in random, apparently:
>
> SMP: AP CPU #2 Launched!
> Timecounter "TSC-low" frequency 1497223020 Hz quality 1000
> random: entpanic: mtx_lock() of spin mutex (null) _at_
> /usr/src/sys/kern/subr_bus.c:617
> cpuid = 0
> time = 1
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfffffe00004507a0
> vpanic() at vpanic+0x18d/frame 0xfffffe0000450800
> doadump () at doadump/frame 0xfffffe0000450880
> __mtx_lock_flags() at __mtx_lock_flags+0x163/frame 0xfffffe00004508d0
> devctl_queue_data_f() at devctl_queue_data_f+0x6a/frame 0xfffffe0000450900
> g_dev_taste() at g_dev_taste+0x370/frame 0xfffffe0000450a10
> g_new_provider_event() at g_new_provider_event+0xfa/frame
> 0xfffffe0000450a30
> g_run_events() at g_run_events+0x151/frame 0xfffffe0000450a70
> fork_exit() at fork_exit+0x84/frame 0xfffffe0000450ab0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000450ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> KDB: enter: panic
> [ thread pid 14 tid 100052 ]
> Stopped at kdb_enter+0x3b: movq    $0,kdb_why
> db> dump
> Cannot dump: no dump device specified.
> db>
>
> Now dumping worked fine the last time the kernel panicked: I have
> dumpdev=AUTO in rc.conf and I have swap on nvd0p3 (first) and
> /dev/zvol/root/swap
> (second, larger than the first.)
>
> Root on the nvd0p2 is ZFS, and ther's a four-drive raidZ with user
> directories and what-not on them, and another ZFS on an external USB drive
> that I use
> for backups, unmounted.
>
> In the new kernels, we clearly aren't even getting as far as finding the
> hubs and controllers, let alone the drives.
>
> I've attached dmesg.boot from the last boot from last week's good kernel.
> (While briefly in yoyo mode I turned the SMT back on, so now there are 16
> cores
> instead of the eight mentioned in the crash dump.  Didn't help, but I
> haven't turned it back off yet.)
>
> Cheers,
>
> Andrew
>
>
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
>
>