Re: 12-Current panics on boot (didn't a week ago.)

From: Joe Maloney <jmaloney_at_ixsystems.com>
Date: Sat, 31 Mar 2018 12:13:07 -0400
The drm-next-kmod, and drm-stable-kmod modules panic for me.  I will attach
logs when I can.

On Friday, March 30, 2018, Andrew Reilly <areilly_at_bigpond.net.au> wrote:

> Hi Jonathan, all,
>
> I've just compiled and booted a kernel derived from current-GENERIC
> but with nooptions TCP_BLACKBOX, and much to my surprise it boots.
> Possible link to network-related activities is that the next line
> of boot output that was not being displayed during the crash is:
>
> [ath_hal] loaded
>
> That's vaguely network-shaped: could it be an issue?
>
> Please let me know if there's anything else that I could test or
> poke, in order to find the real culprit.
>
> My make.conf says:
>
> KERNCONF=ZEN
> WRKDIRPREFIX=/usr/obj/ports
> MALLOC_PRODUCTION=yes
>
> My /usr/src/sys/amd64/conf/ZEN says:
>
> include GENERIC
> nooptions TCP_BLACKBOX
>
> Uname -a says:
> FreeBSD Zen.ac-r.nu 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r331768M: Sat
> Mar 31 10:47:52 AEDT 2018     root_at_Zen:/usr/obj/usr/src/amd64.amd64/sys/ZEN
> amd64
>
> Cheers,
>
> Andrew
>
>
> Here's the top part of the new dmesg.boot, FYI:
> Copyright (c) 1992-2018 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>         The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 12.0-CURRENT #0 r331768M: Sat Mar 31 10:47:52 AEDT 2018
>     root_at_Zen:/usr/obj/usr/src/amd64.amd64/sys/ZEN amd64
> FreeBSD clang version 6.0.0 (tags/RELEASE_600/final 326565) (based on LLVM
> 6.0.0)
> WARNING: WITNESS option enabled, expect reduced performance.
> VT(vga): resolution 640x480
> CPU: AMD Ryzen 7 1700 Eight-Core Processor           (2994.45-MHz K8-class
> CPU)
>   Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
>   Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,
> APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
>   Features2=0x7ed8320b<SSE3,PCLMULQDQ,MON,SSSE3,FMA,CX16,
> SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
>   AMD Features=0x2e500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM>
>   AMD Features2=0x35c233ff<LAHF,CMP,SVM,ExtAPIC,CR8,ABM,SSE4A,MAS,
> Prefetch,OSVW,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX>
>   Structured Extended Features=0x209c01a9<FSGSBASE,
> BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA>
>   XSAVE Features=0xf<XSAVEOPT,XSAVEC,XINUSE,XSAVES>
>   AMD Extended Feature Extensions ID EBX=0x7<CLZERO,IRPerf,XSaveErPtr>
>   SVM: (disabled in BIOS) NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
>   TSC: P-state invariant, performance statistics
> real memory  = 34359738368 (32768 MB)
> avail memory = 33271214080 (31729 MB)
> Event timer "LAPIC" quality 600
> ACPI APIC Table: <ALASKA A M I >
> FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
> FreeBSD/SMP: 1 package(s) x 2 cache groups x 4 core(s)
> random: unblocking device.
> Firmware Warning (ACPI): Optional FADT field Pm2ControlBlock has valid
> Length but zero Address: 0x0000000000000000/0x1 (20180313/tbfadt-796)
> ioapic0 <Version 2.1> irqs 0-23 on motherboard
> ioapic1 <Version 2.1> irqs 24-55 on motherboard
> SMP: AP CPU #7 Launched!
> SMP: AP CPU #3 Launched!
> SMP: AP CPU #2 Launched!
> SMP: AP CPU #6 Launched!
> SMP: AP CPU #5 Launched!
> SMP: AP CPU #4 Launched!
> SMP: AP CPU #1 Launched!
> Timecounter "TSC-low" frequency 1497224985 Hz quality 1000
> random: entropy device external interface
> [ath_hal] loaded
> module_register_init: MOD_LOAD (vesa, 0xffffffff8109f600, 0) error 19
> random: registering fast source Intel Secure Key RNG
> random: fast provider: "Intel Secure Key RNG"
> kbd1 at kbdmux0
> netmap: loaded module
> nexus0
> vtvga0: <VT VGA driver> on motherboard
> cryptosoft0: <software crypto> on motherboard
> aesni0: <AES-CBC,AES-XTS,AES-GCM,AES-ICM,SHA1,SHA256> on motherboard
> acpi0: <ALASKA A M I > on motherboard
> acpi0: Power Button (fixed)
> cpu0: <ACPI CPU> on acpi0
> cpu1: <ACPI CPU> on acpi0
> cpu2: <ACPI CPU> on acpi0
> cpu3: <ACPI CPU> on acpi0
> cpu4: <ACPI CPU> on acpi0
> cpu5: <ACPI CPU> on acpi0
> cpu6: <ACPI CPU> on acpi0
> cpu7: <ACPI CPU> on acpi0
> attimer0: <AT timer> port 0x40-0x43 irq 0 on acpi0
> Timecounter "i8254" frequency 1193182 Hz quality 0
> Event timer "i8254" frequency 1193182 Hz quality 100
> atrtc0: <AT realtime clock> port 0x70-0x71 on acpi0
> atrtc0: registered as a time-of-day clock, resolution 1.000000s
> Event timer "RTC" frequency 32768 Hz quality 0
> hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 0,8 on
> acpi0
> Timecounter "HPET" frequency 14318180 Hz quality 950
> Event timer "HPET" frequency 14318180 Hz quality 350
> Event timer "HPET1" frequency 14318180 Hz quality 350
> Event timer "HPET2" frequency 14318180 Hz quality 350
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
> acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> pci0: <ACPI PCI bus> on pcib0
> amdsmn0: <AMD Family 17h System Management Network> on hostb0
> amdtemp0: <AMD CPU On-Die Thermal Sensors> on hostb0
>
>
> On Sun, Mar 25, 2018 at 04:35:31AM +0000, Jonathan Looney wrote:
> > For now, you can update through r331485 and then take TCP_BLACKBOX out of
> > your kernel config file. That won’t really “fix” anything, but should at
> > least get you a booting system (assuming the new code from r331347 is
> > really triggering a problem).
> >
> >
> > I’ll take another look to see if I missed something in the commit. But,
> at
> > the moment, I’m hard-pressed to see how r331347 would cause the problem
> you
> > describe.
> >
> >
> > Jonathan
> >
> > On Sat, Mar 24, 2018 at 9:17 PM Andrew Reilly <areilly_at_bigpond.net.au>
> > wrote:
> >
> > > OK, I've completed the search: r331346 works, r331347 panics
> > > somewhere in the initialization of random.
> > >
> > > In the 331347 change (Add the "TCP Blackbox Recorder") I can't see
> > > anything obvious to tweak, unfortunately.  It's a fair chunk of new
> > > code but it's all network-stack related, and my kernel is panicking
> > > long before any network activity happens.
> > >
> > > Any suggestions?
> > >
> > > Cheers,
> > >
> > > Andrew
> > >
> > > On Sat, Mar 24, 2018 at 05:23:18PM -0600, Warner Losh wrote:
> > > > Thanks Andrew... I can't recreate this on my VM nor my real hardware.
> > > >
> > > > Warner
> > > >
> > > > On Sat, Mar 24, 2018 at 5:22 PM, Andrew Reilly <
> areilly_at_bigpond.net.au>
> > > > wrote:
> > > >
> > > > > So, r331464 crashes in the same place, on my system.  r331064 still
> > > boots
> > > > > OK.  I'll keep searching.
> > > > >
> > > > > One week ago there was a change to randomdev to poll for signals
> every
> > > so
> > > > > often, as a defence against very large reads.  That wouldn't have
> > > > > introduced a race somewhere,
> > > > > or left things in an unexpected state, perhaps?  That change
> (r331070)
> > > by
> > > > > cem_at_ is just a few revisions after the one that is working for me.
> > > I'll
> > > > > start looking there...
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Andrew
> > > > >
> > > > > On Sun, Mar 25, 2018 at 07:49:17AM +1100, Andrew Reilly wrote:
> > > > > > Hi Warner,
> > > > > >
> > > > > > The breakage was in 331470,  and at least one version earlier,
> that I
> > > > > updated past when it panicked.
> > > > > >
> > > > > > I'm guessing that kdb's inability to dump would be down to it not
> > > having
> > > > > found any disk devices yet, right?  So yes, bisecting to narrow
> down
> > > the
> > > > > issue is probably the best bet.  I'll try your r331464: if that
> works
> > > that
> > > > > leaves only four or five revisions.  Of course the breakage could
> be
> > > > > hardware specific.
> > > > > >
> > > > > > Cheers,
> > > > > > --
> > > > > > Andrew
> > > > >
> > >
> > _______________________________________________
> > freebsd-current_at_freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "
> freebsd-current-unsubscribe_at_freebsd.org"
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
>


-- 
Joe Maloney
QA Manager / iXsystems
Enterprise Storage & Servers Driven By Open Source
Received on Sat Mar 31 2018 - 14:13:09 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:15 UTC