Since last week (today) current on my Ryzen box is unstable

From: Andrew Reilly <areilly_at_bigpond.net.au>
Date: Sat, 17 Feb 2018 23:16:21 +1100
Hi,

I do a weekly build to track changes, on 12-current since I gave my fileserver this new Ryzen motherboard a few months ago.  I switched to current because there was some badness in 11-stable that I attributed to new processor twitchiness (wouldn't reboot, temperature sensors not working.)  A month or so of 12- has been lovely, for the most part.

Today's rebuild has given me uptimes of below an hour, usually.  The box will stay up in single user mode long enough to rebuild world/kernel, but multi-user it is panicking at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:1592

The backtrace shows that it gets to this panic from a sendfile() syscall.  The line above is in the middle of a big edit that's part of svn revision 329363.  The tripping assertion seems to suggest that m->valid != 0, for whatever that's worth.

Anything that I should be trying?

On a side-note, the new devmatch workings are giving me 43 boot warnings about "Malformed NOMATCH string: ''?'', and devmatch_enable="NO" in /etc/rc.conf doesn't seem to help, and the new matching is very very keen to load cc_vegas.ko, a lot.  Here's the output of devmatch -v, in case that helps:

$ devmatch -v
Searching  acpi bus at handle=\_PR_.P008 for pnpinfo _HID=none _UID=0
Searching  acpi bus at handle=\_PR_.P009 for pnpinfo _HID=none _UID=0
Searching  acpi bus at handle=\_PR_.P00A for pnpinfo _HID=none _UID=0
Searching  acpi bus at handle=\_PR_.P00B for pnpinfo _HID=none _UID=0
Searching  acpi bus at handle=\_PR_.P00C for pnpinfo _HID=none _UID=0
Searching  acpi bus at handle=\_PR_.P00D for pnpinfo _HID=none _UID=0
Searching  acpi bus at handle=\_PR_.P00E for pnpinfo _HID=none _UID=0
Searching  acpi bus at handle=\_PR_.P00F for pnpinfo _HID=none _UID=0
Searching  pci bus at slot=0 function=2 dbsf=pci0:0:0:2 handle=\_SB_.PCI0.IOMA for pnpinfo vendor=0x1022 device=0x1451 subvendor=0x1022 subdevice=0x1451 class=0x080600
Searching  pci bus at slot=0 function=0 dbsf=pci0:9:0:0 for pnpinfo vendor=0x8086 device=0x24fb subvendor=0x8086 subdevice=0x2110 class=0x028000
Searching  pci bus at slot=0 function=1 dbsf=pci0:11:0:1 for pnpinfo vendor=0x1002 device=0xaab0 subvendor=0x174b subdevice=0xaab0 class=0x040300
Searching  pci bus at slot=0 function=0 dbsf=pci0:17:0:0 for pnpinfo vendor=0x1022 device=0x145a subvendor=0x1022 subdevice=0x145a class=0x130000
Searching  pci bus at slot=0 function=2 dbsf=pci0:17:0:2 handle=\_SB_.PCI0.GP17.APSP for pnpinfo vendor=0x1022 device=0x1456 subvendor=0x1022 subdevice=0x1456 class=0x108000
cc_vegas.ko
Searching  pci bus at slot=0 function=0 dbsf=pci0:18:0:0 for pnpinfo vendor=0x1022 device=0x1455 subvendor=0x1022 subdevice=0x1455 class=0x130000
Searching  acpi bus at handle=\_SB_.PCI0.SBRG.PIC_ for pnpinfo _HID=PNP0000 _UID=0
Searching  acpi bus at handle=\_SB_.PCI0.SBRG.SPKR for pnpinfo _HID=PNP0800 _UID=0
Searching  acpi bus at handle=\_SB_.GPIO for pnpinfo _HID=AMDI0030 _UID=0
Searching  acpi bus at handle=\_SB_.PTIO for pnpinfo _HID=AMDIF030 _UID=0
Searching  acpi bus at handle=\AOD_ for pnpinfo _HID=PNP0C14 _UID=0

I can't tell if this is related to the zfs problem or not.  As far as I'm aware, cc_vegas.ko was not loaded into the kernel before today.

FWIW uname -a says:
FreeBSD Zen.ac-r.nu 12.0-CURRENT FreeBSD 12.0-CURRENT #6 r329450: Sat Feb 17 22:36:19 AEDT 2018     root_at_:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

I'll attach the dmesg.boot from the boot that I had to do while composing this message...

Cheers,

Andrew



Received on Sat Feb 17 2018 - 11:16:38 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:15 UTC