On Wed, Sep 19, 2018 at 05:02:11PM -0400, Mark Johnston wrote: > On Wed, Sep 19, 2018 at 01:01:52PM -0700, Steve Kargl wrote: > > I have the kernel and core file if more information is needed. > > > > % cat info.2 > > Dump header from device: /dev/ada0p3 > Architecture: amd64 > > Architecture Version: 2 > > Dump Length: 2348281856 > > Blocksize: 512 > > Compression: none > > Dumptime: Wed Sep 19 12:29:59 2018 > > Hostname: troutmask.apl.washington.edu > > Magic: FreeBSD Kernel Dump > > Version String: FreeBSD 12.0-ALPHA4 #0 r338505: Thu Sep 6 13:45:34 PDT 2018 > > kargl_at_troutmask.apl.washington.edu:/usr/obj/usr/src/amd64.amd64/sys/SPEW > > Panic String: page fault > > Dump Parity: 2676008548 > > Bounds: 2 > > Dump Status: good > > > > % more core.txt.2 > > Fatal trap 12: page fault while in kernel mode > > cpuid = 1; apic id = 11 > > fault virtual address = 0xffffb8000719a428 > > This seems to be the result of a bit-flip. cred is 0xffffb8000719a400, > which is almost but not quite in the direct map. In particular we have: > > (kgdb) frame 10 > #10 0xffffffff8083e07d in vm_object_destroy (object=<optimized out>) at /usr/src/sys/vm/vm_object.c:703 > 703 swap_release_by_cred(object->charge, object->cred); > (kgdb) p object > $8 = <optimized out> > (kgdb) p *(vm_object_t)$r13 > $9 = { > ... > cred = 0xffffb8000719a400, > charge = 28672, > umtx_data = 0x0 > } > (kgdb) p *(struct ucred *)0xfffff8000719a400 > $10 = { > cr_ref = 5737, > cr_uid = 1001, > cr_ruid = 1001, > cr_svuid = 1001, > cr_ngroups = 7, > cr_rgid = 1001, > cr_svgid = 1001, > cr_uidinfo = 0xfffff80007285500, > cr_ruidinfo = 0xfffff80007285500, > cr_prison = 0xffffffff80a9de10 <prison0>, > ... <more sane-looking ucred fields> > > That is, flipping one of the bits in the fault address leads me to a > valid ucred. This could in principle be the result of a software bug, > but I'd be more inclined to suspect the hardware. Mark, Thanks for looking into the problem. This system has been running for probably 2 years or so without issues. I guess it's time to pull out memtest86+ (or similar) to see if hardware is starting to fail. -- SteveReceived on Wed Sep 19 2018 - 19:11:59 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:18 UTC