RE: strange kernel crash

From: Andrew Duane <aduane_at_juniper.net>
Date: Wed, 11 Nov 2015 12:46:05 +0000
> -----Original Message-----
> From: owner-freebsd-hackers_at_freebsd.org [mailto:owner-freebsd-hackers_at_freebsd.org] On Behalf Of Andriy Gapon
> Sent: Wednesday, November 11, 2015 3:02 AM
> To: John Baldwin <jhb_at_FreeBSD.org>
> Cc: Hans Petter Selasky <hps_at_selasky.org>; FreeBSD Hackers <freebsd-hackers_at_FreeBSD.org>; freebsd-current_at_FreeBSD.org
> Subject: Re: strange kernel crash
> 
> On 10/11/2015 20:42, John Baldwin wrote:
> > On Tuesday, November 10, 2015 10:48:08 AM Andriy Gapon wrote:
> >> On 09/11/2015 22:16, John Baldwin wrote:
> >>> On Friday, November 06, 2015 07:02:59 PM Hans Petter Selasky wrote:
> >>>> On 11/06/15 12:20, Andriy Gapon wrote:
> >>>>> Now the strange part:
> >>>>>
> >>>>>     0xffffffff80619a18 <+744>:   jne    0xffffffff80619a61 <__mtx_lock_flags+817>
> >>>>>     0xffffffff80619a1a <+746>:   mov    %rbx,(%rsp)
> >>>>> => 0xffffffff80619a1e <+750>:   movq   $0x0,0x18(%rsp)
> >>>>>     0xffffffff80619a27 <+759>:   movq   $0x0,0x10(%rsp)
> >>>>>     0xffffffff80619a30 <+768>:   movq   $0x0,0x8(%rsp)
> >>>>
> >>>> Were these instructions dumped from RAM or from the kernel ELF file?
> >>>
> >>> Probably not from RAM.  You can use 'info files' in gdb to see what
> >>> is handling the address range in question (core vs executable).  x/i
> >>> in ddb would have been the "real" truth.
> >>
> >> Yes, according to the output of files it looks like gdb would read
> >> that data from the text section of the kernel file.
> >>
> >> How about libkvm?  Would kvm_read read data from the core file?
> >
> > kvm_read should only access the vmcore, yes.
> >
> >> I've written the following small program (cut down dmesg.c, actually):
> >> https://people.freebsd.org/~avg/vmcore_read.c
> >>
> >> (kgdb) disassemble /r
> >> => 0xffffffff80619a1e <+750>:   48 c7 44 24 18 00 00 00 00      movq
> >> $0x0,0x18(%rsp)
> >>
> >> $ vmcore_read -N /boot/kernel.29/kernel -M /var/crash/vmcore.29
> >> 0xffffffff80619a1e 9
> >> 48 c7 44 24 18 00 00 00 00
> >>
> >> Seems like the code is intact.
> >>
> >> P.S.
> >> 1. To correct something I said earlier, the fault is #UD, not #GP.
> >> 2. The only "suspicious" activity at the time of the crash was the execution of a bhyve VM.
> >
> > Was the crash in the guest or the host?  UD# seems even more bizarre.
> 
> It was the host.  This is bizarre indeed.  I can think only of two possibilities:
>   - new CPU erratum
>   - corrupted data somehow getting into the instruction cache, but the correct data being read during the crash dump (i.e. flaky memory)

Or perhaps a missing memory sync operation somewhere....

> 
> --
> Andriy Gapon
> _______________________________________________
> freebsd-hackers_at_freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe_at_freebsd.org"
Received on Wed Nov 11 2015 - 12:01:54 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:00 UTC