Re: MCA unrecoverable machine check exception on APU2

From: Dimitry Andric <dim_at_FreeBSD.org>
Date: Sun, 1 May 2016 16:36:09 +0200
On 01 May 2016, at 14:35, Shawn Webb <shawn.webb_at_hardenedbsd.org> wrote:
> 
> I'm getting this panic. I'm not sure if it's a problem with the OS or
> the hardware as this is a bit too low-level for me. Here's the dmesg
> output along with the stack trace:
> 
> MCA: Bank 0, Status 0xb400000000000135
> MCA: Global Cap 0x0000000000000106, Status 0x0000000000000004
> MCA: Vendor "AuthenticAMD", ID 0x730f01, APIC ID 3
> MCA: CPU 3 UNCOR DCACHE L1 DRD error
> MCA: Address 0x44e9830
> panic: Unrecoverable machine check exception
> cpuid = 3
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe00f53cb390
> vpanic() at vpanic+0x182/frame 0xfffffe00f53cb410
> panic() at panic+0x43/frame 0xfffffe00f53cb470
> mca_intr() at mca_intr+0x6b/frame 0xfffffe00f53cb490
> trap() at trap+0xa8/frame 0xfffffe00f53cb6a0
> calltrap() at calltrap+0x8/frame 0xfffffe00f53cb6a0
> --- trap 0x1c, rip = 0xffffffff80b6ffe1, rsp = 0xfffffe00f53cb770, rbp = 0xfffffe00f53cb7f0 ---
> callout_process() at callout_process+0x131/frame 0xfffffe00f53cb7f0
> handleevents() at handleevents+0x16d/frame 0xfffffe00f53cb830
> timercb() at timercb+0x227/frame 0xfffffe00f53cb880
> lapic_handle_timer() at lapic_handle_timer+0x9f/frame 0xfffffe00f53cb8c0
> Xtimerint() at Xtimerint+0x8c/frame 0xfffffe00f53cb8c0
> --- interrupt, rip = 0xffffffff80b85a39, rsp = 0xfffffe00f53cb990, rbp = 0xfffffe00f53cba70 ---
> sched_idletd() at sched_idletd+0x439/frame 0xfffffe00f53cba70
> fork_exit() at fork_exit+0x84/frame 0xfffffe00f53cbab0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe00f53cbab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> KDB: enter: panic
> [ thread pid 11 tid 100006 ]
> Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
> db>
> 
> This is running HardenedBSD 11-CURRENT/amd64 on the PC-Engines APU2. Is
> this worth filing a bug report upstream with FreeBSD or is this a
> hardware bug?

Hi Shawn,

This particular part seems to indicate a hardware problem:

MCA: Vendor "AuthenticAMD", ID 0x730f01, APIC ID 3
MCA: CPU 3 UNCOR DCACHE L1 DRD error
MCA: Address 0x44e9830
panic: Unrecoverable machine check exception

I'm not a MCE expert, but it looks like core 3 got an uncorrectable L1
data cache error.

-Dimitry


Received on Sun May 01 2016 - 12:36:26 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:04 UTC