Re: Interpreting MCA error output

From: Steve Kargl <sgk_at_troutmask.apl.washington.edu>
Date: Sat, 16 Jul 2011 11:38:17 -0700
On Sat, Jul 16, 2011 at 02:25:20PM -0400, Kim Culhan wrote:
> Noticed the following console message while running make world with
> 9.0-CURRENT on 7-16-11
> 
> Jul 16 11:15:20 delta kernel: MCA: Vendor "GenuineIntel", ID 0x106a5, APIC
> ID 16
> Jul 16 11:15:20 delta kernel: MCA: CPU 8 COR (1) RD channel ?? memory error
> Jul 16 11:15:20 delta kernel: MCA: Address 0x28f261f80
> Jul 16 11:15:20 delta kernel: MCA: Misc 0x1834958000001385
> Jul 16 12:15:20 delta kernel: MCA: Bank 8, Status 0x8c0000400001009f
> Jul 16 12:15:20 delta kernel: MCA: Global Cap 0x0000000000001c09, Status
> 0x00000
> 00000000000
> Jul 16 12:15:20 delta kernel: MCA: Vendor "GenuineIntel", ID 0x106a5, APIC
> ID 16
> Jul 16 12:15:20 delta kernel: MCA: CPU 8 COR (1) RD channel ?? memory error
> Jul 16 12:15:20 delta kernel: MCA: Address 0x28e019f80
> Jul 16 12:15:20 delta kernel: MCA: Misc 0x1834958000000588

Copying the above into zxc, I see

troutmask:kargl[212] ./mcelog --ascii < zxc
mcelog: Cannot open /dev/mem for DMI decoding: Permission denied
HARDWARE ERROR. This is *NOT* a software problem!
Please contact your hardware vendor
CPU 8 BANK 8 
MISC 1834958000000588 ADDR 28e019f80 
MCG status:
MCi status:
MCi_MISC register valid
MCi_ADDR register valid
MCA: MEMORY CONTROLLER RD_CHANNELunspecified_ERR
Transaction: Memory read error
Memory read ECC error
Memory corrected error count (CORE_ERR_CNT): 1
Memory transaction Tracker ID (RTId): 88
Memory DIMM ID of error: 0
Memory channel ID of error: 0
Memory ECC syndrome: 18349580
STATUS 8c0000400001009f MCGSTATUS 0
MCGCAP 1c09 APICID 10 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 26

Looks like your DIMM 0 had an error that was corrected due to ECC.
Received on Sat Jul 16 2011 - 16:38:17 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:15 UTC