Re: ECC memory driver in FreeBSD 10?

From: O. Hartmann <ohartman_at_zedat.fu-berlin.de>
Date: Mon, 09 Apr 2012 12:04:27 +0200
Am 04/08/12 14:53, schrieb Miroslav Lachman:
> Nikolay Denev wrote:
>> On Apr 6, 2012, at 2:48 PM, O. Hartmann wrote:
>>
>>> I'm looking for a way to force FreeBSD 10 to maintain/watch ECC errors
>>> reported by UEFI (or BIOS).
>>> Since ECC is said to be essential for server systems both in buisness
>>> and science and I do not question this, I was wondering if I can not
>>> report ECC errors via a watchdog or UEFI (ACPI?) report to syslog
>>> facility on FreeBSD.
>>> FreeBSD is supposed to be a server operating system, as far as I know,
>>> so I believe there must be something which didn't have revealed itself
>>> to me, yet.
> 
>>
>> If the hardware supports it, such errors should be logged as MCEs
>> (Machine Check Exceptions).
>> I can say for sure it works pretty well with Dell servers, as I had 
>> one with failing RAM module, and
>> it reported the corrected ECC errors in dmesg.
> 
> Memory ECC errors are logged in to messages and you can decode it by
> sysutils/mcelog. I did it in the past on one of our Sun Fire X2100 M2
> with FreeBSD 8.x.
> 
> Miroslav Lachman

Seems that I have been blessed with non-faulty memory over tha past
three or four years. Last time I saw errors was around 2000. All of our
24/7 servers do have ECC RAM.

So, your replies all implies if I log the system's messages via syslog
properly (as we do remotely on a centralized server), then ECC errors
should be reported by FreeBSD/kernel in a canonical way as the UEFI/BIOS
reports them?
Without special drivers/tools, scripts which scans for those errors
should report occurences?

Since my (FreeBSD) boxes didn't show up errors of that kind - Linux
boxes of a colleague did once! - doesn't imply missing capabilities.
This is nice to hear/read.

Thanks a lot,

Oliver


Received on Mon Apr 09 2012 - 08:04:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:25 UTC