Hi, I have a simple patch for amd64 that uses the Machine Check Architecture/Exceptions on most recent x86 CPUs to detect memory errors: http://people.freebsd.org/~ssouhlal/testing/mce-20070621.diff It will report uncorrected and corrected errors (the latter, only if sysctl machdep.mce.log_corrected=1). You can ask the kernel to panic if it gets an uncorrected error by setting machdep.mce.panic_on_uc=1. All this can be disabled by setting the machdep.mce.enable tunable to 0. I'm still not sure if I want this enabled by default, as I don't have any Intel machines to test this on, but I have tested it on Opteron (both corrected and uncorrected errors). I would appreciate it if someone would try this, especially if you have Intel machines with bad RAM. Comments are welcome. -- SuleimanReceived on Mon Jun 25 2007 - 23:55:16 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:13 UTC