On Jun 25, 2007, at 11:55 PM, Ed Schouten wrote: > * Suleiman Souhlal <ssouhlal_at_FreeBSD.org> wrote: >> Hi, >> >> I have a simple patch for amd64 that uses the Machine Check >> Architecture/Exceptions on most recent x86 CPUs to detect memory >> errors: >> >> http://people.freebsd.org/~ssouhlal/testing/mce-20070621.diff >> >> It will report uncorrected and corrected errors (the latter, only >> if sysctl >> machdep.mce.log_corrected=1). >> You can ask the kernel to panic if it gets an uncorrected error >> by setting >> machdep.mce.panic_on_uc=1. >> All this can be disabled by setting the machdep.mce.enable >> tunable to 0. I'm >> still not sure if I want this enabled by default, as I don't have >> any Intel >> machines to test this on, but I have tested it on Opteron (both >> corrected >> and uncorrected errors). >> >> I would appreciate it if someone would try this, especially if >> you have >> Intel machines with bad RAM. >> >> Comments are welcome. > > | /* > | * Uncorrected MCEs will generate a #MC, while corrected > | * don't, so we have to periodically poll for them. > | */ > > What about adding an option to only print uncorrected MCE's? That's > the > most interesting data and we can get that without using a kthread, > right? sysctl machdep.mce.log_corrected=0 machdep.mce.poll_delay=0 will stop reporting the corrected errors and will stop the kthread (but won't actually kill it (I guess I'll fix that before I commit the patch)). Thanks, -- SuleimanReceived on Tue Jun 26 2007 - 05:10:43 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:13 UTC