Re: Has anyone else seen any form of in memory or on disk corruption?

From: Bakul Shah <bakul_at_bitblocks.com>
Date: Sat, 05 Jul 2008 10:18:36 -0700
On Sat, 05 Jul 2008 11:59:34 EDT gnn_at_freebsd.org  wrote:
> At Fri, 04 Jul 2008 12:10:43 -0700,
> Bakul Shah wrote:
> > 
> > On Fri, 04 Jul 2008 12:58:07 EDT gnn_at_freebsd.org  wrote:
> > > I have hundreds of these files to run this over, and a full check
> > > takes about 3 hours, but I usually see some form of corruption within
> > > the first 20 minutes.
> > > 	...
> > > 4) Corruption is seen only after a reboot, if the machines continue to
> > > run corruption is never seen again, until another reboot.
> > 
> > This sounds like a hardware problem....  May be heat related or due
> > to a marginal power supply?  Try using a beefier supply on one of
> > the systems or removing something to reduce load.  Or increase the
> > load by making disks do lots of seeking while you are running unzip
> > and running other things at the same time.  To isolate heat related
> > problems we used to blow cold air (or hot air) on suspected
> > components and see if the problem goes away or gets worse.
> 
> These machines are in a brand new data center with more than adequate
> cooling and have very beefy power.  

The fact you see corruption during just the first 20 minutes
and that you have this problem with a bunch of machines with
the same kind of mobo made me think of a h/w problem.  A
marginal design may work fine within some temperature range
but not the one of interest to you (when the scratches on
your head get really deep this is something to look into).
Received on Sat Jul 05 2008 - 15:18:37 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:32 UTC