Re: data corruption with DISABLE_PSE+DISABLE_PG_G: unrelated

From: Terry Lambert <tlambert2_at_mindspring.com>
Date: Sat, 10 May 2003 12:19:51 -0700
Heiko Schaefer wrote:
> i'm sorry, my mail was probably a bit confusing.
> since it has been pointed out to me, i am running -current kernels with
> 
> options               DISABLE_PSE
> options               DISABLE_PG_G
> 
> enabled.
> 
> what i am asking myself:
> is there any chance that i still get any data corruption because of the
> issues that you write about in some configuration ?!

No.  Not with thouse flags set.  If you are getting data
corruption with the flags set, then you have some other
problem, most likely hardware.


> because with the 512mb (ddr) ram (which might or might not be defective) i
> get data corruption, while with another 256mb (sdr) ram, i apparently
> don't.

At Whistle, we had a number of issues with matched simms;
the simms from one manufacturer were not good vs. another,
supposedly identically rated part.

There's also the possibility that the RAM speed is too
slow for your FSB speed.  You should be careful here,
because not all motherboards are able to detect this
mismatch.

Another less common problem with RAM is that the chipset
you are using can delay referesh cycles under interrupt
load, if it is programmed incorrectly.  We saw this in the
InterJet II with the Cyrix MediaGX chipset, which has a
nasty DMA transfer bug, unless it's programmed *just so*
by the BIOS.

Doug Ambrisko can tell you more about the RAM problem, and
Julian Elisher is the guy who rewrote the BIOS settings to
program the Cyrix chipset correctly for us.  Both of them
are on these mailing lists, but probably not reading this
thread.

You may also want to try a different or bigger power
supply; I've seen undervoltage and underrated power supplies
cause this problem, as well, when they functioned fine with
less RAM.

It could also be cooling, inside the case.  More RAM = more
heat.

Finally, I've personally seen problems when using SIMMs
with gold connectors instead of tin; yeah, that's counter
intuitive.

This is "looking for zebras" territory, at this point,
though, so take my speculations with a grain of salt.


> so far i had the impression that my test (copying >30gb of checksummed
> data between disks) shows these problems rather reliably.

It's something other than the CPU bug; sorry.

Please ignore the rest of the CPU related information to
which you responded; more memory, etc., won't help you out
of this one...

-- Terry
Received on Sat May 10 2003 - 10:21:22 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:07 UTC