Re: data corruption with current (maybe sis chipset related?)

From: Heiko Schaefer <hschaefer_at_fto.de>
Date: Thu, 8 May 2003 12:36:41 +0200 (CEST)
Hi Poul,

> >> >more numbers: i typically see 1-2 blocks of corrupted data (32kb is the
> >> >size of corrupted data i usually see) on that 60gb disk.
> >>
> >> I just saw this as well in the stdout+stderr from a make universe :-(
> >
> >erm... this sounds exciting to me, but i have no clue what you actually
> >mean. could you enlighten me ? :)
> >
> >as i am currently spending heavy thought on investing more money in
> >hardware to solve my problem, any hints regarding software-reasons would
> >be most appreciated.
>
> I have no clue to software reasons yet, but the fact that I saw something
> similar means that you should not rush out to get new hardware just yet.

sorry to be pushy, but have you found anything - or been able to reproduce
anything since then ? i'm still confused how exactly you determined that
some data on your disk was corrupt.

i'm quite anxious to get this issue solved practically for me, and i'm by
now desperate enough to give linux+loopaes some half-serious tought.
however, i would of course prefer to solve this on the freebsd side once
and for all - assuming it's not only broken hardware on my end.

if there is anything i can do - testingwise - i'd be happy to. if there's
a reasonable chance that noone will find the cause anytime soon (again
assuming it's a software issue), i'd also be willing to buy hardware that
should rule out any hardware causes plausibly in order to pin it down.

more fiddling with copying stuff between my two disks didn't give me any
new insight, but the statistical properties of the data corruption puzzle
me. i almost always get 1-2 broken blocks per disk - almost never no
broken blocks, and i've never seen a larger number of them either.

parallel copying (which i initially did) seems not to be necessary to
reproduce the problem for me. one "cp -Rp" on an otherwise empty box leads
to the same corruption.

the thought that the cvs code might be so broken that this happens on a
healthy machine troubles me greatly... judging by my statistics, it might
be something that could just possibly go unnoticed for some time, even if
it would be happening on every -current box - i imagine.

regards,

Heiko
Received on Thu May 08 2003 - 01:36:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:07 UTC