Hi List, i already brought up my issue with data corruption when i suspected that gbde might be the cause for it. it turns out gbde was not guilty. then just now i thought that M. Warner Losh's mail (subject 'Precaution') could explain what's going wrong, but i still have the problem with a new world and kernel as of today. i can reproduce the data corruption by doing the following: (i have two disks in the box, one is 30GB, the other is 60GB) the 30GB disk is already filled with data, which i then copy (in parallel) into two directories on the 60GB disk. the result is that i (should) have two times the same data on the 60gb disk as on the 30gb disk. then i compute the checksums of the duplicated files - and more often than not a few files are corrupted in the copied version. more numbers: i typically see 1-2 blocks of corrupted data (32kb is the size of corrupted data i usually see) on that 60gb disk. usually the corruptions are aligned within the file, at least to a multiple of 512. often, the corrupted data consists of lots of 0-bytes, but i also see data that looks random in other places of the corrupted segments of the files. it seems that not only the content of files gets corrupted, i also see errors when i fsck that partition, sometimes (for example: once i saw a file that had a size in "ls -l" which clearly didn't match its actual content, as seen by "wc"). by now i have ruled out a number of possible reasons: - i am only using local disks (no networking as i did initially) - first i used a 512mb ddr memory, now i use a 256mb sdr one, which should (i believe) have different enough properties to rule out the original memory as the cause of the problem as i see it, the issue can be hardware-related (mainboard/cpu seem to be the only remaining possibilities) or software related (maybe the driver for the chipset, in particular the harddrive controller, is suboptimal ?), or maybe come other freebsd code that moves around data makes occasional mistakes. the board is an elitegroup k7-s5a lan (with sis 735 chipset), the cpu an amd xp 1800+ (i specifically bought that hardware very recently to run a gbde-based nfs server on it). does anyone know of any (freebsd-current) issues that might be causing this - or have any idea on how i can further rule out anything of this kind ? my best idea at this point is to go out and buy p4-board and cpu. and i don't really like that. it seems almost futile to go and get another board/cpu of the same type before i have a good idea what is actually wrong :( regards, thanks for any thoughts, Heiko -- Free Software. Why put up with inferior code and antisocial corporations? http://www.gnu.org/philosophy/why-free.htmlReceived on Tue May 06 2003 - 05:41:41 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:06 UTC