5.2 SMP data corruption problems...

From: Jaye Mathisen <mrcpu_at_internetcds.com>
Date: Tue, 20 Jan 2004 09:10:30 -0800
5.2-current as of 1/15.  mobo is Tyan HESL-T, bios rev is 1.04, dual P3 1G'S.
2 3WARE CONTROLLERS, latest bios, 16 drives.

Was seeing data corruption on large copies to the 3ware drives, via FTP/samba
or even just tar from disk to disk.  Small files never seemed to get
corruped (md5 checksum'd everything regularly), but files over 4G seemed to
always get corrupted somewhere, although not at the same spots.

Eventually the box panic'd with a lock order reversal, and would not let
me fsck the large partition (900GB), it would keep panicing in pass 2
wiht anotehr lock-order reversal.

I supped to current as of 1/19, tried again, same thing, file corruption,
lots of panics.


Finally, in the midst of just messing with stuff, I build a new kernel
without the smp/apic stuff, and it's working fine.

Disk-to-disk copies are fine, no corruption, nothing during uploads, no
panics.  And I can fsck the partition that I couldn't before, and it works fine.

I do not have the kernel dump info, the debugging was being done remotely over
the phone, no way I was going to transcribe it that way.

Anyway, just a heads up for those with potentially serverworks chipsets and 5.2, there's
possibly something wrong.  The corruption is silent, if I hadn't checked, there'd be
no way to know.
Received on Tue Jan 20 2004 - 08:10:45 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:39 UTC