mpt errors - UNIT ATTENTION asc:29,0

From: Artem Belevich <artemb_at_gmail.com>
Date: Fri, 7 Aug 2009 11:06:03 -0700
Hi,

I'm running 8.0-BETA2 on Asus p5BV/SAS with built-in LSI1068
controller with 8 SATA ports. 6 of the ports hooked up to 1TB WD Green
drives. The drives are used as a single raidz2 ZFS pool:

	NAME        STATE     READ WRITE CKSUM
	z2          ONLINE       0     0     0
	  raidz2    ONLINE       0     0     0
	    da1     ONLINE       0     0     0
	    da0     ONLINE       0     0     0
	    da2     ONLINE       0     0     0
	    da3     ONLINE       0     0     0
	    da4     ONLINE       0     0     0
	    da5     ONLINE       0     0     0

I'm runing a simple stress test that copies 10GB file until it fills
the volume and then runs "zfs scrub" on it.

dd if=/dev/urandom of=/z2/f.0 bs=1m count=10240
for f in {1..350}; do echo $f; cp f.$[$f-1] f.$f; done;
zpool scrub z2

What concerns me is that I'm periodically getting error messages from
MPT driver. They usually start few hours after the start of the script
and by the end of it they are happening every few minutes seemingly
randomly on all six drives.

Aug  7 10:25:32 buz kernel: mpt0: mpt_cam_event: 0x16
Aug  7 10:25:32 buz kernel: mpt0: mpt_cam_event: 0x16
Aug  7 10:25:32 buz kernel: (da4:mpt0:0:4:0): READ(10). CDB: 28 0 46
32 97 c0 0 0 80 0
Aug  7 10:25:32 buz kernel: (da4:mpt0:0:4:0): CAM Status: SCSI Status Error
Aug  7 10:25:32 buz kernel: (da4:mpt0:0:4:0): SCSI Status: Check Condition
Aug  7 10:25:32 buz kernel: (da4:mpt0:0:4:0): UNIT ATTENTION asc:29,0
Aug  7 10:25:32 buz kernel: (da4:mpt0:0:4:0): Power on, reset, or bus
device reset occurred
Aug  7 10:25:32 buz kernel: (da4:mpt0:0:4:0): Retrying Command (per Sense Data)

ZFS scrub does not seem to report any issues so far - no checksum or
read/write errors. WD's hard drive diagnostics tools didn't find any
issues with te drives either.

Sould somebody shed some light on why would such error happen? Is that
some sort of hardware issue? Driver bug? Issue with compatibility
between controller and the drives? System configuration issue (some
sysctl/tunable needs tweaking, perhaps)?

I'd appreciate any hints on what could be going on and what should be
done about it.

Thanks,
--Artem
Received on Fri Aug 07 2009 - 16:39:11 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:53 UTC