AHCI timeout when using ZFS + AIO + NCQ

From: Vladislav Prodan <universite_at_ukr.net>
Date: Thu, 24 Jan 2013 14:19:38 +0200
('binary' encoding is not supported, stored as-is) I have the server: FreeBSD 9.1-PRERELEASE #0: Wed Jul 25 01:40:56 EEST 2012 Jan 24 12:53:01 vesuvius kernel: atapci0: <JMicron ATA controller> port 0xc040-0xc047,0xc030-0xc033,0xc020-0xc027,0xc010-0xc013,0xc000-0xc00f mem 0xfe210000-0xfe2101ff irq 51 at device 0.0 on pci3 ... Jan 24 12:53:01 vesuvius kernel: ahci0: <ATI IXP700 AHCI SATA controller> port 0xf040-0xf047,0xf030-0xf033,0xf020-0xf027,0xf010-0xf013,0xf000-0xf00f mem 0xfe307000-0xfe3073ff irq 19 at device 17.0 on pci0 Jan 24 12:53:01 vesuvius kernel: ahci0: AHCI v1.20 with 6 6Gbps ports, Port Multiplier supported ... Jan 24 12:53:01 vesuvius kernel: ada2 at ahcich2 bus 0 scbus4 target 0 lun 0 Jan 24 12:53:01 vesuvius kernel: ada2: <ST3000DM001-9YN166 CC4C> ATA-8 SATA 3.x device Jan 24 12:53:01 vesuvius kernel: ada2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes) Jan 24 12:53:01 vesuvius kernel: ada2: Command Queueing enabled Jan 24 12:53:01 vesuvius kernel: ada2: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C) Jan 24 12:53:01 vesuvius kernel: ada2: Previously was known as ad12 ... I use 4 HDD in RAID10 via ZFS. With a very irregular intervals fall off HDD drives. As a result, the server stops. Jan 24 06:48:06 vesuvius kernel: ahcich2: Timeout on slot 6 port 0 Jan 24 06:48:06 vesuvius kernel: ahcich2: is 00000000 cs 00000000 ss 000000c0 rs 000000c0 tfd 40 serr 00000000 cmd 0000e817 Jan 24 06:48:06 vesuvius kernel: (ada2:ahcich2:0:0:0): READ_FPDMA_QUEUED. ACB: 60 00 4c 4e 1e 40 68 00 00 01 00 00 Jan 24 06:48:06 vesuvius kernel: (ada2:ahcich2:0:0:0): CAM status: Command timeout Jan 24 06:48:06 vesuvius kernel: (ada2:ahcich2:0:0:0): Retrying command Jan 24 06:51:11 vesuvius kernel: ahcich2: AHCI reset: device not ready after 31000ms (tfd = 00000080) Jan 24 06:51:11 vesuvius kernel: ahcich2: Timeout on slot 8 port 0 Jan 24 06:51:11 vesuvius kernel: ahcich2: is 00000000 cs 00000100 ss 00000000 rs 00000100 tfd 00 serr 00000000 cmd 0000e817 Jan 24 06:51:11 vesuvius kernel: (aprobe0:ahcich2:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00 Jan 24 06:51:11 vesuvius kernel: (aprobe0:ahcich2:0:0:0): CAM status: Command timeout Jan 24 06:51:11 vesuvius kernel: (aprobe0:ahcich2:0:0:0): Error 5, Retry was blocked Jan 24 06:51:11 vesuvius kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 4227133, size: 8192 Jan 24 06:51:11 vesuvius kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 4227133, size: 8192 Jan 24 06:51:11 vesuvius kernel: ahcich2: AHCI reset: device not ready after 31000ms (tfd = 00000080) Jan 24 06:51:11 vesuvius kernel: swap_pager: indefinite wait buffer: bufobj: 0, blkno: 4227133, size: 8192 Jan 24 06:51:11 vesuvius kernel: ahcich2: Timeout on slot 8 port 0 Jan 24 06:51:11 vesuvius kernel: ahcich2: is 00000000 cs 00000100 ss 00000000 rs 00000100 tfd 00 serr 00000000 cmd 0000e817 Jan 24 06:51:11 vesuvius kernel: (aprobe0:ahcich2:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00 Jan 24 06:51:11 vesuvius kernel: (aprobe0:ahcich2:0:0:0): CAM status: Command timeout Jan 24 06:51:11 vesuvius kernel: (aprobe0:ahcich2:0:0:0): Error 5, Retry was blocked Jan 24 06:51:11 vesuvius kernel: swap_pager: I/O error - pagein failed; blkno 4227133,size 8192, error 6 Jan 24 06:51:11 vesuvius kernel: (ada2:(pass2:vm_fault: pager read error, pid 1943 (named) Jan 24 06:51:11 vesuvius kernel: ahcich2:0:ahcich2:0:0:0:0): lost device Jan 24 06:51:11 vesuvius kernel: 0): passdevgonecb: devfs entry is gone Jan 24 06:51:11 vesuvius kernel: pid 1943 (named), uid 53: exited on signal 11 ... Helps only restart by pressing Power. Judging by the state of SMART, HDD have no problems. SATA data cable changed. I found a similar problem: http://lists.freebsd.org/pipermail/freebsd-stable/2010-February/055374.html PR: amd64/165547: NVIDIA MCP67 AHCI SATA controller timeout -- Vladislav V. Prodan System & Network Administrator http://support.od.ua +380 67 4584408, +380 99 4060508 VVP88-RIPE Received on Thu Jan 24 2013 - 11:40:36 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:34 UTC