On Sun, Sep 13, 2009 at 11:17:41PM +0300, Alexander Motin wrote: > Kris Kennaway wrote: > > I am getting timeouts on 8.0b4/HEAD when I do a lot of ZFS I/O to a pool > > on ad4: > > > > atapci0: <VIA 6420 SATA150 controller> port > > 0xc800-0xc807,0xc400-0xc403,0xc000-0xc007,0xb800-0xb803,0xb400-0xb40f,0xb000-0xb0ff > > irq 20 at device 15.0 on pci0 > > ata2: <ATA channel 0> on atapci0 > > ata3: <ATA channel 1> on atapci0 > > ata0: <ATA channel 0> on atapci1 > > ata1: <ATA channel 1> on atapci1 > > > > ad4: 476940MB <WDC WD5000AAKS-00TMA0 12.01C01> at ata2-master SATA150 > > ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - > > completing request directly > > ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - > > completing request directly > > ad4: WARNING - SETFEATURES ENABLE RCACHE taskqueue timeout - completing > > request directly > > ad4: WARNING - SETFEATURES ENABLE WCACHE taskqueue timeout - completing > > request directly > > ad4: WARNING - SET_MULTI taskqueue timeout - completing request directly > > ad4: TIMEOUT - WRITE_DMA48 retrying (1 retry left) LBA=344052040 > > ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - > > completing request directly > > ad4: WARNING - SETFEATURES SET TRANSFER MODE taskqueue timeout - > > completing request directly > > > > It becomes stuck in a loop displaying the above and is unable to > > complete further I/O operations. I wonder if it is just batching up a > > lot of I/O and then timing out because it is busy, and then not > > recovering from this state? > > > > Any ideas what could be wrong? > > There are two different kinds of timeouts we can see: > - first one, "ad4: WARNING - ..." is just a queue waiting timeout. It > is not the reason, but consequence of the problem. And I have doubts > that it is reasonable to do it. > - second one, "TIMEOUT - WRITE_DMA48 ..." is a real command execution > timeout. I don't know whether this is result of some improper error > recovery, or you drive indeed lost required servo information near > LBA=344052040 and tries to find it too long. You can try to read that > sector and nearby ones with dd. Could this be related to BIO_FLUSH requests? -- Pawel Jakub Dawidek http://www.wheel.pl pjd_at_FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am!
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:55 UTC