On Sat, Jul 25, 2009 at 10:19:10PM +0300, Alexander Motin wrote: > Juergen Lock wrote: > > On Mon, Jul 06, 2009 at 11:16:46PM +0200, Juergen Lock wrote: > >> I tried this on the box with that optical drive that head no > >> longer likes (fails to be probed and generates an irq storm, see > >> http://docs.freebsd.org/cgi/mid.cgi?20090628101656.GA38983 > >> ), and with ahci.ko loaded by loader.conf I got timeouts followed by > >> a panic: > >> http://people.freebsd.org/~nox/cam-ata.20090704-panic1.jpg > >> http://people.freebsd.org/~nox/cam-ata.20090704-panic2.jpg > >> [...] > > > > Ok I managed to dig myself out of this mess by connecting the problem > > drive to a jmicron pcie card that fell into my hands yesterday; I updated > > the test install to head from today and started reinstalling ports (bc of > > the shlib bumps) and testing the new hplip port on head (seems to work > > no worse than on 7), when suddenly ahci got problems: it printed endless > > retrying messages with the box' disk access led on solid, causing processes > > to get stuck. I was still able to switch to a console and enter ddb, > > but dumping (call doadump) failed and I didn't know what to look for > > otherwise, so I'm afraid I can't give more info about this hang. :( > > Anyway, could this be caused by ncq? I have disabled ahci.ko for now, > > we'll see if this `fixes' it... > > Difficult to say without seeing those messages. NCQ errors actually may > lead to series (up to 32) of retries, as if there were several running > commands when error happened, all other commands are aborted and retried > after error recovery process completes. Ah so the recovery could take several minutes? Maybe I didn't wait long enough then... > I haven't experimented with > really broken drives, but artificially generated NCQ errors were handled > properly on my tests. > OK I guess I should take a photo next time it happens... Btw, can the max # of `tags' be lowered with ncq too in case a drive cant handle too many? I think its `camcontrol tags' for scsi... > > Here is the dmesg with ahci and the jmicron: > > > > atapci0: <JMicron JMB363 SATA300 controller> port 0xbf00-0xbf07,0xbe00-0xbe03,0xbd00-0xbd07,0xbc00-0xbc03,0xbb00-0xbb0f mem 0xfd8fe000-0xfd8fffff irq 17 at device 0.0 on pci2 > > atapci0: Reserved 0x10 bytes for rid 0x20 type 4 at 0xbb00 > > ioapic0: routing intpin 17 (PCI IRQ 17) to lapic 0 vector 49 > > atapci0: [MPSAFE] > > atapci0: [ITHREAD] > > atapci0: Reserved 0x2000 bytes for rid 0x24 type 3 at 0xfd8fe000 > > atapci0: AHCI called from vendor specific driver > > atapci0: AHCI v1.00 controller with 2 3Gbps ports, PM supported > > atapci0: Caps: 64bit NCQ ALP AL CLO 3Gbps PM PMD SSC PSC 32cmd 2ports > > ata2: <ATA channel 0> on atapci0 > > ata2: AHCI reset... > > ata2: hardware reset ... > > ata2: SATA connect timeout status=00000000 > > ata2: AHCI reset done: phy reset found no device > > ata2: [MPSAFE] > > ata2: [ITHREAD] > > ata3: <ATA channel 1> on atapci0 > > ata3: AHCI reset... > > ata3: hardware reset ... > > ata3: SATA connect time=0ms status=00000113 > > ata3: ready wait time=11ms > > ata3: software reset port 15... > > ata3: ready wait time=0ms > > ata3: SIGNATURE: eb140101 > > ata3: AHCI reset done: devices=00010000 > > ata3: [MPSAFE] > > ata3: [ITHREAD] > > ata4: <ATA channel 2> on atapci0 > > atapci0: Reserved 0x8 bytes for rid 0x10 type 4 at 0xbf00 > > atapci0: Reserved 0x4 bytes for rid 0x14 type 4 at 0xbe00 > > ata4: reset tp1 mask=03 ostat0=60 ostat1=70 > > ata4: stat0=0x20 err=0x20 lsb=0x20 msb=0x20 > > ata4: stat1=0x30 err=0x30 lsb=0x30 msb=0x30 > > ata4: reset tp2 stat0=20 stat1=30 devices=0x0 > > ata4: [MPSAFE] > > ata4: [ITHREAD] > > As I can see here, your JMicron configured for combined mode, not for > plain AHCI, so it was handled by ata(4), not by ahci(4). Ah that can be configured? Anyway there's only an optical drive on it atm so its probably not _that_ important. :) Thanx, JuergenReceived on Sat Jul 25 2009 - 18:13:31 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:52 UTC