On Sat, 28 Feb 2009 14:48:52 -0500 Elliot Schlegelmilch <elliot+list_at_schlegelmilch.org> wrote: > On Thu, Feb 26, 2009 at 12:22:12PM +0100, Gary Jennejohn wrote: > > On Wed, 25 Feb 2009 21:56:38 +0200 > > Alexander Motin <mav_at_FreeBSD.org> wrote: > > > > > Gary Jennejohn wrote: > > > > I've been having lots of problems with SATA drives attached to higher > > > > port numbers, namely ata5 and ata6. > > > > > > > > I was installing Linux under qemu today and it had been running for > > > > several hours and had installed multi-gigabytes of data when qemu > > > > just stopped. > > > > > > > > I noticed that all I/O to the disk had ceased. > > > > > > > > Doing "atacontrol reinit" on the port (ata5) resulted in a message > > > > that the device was not configured, which was patently false since > > > > qemu had just been merrily writing to it. > > > > > > > > This with a kernel made from sources updated today at about 2 PM (GMT+1). > > > > > > > > I've also seen problems with a disk attached to ata6. It just sort > > > > of disappears after a while. > > > > > > > > Disks attached to ata2, ata3 and ata4 don't exhibit any problems. > > > > > > You have told much and same time gave nothing that can be used. > > > > > > > I was only interested in whether others have seen this problem. I was > > not looking for a solution. > > > > > What controller do you have? What drives on what channels? Is there any > > > kernel messages about the problem? Have you tried to enable verbose > > > messages to get additional details? > > > > > > > atapci0_at_pci0:0:17:0: class=0x010601 card=0xb0021458 chip=0x43911002 rev=0x00 hdr=0x00 > > vendor = 'ATI Technologies Inc' > > class = mass storage > > subclass = SATA > > > > There were no kernel messages at all, the drive simply hung. > > > > I'll do a verbose boot and try to reproduce the disk hang later. > > > > > Reinit could return ENXIO if it already was in progress. Disappearing > > > drives are also can be related to that reinit. Can't it be just a real > > > hardware problem? > > > > > > > I should have mentioned that the error returned was about some IOCTL. > > Can't remember which one right now, but the error message did include > > that the device was not configured. > > > > I've also noticed several times in the past when the problem occurred > > that the BIOS could not enumerate the AHCI disks anymore. I had to > > do a POR. Seems that the controller was completely hosed such that > > a simple reset didn't reinitialize it sufficiently for it to work. > > > > This morning I booted the box and started a cvsup. My repository is > > on a ZFS mirror with the disks on ata3 and ata4. The system hung after > > the data from the server were received, although all the data were > > successfully written to the disks. > > > > I couldn't do anything at all - it looked like the root disk was not > > responding and the disk light was on solid red. I had to do a hard > > reset. > > > > This is the first time I've seen a problem with this port. The root > > disk is on ata2. > > > > I rebooted and turned off MSI. I'll monitor the situation to see > > whether that helps. > > I don't mean to hijack your thread, but I've had problems with one of > my SATA disks falling off the bus. I could usually retrieve it with > an atacontrol detach / retach. However, with a recent kernel all I'm > getting is this: > > ata2: <ATA channel 0> on atapci1 > ata2: AHCI reset...: 2 > ata2: SATA connect time=0ms > ata2: ready wait time=0ms52 (12272 MB) > ata2: software reset port 15... > ata2: ahci_issue_cmd timeout: 100 of 100ms, status=00000001 > ata2: software reset set timeout > ata2: software reset port 0... > ata2: ahci_issue_cmd timeout: 100 of 100ms, status=00000001 > ata2: software reset set timeout > ata2: SIGNATURE: ffffffff > ata2: Unknown signature, assuming disk device > ata2: AHCI reset done: devices=00000001 > ata2: [MPSAFE] > ata2: [ITHREAD] > > One for each channel, up to ata7. > This is what I see when e.g. ata6 is hosed. Interesting to see that not just ATI (780G) has problems. I tried a detach/retach once, but interesting things happened because the disk was mounted and I was (stupidly) cd'd to it. I tried mounting the disk sync today, which may have been helpful. Hard to say. I was able to do an online update of the openSUSE which I have running out of a qemu image on the affected disk and it succeeded. > atapci0_at_pci0:0:31:1: class=0x01018a card=0x948115d9 chip=0x269e8086 rev=0x09 hdr=0x00 > vendor = 'Intel Corporation' > device = '631xESB/632xESB/3100 Ultra ATA Storage Controller' > class = mass storage > subclass = ATA > > The last known kernel which works was Dec 17, but trying to rebuild a > kernel from that date doesn't see the SATA disks either (as the kernel > which sees the disks zfs doesn't work.) Or perhaps I'm csup'ing > incorrectly. I'm still trying to back up far enough so it will work. --- Gary JennejohnReceived on Sat Feb 28 2009 - 20:05:49 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:42 UTC