On Tuesday 12 April 2011 23:39:30 Garrett Cooper wrote: > On Tue, Apr 12, 2011 at 2:08 PM, Alexander Motin <mav_at_freebsd.org> wrote: > > YongHyeon PYUN wrote: > >> On Tue, Apr 12, 2011 at 11:12:55PM +0300, Alexander Motin wrote: > >>> David Naylor wrote: > >>>> On Tuesday 12 April 2011 08:17:51 Alexander Motin wrote: > >>>>> David Naylor wrote: > >>>>>> I am running -current and since a few days ago (at least 2011/04/11) > >>>>>> I am unable to boot. > >>>>>> > >>>>>> The boot process stops when it looks to find a bootable device. The > >>>>>> prompt (when pressing '?') does not display any device and yielding > >>>>>> one second (or more) to the kernel (by pressing '.') does not > >>>>>> improve the situation. > >>>>>> > >>>>>> A known working date is 2011/02/20. > >>>>>> > >>>>>> I am running amd64 on a nVidia MCP51 chipset. > >>>>> > >>>>> MCP51... again... > >>>>> > >>>>>> I am willing to help any way I can. > >>>>> > >>>>> You could start from capturing and showing verbose dmesg. Full or at > >>>>> least in parts related to disks. > >>>> > >>>> I captured the dmesg output for both the old (working) kernel and the > >>>> new (bad) kernel. See attached for the difference between the two. > >>>> If you need the full dmesg please let me know. > >>>> > >>>> One thing I found is that the old kernel would not boot if I simply > >>>> rebooted from the bad kernel. I had to do a hard power off before > >>>> the old kernel would work again. Is some device state surviving > >>>> between reboots? > >>> > >>> +ata2: reiniting channel .. > >>> +ata2: SATA connect time=0ms status=00000113 > >>> +ata2: reset tp1 mask=01 ostat0=58 ostat1=00 > >>> +ata2: stat0=0x50 err=0x01 lsb=0x00 msb=0x00 > >>> +ata2: reset tp2 stat0=50 stat1=00 devices=0x1 > >>> +ata2: reinit done .. > >>> +unknown: FAILURE - ATA_IDENTIFY timed out LBA=0 > >>> > >>> As soon as all devices detected but not responding to commands, I would > >>> suppose that there is something wrong with ATA interrupts. There is a > >>> long chain of interrupt problems in this chipset. I have already tried > >>> to debug one case where ATA wasn't generating interrupts at all. > >>> Unfortunately, without success -- requests were executing, but not > >>> generating interrupts, it wasn't looked like ATA driver problem. > >>> > >>> What's about possible candidate to revision triggering your problem, I > >>> would look on this message: > >>> +pcib0: Enabling MSI window for HyperTransport slave at pci0:0:9:0 > >>> > >>> At least it is recent (SVN revs 219737,219740 on 2011-03-18 by jhb) and > >>> it is interrupt related. > >> > >> Does the driver disable MSI for MCP51? > > > > ata(4) doesn't uses MSI by default and I doubt this controller supports > > them any way. But if I am not mixing something, there were very strange > > situations with MSI on that chipset, when enabling them one one device > > caused interrupt problems on another. > > > >> I think jhb's patch fixed one MSI issue of all MCP chipset. > > > > I am not telling it is wrong. It could just trigger something. > > Could the OP try disabling MSI[X] to see whether or not the issue > still occurs then? > -Garrett I added: hw.pci.enable_msi=0 hw.pci.enable_msix=0 to loader.conf but the problem persisted. _at_mav: I will revert r219737 and r219740 and try again but this will be in +10 hours... Thanks
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:13 UTC