Re: [regression] unable to boot: no GEOM devices found.

From: Alexander Motin <mav_at_FreeBSD.org>
Date: Wed, 13 Apr 2011 00:08:58 +0300
YongHyeon PYUN wrote:
> On Tue, Apr 12, 2011 at 11:12:55PM +0300, Alexander Motin wrote:
>> David Naylor wrote:
>>> On Tuesday 12 April 2011 08:17:51 Alexander Motin wrote:
>>>> David Naylor wrote:
>>>>> I am running -current and since a few days ago (at least 2011/04/11) I am
>>>>> unable to boot.
>>>>>
>>>>> The boot process stops when it looks to find a bootable device.  The
>>>>> prompt (when pressing '?') does not display any device and yielding one
>>>>> second (or more) to the kernel (by pressing '.') does not improve the
>>>>> situation.
>>>>>
>>>>> A known working date is 2011/02/20.
>>>>>
>>>>> I am running amd64 on a nVidia MCP51 chipset.
>>>> MCP51... again...
>>>>
>>>>> I am willing to help any way I can.
>>>> You could start from capturing and showing verbose dmesg. Full or at
>>>> least in parts related to disks.
>>> I captured the dmesg output for both the old (working) kernel and the new 
>>> (bad) kernel.  See attached for the difference between the two.  If you need 
>>> the full dmesg please let me know.  
>>>
>>> One thing I found is that the old kernel would not boot if I simply rebooted 
>>> from the bad kernel.  I had to do a hard power off before the old kernel would 
>>> work again.  Is some device state surviving between reboots?  
>> +ata2: reiniting channel ..
>> +ata2: SATA connect time=0ms status=00000113
>> +ata2: reset tp1 mask=01 ostat0=58 ostat1=00
>> +ata2: stat0=0x50 err=0x01 lsb=0x00 msb=0x00
>> +ata2: reset tp2 stat0=50 stat1=00 devices=0x1
>> +ata2: reinit done ..
>> +unknown: FAILURE - ATA_IDENTIFY timed out LBA=0
>>
>> As soon as all devices detected but not responding to commands, I would
>> suppose that there is something wrong with ATA interrupts. There is a
>> long chain of interrupt problems in this chipset. I have already tried
>> to debug one case where ATA wasn't generating interrupts at all.
>> Unfortunately, without success -- requests were executing, but not
>> generating interrupts, it wasn't looked like ATA driver problem.
>>
>> What's about possible candidate to revision triggering your problem, I
>> would look on this message:
>> +pcib0: Enabling MSI window for HyperTransport slave at pci0:0:9:0
>>
>> At least it is recent (SVN revs 219737,219740 on 2011-03-18 by jhb) and
>> it is interrupt related.
> 
> Does the driver disable MSI for MCP51?

ata(4) doesn't uses MSI by default and I doubt this controller supports
them any way. But if I am not mixing something, there were very strange
situations with MSI on that chipset, when enabling them one one device
caused interrupt problems on another.

> I think jhb's patch fixed one MSI issue of all MCP chipset.

I am not telling it is wrong. It could just trigger something.

-- 
Alexander Motin
Received on Tue Apr 12 2011 - 19:09:14 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:13 UTC