Re: [regression] unable to boot: no GEOM devices found.

From: John Baldwin <jhb_at_freebsd.org>
Date: Wed, 28 Mar 2012 14:37:35 -0400
On Monday, May 09, 2011 2:24:37 pm David Naylor wrote:
> On Friday 15 April 2011 23:29:55 David Naylor wrote:
> > On Friday 15 April 2011 18:28:06 John Baldwin wrote:
> > > On Wednesday, April 13, 2011 1:07:06 pm David Naylor wrote:
> > > > On Tuesday 12 April 2011 22:12:55 Alexander Motin wrote:
> > > > > David Naylor wrote:
> > > > > > On Tuesday 12 April 2011 08:17:51 Alexander Motin wrote:
> > > > > >> David Naylor wrote:
> > > > > >>> I am running -current and since a few days ago (at least
> > > > > >>> 2011/04/11) I am unable to boot.
> > > > > >>> 
> > > > > >>> The boot process stops when it looks to find a bootable device.
> > > > > >>> The prompt (when pressing '?') does not display any device and
> > > > > >>> yielding
> > > 
> > > one
> > > 
> > > > > >>> second (or more) to the kernel (by pressing '.') does not improve
> > > > > >>> the situation.
> > > > > >>> 
> > > > > >>> A known working date is 2011/02/20.
> > > > > >>> 
> > > > > >>> I am running amd64 on a nVidia MCP51 chipset.
> > > > > >> 
> > > > > >> MCP51... again...
> > > > > 
> > > > > +ata2: reiniting channel ..
> > > > > +ata2: SATA connect time=0ms status=00000113
> > > > > +ata2: reset tp1 mask=01 ostat0=58 ostat1=00
> > > > > +ata2: stat0=0x50 err=0x01 lsb=0x00 msb=0x00
> > > > > +ata2: reset tp2 stat0=50 stat1=00 devices=0x1
> > > > > +ata2: reinit done ..
> > > > > +unknown: FAILURE - ATA_IDENTIFY timed out LBA=0
> > > > > 
> > > > > As soon as all devices detected but not responding to commands, I
> > > > > would suppose that there is something wrong with ATA interrupts.
> > > > > There is a long chain of interrupt problems in this chipset. I have
> > > > > already tried to debug one case where ATA wasn't generating
> > > > > interrupts at all. Unfortunately, without success -- requests were
> > > > > executing, but not generating interrupts, it wasn't looked like ATA
> > > > > driver problem.
> > > > > 
> > > > > What's about possible candidate to revision triggering your problem,
> > > > > I would look on this message:
> > > > > +pcib0: Enabling MSI window for HyperTransport slave at pci0:0:9:0
> > > > > 
> > > > > At least it is recent (SVN revs 219737,219740 on 2011-03-18 by jhb)
> > > > > and it is interrupt related.
> > > > 
> > > > I reverted those two revs and everything works again.
> > > 
> > > Hmm, can you provide a full boot verbose dmesg?  Alternatively, can you
> > > see if the device at pci0:0:9:0 is a PCI-PCI bridge?
> > 
> > I can provide a verbose dmesg if the following is not enough:
> > 
> > none17_at_pci0:0:9:0:      class=0x050000 card=0x50011458 chip=0x027010de
> > rev=0xa2 hdr=0x00
> >     vendor     = 'NVIDIA Corporation'
> >     device     = 'MCP51 Host Bridge'
> >     class      = memory
> >     subclass   = RAM
> > 
> > I see two PCI-PCI bridges at pci0:0:3:0 and pci0:0:16:0.  I've attached the
> > full `pciconf -lv` output.
> 
> FYI, this issue is still present on current (~24 hours old).  Reverting the  
> above mentioned revisions still fixes the problem.

I finally had an idea about a way to solve this (at least when using ACPI) that
doesn't involve a whole bunch of quirks, etc.  Please try
http://www.FreeBSD.org/~jhb/patches/hostb_htmsi.patch

-- 
John Baldwin
Received on Wed Mar 28 2012 - 16:43:08 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:25 UTC