Re: 6.0-CURRENT SNAP004 hangs on amr

From: John Baldwin <jhb_at_FreeBSD.org>
Date: Mon, 11 Jul 2005 13:27:44 -0400
On Friday 08 July 2005 11:47 am, Mike Tancsa wrote:
> At 10:06 AM 08/07/2005, John Baldwin wrote:
> >On Thursday 07 July 2005 10:09 pm, Mike Tancsa wrote:
> > > At 04:58 PM 07/07/2005, John Baldwin wrote:
> > > >Crud, it's off in the weeds. :(  Can you do a boot -v and get the
> > > > lines after 'pcib3:'?
> > >
> > > Here you go
> >
> >Ok, I see why it is badly confused in the non-APIC and non-ACPI case
> > though I don't know why it is panicing.  (FWIW, it is trying to route all
> > interrupts to IRQ 14 because your $PIR is all busted *sigh*).  BIOS
> > writers suck.
>
> Unfortunately, this is the latest version of the BIOS from Dell.

Windows uses ACPI, so I bet the $PIR stuff isn't QA'd anymore.

> >Anyway, I still need a simple matrix of what works and what doesn't work
> >first (if I got one earlier I lost it):
> >
> >ACPI/APIC - amr0 gets no interrupts, hangs after boot
>
> yes, it hangs either with amr or perhaps ata.  yesterday I was trying just
> a netboot and it seemed to work if I pulled the card and did not have the
> ata code in the driver, although I had not setup the fstab to properly
> work, but the fact that I was complaining about mounting root implies it
> got farther along.  If you feel this is worth checking out, I could pull
> the amr card again, and try and properly netboot a kernel and mount root
> via nfs on 6.x.
>
> On RELENG_5 it sometimes works if I disable ata in the kernel.   I attached
> the boot-v from releng5.  6.x hangs (also attached)

Hummm, so does it work with an ata(4) disk if you pull the amr card or disable 
the amr driver in your kernel?

> >ACPI/no-APIC - amr0 gets no interrupts, hangs after boot?
>
> Panic.  This case should be in the last email I sent.
> OK set hint.apic.0.disabled=1
> OK load acpi.ko
>
> kernel trap 12 with interrupts disabled
>
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0xba9f
> fault code              = supervisor read, page not present
> instruction pointer     = 0x8:0xc00fd141
> stack pointer           = 0x10:0xc0c2094c
> frame pointer           = 0x10:0xc0c209b8
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                          = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = resume, IOPL = 0
> current process         = 0 (swapper)
> [thread pid 0 tid 0 ]
> Stopped at      0xc00fd141:     cmpb    %cs:0xba9f,%bh
> db> trace
> Tracing pid 0 tid 0 td 0xc07d1c60
> kernbase(e0b,c07029d1,c00fc860,c00fc86b,c0c209f8) at 0xc00fd141
> db>

Hmm, so no-APIC always gets this BIOS panic, in both the ACPI and non-ACPI 
cases?  And is that the same on 5.x?

> >no-ACPI/APIC - ???
>
> RELENG_5, all is happy. 6.x hang.

Ok.  Very odd in that the code there is almost identical.  There is one diff 
you can try reverting (use patch -R) and see if it changes anything:

Index: mptable.c
===================================================================
RCS file: /usr/cvs/src/sys/i386/i386/mptable.c,v
retrieving revision 1.235.2.4
retrieving revision 1.241
diff -u -r1.235.2.4 -r1.241
--- mptable.c   25 Mar 2005 21:10:07 -0000      1.235.2.4
+++ mptable.c   14 Apr 2005 17:59:58 -0000      1.241
_at__at_ -353,7 +353,6 _at__at_
                busses[i].bus_type = NOBUS;

        /* Second, we run through adding I/O APIC's and busses. */
-       ioapic_enable_mixed_mode();
        mptable_parse_apics_and_busses();

        /* Third, we run through the table tweaking interrupt sources. */

There are also some changes to the amr(4) driver (minor though) in 6.x that 
you could try reverting perhaps.  ata(4) has had a lot of changes in 6.x.

-- 
John Baldwin <jhb_at_FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve"  =  http://www.FreeBSD.org
Received on Mon Jul 11 2005 - 16:05:21 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:38 UTC