Re: Compaq ProLiant 1600 server freezes when detecting keyboard controller

From: Tillman Hodgson <tillman_at_seekingfire.com>
Date: Sun, 20 Nov 2005 09:32:58 -0600
On Sun, Nov 20, 2005 Craig wrote:
> > I've seen this freeze on a couple of 1600s with 6.0-RELEASE.  Does this
> > system have multiple processors?  If not, try changing the "OS Type"
> > setting in the BIOS to "Other".  That fixed it for me.

It is indeed a single processor system. I used the system utilities to
change the system type to "Other", with no change in behaviour.

I watched the boot closely this time and noticed that it complains about
too many IRQ 0s and aborts the keyboard controller. I don't have the
exact wording for this -- again, I don't have the serial console set up.

I'll try the UnixWare setting next, though it's not an SMP kernel. I had
been using the Linux setting, for no particular reason.

On Sun, Nov 20, 2005 at 04:25:36AM +0000, Bill Paul wrote:
> What you can also try, if the BIOS doesn't support this option, is
> to break to the OK prompt in the boot loader and type:
> 
> OK set hw.pci.enable_io_modes="0"

I tried this next. With this set, it makes it past the keyboard
controller, sc0, sio0, sio1 and vga0. Then the following line appears:

RTC BIOS diagnostic error 29<config_unit,fixed_disk>

Timecounter lines then appear, then IPsec, ipfe2, the delay for SCSI
devices to settle, then:

sym0: unable to abort current chip operation.
sym0: suspicious SCSI data while resetting the BUS.
sym0: dp1, d15-8,dp0,d7-0,rst,req,ack,bsy,sel,atn,msg,c/d,i/o = 0x0, expecting 0x100
sym0: unable to abort current chip operation.
sym1: suspicious SCSI data while resetting the BUS.
sym1: dp1, d15-8,dp0,d7-0,rst,req,ack,bsy,sel,atn,msg,c/d,i/o = 0x0, expecting 0x100

And there it hangs. The Aug 20 kernel doesn't show any SCSI bus errors.

> When the kernel wedges here, it's likely because of a bad interaction
> between the PCI code and the vm86 code. The vm86 code (which lets
> you run 16 bit BIOS code in an emulated environment using a special
> feature of the Pentium) uses physical page 0 contain the instructions
> that run when making a vm86 bios call. What can happen sometimes
> is that the PCI BIOS leaves one of the PCI devices unconfigured,
> in which case its base address register is set to 0. Our PCI code
> then comes along and enables all of the devices but doesn't necessarily
> update the base address registers on some of them, which has the effect
> of mapping one of the PCI devices at physical address 0.
> 
> This problem remains more or less hidden until the keyboard driver
> code goes to make a vm86 bioscall to access the keyboard. The CPU
> is switched to vm86 mode and tries to jump to the code at page 0,
> but code execution doesn't work because a PCI device has been
> mapped here by mistake. The result is the CPU locks up hard.
> 
> Setting hw.pci.enable_io_modes to 1 prevents the PCI code from
> unconditionally enabling I/O mode and memory mapped mode of all
> devices.

Oh, interesting background, thanks!

-T


-- 
"If 'everybody knows' such-and-such, then it ain't so, by at least ten
 thousand to one."
    -- Robert Heinlein
Received on Sun Nov 20 2005 - 14:33:02 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:48 UTC