Re: Safe-mode on amd64 broken

From: Alexander Motin <mav_at_FreeBSD.org>
Date: Thu, 30 Sep 2010 09:52:39 +0300
David Naylor wrote:
> On Thursday 30 September 2010 07:23:34 Alexander Motin wrote:
>> David Naylor wrote:
>>> On Wednesday 29 September 2010 18:25:13 Alexander Motin wrote:
>>>> David Naylor wrote:
>>>>> On Wednesday 29 September 2010 16:19:08 Andriy Gapon wrote:
>>>>>> What do you try to actually achieve?
>>>>> I was trying to boot a system and it was panicking due to stray
>>>>> interrupts. It turned out to be caused by HPET.  I found
>>>>> `hint.hpet.0.clock=0' which fixed the problem.
>>>>>
>>>>> This means HPET does not work on any of my machines.  The other one's
>>>>> symptoms are hda losing interrupts after a period of up-time.
>>>> What chipset do you use? Nvidia MCP5x? Could you send me your verbose
>>>> dmesg?
>>> Yes, the one is a MCP51, the other is a ICH8M.
>>>
>>> The desktop is a Gigabyte N650SLI-DS4L.  Its symptom is hda losing
>>> interrupts after a period of time.
>> There are too many reports about different lost interrupts problems on
>> different controllers of MCP5x. I don't know the reason. Attached patch
>> should disable using regular HPET interrupts on NVidia chipsets. I hope
>> it will work as workaround. May be it is too aggressive, but better to
>> be safe then sorry. I assume that legacy_route mode may still work fine
>> there. It would be nice to test it.
> 
> I assume you mean hint.hpet.0.legacy_route=1?  I'll give that a try later 
> today on both machines.  

Make sure that both attimer and atrtc disabled, as mentioned in hpet(4).

> Is your patch the same as hint.hpet.0.clock=0?  

By default - effectively yes. But it still allows to configure
legacy_route, which is, for example, default for Linux.

>>> The laptop is a Acer 2920.  Its symptom for a GENERIC is a panic saying
>>> stray interrupt (irq7), with a custom kernel booting stalls.
>> This is strange, as my Acer with the same ICH8M works fine in all
>> possible modes. Also IMHO stray interrupts are not a reason to panic.
>> Could you show what it looks like?
> 
> See http://markmail.org/message/smxnofrdmmkxyvnd for my previous email that 
> includes the backtrace from that panic.  When I booted in i386 safe mode the 
> kernel reported stray interrupts on irq7.  vmstat -i shows irq7 as "stray 
> irq7".  

I am not sure "stray irq7" related here. Instead more suspicious looks
probable irq20 interrupt sharing between HPET and uhci0 and the fact
that system panicked during interrupt handler registration by uhci0. I
can't be sure what IRQ was used by HPET there, as in only present dmesg
it was disabled, but as soon as HPET registered early, I think it
grabbed first possible - irq20. On my system HPET also uses irq20, but
uhci0 lives on irq16 and so irq20 is not shared.

To collect more data you may try to hint HPET driver to avoid irq20 by
setting hint.hpet.0.allowed_irqs=0x00e00000 or other values. I've tried
same recipy to create sharing on my system, but still found no problem.

-- 
Alexander Motin
Received on Thu Sep 30 2010 - 04:52:59 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:07 UTC