Re: One-shot-oriented event timers management

From: Brandon Gooch <jamesbrandongooch_at_gmail.com>
Date: Wed, 1 Sep 2010 12:53:04 -0500
On Wed, Sep 1, 2010 at 5:44 AM, Alexander Motin <mav_at_freebsd.org> wrote:
> Alexander Motin wrote:
>> Gary Jennejohn wrote:
>>> On Mon, 30 Aug 2010 12:11:48 +0200
>>> OK, this is purely anecdotal, but I'll report it anyway.
>>>
>>> I was running pretty much all day with the patched kernel and things
>>> seemed to be working quite well.
>>>
>>> Then, after about 7 hours, everything just stopped.
>>>
>>> I had gkrellm running and noticed that it updated only when I moved the
>>> mouse.
>>>
>>> This behavior leads me to suspect that the timer interrupts had stopped
>>> working and the mouse interrupts were causing processes to get scheduled.
>>>
>>> Unfortunately, I wasn't able to get a dump and had to hit reset to
>>> recover.
>>>
>>> As I wrote above, this is only anecdotal, but I've never seen anything
>>> like this before applying the patches.
>>
>> One-shot timers have one weak side: if for some reason timer interrupt
>> getting lost -- there will be nobody to reload the timer. Such cases
>> probably will require special attention. Same funny situation with
>> mouse-driven scheduler happens also if LAPIC timer dies when pre-Core-iX
>> CPU goes to C3 state.
>
> I have reproduced the problem locally. It happens more often when ticks
> are not stopped on idle, like in your original case (or if explicitly
> enabled by kern.eventtimer.idletick sysctl).
>
> I've made some changes to HPET driver, which, I hope, should fix
> interrupt losses there.
>
> Updated patch: http://people.freebsd.org/~mav/timers_oneshot6.patch
>
> Patch also includes some optimizations to reduce lock contention.
>
> Thanks for testing.

This latest patch causes an interrupt storm with the HPET timer on my
system. The machine took about 8 minutes to boot and bring me to a
login prompt. System interactivity (i.e. input from keyboard, output
on console) was fine, but after checking the output of `systat vmstat
-1`, I saw the interrupt rate on each HPET entry was over 120k!

Can I provide any useful detail? Of course, test patches are always welcome :)

-Brandon
Received on Wed Sep 01 2010 - 15:53:06 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:06 UTC