Re: [RFC/RFT] calloutng

From: Ian Lepore <ian_at_FreeBSD.org> Date: Thu, 17 Jan 2013 07:13:47 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:34 UTC

On Mon, 2013-01-14 at 11:38 +1100, Bruce Evans wrote:
> On Sun, 13 Jan 2013, Alexander Motin wrote:
> 
> > On 13.01.2013 20:09, Marius Strobl wrote:
> >> On Tue, Jan 08, 2013 at 12:46:57PM +0200, Alexander Motin wrote:
[...]
> >
> > In existing code in HEAD and 9 timecounters are never called with spin
> > mutex held.  I intentionally tried to avoid that in existing eventtimers
> > code.
> 
> Er, timecounters are called with a spin mutex held in existing code:
> though it is dangerous to do so, timecounters are called from fast
> interrupt handlers for very timekeeping-critical purposes:
> - to implement the TIOCTIMESTAMP ioctl (except this is broken in
>    -current).  This was a primitive version of pps timestamping.
> - for pps timestamping.  The interrupt handler (which should be a fast
>    interrupt handler to minimize latency) calls pps_capture() which
>    calls tc_get_timecount() and does other "lock-free" accesses to the
>    timecounter state.  This still works in -current (at least there is
>    still code for it).
> 

Unfortunately, calling pps_capture() in the primary interrupt context is
no longer an option with the stock pps driver.  Ever since the ppbus
rewrite all ppbus children must use threaded handlers.  I tried to fix
that a couple different ways, and both ended up with crazy-complex code
scattered around the ppbus family just to support the rarely-used pps
capture.  It would have been easier to do if filter and threaded
interrupt handlers had the same function signature.

I ended up writting a separate driver that can be used instead of ppc +
ppbus + pps, since anyone who cares about precise pps capture is
unlikely to be sharing the port with a printer or plip device or some
such.

>    OTOH, all drivers that call pps_capture() from their interrupt handler
>    then immediately call pps_event().  This has always been very broken,
>    and became even more broken with SMPng.  pps_event() does many more
>    timecounter and pps accesses whose locking is unclear at best, and
>    in some configurations it calls hardpps(), which is only locked by
>    Giant, despite comments in kern_ntptime.c still saying that it (and
>    many other functions in kern_ntptime.c) must be called at splclock()
>    or higher.  splclock() is of course now null, but the locking
>    requirements in kern_ntptime.c haven't changed much.  kern_ntptime.c
>    always needed to be locked by the equivalent of a spin mutex, which
>    is stronger locking than was given by splclock().  pps_event() would
>    have to aquire the spin mutex before calling hardpps(), although
>    this is bad for fast interrupt handlers.  The correct implementation
>    is probably to only do the capture part from fast interrupt handlers.
> 

In my rewritten dedicated pps driver I call pps_capture() from the
filter handler and pps_event() from the threaded handler.  I never found
any good documentation on the low-level details of this stuff, and there
isn't enough good example code to work from.  My hazy memory is that I
ended up studying the pps_capture() and pps_event() code enough to infer
that their design intent seems to be to allow you to capture with no
locking and do the event processing later in some sort of deferred or
threaded context.

> > Callout code same time can be called in any environment with any
> > locks held. And new callout code may need to know precise current time
> > in any of those conditions. Attempt to use an IPI and wait there can be
> > fatal.
> 
> Callout code can't be called from such a general "any" environment as
> timecounter code.  Not from a fast interrupt handler.  Not from an NMI
> or IPI handler.  I hope.  But timecounter code has a good chance of
> working even for the last 2 environments, due to its design requirement
> of working in the first.
> 
> The spinlock in the i8254 timecounter certainly breaks some cases.
> For example, suppose the lock is held for a timecounter read from
> normal context.  It masks hardware interrupts on the current CPU (except
> in my version).  It doesn't mask NMIs or other traps.  So if the NMI
> or other trap handler does a timecounter hardware call, there is
> deadlock in at least the !SMP case.  In my version, it blocks normal
> interrupts later if they occur, but doesn't block fast interrupts, so
> the pps_capture() call would deadlock if it occurs, like a timecounter
> call from an NMI.  I avoid this by not using pps in any fast interrupt
> handler, and by only using the i8254 timecounter for testing.  I do
> use pps in a (nonstandard) x86 RTC clock interrupt handler.  My clock
> interrupt handlers are all non-fast to avoid this and other locking
> problems.

Hrm, now you've got me a bit worried about capturing in the primary
context.  Not that I have much option, on a 300mhz Geode and similar
wimpy embedded processors there's enough latency on a theaded handler
that the pps signal can be de-asserted by time the handler runs
(precision timing gear often outputs a very narrow pps pulse, 1 - 10uS
isn't uncommon).

I know I don't have to worry about NMIs on the systems in question, but
I'm not so sure about "other trap handler".  

> [...]
> 
> Bruce

-- Ian