Re: nanosleep returning early

From: Bruce Evans <bde_at_zeta.org.au>
Date: Fri, 23 Jul 2004 02:25:16 +1000 (EST)
On Fri, 23 Jul 2004, Bruce Evans wrote:

> On Thu, 22 Jul 2004, John Birrell wrote:
>
> > On Wed, Jul 21, 2004 at 11:01:20PM +1000, Bruce Evans wrote:
> > > ...
> > > The most obvious bug is that nanosleep() uses the low-accuracy interface
> > > getnanouptime().  I can't see why the the problem is more obvious with
> > > large HZ or why it affects short sleeps.  From kern_time.c 1.170:
> > > ...
> > > % 	getnanouptime(&ts);
> > >
> > > This may lag the actual (up)time by 1/HZ seconds.
>
> [                       (actually tc_tick/HZ seconds)

> > So, does increasing HZ expose the lower accuracy of getnanouptime() and is
> > that what I'm seeing?
>
> I still don't know the reason.  Unfortunately, I deleted your original
> mail so I can't run the test program in it easily.

Now I think I know the reason.  The interval between clock interrupts
is supposed to be 1/HZ seconds = `tick' microseconds, but it cannot
be set nearly that precisely, and the imprecision of inversely
proportional to HZ.  The i8254 counter has a default nominal frequency
of 1193182 Hz.  Suppose that this is perfectly accurate.  Then to
implement clock interrupts at HZ hz, we want to program the i8254's
maximum count to 1193182/HZ in infinite precision, but counts must be
integers so we must round.  The loss of precision is quite large for
HZ = 1000: 1193182 / 1000.0 = 1193.182; rounding this (to nearest)
gives 1193 and an error of 182 in 1193182 = 152 ppm.  Also, the extra
tick added by tvtohz() is only 1000 uS long, so it only has a chance
of about 152/1000 to compensate for the rounding error.  Finally, the
explicit check that the interval has elapsed cannot compensate for
errors larger than tc_tick/HZ because getnanouptime() is fuzzy.

Rounding 1193.182 to nearest happens to round down; thus clock ticks
are shorter than `tick' microseconds, tvtohz()'s value is too small,
and nanosleep() may return too early.  The loop limits the error to
about 1 tick in this case.  The i8254 frequency may be calibrated or
set using sysctl to a more (or less) accurate value.  Then the rounding
may go the other way so that tvtohz()'s value is too large and nanoleep()
may return too late.  The loop cannot limit the error in this case.
The absolute error may be large for long sleeps.  E.g., 152 ppm over
1 day is 13 seconds.

tvtohz()'s value  may also be too large because the i8254 frequency
is not known accurately.  It's nominal value is wrong by 10-100 Hz
on my systems.  I minimize errors from this by calibrating all
timecounters using a common clock.

Bruce
Received on Thu Jul 22 2004 - 14:25:21 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:02 UTC