Re: One-shot-oriented event timers management

From: Brandon Gooch <jamesbrandongooch_at_gmail.com> Date: Mon, 30 Aug 2010 00:24:58 -0500 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:06 UTC

2010/8/29 Alexander Motin <mav_at_freebsd.org>:
> Hi.
>
> I would like to present my new work on timers management code.
>
> In my previous work I was mostly orienting on reimplementing existing
> functionality in better way. The result seemed not bad, but after
> looking on perspectives of using event timers in one-shot (aperiodic)
> mode I've understood that implemented code complexity made it hardly
> possible. So I had to significantly cut it down and rewrite from the new
> approach, which is instead primarily oriented on using timers in
> one-shot mode. As soon as some systems have only periodic timers I have
> left that functionality, though it was slightly limited.
>
> New management code implements two modes of operation: one-shot and
> periodic. Specific mode to be used depends on hardware capabilities and
> can be controlled.
>
> In one-shot mode hardware timers programmed to generate single interrupt
> precisely at the time of next wanted event. It is done by comparing
> current binuptime with next scheduled times of system events
> (hard-/stat-/profclock). This approach has several benefits: event timer
> precision is now irrelevant for system timekeeping, hard- and statclocks
> are not aliased, while only one timer used for it, and the most
> important -- it allows us to define which events and when exactly we
> really want to handle, without strict dependence on fixed hz, stathz,
> profhz periods. Sure, our callout system is highly depends on hz value,
> but now at least we can skip interrupts when we have no callouts to
> handle at the time. Later we can go further.
>
> Periodic mode now also uses alike principals of scheduling events. But
> timer running in periodic mode just unable to handle arbitrary events
> and as soon as event timers may not be synchronized to system
> timecounter and may drift from it, causing jitter effects. So I've used
> for time source of scheduling the timer events themselves. As result,
> periodic timer runs on fixed frequency multiply to hz rate, while
> statclock and profclock generated by dividing it respectively. (If
> somebody would tell me that hardclock jitter is not really a big
> problem, I would happily rip that artificial timekeeping out of there to
> simplify code.) Unluckily this approach makes impossible to use two
> events timers to completely separate hard- and statclocks any more, but
> as I have said, this mode is required only for limited set of systems
> without one-shot capable timers. Looking on my recent experience with
> different platforms, it is not a big fraction.
>
> Management code is still handles both per-CPU and global timers. Per-CPU
> timers usage is obvious. Global timer is programmed to handle all CPUs
> needs. In periodic mode global timer generates periodic interrupts to
> some one CPU, while management code then redistributes them to CPUs that
> really need it, using IPI. In one-shot mode timer is always programmed
> to handle first scheduled event throughout the system. When that
> interrupt arrives, it is also getting redistributed to wanting CPUs with
> IPI.
>
> To demonstrate features that could be obtained from so high flexibility
> I have incorporated the idea and some parts of dynamic ticks patches of
> Tsuyoshi Ozawa. Now, when some CPU goes down into C2/C3 ACPI sleep
> state, that CPU stops scheduling of hard-/stat-/profclock events until
> the next registered callout event. If CPU wakes up before that time by
> some unrelated interrupt, missed ticks are called artificially (it is
> needed now to keep realistic system stats). After system is up to date,
> interrupt is handled. Now it is implemented only for ACPI systems with
> C2/C3 states support, because ACPI resumes CPU with interrupts disabled,
> that allows to keep up missed time before interrupt handler or some
> other process (in case of unexpected task switch) may need it. As I can
> see, Linux does alike things in the beginning of every interrupt handler.
>
> I have actively tested this code for a few days on my amd64 Core2Duo
> laptop and i386 Core-i5 desktop system. With C2/C3 states enabled
> systems experience only about 100-150 interrupts per second, having HZ
> set to 1000. These events mostly caused by several event-greedy
> processes in our tree. I have traced and hacked several most aggressive
> ones in this patch: http://people.freebsd.org/~mav/tm6292_idle.patch .
> It allowed me to reduce down to as low as 50 interrupts per system,
> including IPIs! Here is the output of `systat -vm 1` from my test
> system: http://people.freebsd.org/~mav/systat_w_oneshot.txt . Obviously
> that with additional tuning the results can be improved even more.
>
> My latest patch against 9-CURRENT can be found here:
> http://people.freebsd.org/~mav/timers_oneshot4.patch
>
> Comments, ideas, propositions -- welcome!
>
> Thanks to all who read this. ;)

Totally awesome work mav_at_!

One thing I see:

Where is *frame pointing to? It isn't initialized in the function, so...

+static int
+handleevents(struct bintime *now, int fake)
 {
+	struct trapframe *frame;
+	struct pcpu_state *state;
+	uintfptr_t pc;
+	int usermode;
+	int done;

-	if (doconfigtimer(0))
-		return (FILTER_HANDLED);
-	return (hardclockhandler(frame));
+	done = 0;
+#ifdef KDTRACE_HOOKS
+	/*
+	 * If the DTrace hooks are configured and a callback function
+	 * has been registered, then call it to process the high speed
+	 * timers.
+	 */
+	if (cyclic_clock_func[curcpu] != NULL)
+		(*cyclic_clock_func[curcpu])(frame);
+#endif

Also, for those of us testing, should we "reset" our timer settings
back to defaults and work from there[1] (meaning, should we be futzing
around with timer event sources, kern.hz, etc...)?

Thanks again for tackling these tough, but important issues. I'm
looking very forward to testing this out!

-Brandon

[1] http://wiki.freebsd.org/TuningPowerConsumption