Re: Strange issue after early AP startup

From: Hans Petter Selasky <hps_at_selasky.org>
Date: Tue, 17 Jan 2017 20:31:28 +0100
Hi,

On 01/17/17 20:00, John Baldwin wrote:
>>
>> Does this matter for the first tick? How often is configtimer() called?
>
> As I said, it is called at runtime when profclock is started / stopped, not
> just at boot.  Those changes at runtime probably have existing callouts
> active and your change will not process any callouts until the next hardclock
> tick fires (but only because you are setting nextcallopt to the bogus
> 'next' value).



>>  > (One odd thing is that even in your case the first call to
>> handleevents(),
>>  > the 'now => state->nextcallout' check in handleevents() should be true
>>  > which resets both nextcall and nextcallopt and invokes
>> callout_process().)
>>
>> Let me take you through the failure path, by code inspection:
>
> I would really appreciate it if you could add traces to find out what actually
> happens rather than what seems to happen by looking at the code. :-/

The problem is that once you add some prints, the problem goes away. 
Maybe I should try to set hz to 100 or 25 ???

>
> 0) cpu_initclocks_bsp() is called and init's nextcall and nexcallopt to SBT_MAX
>    similar to your change.  If no callout is scheduled before configtimer()
>    then they remain set to SBT_MAX.  Your current patch happens to trigger a
>    (bogus) call to callout_process() on the first hardclock() because it
>    sets nextcallopt to 'next' even though no callout is actually scheduled to
>    fire at time 'next'.
>
>> 1) configtimer() is called and we init nextcall and nextcallopt:
>>
>>  >                 next = now + timerperiod;
>> ...
>>  >                       state->nextcall = next;
>>  >                       state->nextcallopt = next;
>
> These both say "the next callout() should fire at 'next' which is the time of
> the next hardclock()", even though there may be no callouts scheduled (in which
> case both of these fields should be set to SBT_MAX from the call to
> cpu_initclocks_bsp(), or there may be callouts scheduled in which case 'nextcall'
> and 'nextcallopt' will reflect the time that those callouts are already
> scheduled for and this overwrites that).

I see there are some callouts scheduled by SYSINITs, before the first 
configtimer(), like NFS_TIMERINIT in nfs_init(). These are setup using 
"dummy_timecounter" which means any nextcall values before the first 
configtimer should be discarded.

>
>> 2) Any callout_reset() calls cpu_new_callout():
>>
>>  >          */
>>  >         state->nextcallopt = bt_opt;
>>  >         if (bt >= state->nextcall)
>> We follow this path, because "bt" is surely based on sbinuptime() and is
>> greater or equal to state->nextcall. Note that state->nextcallopt is
>> updated to bt_opt, which is in the future.
>
> Note, my patch should _leave_ nextcall at SBT_MAX (from cpu_initclocks_bsp())
> unless there was already an earlier call to callout_reset().

Yes, there are calls to callout_reset(). See for example NFS_TIMERINIT, 
like mentioned above.

 > IOW, it
> should be a NOP for the purposes of this branch compared with your change.
> (You could add a warning to print out if 'nextcall' != SBT_MAX during boot
> and see if it fires for example.)
>
>>  >                 goto done;
>>  >         state->nextcall = bt;
>>
>> 3) getnextcpuevent(0) is called by the fast timercb() to setup the next
>> event:
>>
>>  >         state = DPCPU_PTR(timerstate);
>>  >         /* Handle hardclock() events, skipping some if CPU is idle. */
>>  >         event = state->nexthard;
>> ...
>>  >         /* Handle callout events. */
>>  >         if (event > state->nextcall)
>> We then go looping into this path, because state->nextcall is still
>> equal to "next" as in step 1) which is now in the past, until "now >=
>> state->nextcallopt" inside handleevents(), which clears this condition.
>>  >                 event = state->nextcall;
>> ...
>>  >         return (event);
>
> I'm curious if there is a callout_reset() that has set 'nextcall' to a time
> that is effectively before 'now'.
>
> Maybe add a printf like this:
>
> Index: kern_clocksource.c
> ===================================================================
> --- kern_clocksource.c	(revision 312301)
> +++ kern_clocksource.c	(working copy)
> _at__at_ -498,12 +498,18 _at__at_ configtimer(int start)
>  		CPU_FOREACH(cpu) {
>  			state = DPCPU_ID_PTR(cpu, timerstate);
>  			state->now = now;
> +			printf("%s: CPU %d: now %jd nextcall %jd nextcallopt %jd next %jd\n", __func__, cpu, state->nextcall, state->nextcall, next);
>  #ifndef EARLY_AP_STARTUP
>  			if (!smp_started && cpu != CPU_FIRST())
>  				state->nextevent = SBT_MAX;
>  			else
>  #endif
>
> In particular what I am worried about with your patch is that for
> post-boot calls to configtimer() you will delay any previously-scheduled
> callouts until the next hardclock.
>

I understand. Would a solution be to refactor callout_process(), to 
accept the PCPU_GET(CPUID) as an argument and be executed for all CPUs 
by configtimer(), instead of trying to guess state->nextcall and 
state->nextcallopt in configtimer() ?

--HPS
Received on Tue Jan 17 2017 - 18:32:04 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:09 UTC