On Tuesday, January 17, 2017 07:04:15 PM Hans Petter Selasky wrote: > On 01/17/17 16:50, John Baldwin wrote: > > On Monday, January 16, 2017 10:10:16 PM Hans Petter Selasky wrote: > >> On 01/16/17 20:31, John Baldwin wrote: > >>> On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote: > >>>> Hi, > >>>> > >>>> When booting I observe an additional 30-second delay after this print: > >>>> > >>>>> Timecounters tick every 1.000 msec > >>>> > >>>> ~30 second delay and boot continues like normal. > >>>> > >>>> Checking "vmstat -i" reveals that some timers have been running loose. > >>>> > >>>>> cpu0:timer 44300 442 > >>>>> cpu1:timer 40561 404 > >>>>> cpu3:timer 48462822 483058 > >>>>> cpu2:timer 48477898 483209 > >>>> > >>>> Trying to add delays and/or prints around the Timecounters printout > >>>> makes the issue go away. Any ideas for debugging? > >>> > >>> I have generally used KTR tracing to trace what is happening during > >>> boot to debug EARLY_AP_STARTUP issues. > >>> > >> > >> Hi John, > >> > >> What happens is that getnextcpuevent(0) keeps on returning > >> "state->nextcall" which is in the past for CPU #2 and #3 on my box. > >> > >> In "cpu_new_callout()" there is a check if "bt >= state->nextcall", > >> which I suspect is true, so "state->nextcall" never gets set to real > >> minimum sbintime. > >> > >> The attached patch fixes the problem for me, but I'm not 100% sure > if it > >> is correct. > > > > Hi, > > > I think we want to be honoring any currently scheduled callouts. > > The problem here is that we might be changing the clocksource, then > sbinuptime() will change too, so I think the value should be reset by > configtimer() and then corrected at the next call to callout_process(). > > > You could > > do that by setting it to 'cc_firstevent' of the associated CPU, but in > > practice 'state->nextcall' should already be set to that (it is > initalized > > to SBT_MAX in cpu_initclocks_bsp() and is then only set to other > values due > > to cpu_new_callout()). Keep in mind that configtimer() is not just > called > > from boot, but is also invoked when starting/stopping the profiling > timer. > > > > > However, when setting 'nextevent' (which is used to schedule the next > timer > > interrupt), we should be honoring the existing 'nextcall' if it is sooner > > than the next hardclock. > > Does this matter for the first tick? How often is configtimer() called? As I said, it is called at runtime when profclock is started / stopped, not just at boot. Those changes at runtime probably have existing callouts active and your change will not process any callouts until the next hardclock tick fires (but only because you are setting nextcallopt to the bogus 'next' value). > > (One odd thing is that even in your case the first call to > handleevents(), > > the 'now => state->nextcallout' check in handleevents() should be true > > which resets both nextcall and nextcallopt and invokes > callout_process().) > > Let me take you through the failure path, by code inspection: I would really appreciate it if you could add traces to find out what actually happens rather than what seems to happen by looking at the code. :-/ 0) cpu_initclocks_bsp() is called and init's nextcall and nexcallopt to SBT_MAX similar to your change. If no callout is scheduled before configtimer() then they remain set to SBT_MAX. Your current patch happens to trigger a (bogus) call to callout_process() on the first hardclock() because it sets nextcallopt to 'next' even though no callout is actually scheduled to fire at time 'next'. > 1) configtimer() is called and we init nextcall and nextcallopt: > > > next = now + timerperiod; > ... > > state->nextcall = next; > > state->nextcallopt = next; These both say "the next callout() should fire at 'next' which is the time of the next hardclock()", even though there may be no callouts scheduled (in which case both of these fields should be set to SBT_MAX from the call to cpu_initclocks_bsp(), or there may be callouts scheduled in which case 'nextcall' and 'nextcallopt' will reflect the time that those callouts are already scheduled for and this overwrites that). > 2) Any callout_reset() calls cpu_new_callout(): > > > */ > > state->nextcallopt = bt_opt; > > if (bt >= state->nextcall) > We follow this path, because "bt" is surely based on sbinuptime() and is > greater or equal to state->nextcall. Note that state->nextcallopt is > updated to bt_opt, which is in the future. Note, my patch should _leave_ nextcall at SBT_MAX (from cpu_initclocks_bsp()) unless there was already an earlier call to callout_reset(). IOW, it should be a NOP for the purposes of this branch compared with your change. (You could add a warning to print out if 'nextcall' != SBT_MAX during boot and see if it fires for example.) > > goto done; > > state->nextcall = bt; > > 3) getnextcpuevent(0) is called by the fast timercb() to setup the next > event: > > > state = DPCPU_PTR(timerstate); > > /* Handle hardclock() events, skipping some if CPU is idle. */ > > event = state->nexthard; > ... > > /* Handle callout events. */ > > if (event > state->nextcall) > We then go looping into this path, because state->nextcall is still > equal to "next" as in step 1) which is now in the past, until "now >= > state->nextcallopt" inside handleevents(), which clears this condition. > > event = state->nextcall; > ... > > return (event); I'm curious if there is a callout_reset() that has set 'nextcall' to a time that is effectively before 'now'. Maybe add a printf like this: Index: kern_clocksource.c =================================================================== --- kern_clocksource.c (revision 312301) +++ kern_clocksource.c (working copy) _at__at_ -498,12 +498,18 _at__at_ configtimer(int start) CPU_FOREACH(cpu) { state = DPCPU_ID_PTR(cpu, timerstate); state->now = now; + printf("%s: CPU %d: now %jd nextcall %jd nextcallopt %jd next %jd\n", __func__, cpu, state->nextcall, state->nextcall, next); #ifndef EARLY_AP_STARTUP if (!smp_started && cpu != CPU_FIRST()) state->nextevent = SBT_MAX; else #endif In particular what I am worried about with your patch is that for post-boot calls to configtimer() you will delay any previously-scheduled callouts until the next hardclock. -- John BaldwinReceived on Tue Jan 17 2017 - 18:10:22 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:09 UTC