On Monday, January 16, 2017 10:10:16 PM Hans Petter Selasky wrote: > On 01/16/17 20:31, John Baldwin wrote: > > On Monday, January 16, 2017 04:51:42 PM Hans Petter Selasky wrote: > >> Hi, > >> > >> When booting I observe an additional 30-second delay after this print: > >> > >>> Timecounters tick every 1.000 msec > >> > >> ~30 second delay and boot continues like normal. > >> > >> Checking "vmstat -i" reveals that some timers have been running loose. > >> > >>> cpu0:timer 44300 442 > >>> cpu1:timer 40561 404 > >>> cpu3:timer 48462822 483058 > >>> cpu2:timer 48477898 483209 > >> > >> Trying to add delays and/or prints around the Timecounters printout > >> makes the issue go away. Any ideas for debugging? > > > > I have generally used KTR tracing to trace what is happening during > > boot to debug EARLY_AP_STARTUP issues. > > > > Hi John, > > What happens is that getnextcpuevent(0) keeps on returning > "state->nextcall" which is in the past for CPU #2 and #3 on my box. > > In "cpu_new_callout()" there is a check if "bt >= state->nextcall", > which I suspect is true, so "state->nextcall" never gets set to real > minimum sbintime. > > The attached patch fixes the problem for me, but I'm not 100% sure if it > is correct. I think we want to be honoring any currently scheduled callouts. You could do that by setting it to 'cc_firstevent' of the associated CPU, but in practice 'state->nextcall' should already be set to that (it is initalized to SBT_MAX in cpu_initclocks_bsp() and is then only set to other values due to cpu_new_callout()). Keep in mind that configtimer() is not just called from boot, but is also invoked when starting/stopping the profiling timer. However, when setting 'nextevent' (which is used to schedule the next timer interrupt), we should be honoring the existing 'nextcall' if it is sooner than the next hardclock. (One odd thing is that even in your case the first call to handleevents(), the 'now => state->nextcallout' check in handleevents() should be true which resets both nextcall and nextcallopt and invokes callout_process().) Here is a suggestion that attempts what I described in the first paragraph. If you still get hangs it would be good to break into DDB and capture the output of 'show clocksource'. Index: kern_clocksource.c =================================================================== --- kern_clocksource.c (revision 312301) +++ kern_clocksource.c (working copy) _at__at_ -503,7 +503,12 _at__at_ configtimer(int start) state->nextevent = SBT_MAX; else #endif + if (next < state->nextcall) state->nextevent = next; + else if (state->nextcall < now) + state->nextevent = now; + else + state->nextevent = state->nextcall; if (periodic) state->nexttick = next; else _at__at_ -511,8 +516,6 _at__at_ configtimer(int start) state->nexthard = next; state->nextstat = next; state->nextprof = next; - state->nextcall = next; - state->nextcallopt = next; hardclock_sync(cpu); } busy = 0; -- John BaldwinReceived on Tue Jan 17 2017 - 16:32:41 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:09 UTC