Re: EARLY_AP_STARTUP hangs during boot

From: John Baldwin <jhb_at_freebsd.org> Date: Thu, 28 Jul 2016 08:09:23 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:07 UTC

On Thursday, July 28, 2016 12:31:31 AM Gary Jennejohn wrote:
> On Wed, 27 Jul 2016 14:43:36 -0700
> John Baldwin <jhb_at_freebsd.org> wrote:
> 
> > On Tuesday, June 07, 2016 12:06:54 PM Gary Jennejohn wrote:
> > > On Tue, 31 May 2016 13:10:06 -0700
> > > John Baldwin <jhb_at_freebsd.org> wrote:
> > >   
> > > > On Saturday, May 28, 2016 02:11:41 PM Gary Jennejohn wrote:  
> > > > > On Fri, 27 May 2016 09:50:05 +0200
> > > > > Gary Jennejohn <gljennjohn_at_gmail.com> wrote:
> > > > >     
> > > > > > On Thu, 26 May 2016 16:54:35 -0700
> > > > > > John Baldwin <jhb_at_freebsd.org> wrote:
> > > > > >     
> > > > > > > On Tuesday, May 17, 2016 06:47:41 PM Gary Jennejohn wrote:      
> > > > > > > > On Mon, 16 May 2016 10:54:19 -0700
> > > > > > > > John Baldwin <jhb_at_freebsd.org> wrote:
> > > > > > > >         
> > > > > > > > > On Monday, May 16, 2016 12:22:42 PM Gary Jennejohn wrote:        
> > > > > > > > > > I tried out EARLY_AP_STARTUP, but the kernel hangs and I can't
> > > > > > > > > > break into DDB.
> > > > > > > > > > 
> > > > > > > > > > I did a verbose boot and the last lines I see are related to routing
> > > > > > > > > > MSI-X to various local APIC vectors.  I copied the last few lines and
> > > > > > > > > > they look like this:
> > > > > > > > > > 
> > > > > > > > > > msi: routing MSI-X IRQ 256 to local APIC 2 vector 48
> > > > > > > > > > msi: routing MSI-X IRQ 257 to local APIC 3 vector 48
> > > > > > > > > > msi: routing MSI-X IRQ 258 to local APIC 4 vector 48
> > > > > > > > > > msi: routing MSI-X IRQ 256 to local APIC 0 vector 49        
> > > > > > > >          ^^^^^^^ Assigning        
> > > > > > > > > > 
> > > > > > > > > > I tried disabling msi and msix in /boot/loader.conf, but the settings
> > > > > > > > > > were ignored (probabaly too early).          
> > > > > > > > > 
> > > > > > > > > No, those settings are not too early.  However, the routing to different
> > > > > > > > > CPUs now happens earlier than it used to.  What is the line before the
> > > > > > > > > MSI lines?  You can take a picture with your phone/camera if that's simplest.
> > > > > > > > >         
> > > > > > > > 
> > > > > > > > Here a few lines before the MSI routing happens:
> > > > > > > > 
> > > > > > > > hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 0,8 on acpi0
> > > > > > > > hpet0: vendor 0x4353, rev 0x1, 14318180 Hz, 3 timers, legacy route
> > > > > > > > hpet0: t0 : irqs 0x00c0ff (0), MSI, periodic
> > > > > > > > hpet0: t1 : irqs 0x00c0ff (0), MSI, periodic
> > > > > > > > hpet0: t2 : irqs 0x00c0ff (0), MSI, periodic
> > > > > > > > Timecounter "HPET" frequency 14318180 Hz quality 950        
> > > > > > > 
> > > > > > > The assigning message means it is in the loop using
> > > > > > > bus_bind_intr() to setup per-CPU timers.  Can you please try
> > > > > > > setting 'hint.hpet.0.per_cpu=0' at the loader prompt to see if
> > > > > > > disabling the use of per-CPU timers allows you to boot?
> > > > > > >       
> > > > > > 
> > > > > > Something has changed since the last time I generated a kernel with
> > > > > > this option.
> > > > > > 
> > > > > > Now I get a NULL-pointer dereference in the kernel, doesn't matter
> > > > > > whether I set the hint or not.
> > > > > >     
> > > > > 
> > > > > OK, now that the startup has been fixed, I tried setting the hint at
> > > > > the loader prompt, but the kenel hangs in exactly the same place as
> > > > > before.  I actually booted twice to make certain I hadn't made a
> > > > > typo when setting the hint.    
> > > > 
> > > > Humm, it shouldn't be calling bus_bind_intr() if the hint is set.  Actually,
> > > > I guess it just binds them all to first CPU if per-CPU timers aren't set.
> > > > Can you add debug printfs to hpet_attach() in sys/dev/acpica/acpi_hpet.c to
> > > > narrow down which line in that function it hangs after?
> > > > 
> > > > Another option to try is to add the following to your kernel config:
> > > > 
> > > > options  	KTR
> > > > options  	KTR_COMPILE=KTR_PROC
> > > > options 		KTR_MASK=KTR_PROC
> > > > options  	KTR_VERBOSE=1
> > > > 
> > > > this will spew a lot of crap to the screen, but if it stops spewing when it
> > > > hangs then it might be tell us where the system is hung.  If you have any way
> > > > to configure a serial console then this would also be useful even if it spews
> > > > constantly when it is hung (assuming you could log the output of the serial
> > > > console).
> > > >   
> > > 
> > > I used the KTR options.
> > > 
> > > After the Timecounter "HPET" frequency 14318180 Hz quality 950 I see
> > > 
> > > cpu0 mi_switch: old thread 10000 (swapper)
> > > cpu0 mi_switch: new thread 10022 (if_config_tqg_0)
> > > cpu0 sleep_broadcast(0x80002f9a600, 0)
> > > cpu0 msleep_spin: old thread 100022
> > > cpu0 mi_switch: old thread 10022
> > > cpu0 mi_switch: new thread 10016 (if_io_tqg_0)
> > > cpu0 sleep_broadcast(0x80002f9a780, 0)
> > > cpu0 msleep_spin: old thread 10016
> > > cpu0 mi_switch: old thread 10016
> > > cpu0 fork_exit: new thread 0x80004239510 (td_sched 0x8000042399d8, pid
> > > 10, idle: cpu0)
> > > 
> > > And that's all that came out, really not very much at all.  
> > 
> > Ok, that seems odd.
> > 
> > Can you apply this patch and run with the KTR output still:
> > 
> > Index: sched_ule.c
> > ===================================================================
> > --- sched_ule.c	(revision 303397)
> > +++ sched_ule.c	(working copy)
> > _at__at_ -1904,6 +1904,13 _at__at_ sched_switch(struct thread *td, struct thread *new
> >  	td->td_owepreempt = 0;
> >  	if (!TD_IS_IDLETHREAD(td))
> >  		tdq->tdq_switchcnt++;
> > +
> > +	if (td == &thread0 && cold) {
> > +		printf("thread0 switching out (ts_cpu %d):\n", ts->ts_cpu);
> > +		kdb_backtrace();
> > +		MPASS(TD_IS_RUNNING(td));
> > +	}
> > +	
> >  	/*
> >  	 * The lock pointer in an idle thread should never change.  Reset it
> >  	 * to CAN_RUN as well.
> > _at__at_ -1920,6 +1927,9 _at__at_ sched_switch(struct thread *td, struct thread *new
> >  		if (THREAD_CAN_MIGRATE(td) && !THREAD_CAN_SCHED(td, ts->ts_cpu))
> >  			ts->ts_cpu = sched_pickcpu(td, 0);
> >  #endif
> > +		if (td == &thread0 && cold)
> > +			printf("thread0 ts_cpu %d, cpuid %d\n", ts->ts_cpu,
> > +			    cpuid);
> >  		if (ts->ts_cpu == cpuid)
> >  			tdq_runq_add(tdq, td, srqflag);
> >  		else {
> > 
> > 
> 
> Well, now I know that ULE is a prerequiste for EARLY_AP_STARTUP!  I
> wasn't aware of that.  I prefer BSD and that's the scheduler I did
> the first tests with.
> 
> But with the ULE scheduler the system comes up all the way.
> 
> It would be nice if the BSD scheduler could also be modified to
> work with EARLY_AP_STARTUP.

No, ULE is not a prerequisite.  However, I will do some testing with 4BSD
to see what I need to fix.  Thanks!  I suspect the early sched_bind that
happens when migrating x86 interrupts due to bus_bind_intr() is causing
the issue.

-- 
John Baldwin