Re: make buildkernel hang with SCHED_ULE

From: Jeff Roberson <jroberson_at_chesapeake.net>
Date: Thu, 14 Aug 2003 22:54:20 -0400 (EDT)
On Thu, 14 Aug 2003, Adam Migus wrote:

> Andrew Gallatin wrote:
>
> >Adam Migus writes:
> > > Folks,
> > > While doing some performance analysis (doing make -j5 buildkernel)
> > > on a set of 14 kernels I've hit one using the SCHED_ULE scheduler
> > > that hangs.   It happens every time but not necessarily in the same
> > > place in the make.
> > >
> >
> ><...>
> >
> > > The hardware is a dual Xeon box.  The kernel is SMP w/ SCHED_ULE
> > > instead of SCHED_4BSD, the options required for diskless and the
> > > following two options:
> >
> >You have machdep.hlt_logical_cpus: 1 in your sysctl output.  [BTW,
> >lots of people read this mail via the web archives at
> >http://docs.freebsd.org/cgi/getmsg.cgi?fetch=1073654+0+current/freebsd-current,
> >where its impossible to view mime; it would be MUCH better for us if
> >appended things like stack traces and sysctl output rather then
> >scrambling them for no reason]
> >
> >SCHED_ULE is incompatible with halting logical CPUs.  Something about
> >it does't know the core isn't running, so it schedules a job there
> >which never runs, and then it gets confused.  When I boot a 1 CPU P4
> >with an SMP kernel and machdep.hlt_logical_cpus=1, it hangs before
> >making it to multiuser mode..
> >
> >Try setting machdep.hlt_logical_cpus=0 (via sysctl now, and in
> >/boot/loader.conf so it doesn't happen again).
> >
> >
> >Drew
> >
> >
>
> Andrew,
> WRT the mime thing.  My apologies.  It never occured to me as everyone I
> know personally uses a "real" mail reader.  I'd attached them simply to
> keep the scrolling down and allow order independant viewing.  Thanks for
> the tip.  I'll just read them in as plain text in the future.
>
> WRT the sysctl value.  Thanks for the tip.  Is this to be considered a
> bug in SCHED_ULE?  If the default is hlt_logical_cpus=1 I would think
> the scheduler should be able to handle it or deal with it
> appropriately.  Perhaps ignoring the value, setting it to 0 internally
> or even just putting a warning message on boot?  After all, not everyone
> RTFM's.  :-)
>

The MD code does not currently export the status of the CPUs in any
reliable way.  ULE attempts to recognize halted CPUs but it is not able to
due to other issues.  I think john baldwin might be solving this for x86.
If not I can take a stab at it again.

> Thanks again,
>
> --
> Adam - Migus Dot Org (http://www.migus.org)
>
>
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
>
Received on Thu Aug 14 2003 - 17:55:42 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:18 UTC