Re: More ULE bugs fixed.

From: Jeff Roberson <jroberson_at_chesapeake.net> Date: Wed, 29 Oct 2003 12:33:38 -0500 (EST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:26 UTC

On Thu, 30 Oct 2003, Bruce Evans wrote:

> > Test for scheduling buildworlds:
> >
> > 	cd /usr/src/usr.bin
> > 	for i in obj depend all
> > 	do
> > 		MAKEOBJDIRPREFIX=/somewhere/obj time make -s -j16 $i
> > 	done >/tmp/zqz 2>&1
> >
> > (Run this with an empty /somewhere/obj.  The all stage doesn't quite
> > finish.)  On an ABIT BP6 system with a 400MHz and a 366MHz CPU, with
> > /usr (including /usr/src) nfs-mounted (with 100 Mbps ethernet and a
> > reasonably fast server) and /somewhere/obj ufs1-mounted (on a fairly
> > slow disk; no soft-updates), this gives the following times:
> >
> > SCHED_ULE-yesterday, with not so careful setup:
> >        40.37 real         8.26 user         6.26 sys
> >       278.90 real        59.35 user        41.32 sys
> >       341.82 real       307.38 user        69.01 sys
> > SCHED_ULE-today, run immediately after booting:
> >        41.51 real         7.97 user         6.42 sys
> >       306.64 real        59.66 user        40.68 sys
> >       346.48 real       305.54 user        69.97 sys
> > SCHED_4BSD-yesterday, with not so careful setup:
> >       [same as today except the depend step was 10 seconds slower (real)]
> > SCHED_4BSD-today, run immediately after booting:
> >        18.89 real         8.01 user         6.66 sys
> >       128.17 real        58.33 user        43.61 sys
> >       291.59 real       308.48 user        72.33 sys
> > SCHED_4BSD-yesterday, with a UP kernel (running on the 366 MHz CPU) with
> >     many local changes and not so careful setup:
> >        17.39 real         8.28 user         5.49 sys
> >       130.51 real        60.97 user        34.63 sys
> >       390.68 real       310.78 user        60.55 sys
> >
> > Summary: SCHED_ULE was more than twice as slow as SCHED_4BSD for the
> > obj and depend stages.  These stages have little parallelism.  SCHED_ULE
> > was only 19% slower for the all stage.  ...
>
> I reran this with -current (sched_ule.c 1.68, etc.).  Result: no
> significant change.  However, with a UP kernel there was no significant
> difference between the times for SCHED_ULE and SCHED_4BSD.

There was a significant difference on UP until last week.  I'm working on
SMP now.  I have some patches but they aren't quite ready yet.

>
> > Test 5 for fair scheduling related to niceness:
> >
> > 	for i in -20 -16 -12 -8 -4 0 4 8 12 16 20
> > 	do
> > 		nice -$i sh -c "while :; do echo -n;done" &
> > 	done
> > 	time top -o cpu
> >
> > With SCHED_ULE, this now hangs the system, but it worked yesterday.  Today
> > it doesn't get as far as running top and it stops the nfs server responding.
> > To unhang the system and see what the above does, run a shell at rtprio 0
> > and start top before the above, and use top to kill processes (I normally
> > use "killall sh" to kill all the shells generated by tests 1-5, but killall
> > doesn't work if it is on nfs when the nfs server is not responding).
>
> This shows problems much more clearly with UP kernels.  It gives the
> nice -20 and -16 processes approx. 55% and 50% of the CPU, respectively
> (the total is significantly more than 100%), and it gives approx.  0%
> of the CPU to the other sh processes (perhaps exactly 0).  It also
> apparently gives gives 0% of the CPU to some important nfs process (I
> couldn't see exactly which) so the nfs server stops responding.
> SCHED_4BSD errs in the opposite direction by giving too many cycles to
> highly niced processes so it is naturally immune to this problem.  With
> SMP, SCHED_ULE lets many more processes run.

I seem to have broken something related to nice.  I only tested
interactivity and performance after my last round of changes.  I have a
standard test that I do that is similar to the one that you have posted
here.  I used it to gather results for my paper
(http://www.chesapeake.net/~jroberson/ULE.pdf).  There you can see what
the intended nice curve is like.  Oddly enough, I ran your test again on
my laptop and I did not see 55% of the cpu going to nice -20.  It was
spread proportionally from -20 to 0 with postive nice values not receiving
cpu time, as intended.  It did not, however, let interactive processes
proceed.  This is certainly a bug and it sounds like there may be others
which lead to the problems that you're having.

>
> The nfs server also sometimes stops reponding with only non-negatively
> niced processes (0 through 20 in the above), but it takes longer.
>
> The nfs server restarts if enough of the hog processes are killed.
> Apparently nfs has some critical process running at only user priority
> and nice 0 and even non-negatively niced processes are enough to prevent
> it it running.

This shouldn't be the case, it sounds like my interactivity boost is
somewhat broken.

>
> Top output with loops like the above shows many anomalies in PRI, TIME,
> WCPU and CPU, but no worse than the ones with SCHED_4BSD.  PRI tends to
> stick at 139 (the max) with SCHED_ULE.  With SCHED_4BSD, this indicates
> that the scheduler has entered an unfair scheduling region.  I don't
> know how to interpret it for SCHED_ULE (at first I thought 139 was a
> dummy value).

Priority has a different meaning in ULE and WCPU shouldn't differ from CPU
at the moment.  I'm confused about the results of your nice test, but it
shouldn't take me long to fix it.  I'm probably going to do SMP
performance first though.

Cheers,
Jeff

>
> Bruce
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
>