Re: More ULE bugs fixed.

From: Jeff Roberson <jroberson_at_chesapeake.net> Date: Fri, 17 Oct 2003 02:28:21 -0400 (EDT) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:25 UTC

On Fri, 17 Oct 2003, Bruce Evans wrote:

> How would one test if it was an improvement on the 4BSD scheduler?  It
> is not even competitive in my simple tests.

[scripts results deleted]

>
> Summary: SCHED_ULE was more than twice as slow as SCHED_4BSD for the
> obj and depend stages.  These stages have little parallelism.  SCHED_ULE
> was only 19% slower for the all stage.  It apparently misses many
> oppurtunities to actually run useful processes.  This may be related
> to /usr being nfs mounted.  There is lots of idling waiting for nfs
> even in the SCHED_4BSD case.  The system times are smaller for SCHED_ULE,
> but this might not be significant.  E.g., zeroing pages can account
> for several percent of the system time in buildworld, but on unbalanced
> systems that have too much idle time most page zero gets done in idle
> time and doesn't show up in the system time.

At one point ULE was at least as fast as 4BSD and in most cases faster.
This is a regression.  I'll sort it out soon.

>
> Test 1 for fair scheduling related to niceness:
>
> 	for i in 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
> 	do
> 		nice -$i sh -c "while :; do echo -n;done" &
> 	done
> 	top -o time
>
> [Output deleted].  This shows only a vague correlation between niceness
> and runtime for SCHED_ULE.  However, top -o cpu shows a strong correlation
> between %CPU and niceness.  Apparently, %CPU is very innacurate and/or
> not enough history is kept for long-term scheduling to be fair.
>
> Test 5 for fair scheduling related to niceness:
>
> 	for i in -20 -16 -12 -8 -4 0 4 8 12 16 20
> 	do
> 		nice -$i sh -c "while :; do echo -n;done" &
> 	done
> 	time top -o cpu
>
> With SCHED_ULE, this now hangs the system, but it worked yesterday.  Today
> it doesn't get as far as running top and it stops the nfs server responding.
> To unhang the system and see what the above does, run a shell at rtprio 0
> and start top before the above, and use top to kill processes (I normally
> use "killall sh" to kill all the shells generated by tests 1-5, but killall
> doesn't work if it is on nfs when the nfs server is not responding).

  661 root     112  -20   900K   608K RUN      0:24 27.80% 27.64% sh
  662 root     114  -16   900K   608K RUN      0:19 12.43% 12.35% sh
  663 root     114  -12   900K   608K RUN      0:15 10.66% 10.60% sh
  664 root     114   -8   900K   608K RUN      0:11  9.38%  9.33% sh
  665 root     115   -4   900K   608K RUN      0:10  7.91%  7.86% sh
  666 root     115    0   900K   608K RUN      0:07  6.83%  6.79% sh
  667 root     115    4   900K   608K RUN      0:06  5.01%  4.98% sh
  668 root     115    8   900K   608K RUN      0:04  3.83%  3.81% sh
  669 root     115   12   900K   608K RUN      0:02  2.21%  2.20% sh
  670 root     115   16   900K   608K RUN      0:01  0.93%  0.93% sh

I think you cvsup'd at a bad time.  I fixed a bug that would have caused
the system to lock up in this case late last night.  On my system it
freezes for a few seconds and then returns.  I can stop that by turning
down the interactivity threshold.

Thanks,
Jeff

>
> Bruce
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
>