Re: More ULE bugs fixed.

From: Bruce Evans <bde_at_zeta.org.au> Date: Tue, 4 Nov 2003 00:33:48 +1100 (EST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:27 UTC

On Sun, 2 Nov 2003, Jeff Roberson wrote:

> On Sat, 1 Nov 2003, Bruce Evans wrote:

> > My simple make benchmark now takes infinitely longer with ULE under SMP,
> > since make -j 16 with ULE under SMP now hangs nfs after about a minute.
> > 4BSD works better.  However, some networking bugs have developed in the
> > last few days.  One of their manifestations is that SMP kernels always
> > panic in sbdrop() on shutdown.

This was fixed by setting debug.mpsafenet to 0 (fxp is apparently not MPSAFE
yet).

The last run with sched_ule.c 1.75 shows little difference between ULE
and 4BSD:

% *** zqz.4bsd.1	Wed Oct 29 22:03:29 2003
% --- zqz.ule.3	Sun Nov  2 22:58:53 2003
% ***************
% *** 4 ****
% --- 5,6 ----
% + ===> atm
% + ===> atm/sscop

The tree compiled by 4BSD is 4 days older so ULE does these extra.

% ***************
% *** 227 ****
% !        18.49 real         8.26 user         6.38 sys
% --- 229 ----
% !        18.44 real         8.00 user         6.43 sys

Differences for "make obj" (all this in usr.bin tree).

% ***************
% *** 229,233 ****
% !        265  average shared memory size
% !        116  average unshared data size
% !        125  average unshared stack size
% !      23222  page reclaims
% !         26  page faults
% --- 231,235 ----
% !        274  average shared memory size
% !        118  average unshared data size
% !        128  average unshared stack size
% !      22760  page reclaims
% !         25  page faults
% ***************
% *** 236,241 ****
% !        918  block output operations
% !       9893  messages sent
% !       9893  messages received
% !        230  signals received
% !      13034  voluntary context switches
% !       1216  involuntary context switches
% --- 238,243 ----
% !        926  block output operations
% !       9973  messages sent
% !       9973  messages received
% !        232  signals received
% !      17432  voluntary context switches
% !       1583  involuntary context switches

Tiny differences in time -l output for obj stage, except ULE does more
context switches.

The signals are mostly SIGCHLD (needed to fix make(1)).

% ***************
% *** 245 ****
% --- 248,249 ----
% + ===> atm
% + ===> atm/sscop
% ***************
% *** 506 ****
% !       126.67 real        57.42 user        43.83 sys
% --- 510 ----
% !       124.43 real        58.07 user        42.17 sys
% ***************
% *** 508,512 ****
% !       1973  average shared memory size
% !        803  average unshared data size
% !        128  average unshared stack size
% !     203770  page reclaims
% !       1459  page faults
% --- 512,516 ----
% !       1920  average shared memory size
% !        784  average unshared data size
% !        127  average unshared stack size
% !     203124  page reclaims
% !       1464  page faults
% ***************
% *** 514,520 ****
% !        165  block input operations
% !       1463  block output operations
% !      83118  messages sent
% !      83117  messages received
% !        265  signals received
% !     100319  voluntary context switches
% !       8113  involuntary context switches
% --- 518,524 ----
% !        167  block input operations
% !       1469  block output operations
% !      83234  messages sent
% !      83236  messages received
% !        267  signals received
% !     125750  voluntary context switches
% !      17825  involuntary context switches

Similarly for depend stage.

% ***************
% *** 524 ****
% --- 529,530 ----
% + ===> atm
% + ===> atm/sscop
% ***************
% *** 701 ****
% !       291.30 real       307.00 user        73.77 sys
% --- 707 ----
% !       290.28 real       308.16 user        74.05 sys
% ***************
% *** 703,707 ****
% !       2073  average shared memory size
% !       2076  average unshared data size
% !        127  average unshared stack size
% !     624020  page reclaims
% !        156  page faults
% --- 709,713 ----
% !       2084  average shared memory size
% !       2056  average unshared data size
% !        128  average unshared stack size
% !     626651  page reclaims
% !        154  page faults
% ***************
% *** 709,715 ****
% !         72  block input operations
% !       2122  block output operations
% !      45315  messages sent
% !      45317  messages received
% !        691  signals received
% !     195785  voluntary context switches
% !      58130  involuntary context switches
% --- 715,721 ----
% !         83  block input operations
% !       2133  block output operations
% !      45532  messages sent
% !      45524  messages received
% !        759  signals received
% !     228998  voluntary context switches
% !     128078  involuntary context switches

Similarly for the "all" stage.  The benchmark was not run carefully enough
for the 1 second differences in the times to be significant.

> You commented on the nice cutoff before.  What do you believe the correct
> behavior is?  In ULE I went to great lengths to be certain that I emulated
> the old behavior of denying nice +20 processes cpu time when anything nice
> 0 or above was running.  As a result of that, nice -20 processes inhibit
> any processes with a nice below zero from receiving cpu time.  Prior to a
> commit earlier today, nice -20 would stop nice 0 processes that were
> non-interactive.  I've changed that though so nice 0 will always be able
> to run, just with a small slice.  Based on your earlier comments, you
> don't believe that this behavior is correct, why, and what would you like
> to see?

Only RELENG_4 has that "old" behaviour.

I think the existence of rtprio and a non-broken idprio makes infinite
deprioritization using niceness unnecessary.  (idprio is still broken
(not available to users) in -current, but it doesn't need to be if
priority propagation is working as it should be.)  It's safer and fairer
for all niced processes to not completely prevent each other being
scheduled, and use the special scheduling classes for cases where this
is not wanted.  I'd mainly like the slices for nice -20 vs nice --20
processes to be very small and/or infrequent.

Bruce