Re: nice handling in ULE (was: Re: SCHEDULE and high load situations)

From: Don Lewis <truckman_at_FreeBSD.org> Date: Fri, 13 Aug 2004 03:02:36 -0700 (PDT) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:06 UTC

On 12 Aug, Don Lewis wrote:

> I did some experimentation, and the problem I'm seeing appears to just
> be related to how nice values are handled by ULE.  I'm running two
> copies of the following program, one at nice +15, and the other not
> niced:
> 
> hairball:~ 102>cat sponge.c
> int
> main(int argc, char **argv)
> {
>         while (1)
>                 ;
> }
> 
> The niced process was started second, but it has accumulated more CPU
> time and is getting a larger percentage of the CPU time according to
> top.
> 
> last pid:   662;  load averages:  2.00,  1.95,  1.45    up 0+00:22:35  15:14:27
> 31 processes:  3 running, 28 sleeping
> CPU states: 45.3% user, 53.1% nice,  1.2% system,  0.4% interrupt,  0.0% idle
> Mem: 22M Active, 19M Inact, 44M Wired, 28K Cache, 28M Buf, 408M Free
> Swap: 1024M Total, 1024M Free
> Seconds to delay: 
>   PID USERNAME PRI NICE   SIZE    RES STATE    TIME   WCPU    CPU COMMAND
>   599 dl       139   15  1180K   448K RUN      8:34 53.91% 53.91% sponge
>   598 dl       139    0  1180K   448K RUN      7:22 42.97% 42.97% sponge
>   587 dl        76    0  2288K  1580K RUN      0:03  0.00%  0.00% top
>   462 root      76    0 56656K 46200K select   0:02  0.00%  0.00% Xorg
>   519 gdm       76    0 11252K  8564K select   0:01  0.00%  0.00% gdmlogin
>   579 dl        76    0  6088K  2968K select   0:00  0.00%  0.00% sshd
> 
> 
> 
> I thought it might have something to do with grouping by niceness, which
> would group the un-niced process with a bunch of other processes that
> wake up every now and then for a little bit if CPU time, so I tried the
> experiment again with nice +1 and nice +15.  This gave a rather
> interesting result.  Top reports the nice +15 process as getting a
> higher %CPU, but the nice +1 process has slowly accumulated a bit more
> total CPU time.  The difference in total CPU time was initially seven
> seconds or less.
> 
> last pid:   745;  load averages:  2.00,  1.99,  1.84    up 0+00:43:30  15:35:22
> 31 processes:  3 running, 28 sleeping
> CPU states:  0.0% user, 99.6% nice,  0.4% system,  0.0% interrupt,  0.0% idle
> Mem: 22M Active, 19M Inact, 44M Wired, 28K Cache, 28M Buf, 408M Free
> Swap: 1024M Total, 1024M Free
> Seconds to delay: 
>   PID USERNAME PRI NICE   SIZE    RES STATE    TIME   WCPU    CPU COMMAND
>   675 dl       139   15  1180K   448K RUN      9:48 52.34% 52.34% sponge
>   674 dl       139    1  1180K   448K RUN     10:03 44.53% 44.53% sponge
>   587 dl        76    0  2288K  1580K RUN      0:06  0.00%  0.00% top
>   462 root      76    0 56656K 46200K select   0:03  0.00%  0.00% Xorg
>   519 gdm       76    0 11252K  8564K select   0:02  0.00%  0.00% gdmlogin
>   579 dl        76    0  6088K  2968K select   0:00  0.00%  0.00% sshd

I compiled a kernel with the KTR stuff and ran this last experiment
again.  It looks like the two niced processes get the appropriate slice
values assigned by ULE, and they both have the same priority.  Where
things seem to be going wrong is that the two processes are being run in
a round robin fashion, alternating execution once every tick or two. The
less-nice process gets prempted multiple times by the more-nice process
before the less-nice process has exhausted its slice.