Re: Change top's notion of idle processes / threads

From: John Baldwin <jhb_at_freebsd.org>
Date: Tue, 27 May 2014 16:37:23 -0400
On Tuesday, May 27, 2014 10:43:56 am Ed Maste wrote:
> On 26 May 2014 11:51, Ed Maste <emaste_at_freebsd.org> wrote:
> >
> > The change in the patch is good, the new behaviour is much more
> > usable.  Note that we don't currently define "idle" in top(8); for
> > this change maybe we should just state that non-idle processes may
> > report 0% CPU due to rounding.
> 
> That said, I've discovered an issue with the change after using it a
> bit more, when using -I on the command line.  (Previously I only tried
> it by pressing I in interactive mode.)  With the change top -I lists
> all processes at first (which is a little annoying), but it renders -I
> ineffective when used with -b (batch mode).
> 
> What do you think about this additional change, so that we use the
> previous 0% idleness test for the first iteration of the list:
> 
>  if (oldp == NULL)
> - return (pp->ki_runtime != 0);
> + return (pp->ki_pctcpu != 0);

Not a bad idea.  I have another change that also reworks how ki_pctcpu
is calculated.  Namely, if this is a subsequent update and we have the
kinfo_proc from the previous update (or if the process/thread in question
is a newborn since the previous update) this patch uses delta(ki_runtime) /
delta(CLOCK_UPTIME) to calculate a true %CPU across the interval.  As opposed 
to the scheduler-calculated %CPU which has a long delay function, this causes 
the %CPU column in top to reflect changes in behavior much more quickly and to 
also give a more accurate view of mostly-idle processes and threads.  However, 
in actual use it seems to have some jitter as I can't atomically grab a 
CLOCK_UPTIME value with the kern.proc sysctl info.  As a result, I see 
occasional rounding errors where a 'while (1)' loop can oscillate between 98 
and 102% CPU usage for example.  Still, if you start a while (1) loop, it 
jumps to the top of the list much quicker (and falls off as soon as you pause 
it).  For the initial fetch (and for batch mode) it uses the scheduler-
calculated value still.

I also adopted phk_at_'s suggestion of counting changes in ru_nvcsw/ru_nivcsw
as evidence of running.

http://people.freebsd.org/~jhb/patches/top_pctcpu.patch

-- 
John Baldwin
Received on Tue May 27 2014 - 18:37:33 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:49 UTC