Re: ULE scheduling oddity

From: Steve Kargl <sgk_at_troutmask.apl.washington.edu> Date: Tue, 15 Jul 2008 11:35:09 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:32 UTC

On Tue, Jul 15, 2008 at 01:11:05PM -0500, Stephen Montgomery-Smith wrote:
> Steve Kargl wrote:
> >
> >  PID USERNAME    THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
> > 3836 kargl         1 118    0   577M   572M CPU7   7   6:37 100.00% kzk90
> > 3839 kargl         1 118    0   577M   572M CPU2   2   6:36 100.00% kzk90
> > 3849 kargl         1 118    0   577M   572M CPU3   3   6:33 100.00% kzk90
> > 3852 kargl         1 118    0   577M   572M CPU0   0   6:25 100.00% kzk90
> > 3864 kargl         1 118    0   577M   572M RUN    1   6:24 100.00% kzk90
> > 3858 kargl         1 112    0   577M   572M RUN    5   4:10 78.47% kzk90
> > 3855 kargl         1 110    0   577M   572M CPU5   5   4:29 67.97% kzk90
> > 3842 kargl         1 110    0   577M   572M CPU4   4   4:24 66.70% kzk90
> > 3846 kargl         1 107    0   577M   572M RUN    6   3:22 53.96% kzk90
> > 3861 kargl         1 107    0   577M   572M CPU6   6   3:15 53.37% kzk90
> 
> My personal experience is that WCPU is not that accurate a measure of 
> what is really going on.  It is some kind of weighted CPU time, and 
> according to the man page you have to wait for up to a minute to get an 
> accurate sense.

WCPU may indeed be misleading, but there appears to be a problem 
with migrating a process to an otherwise idle cpu.  If I kill
the process on CPU0 and one of the processes on CPU6, I then see

last pid: 65293;  load averages:  8.00,  8.33,  8.91  up 19+21:43:26  11:14:21
39 processes:  9 running, 30 sleeping
CPU: 87.5% user,  0.0% nice,  0.0% system,  0.0% interrupt, 12.5% idle
Mem: 4569M Active, 64M Inact, 163M Wired, 304K Cache, 202M Buf, 26G Free
Swap: 4096M Total, 4096M Free

  PID USERNAME    THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
65035 kargl         1 118    0   577M   572M CPU7   7  62:15 100.00% kzk90
65038 kargl         1 118    0   577M   572M CPU3   3  62:11 100.00% kzk90
65023 kargl         1 118    0   577M   572M CPU1   1  58:44 100.00% kzk90
65032 kargl         1 118    0   577M   572M CPU6   6  55:36 100.00% kzk90
65026 kargl         1 118    0   577M   572M CPU2   2  53:32 100.00% kzk90
65029 kargl         1 112    0   577M   572M CPU5   5  42:16 73.29% kzk90
65041 kargl         1 110    0   577M   572M RUN    5  41:37 66.80% kzk90
65020 kargl         1 110    0   577M   572M CPU4   4  43:45 64.36% kzk90

The 3 processes with less than 100% WCPU bounce between CPU4 and CPU5.
Nothing is ever scheduled for CPU0.

> What I tend to do is to look at the TIME's, and see how fast they tick.
> 
> Also, you can run the programs thus:
> 
> time ./kargl
> 
> and the times produced at the end tend to be a rather good measure of 
> actual percentage cpu time.  Although I can see that in your situation 
> that this might be tricky to use.

I'd expect the output from time to be nearly identical for
each process in that each is running with the exact same
input parameters. 

> There is also a -C option with top that gives "raw CPU" time.  I have 
> never tried it, so I cannot speak to how good it really is.

-C doesn't appear to give anything different.

-- 
Steve