Re: ULE and current.

From: Andy Farkas <andyf_at_speednet.com.au> Date: Fri, 12 Dec 2003 00:03:06 +1000 (EST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:33 UTC

Bruce Evans wrote:
> On Thu, 11 Dec 2003, Jeff Roberson wrote:
> > On Thu, 11 Dec 2003, Andy Farkas wrote:
> > > Jeff Roberson wrote:
> > ...
> > > And at this point I would expect something like:
> > >
> > >  sh #0 using 66.3%,
> > >  sh #1 using 66.3%,
> > >  sh #2 using 66.3%,
> > >  idle: cpu0 to be 0%,
> > >  idle: cpu1 to be 0%.
> >
> > This is actually very difficult to get exactly right.  Since all processes
> > want to run all the time, you have to force alternating pairs to share the
> > second cpu.  Otherwise they wont run for an even amount of time.
>
> Perhaps add some randomness.  Busting the caches every second or so
> shouldn't make much difference.  It happens anyway if there are more
> processes.

Ah, a fundamental misconceptualisation of how SMP works on my part. I'll
leave it up to you guys to figure out how best to do it.

I also have a quad cpu box to test with, so beware :)

...
> > The vm has an idle thread that zeros pages.  This is the third thread.
> >
> > >                     /0   /10  /20  /30  /40  /50  /60  /70  /80  /90  /100
> > > root     idle: cpu0 XXXXXXXXXXXXXXXX
> > > root     idle: cpu1 XXXXXXXXXXXXXXXX
> > >              <idle> XXXXXXXXXXXXXXXX
>
> No, <idle> is just cp_time[CP_IDLE] scaled incorrectly.  It is bogus now that
> we have actual idle processes.  The scaling for the idle processes seems to
> be almost correct (it is apparently scaled by the number of CPUs), but the
> scaling or the value for <idle> is apparently off by a factor of the number
> of CPUs.
...
> > > So, where *I* get confused is that top(1) thinks that the system can be up
> > > to 200% idle, whereas systat(1) thinks there are 3 threads each consuming
> > > a third of 100% idleness... who is right?
> >
> > Both, they just display different statistics. ;-)
>
> Neither; they have different bugs :-).  top actually seems to be
> bug-free here, except it intentionally displays percentages that add
> up to a multiple of 100%.  This seems to be best.  You just have to
> get used to the percentages in the CPU stat line being scaled and the
> others not being scaled.

So the almost-bug in top(1) is that some CPU percentages are scaled and
some are not scaled?

> I now understand the case of an idle system:
>
>                     /0   /10  /20  /30  /40  /50  /60  /70  /80  /90  /100
> root     idle: cpu0 XXXXXXXXXXXXXXXX
> root     idle: cpu1 XXXXXXXXXXXXXXXX
>              <idle> XXXXXXXXXXXXXXXX
>
> This should show 50% for each "idle: cpuN" process.

Yes.

> Instead, it tries
> to show 33.3% for each idle process including the pseudo one, but has
> some rounding errors that make it display 30%.  The factor of 3 to get
> 33.3% instead of 2 to get 50% for the real idle processes is from
> bogusly counting the pseudo-idle process.  The factor of 3 to get 33.3%
> instead of 1 to get 100% for the pseudo-idle process is from bogusly
> counting the real idle processes.
>
> None of these bugs except the percentages being slightly too high are
> scheduler-dependent.

ps. You mentioned "jitter". Thats why I 'sleep 120' in the above tests.
It tends to take about that long for top(1) to settle down. Why is that
so?

>
> Bruce

--

 :{ andyf_at_speednet.com.au

        Andy Farkas
    System Administrator
   Speednet Communications
 http://www.speednet.com.au/