4BSD process starvation during I/O

From: Kris Kennaway <kris_at_obsecurity.org>
Date: Wed, 23 Nov 2005 15:18:37 -0500
I have noticed that when multiple identical processes (e.g. gtar, or
dd) are run on 4BSD, when there are N CPUs on a machine there will be
N processes that run with a higher CPU share than all the others.  As
a result, these N processes finish first, then another N, and so on.

This is true under both 4.11 and 6.0 (so in that sense it's not so
surprising), but the effect is much more pronounced on 6.0 (which may
be possible to fix).

Here are the exit times for 6 identical gtar processes (and same 4.11
gtar binary) started together on a 2-CPU machine:

6.0:

1132776233
1132776235
1132776264
1132776265
1132776279
1132776279
      238.86 real        10.87 user       166.00 sys

You can see they finish in pairs, and there's a spread of 46 seconds
from first to last.

On 4.11:

1132775426
1132775429
1132775431
1132775432
1132775448
1132775449
      275.56 real         0.43 user       336.26 sys

They also finish in pairs, but the spread is half, at 23 seconds.

This seems to be correlated to the rate at which the processes perform
I/O.  On a quad amd64 machine running 6.0 when I run multiple dd
processes at different offsets in a md device:

268435456 bytes transferred in 1.734285 secs (154781618 bytes/sec)
268435456 bytes transferred in 1.737857 secs (154463501 bytes/sec)
268435456 bytes transferred in 1.751760 secs (153237575 bytes/sec)
268435456 bytes transferred in 3.263460 secs (82254865 bytes/sec)
268435456 bytes transferred in 3.295294 secs (81460244 bytes/sec)
268435456 bytes transferred in 3.349770 secs (80135487 bytes/sec)
268435456 bytes transferred in 4.716637 secs (56912467 bytes/sec)
268435456 bytes transferred in 4.850927 secs (55336941 bytes/sec)
268435456 bytes transferred in 4.953528 secs (54190760 bytes/sec)

They finish in groups of 3 here since the 4th CPU is being used to
drive the md worker thread (which takes up most of the CPU).  In this
case the first 3 dd processes get essentially 100% of the CPU, and the
rest get close to 0% until those 3 processes finish.

Perhaps this can be tweaked.

Kris

P.S. Please, no responses about how maybe someone could write a new
scheduler that doesn't have this property.



Received on Wed Nov 23 2005 - 19:18:39 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:48 UTC