Re: Heavy I/O blocks FreeBSD box for several seconds

From: Steve Kargl <sgk_at_troutmask.apl.washington.edu> Date: Wed, 6 Jul 2011 11:00:01 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:15 UTC

On Wed, Jul 06, 2011 at 05:05:41PM +0000, Poul-Henning Kamp wrote:
> In message <20110706170132.GA68775_at_troutmask.apl.washington.edu>, Steve Kargl w
> rites:
> 
> >I periodically ran the same type test in the 2008 post over the
> >last three years.  Nothing has changed.  I even set up an account
> >on one node in my cluster for jeffr to use.  He was too busy to
> >investigate at that time.
> 
> Isn't this just the lemming-syncer hurling every dirty block over
> the cliff at the same time ?

I don't know the answer.  Of course, having no experience in
processing scheduling, I don't understand the question either ;-)

AFAICT, it is a cpu affinity issue.  If I launch n+1 MPI images
on a system with n cpus/cores, then 2 (and sometimes 3) images
are stuck on a cpu and those 2 (or 3) images ping-pong on that
cpu.  I recall trying to use renice(8) to force some load 
balancing, but vaguely remember that it did not help.

> To find out:  Run gstat and keep and eye on the leftmost column
> 
> The road map for fixing that has been known for years...

I'll keep this in mind, the next time I upgrade the cluster.
It's currently running a Feb 10th vintage kernel, and is
under fairly heavy use at the moment.

-- 
Steve