On 06/07/2011 19:05, Poul-Henning Kamp wrote: > In message<20110706170132.GA68775_at_troutmask.apl.washington.edu>, Steve Kargl w > rites: > >> I periodically ran the same type test in the 2008 post over the >> last three years. Nothing has changed. I even set up an account >> on one node in my cluster for jeffr to use. He was too busy to >> investigate at that time. > > Isn't this just the lemming-syncer hurling every dirty block over > the cliff at the same time ? Occasionally there have been reports of there being "something" (tm) which causes CPU-bound processes to stall / starve when heavy file system IO is present. I think I have also noticed this occasionally but it was never serious enough to pursue it - only X11 lagging. The problem is - all this is sporadic and thus anecdotal. AFAIK, the "lemming-syncer" behaviour shouldn't stall anything if it's the only thing which is "wrong", right? I know one issue which might seemingly stall all IO: since there is only one IO queue, if it is filled with requests which take a long time, all other IO is blocked; as an example: doing simultaneous writes on a slow USB flash stick and on a hard drive will soon result in the queue being filled with slow USB requests, which will by the nature of the queue "push out" fast disk requests, making the drive look very slow (this is most noticable with large hirunningspace). But this doesn't seem to directly correlate with the OP's problem. Maybe this particular problem can be tested by having two drives - one to provoke this kind of stalling, and one to test if any IO can be done on it while the stall happens on the first one.Received on Thu Jul 07 2011 - 10:21:16 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:15 UTC