Re: Apparent strange disk behaviour in 6.0

From: Scott Long <scottl_at_samsco.org>
Date: Thu, 28 Jul 2005 15:36:45 -0600
Julian Elischer wrote:
> 
> 
> Julian Elischer wrote:
> 
>>
>>
>> I've been playing around with some raid arrays.
>> I've notived some odd things.
>>
>>
> [stuff]
> 
> Ok I've done some researching..
> 
> it APPEARS that teh system is swapping out running programs in order to 
> store more write data!
> 
> experiment:
> boot to single user mode.
> type:
> mount {big partition}
> dd if=/dev/zero bs=128K of=/$bigpartition}/bigfile count=1000000
> 
> notice that after a short while your dd is killed because the system is 
> out of swapspace.
> (it doesn't have any)
> Why the F*ck does it need swapspace.? there are exactly 2 proceses 
> running in userspace
> and one of them s in wait4(). dd shows a resident size of about 170KB
> leaving about a GIGABYTE of unused RAM.
> 
> The system should make dd wait rather than trying to swap its pages out..
> 
> 
> if you then do
> swapon (your swap device)
> and repeat teh command in the background,
> vmstat 1 will show you pages being faulted in and out...
> no WONDER IO goes to hell in a handbasket..
> 
> Outgoing IO should never be able to force running programs out!
> It should start re-using old pages from the same file!

I've seen this too.  It's especially easy to trigger if you do a large
buildworld, especially using -j.

> 
> 4.11 gives a consinstent 65MB/sec with this array, for as long as I run 
> it..
> 6.0 gives me 65MB for 15 seconds and then it drops to 20MB/sec and then 
> 10MB/sec
> and the swap disk bursts into life.
> 
> the array goes from all the lights solidly on, to bursts of activity 
> with large gaps in between them.
> 
> 

I think that it's time for F/S, disk, and VM guys to sit down in a room
and start figuring out how to reign things back in.  We've basically had
little real-world checks on this kind of stuff for 5 years (i.e. since
5-CURRENT), and it's quite possible that things have gotten massively
un-tuned, non-obvious but critical backpressure codepaths have beeen
garbage collected, etc.

Scott
Received on Thu Jul 28 2005 - 19:37:09 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:40 UTC