Greetings, On Wed, Mar 28, 2007 at 11:38:44AM +0200, Ulrich Spoerlein wrote: > > I observe a strange effect, when using the following setup: Three > FreeBSD 6.2[1] machines on Gigabit Ethernet using em(4) interfaces. > > HostC is the NFS server, HostB has /net/share mounted from HostC. I > will use HostA and HostB to demonstrate the issue. Picture this: > > hostA # scp 500MB hostB:/net/share/ > > Iff the file "500MB" does not yet exist on the NFS share, I can see X > MB/s going out of HostA, X MB/s coming in on HostB, X MB/s going out > on hostB again and finally X MB/s coming in on HostC. > > If I run the scp again, I can see X MB/s going out from HostA, 2*X > MB/s coming in on HostB and X MB/s out plus X MB/s in on HostC. What's > happening is, that HostB issues one NFS READ call for every WRITE > call. The traffic flows like this: > > -----> -----> > A B C > <----- > > If I rm(1) the file on the NFS share, then the first scp(1) will not > show this behaviour. It is only when overwritting files, that this > happens. > > The real weirdness comes into play, when I simply cp(1) from HostB > itself like this: > > hostB # cp 500MB /net/share/ > > I can do this over and over again, and _never_ get any noteworthy > amount of NFS READ calls, only WRITE. The network traffic is also, as > you would expect. > > Then I tested using ssh(1) instead of scp(1), like this: > > hostA # cat 500MB | ssh hostB "cat >/net/share/500MB" > > This works, too. Probably, because sh(1) is truncating the file? > > So, can someone please explain to me, what is happening and if/how it > can be avoided? My first guess is that scp and Samba use too small an I/O block size. Forget NFS and simply imagine that an application issues writes in 128-byte blocks while the disc block size is 512 bytes. If the OS is simple, like MS-DOS :-), then it will read the whole disc block each time and replace just 128 bytes in it on every application's write. If the OS is a bit more sophisticated, say FreeBSD ;-), it will use a buffer cache to alleviate the disc churn. However, it still will have to read the disc block once on the first small write to it because it has no way to know that the application is going to overwrite the whole of the disc block in a moment. So each disc block is read once and written once; but the OS still has to read it due to the poor choice of the write block size. Of course, my scenario implies that the file already contains data and the writes go over them, not beyond the end of file. Something similar (but maybe a bit more complex) should be going on in your case. -- YarReceived on Wed Mar 28 2007 - 22:11:53 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:07 UTC