Am Freitag, 19. November 2004 13:56 schrieb Robert Watson: > On Fri, 19 Nov 2004, Emanuel Strobl wrote: > > Am Donnerstag, 18. November 2004 13:27 schrieb Robert Watson: > > > On Wed, 17 Nov 2004, Emanuel Strobl wrote: > > > > I really love 5.3 in many ways but here're some unbelievable transfer [...] > Well, the claim that if_em doesn't benefit from polling is inaccurate in > the general case, but quite accurate in the specific case. In a box with > multiple NIC's, using polling can make quite a big difference, not just by > mitigating interrupt load, but also by helping to prioritize and manage > the load, preventing live lock. As I indicated in my earlier e-mail, I understand, thanks for the explanation > It looks like the netperf TCP test is getting just under 27MB/s, or > 214Mb/s. That does seem on the low side for the PCI bus, but it's also Nut sure if I understand that sentence correctly, does it mean the "slow" 400MHz PII is causing this limit? (low side for the PCI bus?) > instructive to look at the netperf UDP_STREAM results, which indicate that > the box believes it is transmitting 417Mb/s but only 67Mb/s are being > received or processed fast enough by netserver on the remote box. This > means you've achieved a send rate to the card of about 54Mb/s. Note that > you can actually do the math on cycles/packet or cycles/byte here -- with > TCP_STREAM, it looks like some combination of recipient CPU and latency > overhead is the limiting factor, with netserver running at 94% busy. Hmm, I can't puzzle a picture out of this. > > Could you try using geom gate to export a malloc-backed md device, and see > what performance you see there? This would eliminate the storage round It's a pleasure: test2:~#15: dd if=/dev/zero of=/mdgate/testfile bs=16k count=6000 6000+0 records in 6000+0 records out 98304000 bytes transferred in 5.944915 secs (16535812 bytes/sec) test2:~#17: dd if=/mdgate/testfile of=/dev/null bs=16k 6000+0 records in 6000+0 records out 98304000 bytes transferred in 5.664384 secs (17354755 bytes/sec) This time it's no difference between disk and memory filesystem, but on another machine with a ich2 chipset and a 3ware controller (my current productive system which I try to replace with this project) there was a big difference. Attached is the corresponding message. Thanks, -Harry > trip and guarantee the source is in memory, eliminating some possible > sources of synchronous operation (which would increase latency, reducing > throughput). Looking at CPU consumption here would also be helpful, as it > would allow us to reason about where the CPU is going. > > > I was aware of that and because of lacking a GbE switch anyway I decided > > to use a simple cable ;) > > Yes, this is my favorite configuration :-). > > > > (5) Next, I'd measure CPU consumption on the end box -- in particular, > > > use top -S and systat -vmstat 1 to compare the idle condition of the > > > system and the system under load. > > > > I additionally added these values to the netperf results. > > Thanks for your very complete and careful testing and reporting :-). > > Robert N M Watson FreeBSD Core Team, TrustedBSD Projects > robert_at_fledge.watson.org Principal Research Scientist, McAfee Research > > _______________________________________________ > freebsd-current_at_freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
attached mail follows:
Am Dienstag, 2. November 2004 19:56 schrieb Doug White: > On Tue, 2 Nov 2004, Robert Watson wrote: > > On Tue, 2 Nov 2004, Emanuel Strobl wrote: > > > It's a IDE Raid controller (3ware 7506-4, a real one) and the file is > > > indeed huge, but not abnormally. I have a harddisk video recorder, so I > > > have lots of 700MB files. Also if I copy my photo collection from the > > > server it takes 5 Minutes but copying _to_ the server it takes almost > > > 15 Minutes and the average file size is 5 MB. Fast Ethernet isn't > > > really suitable for my needs, but at least the 10MB/s should be > > > reached. I can't imagine I get better speeds when I upgrade to GbE, > > > (which the important boxes are already, just not the switch) because > > > NFS in it's current state isn't able to saturate a 100baseTX line, at > > > least in one direction. That's the real anstonishing thing for me. Why > > > does reading staurate 100BaseTX but writes only a third? > > > > Have you tried using tcpdump/ethereal to see if there's any significant > > packet loss (for good reasons or not) going on? Lots of RPC retransmits > > would certainly explain the lower performance, and if that's not it, it > > would be good to rule out. The traces might also provide some insight > > into the specific I/O operations, letting you see what block sizes are in > > use, etc. I've found that dumping to a file with tcpdump and reading > > with ethereal is a really good way to get a picture of what's going on > > with NFS: ethereal does a very nice job decoding the RPCs, as well as > > figuring out what packets are related to each other, etc. > > It'd also be nice to know the mount options (nfs blocksizes in > particular). I haven't done intensive wire-dumps yet, but I figured out some oddities. My main problem seems to be the 3ware controller in combination with NFS. If I create a malloc backed md0 I can push more than 9MB/s to it with UDP and more that 10MB/s with TCP (both without modifying r/w-size). I can also copy a 100M file from twed0s1d to twed0s1e (so from and to the same RAID5 array which is worst rate) with 15MB/s so the array can't be the bottleneck. Only when I push to the RAID5 array via NFS I only get 4MB/s, no matter if I use UDP, TCP or nonstandard r/w-sizes. Next thing I found is that if I tune -w to anything higher than the standard 8192 the average transfer rate of one big file degrades with UDP but increases with TCP (like I would expect). UDP transfer seems to hic-up with -w tuned, transfer rates peak at 8MB/s but the next second they stay at 0-2MB/s (watched with systat -vm 1) but with TCP everything runs smooth, regardless of the -w value. Now back to my real problem: Can you imagine that NFS and twe are blocking each other or something like that? Why do I get such really bad transfer rates when both parts are in use but every single part on its own seems to work fine? Thanks for any help, -HarryReceived on Fri Nov 19 2004 - 13:10:32 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:22 UTC