:> ii) Now I tried a fair raw throughput comparison between Linux and :> FreeBSD. This time I read always the same whole (Linux) partition :> (~4GB) so that the results should be comparable. I always used native :> dd (FreeBSD and Linux). A measure with Linux dd under emulation :> in FreeBSD gave yet the same result: :> :> dd if=/dev/ad0s9 bs=nnn of=/dev/null :> FreeBSD: :> nnn=4k: 4301789184 bytes transferred in 170.031898 secs (25299895 :... :> :> I notice, that the rates are very similar if bs >= 16k. Under FreeBSD :> the raw throughput rate depends on the block size. Read rate under Linux :> is independent of the block size. Is there a special reason for that? :... : :Last I heard Linux did not have a raw device.. this means that it will :always be cached, and will do various read-ahead optimizations, while :FreeBSD does not have a buffered/cooked device anymore.. w/o the cooked :device FreeBSD has to suffer the latency of the command to the drive... : :-- : John-Mark Gurney Voice: +1 415 225 5579 Don't guess, experiment! I'm sure linux has a device monitor program. If not iostat, then something else. It should be fairly easy to determine whether it is buffering the data. The numbers alone don't tell the story. You can glean a lot of information by running a script like this: #!/bin/csh # time dd if=/dev/ad0 bs=512 of=/dev/null count=131072 time dd if=/dev/ad0 bs=1k of=/dev/null count=65536 time dd if=/dev/ad0 bs=2k of=/dev/null count=32768 time dd if=/dev/ad0 bs=4k of=/dev/null count=16384 time dd if=/dev/ad0 bs=8k of=/dev/null count=8192 time dd if=/dev/ad0 bs=16k of=/dev/null count=4096 time dd if=/dev/ad0 bs=32k of=/dev/null count=2048 time dd if=/dev/ad0 bs=64k of=/dev/null count=1024 time dd if=/dev/ad0 bs=128k of=/dev/null count=512 time dd if=/dev/ad0 bs=256k of=/dev/null count=256 time dd if=/dev/ad0 bs=512k of=/dev/null count=128 In particular, you can see both the transfer rate and the user and supervisor overheads involved in sheparding the transfer. At some point the cpu use stops saturating the cpu and the transfer rate maxes out, but even though that occurs you can see that the larger block sizes require less supervisor overhead. If you don't see this sort of marked reduction in supervisor overhead with linux then linux is probably buffering the I/O. TRANSFER RATE VVVVVVVV 67108864 bytes transferred in 8.996482 secs (7459456 bytes/sec) 0.007u 3.412s 0:08.99 37.9% 13+29k 0+0io 0pf+0w ^^^^^ ^^^^^ ^^^^^ USER SUPERVISOR CPU PERCENTAGE (note: only 37% so raw device transaction overheads are likely limiting throughput) 67108864 bytes transferred in 4.563335 secs (14706101 bytes/sec) 0.000u 1.608s 0:04.56 35.0% 10+22k 0+0io 0pf+0w 67108864 bytes transferred in 2.430306 secs (27613340 bytes/sec) 0.000u 0.773s 0:02.43 31.6% 10+23k 0+0io 0pf+0w 67108864 bytes transferred in 1.538318 secs (43624834 bytes/sec) 0.000u 0.312s 0:01.53 20.2% 15+35k 0+0io 0pf+0w 67108864 bytes transferred in 1.264007 secs (53092158 bytes/sec) 0.015u 0.132s 0:01.26 11.1% 24+58k 0+0io 0pf+0w ^^^^^^^^ transfer rate maxes out, but cpu time continues to decrease vvvvv 67108864 bytes transferred in 1.174806 secs (57123364 bytes/sec) 0.000u 0.093s 0:01.17 7.6% 22+60k 0+0io 0pf+0w 67108864 bytes transferred in 1.208589 secs (55526618 bytes/sec) 0.000u 0.046s 0:01.20 3.3% 0+0k 0+0io 0pf+0w 67108864 bytes transferred in 1.241938 secs (54035605 bytes/sec) 0.000u 0.007s 0:01.24 0.0% 0+0k 0+0io 0pf+0w 67108864 bytes transferred in 1.208579 secs (55527089 bytes/sec) 0.007u 0.015s 0:01.20 0.8% 0+0k 0+0io 0pf+0w 67108864 bytes transferred in 1.183573 secs (56700232 bytes/sec) 0.000u 0.007s 0:01.18 0.0% 0+0k 0+0io 0pf+0w 67108864 bytes transferred in 1.200243 secs (55912731 bytes/sec) 0.000u 0.015s 0:01.20 0.8% 0+0k 0+0io 0pf+0w Doing an 'iostat ad0 1' (in my case) at the same time in another window, on *BSD anyway, tells you what the controller is actually being asked to do. In this case it is obvious that the controller is being told to make tiny transfers and is able to do 14000+ transactions/sec, and that even with the next step up the controller is still only able to do 14000+ transactions/sec, which indicates that the system has hit the controllers transaction rate limit. tty ad0 cpu tin tout KB/t tps MB/s us ni sy in id 0 0 0.50 14629 7.14 0 0 39 0 61 ^^^^^ PHYSICAL TRANSACTIONS PER SECOND VVVVVV 0 0 0.50 14626 7.14 0 0 40 0 60 0 0 0.50 14627 7.14 0 0 39 0 61 0 0 0.50 14634 7.15 0 0 42 0 58 0 0 0.50 14638 7.15 0 0 41 0 59 0 0 0.50 14491 7.08 0 0 40 0 60 0 0 0.50 14629 7.14 0 0 37 0 63 0 0 0.50 14630 7.14 0 0 39 0 61 0 0 0.73 14393 10.26 0 0 27 0 73 0 0 1.00 14389 14.05 0 0 38 0 62 <<< note, same max tps 0 0 1.00 14386 14.05 0 0 35 0 65 0 0 1.00 14391 14.05 0 0 30 0 70 0 0 1.00 13915 13.59 0 0 25 0 75 0 0 1.86 13384 24.35 0 0 28 0 72 0 0 2.00 13468 26.30 0 0 23 0 77 <<< nearly same max tps tty ad0 cpu tin tout KB/t tps MB/s us ni sy in id 0 0 2.74 12245 32.71 0 0 34 0 66 0 0 4.00 10840 42.34 0 0 27 0 73 <<< now the tps starts to drop with the larger block size and the transfer is no longer limited by the controller or system. 0 0 7.41 7061 51.11 0 0 16 0 84 0 0 12.15 4500 53.37 0 0 10 0 90 0 0 20.96 2562 52.45 0 0 7 0 93 0 0 36.15 1461 51.57 0 0 2 0 98 0 0 64.69 837 52.87 2 0 1 0 98 0 0 124.72 439 53.46 0 0 2 0 98 0 0 127.41 429 53.38 0 0 2 0 98 0 0 127.69 406 50.62 0 0 1 0 99 0 0 128.00 268 33.50 0 0 1 0 99 Generally speaking, since the hard drive itself will cache data off the platter, reduced I/O bandwidth using smaller block sizes will almost always be either a transaction rate limit being hit, or the cpu's limits get hit (the cpu gets overburdened). Buffered access to a raw device is not necessarily a good thing. In fact, most of the time you don't want to do it because you have a caching layer on top of your raw accesses (e.g. the filesystem buffer cache / VM cache, or in the case of a database the database has its own cache and buffered access would interfere with it). -Matt Matthew Dillon <dillon_at_backplane.com>Received on Mon Sep 27 2004 - 02:24:24 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:14 UTC