vfs.nfsrv.async broken ? Or pilot error ?

From: Mike Tancsa <mike_at_sentex.net>
Date: Fri, 28 Jan 2005 17:08:17 -0500
I am trying to run some nfs benchmarks and wanted to see how 
vfs.nfsrv.async makes a difference on performance for FreeBSD vs Linux.

 From the client (a RELENG_5 box) if I do a simple

time dd if=/dev/zero of=/mnt/testfile3 bs=16k count=1638

The write is very quick

[ns4-new]# time dd if=/dev/zero of=/mnt/testfile3 bs=16k count=1638
1638+0 records in
1638+0 records out
26836992 bytes transferred in 0.387338 secs (69285733 bytes/sec)
0.007u 0.079s 0:00.54 12.9%     34+266k 0+3276io 0pf+0w
[ns4-new]#



But if I increase the size by 10 times, the process takes more than 100 
times the amount of time to complete, and sometimes it doesnt complete at all


[ns4-new]# time dd if=/dev/zero of=/mnt/testfile3 bs=16k count=16384

16384+0 records in
16384+0 records out
268435456 bytes transferred in 1084.463674 secs (247528 bytes/sec)
0.016u 0.509s 18:04.64 0.0%     25+219k 0+32768io 0pf+0w
[ns4-new]#


On the server side I have

[nfs]# sysctl -a vfs.nfsrv
vfs.nfsrv.nfs_privport: 0
vfs.nfsrv.async: 1
vfs.nfsrv.commit_blks: 54284
vfs.nfsrv.commit_miss: 45781
vfs.nfsrv.realign_test: 184130
vfs.nfsrv.realign_count: 0
vfs.nfsrv.gatherdelay: 10000
vfs.nfsrv.gatherdelay_v3: 0
[nfs]#

The state shows ufs when its stalled out

   PID USERNAME PRI NICE   SIZE    RES STATE    TIME   WCPU    CPU COMMAND
   653 root      -4    0  1208K   792K ufs      0:01  0.00%  0.00% nfsd
   651 root      -4    0  1208K   792K ufs      0:00  0.00%  0.00% nfsd


[nfs]# ps -auxw | grep nfsd | awk '{print "ps -auxwwwjp "$2}' | sh | grep ^root
USER   PID %CPU %MEM   VSZ  RSS  TT  STAT STARTED      TIME 
COMMAND           PPID  PGID   SID JOBC
root   649  0.0  0.0  1336 1032  ??  Ds    4:00PM   0:00.01 nfsd: master 
(nf     1   649   649    0
root   650  0.0  0.0  1208  792  ??  D     4:00PM   0:00.00 nfsd: server 
(nf   649   649   649    0
root   651  0.0  0.0  1208  792  ??  D     4:00PM   0:00.47 nfsd: server 
(nf   649   649   649    0
root   652  0.0  0.0  1208  792  ??  D     4:00PM   0:00.01 nfsd: server 
(nf   649   649   649    0
root   653  0.0  0.0  1208  792  ??  D     4:00PM   0:00.81 nfsd: server 
(nf   649   649   649    0
root   654  0.0  0.0  1208  792  ??  D     4:00PM   0:00.25 nfsd: server 
(nf   649   649   649    0
root   655  0.0  0.0  1208  792  ??  D     4:00PM   0:00.00 nfsd: server 
(nf   649   649   649    0
root   656  0.0  0.0  1208  792  ??  D     4:00PM   0:00.00 nfsd: server 
(nf   649   649   649    0
root   657  0.0  0.0  1208  792  ??  D     4:00PM   0:00.00 nfsd: server 
(nf   649   649   649    0
root   658  0.0  0.0  1208  792  ??  D     4:00PM   0:00.00 nfsd: server 
(nf   649   649   649    0
root   659  0.0  0.0  1208  792  ??  D     4:00PM   0:00.00 nfsd: server 
(nf   649   649   649    0
root   660  0.0  0.0  1208  792  ??  D     4:00PM   0:00.00 nfsd: server 
(nf   649   649   649    0
root   661  0.0  0.0  1208  792  ??  D     4:00PM   0:00.00 nfsd: server 
(nf   649   649   649    0
root   662  0.0  0.0  1208  792  ??  D     4:00PM   0:00.00 nfsd: server 
(nf   649   649   649    0
root   663  0.0  0.0  1208  792  ??  D     4:00PM   0:00.00 nfsd: server 
(nf   649   649   649    0
root   664  0.0  0.0  1208  792  ??  D     4:00PM   0:00.00 nfsd: server 
(nf   649   649   649    0
[nfs]#

Does something else need to be set tweaked for it to work ?
I have tried both with the default options for mounting via nfs
e.g.
mount_nfs 10.1.1.1:/mnt /mnt
and
mount_nfs -U -2 10.1.1.1:/mnt /mnt
with no difference.

It seems to get to a point where the network traffic actually stalls out 
(i.e. tcpdump doesnt show anything), however there is still a lot of disk 
IO going on that really should not be for the size of the file.

When using async=0, it works as expected consistently.

[ns4-new]# time dd if=/dev/zero of=/mnt/testfile3 bs=16k count=16384
16384+0 records in
16384+0 records out
268435456 bytes transferred in 30.574544 secs (8779704 bytes/sec)
0.006u 0.480s 0:31.62 1.5%      32+279k 0+32768io 0pf+0w
[ns4-new]#


[nfs]# cat /var/run/dmesg.boot
Copyright (c) 1992-2005 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
         The Regents of the University of California. All rights reserved.
FreeBSD 5.3-STABLE #0: Fri Jan 28 12:47:54 EST 2005
     mdtancsa_at_5-3nfs.sentex.ca:/usr/obj/usr/src/sys/nfs
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 3.00GHz (2992.51-MHz 686-class CPU)
   Origin = "GenuineIntel"  Id = 0xf41  Stepping = 1
   Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
real memory  = 2147418112 (2047 MB)
avail memory = 2096046080 (1998 MB)
ACPI APIC Table: <AOpen  AWRDACPI>
ioapic0 <Version 2.0> irqs 0-23 on motherboard
npx0: [FAST]
npx0: <math processor> on motherboard
npx0: INT 16 interface
acpi0: <AOpen AWRDACPI> on motherboard
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0
cpu0: <ACPI CPU> on acpi0
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
agp0: <Intel 82865 host to AGP bridge> mem 0xfd000000-0xfd3fffff at device 
0.0 on pci0
pcib1: <PCI-PCI bridge> at device 1.0 on pci0
pci1: <PCI bus> on pcib1
pci1: <display, VGA> at device 0.0 (no driver attached)
pcib2: <ACPI PCI-PCI bridge> at device 3.0 on pci0
pci2: <ACPI PCI bus> on pcib2
em0: <Intel(R) PRO/1000 Network Connection, Version - 1.7.35> port 
0xc000-0xc01f mem 0xfd400000-0xfd41ffff irq 17 at device 1.0 on pci2
em0: Ethernet address: 00:01:80:53:ff:04
em0:  Speed:N/A  Duplex:N/A
pcib3: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci3: <ACPI PCI bus> on pcib3
em1: <Intel(R) PRO/1000 Network Connection, Version - 1.7.35> port 
0xd000-0xd03f mem 0xfc800000-0xfc81ffff,0xfc820000-0xfc83ffff irq 16 at 
device 4.0 on pci3
em1: Ethernet address: 00:0e:0c:5d:f7:c0
em1:  Speed:N/A  Duplex:N/A
twe0: <3ware Storage Controller. Driver version 1.50.01.002> port 
0xd100-0xd10f mem 0xfc000000-0xfc7fffff irq 17 at device 5.0 on pci3
twe0: [GIANT-LOCKED]
twe0: 4 ports, Firmware FE7S 1.05.00.056, BIOS BE7X 1.08.00.046
fwohci0: <Lucent FW322/323> mem 0xfc860000-0xfc860fff irq 22 at device 13.0 
on pci3
fwohci0: OHCI version 1.0 (ROM=1)
fwohci0: No. of Isochronous channels is 8.
fwohci0: EUI64 00:01:80:13:94:53:ff:05
fwohci0: Phy 1394a available S400, 3 ports.
fwohci0: Link S400, max_rec 2048 bytes.
firewire0: <IEEE1394(FireWire) bus> on fwohci0
sbp0: <SBP-2/SCSI over FireWire> on firewire0
fwohci0: Initiate bus reset
fwohci0: node_id=0xc800ffc0, gen=1, CYCLEMASTER mode
firewire0: 1 nodes, maxhop <= 0, cable IRM = 0 (me)
firewire0: bus manager 0 (me)
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel ICH5 UDMA100 controller> port 
0xf000-0xf00f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 31.1 on pci0
ata0: channel #0 on atapci0
ata1: channel #1 on atapci0
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
acpi_tz0: <Thermal Zone> on acpi0
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A, console
sio1: <16550A-compatible COM port> port 0x2f8-0x2ff irq 3 on acpi0
sio1: type 16550A
atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: model IntelliMouse Explorer, device ID 4
pmtimer0 on isa0
orm0: <ISA Option ROMs> at iomem 
0xcd000-0xcdfff,0xcc000-0xccfff,0xc0000-0xca7ff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x100>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
ppc0: parallel port not found.
Timecounter "TSC" frequency 2992514010 Hz quality 800
Timecounters tick every 10.000 msec
ipfw2 initialized, divert enabled, rule-based forwarding enabled, default 
to accept, logging limited to 3100 packets/entry by default
ad0: 38166MB <ST340014A/3.06> [77545/16/63] at ata0-master UDMA100
acd0: DVDROM <HL-DT-STDVD-ROM GDR8163B/0L15> at ata1-master UDMA33
twed0: <Unit 0, RAID5, Normal> on twe0
twed0: 715422MB (1465185024 sectors)
em0: Link is up 100 Mbps Full Duplex
Mounting root from ufs:/dev/ad0s1a
em0: Link is up 100 Mbps Full Duplex
em1: Link is up 1000 Mbps Full Duplex
[nfs]#


         ---Mike




--------------------------------------------------------------------
Mike Tancsa,                                      tel +1 519 651 3400
Sentex Communications,                            mike_at_sentex.net
Providing Internet since 1994                    www.sentex.net
Cambridge, Ontario Canada                         www.sentex.net/mike
Received on Fri Jan 28 2005 - 21:07:27 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:27 UTC