On 01.01.2011 16:45, Kostik Belousov wrote: > On Sat, Jan 01, 2011 at 04:37:44PM +0100, Beat G?tzi wrote: >> On 01.01.2011 16:10, Alexander Best wrote: >>> On Sat Jan 1 11, Beat G?tzi wrote: >>>> Hi, >>>> >>>> Since a couple of days commands like ls(1) are very slow on one of my >>>> tinderboxes. Checking with ktrace/kdump I see that the lstat syscall >>>> takes about one second: >>>> >>>> 70559 ls 0.004644 CALL lstat(0x284472f8,0x28447298) >>>> 70559 ls 0.004651 NAMI "Mk" >>>> 70559 ls 0.004664 STRU struct stat {... } >>>> 70559 ls 0.004672 RET lstat 0 >>>> 70559 ls 0.004688 CALL lstat(0x2844a1b8,0x2844a158) >>>> 70559 ls 0.004696 NAMI "README,v" >>>> 70559 ls 1.004245 STRU struct stat {...} >>>> 70559 ls 1.004263 RET lstat 0 >>>> 70559 ls 1.004277 CALL lstat(0x2844a2b8,0x2844a258) >>>> 70559 ls 1.004286 NAMI ".cvsignore,v" >>>> 70559 ls 2.004282 STRU struct stat {...} >>>> 70559 ls 2.004302 RET lstat 0 >>>> 70559 ls 2.004316 CALL lstat(0x2844a3b8,0x2844a358) >>>> 70559 ls 2.004325 NAMI "CHANGES,v" >>>> 70559 ls 3.004275 STRU struct stat {...} >>>> 70559 ls 3.004296 RET lstat 0 >>>> 70559 ls 3.004310 CALL lstat(0x2844a4b8,0x2844a458) >>>> 70559 ls 3.004318 NAMI "COPYRIGHT,v" >>>> 70559 ls 4.004300 STRU struct stat {...} >>>> 70559 ls 4.004318 RET lstat 0 >>>> >>>> When i immediately re-execute the command in the same directory it looks >>>> normal: >>>> >>>> 78007 ls 0.004665 CALL lstat(0x284472f8,0x28447298) >>>> 78007 ls 0.004673 NAMI "Mk" >>>> 78007 ls 0.004686 STRU struct stat {...} >>>> 78007 ls 0.004693 RET lstat 0 >>>> 78007 ls 0.004708 CALL lstat(0x2844a1b8,0x2844a158) >>>> 78007 ls 0.004715 NAMI "README,v" >>>> 78007 ls 0.004728 STRU struct stat {...} >>>> 78007 ls 0.004735 RET lstat 0 >>>> 78007 ls 0.004742 CALL lstat(0x2844a2b8,0x2844a258) >>>> 78007 ls 0.004749 NAMI ".cvsignore,v" >>>> 78007 ls 0.004761 STRU struct stat {...} >>>> 78007 ls 0.004768 RET lstat 0 >>>> 78007 ls 0.004775 CALL lstat(0x2844a3b8,0x2844a358) >>>> 78007 ls 0.004782 NAMI "CHANGES,v" >>>> 78007 ls 0.004795 STRU struct stat {...} >>>> 78007 ls 0.004802 RET lstat 0 >>>> 78007 ls 0.004809 CALL lstat(0x2844a4b8,0x2844a458) >>>> 78007 ls 0.004817 NAMI "COPYRIGHT,v" >>>> 78007 ls 0.004828 STRU struct stat {...} >>>> 78007 ls 0.004835 RET lstat 0 >>>> >>>> System is a HP DL360 G3 with a Compaq Smart Array 5i controller. There >>>> is no evidence of a RAID controller or hard disk problem in the system >>>> logs. Also writing to a file and reading from a file looks normal. >>>> The running kernel is a GENERIC kernel from Juli without debugging options: >>>> >>>> # uname -a >>>> FreeBSD tinderbox.chruetertee.ch 9.0-CURRENT FreeBSD 9.0-CURRENT #5 >>>> r209980: Tue Jul 13 11:25:50 CEST 2010 >>>> root_at_tinderbox.chruetertee.ch:/usr/obj/usr/home/beat/dev/src/head/sys/BEASTIE >>>> i386 >>> >>> no problems here. >> >> I haven't had this problem the first ~166 days since I rebooted the box >> the last time. Then the problem suddenly occur in all directories on all >> partitions. The box was pretty much under load all the time as it builds >> packages almost 7x24. I forgot to say the used filesystem is UFS. >> >> # diskinfo -ct da0 >> da0 >> 512 # sectorsize >> 299992412160 # mediasize in bytes (279G) >> 585922680 # mediasize in sectors >> 0 # stripesize >> 0 # stripeoffset >> 36472 # Cylinders according to firmware. >> 255 # Heads according to firmware. >> 63 # Sectors according to firmware. >> # Disk ident. >> >> I/O command overhead: >> time to read 10MB block 0.333161 sec = 0.016 msec/sector >> time to read 20480 sectors 3.392359 sec = 0.166 msec/sector >> calculated command overhead = 0.149 msec/sector >> >> Seek times: >> Full stroke: 250 iter in 1.202862 sec = 4.811 msec >> Half stroke: 250 iter in 1.120656 sec = 4.483 msec >> Quarter stroke: 500 iter in 2.109077 sec = 4.218 msec >> Short forward: 400 iter in 1.892342 sec = 4.731 msec >> Short backward: 400 iter in 1.399378 sec = 3.498 msec >> Seq outer: 2048 iter in 0.399352 sec = 0.195 msec >> Seq inner: 2048 iter in 0.385460 sec = 0.188 msec >> Transfer rates: >> outside: 102400 kbytes in 2.325967 sec = 44025 kbytes/sec >> middle: 102400 kbytes in 2.157681 sec = 47458 kbytes/sec >> inside: 102400 kbytes in 2.717089 sec = 37687 kbytes/sec >> >> I also observed that the cpu output of vmstat is a little bit confusing >> on this system: >> >> # vmstat >> procs memory page disk faults cpu >> r b w avm fre flt re pi po fr sr da0 in sy cs us sy id >> 0 0 0 1663M 1120M 117 0 1 0 120 21 0 172 27 245 -29 -10 139 >> > Check the output of sysctl kern.maxvnodes and vfs.numvnodes. I suspect > they are quite close or equial. If yes, consider increasing maxvnodes. > Another workaround, if you have huge nested directories hierarhy, is > to set vfs.vlru_allow_cache_src to 1. Thanks for the hint. kern.maxvnodes and vfs.numvnodes were equal: # sysctl kern.maxvnodes vfs.numvnodes kern.maxvnodes: 100000 vfs.numvnodes: 100765 I've increased kern.maxvnodes and the problem was gone until vfs.numvnodes reached the value of kern.maxvnodes again: # sysctl kern.maxvnodes vfs.numvnodes kern.maxvnodes: 150000 vfs.numvnodes: 150109 As the directory structure is quite huge on this server I've set vfs.vlru_allow_cache_src to one now. > You did not specified how much memory your machine have, but I assume it > is > 1GB. Anyway, increase of maxvnodes on i386 should be done very > cautiously, since it is easy to exhaust KVA. The server has 4GB of RAM. Is it possible to check how much i could increase kern.maxvnodes without exhausting KVA? Thanks a lot! BeatReceived on Sat Jan 01 2011 - 15:00:54 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:10 UTC