Re: Suddenly slow lstat syscalls on CURRENT from Juli

From: Beat Gätzi <beat_at_chruetertee.ch>
Date: Sat, 01 Jan 2011 16:37:44 +0100
On 01.01.2011 16:10, Alexander Best wrote:
> On Sat Jan  1 11, Beat Gätzi wrote:
>> Hi,
>>
>> Since a couple of days commands like ls(1) are very slow on one of my
>> tinderboxes. Checking with ktrace/kdump I see that the lstat syscall
>> takes about one second:
>>
>>  70559 ls       0.004644 CALL  lstat(0x284472f8,0x28447298)
>>  70559 ls       0.004651 NAMI  "Mk"
>>  70559 ls       0.004664 STRU  struct stat {... }
>>  70559 ls       0.004672 RET   lstat 0
>>  70559 ls       0.004688 CALL  lstat(0x2844a1b8,0x2844a158)
>>  70559 ls       0.004696 NAMI  "README,v"
>>  70559 ls       1.004245 STRU  struct stat {...}
>>  70559 ls       1.004263 RET   lstat 0
>>  70559 ls       1.004277 CALL  lstat(0x2844a2b8,0x2844a258)
>>  70559 ls       1.004286 NAMI  ".cvsignore,v"
>>  70559 ls       2.004282 STRU  struct stat {...}
>>  70559 ls       2.004302 RET   lstat 0
>>  70559 ls       2.004316 CALL  lstat(0x2844a3b8,0x2844a358)
>>  70559 ls       2.004325 NAMI  "CHANGES,v"
>>  70559 ls       3.004275 STRU  struct stat {...}
>>  70559 ls       3.004296 RET   lstat 0
>>  70559 ls       3.004310 CALL  lstat(0x2844a4b8,0x2844a458)
>>  70559 ls       3.004318 NAMI  "COPYRIGHT,v"
>>  70559 ls       4.004300 STRU  struct stat {...}
>>  70559 ls       4.004318 RET   lstat 0
>>
>> When i immediately re-execute the command in the same directory it looks
>> normal:
>>
>>  78007 ls       0.004665 CALL  lstat(0x284472f8,0x28447298)
>>  78007 ls       0.004673 NAMI  "Mk"
>>  78007 ls       0.004686 STRU  struct stat {...}
>>  78007 ls       0.004693 RET   lstat 0
>>  78007 ls       0.004708 CALL  lstat(0x2844a1b8,0x2844a158)
>>  78007 ls       0.004715 NAMI  "README,v"
>>  78007 ls       0.004728 STRU  struct stat {...}
>>  78007 ls       0.004735 RET   lstat 0
>>  78007 ls       0.004742 CALL  lstat(0x2844a2b8,0x2844a258)
>>  78007 ls       0.004749 NAMI  ".cvsignore,v"
>>  78007 ls       0.004761 STRU  struct stat {...}
>>  78007 ls       0.004768 RET   lstat 0
>>  78007 ls       0.004775 CALL  lstat(0x2844a3b8,0x2844a358)
>>  78007 ls       0.004782 NAMI  "CHANGES,v"
>>  78007 ls       0.004795 STRU  struct stat {...}
>>  78007 ls       0.004802 RET   lstat 0
>>  78007 ls       0.004809 CALL  lstat(0x2844a4b8,0x2844a458)
>>  78007 ls       0.004817 NAMI  "COPYRIGHT,v"
>>  78007 ls       0.004828 STRU  struct stat {...}
>>  78007 ls       0.004835 RET   lstat 0
>>
>> System is a HP DL360 G3 with a Compaq Smart Array 5i controller. There
>> is no evidence of a RAID controller or hard disk problem in the system
>> logs. Also writing to a file and reading from a file looks normal.
>> The running kernel is a GENERIC kernel from Juli without debugging options:
>>
>> # uname -a
>> FreeBSD tinderbox.chruetertee.ch 9.0-CURRENT FreeBSD 9.0-CURRENT #5
>> r209980: Tue Jul 13 11:25:50 CEST 2010
>> root_at_tinderbox.chruetertee.ch:/usr/obj/usr/home/beat/dev/src/head/sys/BEASTIE
>>  i386
> 
> no problems here.

I haven't had this problem the first ~166 days since I rebooted the box
the last time. Then the problem suddenly occur in all directories on all
partitions. The box was pretty much under load all the time as it builds
packages almost 7x24. I forgot to say the used filesystem is UFS.

# diskinfo -ct da0
da0
        512             # sectorsize
        299992412160    # mediasize in bytes (279G)
        585922680       # mediasize in sectors
        0               # stripesize
        0               # stripeoffset
        36472           # Cylinders according to firmware.
        255             # Heads according to firmware.
        63              # Sectors according to firmware.
                        # Disk ident.

I/O command overhead:
        time to read 10MB block     0.333161 sec  =    0.016 msec/sector
        time to read 20480 sectors  3.392359 sec  =    0.166 msec/sector
        calculated command overhead               =    0.149 msec/sector

Seek times:
        Full stroke:      250 iter in   1.202862 sec =    4.811 msec
        Half stroke:      250 iter in   1.120656 sec =    4.483 msec
        Quarter stroke:   500 iter in   2.109077 sec =    4.218 msec
        Short forward:    400 iter in   1.892342 sec =    4.731 msec
        Short backward:   400 iter in   1.399378 sec =    3.498 msec
        Seq outer:       2048 iter in   0.399352 sec =    0.195 msec
        Seq inner:       2048 iter in   0.385460 sec =    0.188 msec
Transfer rates:
    outside:       102400 kbytes in   2.325967 sec =    44025 kbytes/sec
    middle:        102400 kbytes in   2.157681 sec =    47458 kbytes/sec
    inside:        102400 kbytes in   2.717089 sec =    37687 kbytes/sec

I also observed that the cpu output of vmstat is a little bit confusing
on this system:

# vmstat
 procs  memory     page                   disk   faults         cpu
 r b w  avm  fre   flt  re  pi  po    fr  sr da0   in   sy   cs us sy id
 0 0 0  1663M 1120M 117 0   1   0   120  21   0  172  27 245 -29 -10 139

Thanks,
Beat
Received on Sat Jan 01 2011 - 14:37:42 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:10 UTC