5.2 "ls -l /" hangs; ^C and kill -9 no help; NFS? 5.1 is OK

From: Chris Shenton <chris_at_shenton.org>
Date: Thu, 04 Mar 2004 10:10:46 -0500
Current and new kernel from last week:

  FreeBSD PECTOPAH.shenton.org 5.2-CURRENT FreeBSD 5.2-CURRENT #18: Thu Feb 26 09:24:28 EST 2004     chris_at_PECTOPAH.shenton.org:/usr/obj/usr/src/sys/PECTOPAH  i386

When emacs does or I do in a shell an "ls /" it hangs. Neither ^C nor
^Z help, and root using "kill -9" has no effect on the hung processes:

chris 26189  0.0  0.1 1484  884  ??  Ds Thu11AM 0:00.02 /bin/ls -al -- /netapphome/home/irene/.
root  28121  0.0  0.1 1480  804  p5- T  Thu11AM 0:00.01 ls /netapphome/home/irene
chris 28141  0.0  0.1 1484  864  ??  Ds Thu11AM 0:00.02 /bin/ls -al -- /netapphome/.
chris 29120  0.0  0.1 1480  824  p1- D  Thu12PM 0:00.01 ls -l /netapphome
root  90027  0.0  0.1 1204  672  ??  I  Fri03AM 0:00.01 xargs -0 -n 20 ls -liTd
root   7343  0.0  0.1 1204  672  ??  I  Sat03AM 0:00.01 xargs -0 -n 20 ls -liTd
root  15820  0.0  0.1 1204  672  ??  I  Sun03AM 0:00.01 xargs -0 -n 20 ls -liTd
root  24685  0.0  0.1 1204  688  ??  I  Mon03AM 0:00.01 xargs -0 -n 20 ls -liTd
root  35781  0.0  0.1 1204  688  ??  I  Tue03AM 0:00.01 xargs -0 -n 20 ls -liTd
root  47613  0.0  0.1 1204  688  ??  I  Wed03AM 0:00.01 xargs -0 -n 20 ls -liTd
root  59320  0.0  0.1 1204  688  ??  I   3:01AM 0:00.01 xargs -0 -n 20 ls -liTd
chris 61576  0.0  0.1 1492  876  ??  Ds  9:20AM 0:00.03 /bin/ls -al -- /.
chris 61585  0.0  0.1 1492  876  ??  Ds  9:21AM 0:00.05 /bin/ls -al -- /.
chris 61592  0.0  0.1 1492  876  ??  Ds  9:21AM 0:00.03 /bin/ls -al -- /.
chris 61638  0.0  0.1 1492  876  ??  Ds  9:24AM 0:00.02 /bin/ls -al -- /.
chris 69510  0.0  0.1 1492  912  ??  Ds  9:35AM 0:00.02 /bin/ls -al -- /.
chris 69534  0.0  0.1 1492  912  p0- D   9:37AM 0:00.01 /bin/ls -al -- /.
chris 69584  0.0  0.1 1492  912  p7- D   9:40AM 0:00.01 /bin/ls -al -- /.

A truss suggests it might be hanging on the NFS mount I have on an old
NetApp.  It does NOT hang with simple "ls /netapphome" but does if I
use "ls -l /netapphome".  I had hooked up the NetApp last week but had
some problems, so I left the mounts in place.  But it looks like some
"ls" I did on those mounts last week are still stuck (Thu11AM above
must be from last week because it's 10am Thursday now).

PECTOPAH# truss ls -l /
[...]
getdirentries(0x4,0x8052000,0x1000,0x804f054)    = 1024 (0x400)
lstat("dev",0x8051248)                           = 0 (0x0)
lstat("home",0x8051348)                          = 0 (0x0)
lstat("tmp",0x8051448)                           = 0 (0x0)
lstat("usr",0x8051548)                           = 0 (0x0)
lstat("var",0x8051648)                           = 0 (0x0)
lstat("stand",0x8051748)                         = 0 (0x0)
lstat("etc",0x8051848)                           = 0 (0x0)
lstat("cdrom",0x8051948)                         = 0 (0x0)
lstat("bin",0x8051a48)                           = 0 (0x0)
lstat("boot",0x8051b48)                          = 0 (0x0)
lstat("mnt",0x8051c48)                           = 0 (0x0)
lstat("proc",0x8051d48)                          = 0 (0x0)
lstat("root",0x8051e48)                          = 0 (0x0)
lstat("sbin",0x8051f48)                          = 0 (0x0)
break(0x8054000)                                 = 0 (0x0)
lstat("sys",0x8053048)                           = 0 (0x0)
lstat(".cshrc",0x8053148)                        = 0 (0x0)
lstat(".profile",0x8053248)                      = 0 (0x0)
lstat("COPYRIGHT",0x8053348)                     = 0 (0x0)
lstat("compat",0x8053448)                        = 0 (0x0)
lstat("home.TOO_SMALL",0x8053548)                = 0 (0x0)
lstat("more",0x8053648)                          = 0 (0x0)
lstat("conf",0x8053748)                          = 0 (0x0)
lstat("rescue",0x8053848)                        = 0 (0x0)
lstat("entropy",0x8053948)                       = 0 (0x0)
lstat("lib",0x8053a48)                           = 0 (0x0)
lstat("libexec",0x8053b48)                       = 0 (0x0)
lstat("dist",0x8053c48)                          = 0 (0x0)
lstat("netapproot",0x8053d48)                    = 0 (0x0)

Oddly, another box built from 5.1-CURRENT sources mounts the same
NFS filesysetem and does not have this problem:

chris_at_Thanatos<103> uname -a
FreeBSD Thanatos.shenton.org 5.1-CURRENT FreeBSD 5.1-CURRENT #17: Wed Oct 22 18:  19:18 EDT 2003     root_at_PECTOPAH.shenton.org:/usr/obj/usr/src/sys/PECTOPAH  i386
chris_at_Thanatos<104> ls -l /netapproot
total 28
d---------  18 root  wheel  20480 Mar  4 09:21 etc
drwxrwxrwt  33 root  wheel   8192 Feb 26 15:01 home

I can't umount the shares on the box that hangs because they're busy.
I can't reboot the troubled box right now because I've got a bunch of
network connections going that need to stay.

Any suggestions to help diagnose this? should I do a send-pr on it?

Thanks.
Received on Thu Mar 04 2004 - 06:10:48 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:46 UTC