On Wed, Dec 10, 2008 at 07:06:20PM +0200, Kostik Belousov wrote: > ... > > What concerns me is that even if the attempted unmount gets EBUSY, the > > user-level process descending the directory hierarchy is getting ENOENT > > trying to issue fstatfs() against an open file descriptor. > > > > I'm having trouble figuring out any way that makes any sense. > > Basically, the problem is that NFS uses shared lookup, and this allows > for the bug where several negative namecache entries are created for > non-existent node. Then this node gets created, removing only the first > negative namecache entry. For some reasons, vnode is reclaimed; amd' > tasting of unmount is a good reason for vnode to be reclaimed. > > Now, you have existing path and a negative cache entry. This was > reported by Peter Holm first, I listed relevant revisions that > should fix this in previous mail. Well, I messed up the machine I had been using for testing, and needed to wait for IT to do something to it since I don't have physical or console access to it. So after I happened to demonstrate the effect using my desktop -- which had been running RELENG_7_1, sources updated as of around 0400 hrs. US/Pacific -- I decided to go ahead and update the desktop to RELENG_7_1 as of this morning (which had the commit to sys/kern/vfs_cache.c), then test. It still failed, apparently in the same way; details below. First, here's a list of the files that were changed: U lib/libarchive/archive_read_support_format_iso9660.c U lib/libarchive/archive_string.c U lib/libarchive/archive_string.h U lib/libc/gen/times.3 U lib/libc/i386/sys/pipe.S U lib/libc/i386/sys/reboot.S U lib/libc/i386/sys/setlogin.S U lib/libutil/Makefile U lib/libutil/kinfo_getfile.c U lib/libutil/kinfo_getvmmap.c U lib/libutil/libutil.h U share/man/man4/bce.4 U share/man/man5/Makefile U share/man/man5/fstab.5 U share/man/man5/nullfs.5 U sys/amd64/Makefile U sys/boot/forth/loader.conf.5 U sys/dev/ale/if_ale.c U sys/dev/bce/if_bce.c U sys/dev/cxgb/cxgb_main.c U sys/dev/cxgb/common/cxgb_ael1002.c U sys/dev/cxgb/common/cxgb_t3_hw.c U sys/dev/cxgb/common/cxgb_xgmac.c U sys/dev/re/if_re.c U sys/fs/nullfs/null_vnops.c U sys/kern/Make.tags.inc U sys/kern/kern_descrip.c U sys/kern/kern_proc.c U sys/kern/vfs_cache.c U sys/netinet/in_pcb.h U sys/pci/if_rlreg.h U sys/sys/sysctl.h U sys/sys/user.h U sys/ufs/ufs/ufs_quota.c U usr.bin/procstat/Makefile U usr.bin/procstat/procstat_files.c U usr.bin/procstat/procstat_vm.c U usr.bin/tar/util.c U usr.bin/tar/test/Makefile U usr.bin/tar/test/test_strip_components.c U usr.bin/tar/test/test_symlink_dir.c U usr.bin/xargs/xargs.1 U usr.sbin/mtree/mtree.c We see that sys/kern/vfs_cache.c is, indeed, among them. And: dwolf-bsd(7.1-P)[5] grep '\$FreeBSD' /sys/kern/vfs_cache.c __FBSDID("$FreeBSD: src/sys/kern/vfs_cache.c,v 1.114.2.3 2008/12/09 16:20:58 kib Exp $"); dwolf-bsd(7.1-P)[6] That should correspond to the desired version of the file. Here we see an excerpt from the ktrace output for the amd(8) process and its children; this is a point when amd(8) is trying an unmount() to see if it can get away with it: 977 amd 1229033597.269612 CALL gettimeofday(0x807ad48,0) 977 amd 1229033597.269620 RET gettimeofday 0 977 amd 1229033597.269630 CALL sigprocmask(SIG_BLOCK,0xbfbfeaec,0xbfbfeadc) 977 amd 1229033597.269637 RET sigprocmask 0 977 amd 1229033597.269645 CALL fork 977 amd 1229033597.273810 RET fork 1712/0x6b0 1712 amd 1229033597.273811 RET fork 0 977 amd 1229033597.273836 CALL sigprocmask(SIG_SETMASK,0xbfbfeadc,0) 1712 amd 1229033597.273845 CALL getpid 977 amd 1229033597.273850 RET sigprocmask 0 1712 amd 1229033597.273855 RET getpid 1712/0x6b0 977 amd 1229033597.273864 CALL gettimeofday(0x807ad48,0) 977 amd 1229033597.273874 RET gettimeofday 0 1712 amd 1229033597.273878 CALL unmount(0x2832c610,<invalid>0) ... 1712 amd 1229033597.352643 RET unmount -1 errno 16 Device busy 1712 amd 1229033597.352695 CALL sigprocmask(SIG_BLOCK,0x28097c00,0xbfbfea0c) 1712 amd 1229033597.352728 RET sigprocmask 0 1712 amd 1229033597.352751 CALL sigprocmask(SIG_SETMASK,0x28097c10,0) 1712 amd 1229033597.352769 RET sigprocmask 0 1712 amd 1229033597.352781 CALL sigprocmask(SIG_BLOCK,0x28097c00,0xbfbfe9dc) 1712 amd 1229033597.352790 RET sigprocmask 0 1712 amd 1229033597.352801 CALL sigprocmask(SIG_SETMASK,0x28097c10,0) 1712 amd 1229033597.352805 RET sigprocmask 0 1712 amd 1229033597.352815 CALL exit(0x10) 977 amd 1229033597.353085 RET select -1 errno 4 Interrupted system call 977 amd 1229033597.353093 PSIG SIGCHLD caught handler=0x805de50 mask=0x0 code=0x0 977 amd 1229033597.353103 CALL wait4(0xffffffff,0xbfbfe83c,WNOHANG,0) 977 amd 1229033597.353116 RET wait4 1712/0x6b0 977 amd 1229033597.353122 CALL wait4(0xffffffff,0xbfbfe83c,WNOHANG,0) 977 amd 1229033597.353127 RET wait4 -1 errno 10 No child processes So amd(8) master process (pid 977) jorks off a child (pid 1712) to try an umount(), which it initiates at 1229033597.273878. At 1229033597.352643 the child gets control back, as well as an EBUSY, which I would expect to mean that the attempt failed. The child exits at 1229033597.352815 with a status code of 16. Armed with that, we look at a ktrace excerpt from "rm -fr": 1660 rm 1229033597.283277 CALL rmdir(0x2822b388) 1660 rm 1229033597.283283 NAMI "stvef-paks" 1660 rm 1229033597.285599 RET rmdir 0 1660 rm 1229033597.285620 CALL stat(0x2822b3e8,0xbfbfe8dc) 1660 rm 1229033597.285626 NAMI "stvef-server" 1660 rm 1229033597.286071 STRU struct stat {dev=83951372, ino=20124614, mode=drwxr-xr-x , nlink=3, uid=9874, gid=929, rdev=0, atime=1228844788, stime=1227555769, ctime=1228845828.326650000, birthtime=0, size=4096, blksize=4096, blocks=8, flags=0x0 } 1660 rm 1229033597.286078 RET stat 0 1660 rm 1229033597.286084 CALL open(0x2822b3e8,O_NONBLOCK,<unused>0x1) 1660 rm 1229033597.286091 NAMI "stvef-server" 1660 rm 1229033597.287145 RET open 4 1660 rm 1229033597.287154 CALL fstat(0x4,0xbfbfe8dc) 1660 rm 1229033597.287161 STRU struct stat {dev=83951372, ino=20124614, mode=drwxr-xr-x , nlink=3, uid=9874, gid=929, rdev=0, atime=1228844788, stime=1227555769, ctime=1228845828.326650000, birthtime=0, size=4096, blksize=4096, blocks=8, flags=0x0 } 1660 rm 1229033597.287166 RET fstat 0 1660 rm 1229033597.287171 CALL fcntl(0x4,F_SETFD,FD_CLOEXEC) 1660 rm 1229033597.287177 RET fcntl 0 1660 rm 1229033597.287187 CALL fstatfs(0x4,0xbfbfe704) 1660 rm 1229033597.287195 RET fstatfs -1 errno 2 No such file or directory 1660 rm 1229033597.287202 CALL close(0x4) 1660 rm 1229033597.287211 RET close 0 [Sorry for the long lines....] Here we see that the "rm" process (pid 1660) removed a directory named stvef-paks sucessfully in the interval between 1229033597.283277 (when the request was made) and 1229033597.285599 (when the 0 return occurred). The "rm" process proceeds to process a directory named stvef-server: * At 1229033597.285620 it issues a stat(); the successful return is at 1229033597.286078. * At 1229033597.286084 it issues an open(); the successful return is at 1229033597.287145. The FD is 4. * At 1229033597.287154 it issues an fstat() against FD 4; the successful return is at 1229033597.287166. * At 1229033597.287171 it issues an fcntl() against FD 4; the successful return is at 1229033597.287177. * At 1229033597.287187 it issues an fstatfs() against FD 4; the unsuccessful return is at 1229033597.287195, claiming ENOENT. Say WHAT??!? I expect to be able to test a bit more promptly now. Peace, david -- David H. Wolfskill david_at_catwhisker.org Depriving a girl or boy of an opportunity for education is evil. See http://www.catwhisker.org/~david/publickey.gpg for my public key.
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:38 UTC