vnode lockings bug in -current NFS server code

From: Don Lewis <truckman_at_FreeBSD.org>
Date: Fri, 25 Apr 2003 01:44:38 -0700 (PDT)
I just exercised the NFS server code on my -current box for the first
time and stumbled across some vnode locking bugs that were caught by the
DEBUG_VFS_LOCKS configuration option.  I found problems in
nfsrv_lookup() and nfsrv_create().

There are a number places in the NFS server code that call VOP_GETATTR()
on the vnode returned through the retdirp parameter to nfs_namei().
VOP_GETATTR() wants the vnode to be locked, but nfs_namei() does not
explicitly lock this vnode. This is the directory used by nfs_namei() as
the starting point of it's lookup. For normal NFS lookups, I believe it
will be the same as the parent directory of the filesystem object being
looked up, because normal NFS lookups only process one pathname
component at a time and the server doesn't follow symlinks.  This is not
true of WebNFS.

The vnode may end up being locked if the LOCKPARENT flag has been passed
to the caller and retdirp ends up pointing to the parent vnode returned
by nfs_namei(), or possibly if nfs_namei() follows a symlink in the
WebNFS case and retdirp and leaf object are the same vnode.  Because of
this, it is not safe for the code that calls nfs_namei() to just call
VN_LOCK() before calling VOP_GETATTR().  It is also unsafe because
another process could be attempting to lock vnodes in a different order
at the same time, causing a deadlock.

It appears that it may be possible to rearrange the code to defer the
call to VOP_GETATTR() until after the other vnodes have been unlocked,
when it would be safe to just unconditionally lock the starting
directory vnode. This code is a maze of twisty little passages and would
require more time to implement a proper fix than I can devote to it at
the present time.

If someone is feeling bored ...
Received on Thu Apr 24 2003 - 23:44:46 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:04 UTC