Re: locking vnode pointer fvp in nfs_rename()

From: Bruce Evans <bde_at_zeta.org.au>
Date: Thu, 24 Apr 2003 23:54:27 +1000 (EST)
On Thu, 24 Apr 2003, Don Lewis wrote:

> VOP_FSYNC() wants its vnode argument to be locked by the caller, but
> nfs_rename() doesn't do that.  I've been running something similar to
> the patch below for months in a kernel built with the DEBUG_VFS_LOCKS
> option enabled.  The most recent change I made was to modify my original
> patch to more closely resemble the way this is done in ufs_rename().

This seems to be basically right.

> One concern that I have about all the leaf file system implementations
> is the possibility of spurious failures due to failure to obtain the
> lock. I'm pretty sure that it isn't wise to add the LK_RETRY flag
> because the possibility of a deadlock.  I think the proper way to handle
> this problem is to drop all the locks, wait for the fvp lock to become
> available, and then retry from the beginning. Unfortunately this would
> be messy in the current implementation.

So messy (even in ufs) that it has never been done.  I don't see how
ufs_checkpath()+relookup() can work in all cases, since the tree may
be arbitrarily rearranged while things are unlocked, and it doesn't
go back to the beginning.

> I think it would make a lot of sense to move the preliminary code that
> is common to each of the leaf file system implementations, including
> locking fvp, back into kern_rename().  This would make the
> implementation of the lock retry code a lot easier and would clean up a
> lot of cut-and-paste code in the leaf file systems.  Unfortunately it's
> probably too late to do this for 5.x.

Was it ever in kern_rename() (or rename()?).  This doesn't seem to save
much (just one vn_lock() call).  More below.

> Any reason the following change to nfs_rename() shouldn't be committed?

It has some style bugs :-).

> Index: sys/nfsclient/nfs_vnops.c
> ===================================================================
> RCS file: /home/ncvs/src/sys/nfsclient/nfs_vnops.c,v
> retrieving revision 1.203
> diff -u -r1.203 nfs_vnops.c
> --- sys/nfsclient/nfs_vnops.c	23 Apr 2003 02:58:26 -0000	1.203
> +++ sys/nfsclient/nfs_vnops.c	24 Apr 2003 03:06:21 -0000
> _at__at_ -1521,6 +1521,13 _at__at_
>  		goto out;
>  	}
>
> +	if (fvp == tvp) {
> +		printf("nfs_rename: fvp == tvp (can't happen)\n");
> +		error = 0;
> +		goto out;
> +	}

Maybe the lock was delayed because of the (deleted) complications for the
(fvp == tvp) case.  Exclusive locking would have to be avoided in this
case just to avoid deadlock.

BTW, there was an inconclusive thread in austin-group-l_at_opengroup.org
about rename("foo", "hard-link-to-foo").  I said that I had "fixed"
it to match the letter of the POSIX spec in Linux and FreeBSD.  The
ufs version of the above codeis part of the fix.  NetBSD still has the
old behaviour.  I only saw 1 reply other than mine and 1 from the
originator that I haven't got around to responding to.  It agreed with
the originator that the spec is probably wrong.

There are minor technical advantages for rename() not doing a sort of
unlink(): we don't have to have complications to handle it or have to
worry about races in it :-).  There are nice race possibilites even
for rename("foo", "hard-link-to-foo").  This is not quite the same as
unlink("foo") (after stat()'ing the files and finding that they are
links to each other), since rename() is supposed to be atomic so it
is not permitted to remove the first link if the other is removed
concurrently, while of course stat() + unlink() has inherent races.
However, the old code in ufs_rename() (see RELENG_3) essentially
duplicats the userland race (it starts by completely unlocking the
destination).

> +	if ((error = vn_lock(fvp, LK_EXCLUSIVE, fcnp->cn_thread)) != 0)
> +		goto out;

Need a blank line here.

>  	/*
>  	 * We have to flush B_DELWRI data prior to renaming
>  	 * the file.  If we don't, the delayed-write buffers
> _at__at_ -1529,8 +1536,8 _at__at_
>  	 * ( as far as I can tell ) it flushes dirty buffers more
>  	 * often.
>  	 */
> -

Correct to remove this bkank line.

>  	VOP_FSYNC(fvp, fcnp->cn_cred, MNT_WAIT, fcnp->cn_thread);
> +	VOP_UNLOCK(fvp, 0, fcnp->cn_thread);
>  	if (tvp)
>  	    VOP_FSYNC(tvp, tcnp->cn_cred, MNT_WAIT, tcnp->cn_thread);
>

Bruce
Received on Thu Apr 24 2003 - 04:54:32 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:04 UTC