Re: NFSv4 performance degradation with 12.0-CURRENT client

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Fri, 25 Nov 2016 10:41:06 +0200
On Thu, Nov 24, 2016 at 10:45:51PM +0000, Rick Macklem wrote:
> asomers_at_gmail.com wrote:
> >OpenOwner     Opens LockOwner     Locks    Delegs  LocalOwn LocalOpen LocalLOwn
> >     5638    141453         0         0         0         0         0         0
> Ok, I think this shows us the problem. 141453 opens is a lot and the client would have
> to chek these every time another open is done (there goes all that CPU;-).
> 
> Now, why has this occurred?
> Well, the NFSv4 client can't close NFSv4 Opens on a vnode until that vnode's
> v_usecount goes to 0. This is because mmap'd files might do I/O after the file
> descriptor is closed.
> Now, hopefully Kostik will know something about nullfs and can help with this.
> My guess is that nullfs ends up acquiring a refcnt on the NFS vnode so the
> v_usecount doesn't go to 0 and, therefore, the client never closes the NFSv4 Opens.
> Kostik, do you know if this is the case and whether or not it can be changed?
You are absolutely right. Nullfs vnode keeps a reference to the lower
vnode which is below the nullfs one, i.e. to the nfs vnode in this case.
If cache option is specified for the nullfs mount (default), the nullfs
vnodes are cached normally to avoid the cost of creating and destroying
nullfs vnode on each operation, and related cost of the exclusive locks
on the lower vnode.

An answer to my question in the previous mail to try with nocache
option would give the confirmation. Really, I suspected that v_hash
is calculated differently for NFSv3 and v4 mounts, but if opens are
accumulated until use ref is dropped, that would explain things as well.

Assuming your diagnosis is correct, are you in fact stating that the
current VFS KPI is flawed ?  It sounds as if either some another callback
or counter needs to exist to track number of mapping references to the
vm object of the vnode, in addition to VOP_OPEN/VOP_CLOSE ?

Currently a rough estimation of the number of mappings, which is sometimes
slightly wrong, can be obtained by the expression
	vp->v_object->ref_count - vp->v_object->shadow_count


> >LocalLock
> >        0
> >Rpc Info:
> >TimedOut   Invalid X Replies   Retries  Requests
> >        0         0         0         0       662
> >Cache Info:
> >Attr Hits    Misses Lkup Hits    Misses BioR Hits    Misses BioW Hits    Misses
> >     1275        58       837       121         0         0         0         0
> >BioRLHits    Misses BioD Hits    Misses DirE Hits    Misses
> >        1         0         6         0         1         0
> >
> [more stuff snipped]
> >What role could nullfs be playing?
> As noted above, my hunch is that is acquiring a refcnt on the NFS client vnode such
> that the v_usecount doesn't go to zero (at least for a long time) and without
> a VOP_INACTIVE() on the NFSv4 vnode, the NFSv4 Opens don't get closed and
> accumulate.
> (If that isn't correct, it is somehow interfering with the client Closing the NFSv4 Opens
>  in some other way.)

The following patch should automatically unset cache option for nullfs
mounts over NFSv4 filesystem.

diff --git a/sys/fs/nfsclient/nfs_clvfsops.c b/sys/fs/nfsclient/nfs_clvfsops.c
index 524a372..a7e9fe3 100644
--- a/sys/fs/nfsclient/nfs_clvfsops.c
+++ b/sys/fs/nfsclient/nfs_clvfsops.c
_at__at_ -1320,6 +1320,8 _at__at_ out:
 		MNT_ILOCK(mp);
 		mp->mnt_kern_flag |= MNTK_LOOKUP_SHARED | MNTK_NO_IOPF |
 		    MNTK_USES_BCACHE;
+		if ((VFSTONFS(mp)->nm_flag & NFSMNT_NFSV4) != 0)
+			mp->mnt_kern_flag |= MNTK_NULL_NOCACHE;
 		MNT_IUNLOCK(mp);
 	}
 	return (error);
diff --git a/sys/fs/nullfs/null_vfsops.c b/sys/fs/nullfs/null_vfsops.c
index 49bae28..de05e8b 100644
--- a/sys/fs/nullfs/null_vfsops.c
+++ b/sys/fs/nullfs/null_vfsops.c
_at__at_ -188,7 +188,8 _at__at_ nullfs_mount(struct mount *mp)
 	}
 
 	xmp->nullm_flags |= NULLM_CACHE;
-	if (vfs_getopt(mp->mnt_optnew, "nocache", NULL, NULL) == 0)
+	if (vfs_getopt(mp->mnt_optnew, "nocache", NULL, NULL) == 0 ||
+	    (xmp->nullm_vfs->mnt_kern_flag & MNTK_NULL_NOCACHE) != 0)
 		xmp->nullm_flags &= ~NULLM_CACHE;
 
 	MNT_ILOCK(mp);
diff --git a/sys/sys/mount.h b/sys/sys/mount.h
index 94cabb6..b6f9fec 100644
--- a/sys/sys/mount.h
+++ b/sys/sys/mount.h
_at__at_ -370,7 +370,8 _at__at_ void          __mnt_vnode_markerfree_active(struct vnode **mvp, struct mount *);
 #define	MNTK_SUSPEND	0x08000000	/* request write suspension */
 #define	MNTK_SUSPEND2	0x04000000	/* block secondary writes */
 #define	MNTK_SUSPENDED	0x10000000	/* write operations are suspended */
-#define	MNTK_UNUSED1	0x20000000
+#define	MNTK_NULL_NOCACHE	0x20000000 /* auto disable cache for nullfs
+					      mounts over this fs */
 #define MNTK_LOOKUP_SHARED	0x40000000 /* FS supports shared lock lookups */
 #define	MNTK_NOKNOTE	0x80000000	/* Don't send KNOTEs from VOP hooks */
 
Received on Fri Nov 25 2016 - 07:41:13 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:08 UTC