Re: NFSv4 performance degradation with 12.0-CURRENT client

From: Rick Macklem <rmacklem_at_uoguelph.ca>
Date: Thu, 24 Nov 2016 22:45:51 +0000
asomers_at_gmail.com wrote:
[stuff snipped]
>I've reproduced the issue on stock FreeBSD 12, and I've also learned
>that nullfs is a required factor.  Doing the buildworld directly on
>the NFS mount doesn't cause any slowdown, but doing a buildworld on
>the nullfs copy of the NFS mount does.  The slowdown affects the base
>NFS mount as well as the nullfs copy.  Here is the nfsstat output for
>both server and client duing "ls -al" on the client:
>
>nfsstat -e -s -z
If you do this again, avoid using "-z" and I think you'll see the Opens (below Server:)
going up and up...
>
>Server Info:
>  Getattr   Setattr    Lookup  Readlink      Read     Write    Create    Remove
>      800         0       121         0         0         2         0         0
>   Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus    Access
>        0         0         0         0         0         0         0         8
>    Mknod    Fsstat    Fsinfo  PathConf    Commit   LookupP   SetClId SetClIdCf
>       0         0         0         0         1         3         0         0
>     Open  OpenAttr OpenDwnGr  OpenCfrm DelePurge   DeleRet     GetFH      Lock
>        0         0         0         0         0         0       123         0
>    LockT     LockU     Close    Verify   NVerify     PutFH  PutPubFH PutRootFH
>        0         0         0         0         0       674         0         0
>    Renew RestoreFH    SaveFH   Secinfo RelLckOwn  V4Create
>        0         0         0         0         0         0
>Server:
>Retfailed    Faults   Clients
>        0         0         0
>OpenOwner     Opens LockOwner     Locks    Delegs
>        0         0         0         0         0
Oops, I think this is an nfsstats bug. I don't normally use "-z", so I didn't notice
it clears these counts and it probably should not, since they are "how many of
these that are currently allocated".
I'll check this. (Not relevant to this issue, but needs fixin.;-)
>Server Cache Stats:
>   Inprog      Idem  Non-idem    Misses CacheSize   TCPPeak
>        0         0         0       674     16738     16738
>
>nfsstat -e -c -z
>Client Info:
>Rpc Counts:
> Getattr   Setattr    Lookup  Readlink      Read     Write    Create    Remove
>       60         0       119         0         0         0         0         0
>   Rename      Link   Symlink     Mkdir     Rmdir   Readdir  RdirPlus    Access
>        0         0         0         0         0         0         0         3
>    Mknod    Fsstat    Fsinfo  PathConf    Commit   SetClId SetClIdCf      Lock
>        0         0         0         0         0         0         0         0
>    LockT     LockU      Open   OpenCfr
>        0         0         0         0
>OpenOwner     Opens LockOwner     Locks    Delegs  LocalOwn LocalOpen LocalLOwn
>     5638    141453         0         0         0         0         0         0
Ok, I think this shows us the problem. 141453 opens is a lot and the client would have
to chek these every time another open is done (there goes all that CPU;-).

Now, why has this occurred?
Well, the NFSv4 client can't close NFSv4 Opens on a vnode until that vnode's
v_usecount goes to 0. This is because mmap'd files might do I/O after the file
descriptor is closed.
Now, hopefully Kostik will know something about nullfs and can help with this.
My guess is that nullfs ends up acquiring a refcnt on the NFS vnode so the
v_usecount doesn't go to 0 and, therefore, the client never closes the NFSv4 Opens.
Kostik, do you know if this is the case and whether or not it can be changed?
>LocalLock
>        0
>Rpc Info:
>TimedOut   Invalid X Replies   Retries  Requests
>        0         0         0         0       662
>Cache Info:
>Attr Hits    Misses Lkup Hits    Misses BioR Hits    Misses BioW Hits    Misses
>     1275        58       837       121         0         0         0         0
>BioRLHits    Misses BioD Hits    Misses DirE Hits    Misses
>        1         0         6         0         1         0
>
[more stuff snipped]
>What role could nullfs be playing?
As noted above, my hunch is that is acquiring a refcnt on the NFS client vnode such
that the v_usecount doesn't go to zero (at least for a long time) and without
a VOP_INACTIVE() on the NFSv4 vnode, the NFSv4 Opens don't get closed and
accumulate.
(If that isn't correct, it is somehow interfering with the client Closing the NFSv4 Opens
 in some other way.)

rick
Received on Thu Nov 24 2016 - 21:45:56 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:08 UTC