asomers_at_gmail.com wrote: [stuff snipped] >I've reproduced the issue on stock FreeBSD 12, and I've also learned >that nullfs is a required factor. Doing the buildworld directly on >the NFS mount doesn't cause any slowdown, but doing a buildworld on >the nullfs copy of the NFS mount does. The slowdown affects the base >NFS mount as well as the nullfs copy. Here is the nfsstat output for >both server and client duing "ls -al" on the client: > >nfsstat -e -s -z If you do this again, avoid using "-z" and I think you'll see the Opens (below Server:) going up and up... > >Server Info: > Getattr Setattr Lookup Readlink Read Write Create Remove > 800 0 121 0 0 2 0 0 > Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access > 0 0 0 0 0 0 0 8 > Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf > 0 0 0 0 1 3 0 0 > Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock > 0 0 0 0 0 0 123 0 > LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH > 0 0 0 0 0 674 0 0 > Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create > 0 0 0 0 0 0 >Server: >Retfailed Faults Clients > 0 0 0 >OpenOwner Opens LockOwner Locks Delegs > 0 0 0 0 0 Oops, I think this is an nfsstats bug. I don't normally use "-z", so I didn't notice it clears these counts and it probably should not, since they are "how many of these that are currently allocated". I'll check this. (Not relevant to this issue, but needs fixin.;-) >Server Cache Stats: > Inprog Idem Non-idem Misses CacheSize TCPPeak > 0 0 0 674 16738 16738 > >nfsstat -e -c -z >Client Info: >Rpc Counts: > Getattr Setattr Lookup Readlink Read Write Create Remove > 60 0 119 0 0 0 0 0 > Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access > 0 0 0 0 0 0 0 3 > Mknod Fsstat Fsinfo PathConf Commit SetClId SetClIdCf Lock > 0 0 0 0 0 0 0 0 > LockT LockU Open OpenCfr > 0 0 0 0 >OpenOwner Opens LockOwner Locks Delegs LocalOwn LocalOpen LocalLOwn > 5638 141453 0 0 0 0 0 0 Ok, I think this shows us the problem. 141453 opens is a lot and the client would have to chek these every time another open is done (there goes all that CPU;-). Now, why has this occurred? Well, the NFSv4 client can't close NFSv4 Opens on a vnode until that vnode's v_usecount goes to 0. This is because mmap'd files might do I/O after the file descriptor is closed. Now, hopefully Kostik will know something about nullfs and can help with this. My guess is that nullfs ends up acquiring a refcnt on the NFS vnode so the v_usecount doesn't go to 0 and, therefore, the client never closes the NFSv4 Opens. Kostik, do you know if this is the case and whether or not it can be changed? >LocalLock > 0 >Rpc Info: >TimedOut Invalid X Replies Retries Requests > 0 0 0 0 662 >Cache Info: >Attr Hits Misses Lkup Hits Misses BioR Hits Misses BioW Hits Misses > 1275 58 837 121 0 0 0 0 >BioRLHits Misses BioD Hits Misses DirE Hits Misses > 1 0 6 0 1 0 > [more stuff snipped] >What role could nullfs be playing? As noted above, my hunch is that is acquiring a refcnt on the NFS client vnode such that the v_usecount doesn't go to zero (at least for a long time) and without a VOP_INACTIVE() on the NFSv4 vnode, the NFSv4 Opens don't get closed and accumulate. (If that isn't correct, it is somehow interfering with the client Closing the NFSv4 Opens in some other way.) rickReceived on Thu Nov 24 2016 - 21:45:56 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:08 UTC