Without boring you with too much detail, I have been doing development/testing of pNFS stuff (mostly server side) on a 1 year old kernel (Apr. 12, 2016). When I recently carried the code across to a recent kernel, everything seemed to work, but performance was much slower. After some fiddling around, it appears to be on the NFS client side and nothing in the NFS client code seemed to be causing it. (RPC counts were almost exactly the same, for example. I tried reverting r316532 and disabling vfs.nfs.use_buf_pager. Neither made a significant difference.) I made most of the performance degradation go away by disabling SMP on the client. Here's some elapsed times for kernel builds with everything the same except for which kernel and SMP enabled/disabled (amd64 client machine). 1 year old kernel, SMP enabled - 100minutes recent kernel, SMP disabled - 113minutes recent kernel, SMP enabled - 148minutes (The builds were all of the same kernel sources. When I say "1 year old" vs "recent" I am referring to which kernel was booted for the test run.) All I can think of is that some change in the last year has resulted in an increase in something like interrupt latency or context switch latency that has caused this? Anyone have an idea what this might be caused by or any tunables to fool with beyond disabling SMP (which I suspect won't be a popular answer to "how to fix slow NFS";-). I haven't yet tried fiddling with interrupt moderation on the net interface, but the tests all used the same settings. rickReceived on Wed May 24 2017 - 18:40:03 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:11 UTC