On 4/3/21, Poul-Henning Kamp <phk_at_phk.freebsd.dk> wrote: > -------- > Mateusz Guzik writes: > >> It is high because of this: >> msleep(&vnlruproc_sig, &vnode_list_mtx, PVFS, "vlruwk", >> hz); >> >> i.e. it literally sleeps for 1 second. > > Before the line looked like that, it slept on "lbolt" aka "lightning > bolt" which was woken once a second. > > The calculations which come up with those "constants" have always > been utterly bogus math, not quite "square-root of shoe-size > times sun-angle in Patagonia", but close. > > The original heuristic came from university environments with tons of > students doing assignments and nethack behind VT102 terminals, on > filesystems where files only seldom grew past 100KB, so it made sense > to scale number of vnodes to how much RAM was in the system, because > that also scaled the size of the buffer-cache. > > With a merged VM buffer-cache, whatever validity that heuristic had > was lost, and we tweaked the bogomath in various ways until it > seemed to mostly work, trusting the users for which it did not, to > tweak things themselves. > > Please dont tweak the Finagle Constants again. > > Rip all that crap out and come up with something fundamentally better. > Some level of pacing is probably useful to control total memory use -- there can be A LOT of memory tied up in mere fact that vnode is fully cached. imo the thing to do is to come up with some watermarks to be revisited every 1-2 years and to change the behavior when they get exceeded -- try to whack some stuff but in face of trouble just go ahead and alloc without sleep 1. Should the load spike sort itself out, vnlru will slowly get things down to the watermark. If the watermark is too low, maybe it can autotune. Bottom line is that even with the current idea of limiting preferred total vnode count, the corner case behavior can be drastically better suffering SOME perf loss from recycling vnodes, but not sleeping for a second for every single one. I think the notion of 'struct vnode' being a separately allocated object is not very useful and it comes with complexity (and happens to suffer from several bugs). That said, the easiest and safest thing to do in the meantime is to bump the limit. Perhaps the sleep can be whacked as it is which would largely sort it out. -- Mateusz Guzik <mjguzik gmail.com>Received on Sun Apr 04 2021 - 09:51:39 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:27 UTC