Re: Suddenly slow lstat syscalls on CURRENT from Juli

From: Kostik Belousov <kostikbel_at_gmail.com> Date: Sat, 1 Jan 2011 19:26:35 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:10 UTC

On Sat, Jan 01, 2011 at 05:59:10PM +0100, Beat G?tzi wrote:
> On 01.01.2011 17:46, Kostik Belousov wrote:
> > On Sat, Jan 01, 2011 at 05:42:58PM +0100, Beat G?tzi wrote:
> >> On 01.01.2011 17:12, Kostik Belousov wrote:
> >>> On Sat, Jan 01, 2011 at 05:00:56PM +0100, Beat G?tzi wrote:
> >>>> On 01.01.2011 16:45, Kostik Belousov wrote:
> >>>>> Check the output of sysctl kern.maxvnodes and vfs.numvnodes. I suspect
> >>>>> they are quite close or equial. If yes, consider increasing maxvnodes.
> >>>>> Another workaround, if you have huge nested directories hierarhy, is
> >>>>> to set vfs.vlru_allow_cache_src to 1.
> >>>>
> >>>> Thanks for the hint. kern.maxvnodes and vfs.numvnodes were equal:
> >>>> # sysctl kern.maxvnodes vfs.numvnodes
> >>>> kern.maxvnodes: 100000
> >>>> vfs.numvnodes: 100765
> >>>>
> >>>> I've increased kern.maxvnodes and the problem was gone until
> >>>> vfs.numvnodes reached the value of kern.maxvnodes again:
> >>>> # sysctl kern.maxvnodes vfs.numvnodes
> >>>> kern.maxvnodes: 150000
> >>>> vfs.numvnodes: 150109
> >>> The processes should be stuck in "vlruwk" state, that can be
> >>> checked with ps or '^T' on the terminal.
> >>
> >> Yes, there are various processes in "vlruwk" state,
> >>
> >>>> As the directory structure is quite huge on this server I've set
> >>>> vfs.vlru_allow_cache_src to one now.
> >>> Did it helped ?
> >>
> >> No, it doesn't looks like setting vfs.vlru_allow_cache_src helped. The
> >> problem was gone when I increased kern.maxvnodes until vfs.numvnodes
> >> reached that level. I've stopped all running deamons but numvnodes
> >> doesn't decrease.
> > Stopping the daemons would not decrease the count of cached vnodes.
> > What you can do is to call unmount on the filesystems. Supposedly, the
> > filesystems are busy and unmount shall fail, but it will force freed
> > the vnodes that are unused by any process.
> 
> That freed around 1500 vnodes. At the moment the vfs.numvnodes doesn't
> increase rapidly and the server is usable. I will keep an eye it to see
> if I run into the same problem again.
This is too small amount of vnodes to be freed for the typical system,
and it feels like a real vnode leak. It would be helpful if you tried
to identify the load that causes the situation to occur.

You are on the UFS, right ?