Re: stable/13, vm page counts do not add up

From: Mark Johnston <markj_at_freebsd.org> Date: Tue, 13 Apr 2021 17:18:42 -0400 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:28 UTC

On Tue, Apr 13, 2021 at 05:01:49PM +0300, Andriy Gapon wrote:
> On 07/04/2021 23:56, Mark Johnston wrote:
> > I don't know what might be causing it then.  It could be a page leak.
> > The kernel allocates wired pages without adjusting the v_wire_count
> > counter in some cases, but the ones I know about happen at boot and
> > should not account for such a large disparity.  I do not see it on a few
> > systems that I have access to.
> 
> Mark or anyone,
> 
> do you have a suggestion on how to approach hunting for the potential page leak?
> It's been a long while since I worked with that code and it changed a lot.
> 
> Here is some additional info.
> I had approximately 2 million unaccounted pages.
> I rebooted the system and that number became 20 thousand which is more
> reasonable and could be explained by those boot-time allocations that you mentioned.
> After 30 hours of uptime the number became 60 thousand.
> 
> I monitored the number and so far I could not correlate it with any activity.
> 
> P.S.
> I have not been running any virtual machines.
> I do use nvidia graphics driver.

My guess is that something is allocating pages without VM_ALLOC_WIRE and
either they're managed and something is failing to place them in page
queues, or they're unmanaged and should likely be counted as wired.

It is also possible that something is allocating wired, unmanaged
pages and unwiring them without freeing them.  For managed pages,
vm_page_unwire() ensures they get placed in a queue.
vm_page_unwire_noq() does not, but it is typically only used with
unmanaged pages. 

The nvidia drivers do not appear to call any vm_page_* functions, at
least based on the kld symbol tables.

So you might try using DTrace to collect stacks for these functions,
leaving it running for a while and comparing stack counts with the
number of pages leaked while the script is running.  Something like:

fbt::vm_page_alloc_domain_after:entry
/(args[3] & 0x20) == 0/
{
	_at_alloc[stack()] = count();
}

fbt::vm_page_alloc_contig_domain:entry
/(args[3] & 0x20) == 0/
{
	_at_alloc[stack()] = count();
}

fbt::vm_page_unwire_noq:entry
{
	_at_unwire[stack()] = count();
}

fbt::vm_page_unwire:entry
/args[0]->oflags & 0x4/
{
	_at_unwire[stack()] = count();
}

It might be that the count of leaked pages does not relate directly to
the counts collected by the script, e.g., because there is some race
that results in a leak.  But we can try to rule out some easier cases
first.

I tried to look for possible causes of the KTLS page leak mentioned
elsewhere in this thread but can't see any obvious problems.  Does your
affected system use sendfile() at all?  I also wonder if you see much
mbuf usage on the system.