Re: panic: kmem_malloc(131072): kmem_map too small (AMD64)

From: Chuck Swiger <cswiger_at_mac.com> Date: Wed, 26 Sep 2007 11:21:27 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:18 UTC

On Sep 26, 2007, at 3:02 AM, Darren Reed wrote:
>> Yes, Solaris does something architecturally different because it  
>> is apparently acceptable for zfs to use gigabytes of memory by  
>> default.
>
> Well, if you were designing a file system for servers, is there any
> reason that you wouldn't try to use all of the RAM available?
>
> A similar thought process goes into having a unified buffer cache that
> uses all the free RAM that it can (on a 1.5GB NetBSD box, 1.4GB
> is file cache.)

This is a fine example.  One of the nice notions of a "unified buffer  
cache" is that you should only store data once in physical RAM, and  
just use VMOs to provide additional references (perhaps mapped copy- 
on-write) rather than double-buffering stuff between a processes'  
address space to dedicated kernel disk I/O buffers before they can be  
read or written out.

A key factor to note is that buffer cache can be paged out as needed  
(by definition, pretty much), but historically, kernel memory was  
"wired down" to prevent people from paging out critical things like  
the VM subsystem or disk drivers.  Wired down memory is (or perhaps  
was) a scarce resource, which is why things like the memory disk [md]  
implementation recommends using swap-based backing rather than kernel  
malloc(9)-based backing.

I'm not certain whether FreeBSD's kernel memory allocator [malloc(9),  
zone(9)] even supports the notion of allocating pageable memory  
rather than memory taken from the fixed KVA region.  The manpage  
implies that calling kernel malloc with M_WAITOK will always return a  
valid pointer and not NULL, but I'm not convinced this is will be  
true if you try allocating something larger than the size of KVA and/ 
or the amount of physical RAM available in the system.

> Even if I'm running a desktop workstation, if I'm not there, there's
> no reason that ZFS shouldn't be able to encourage the OS to swap
> out all of the applications (well as much as can be) if it so desires.
>
> The problem comes in deciding how strong ZFS's hold should be
> and how to apply pressure from other parts of the system that
> want to use the RAM as well.

Obviously, it does no good to page out your buffer cache to swap-- so  
if the system is under enough memory pressure to want to use that  
memory for other tasks, then the right thing to do is to shrink the  
buffer cache somewhat to attempt to minimize the global page-fault  
frequency rate.

> Now given that we have a ZFS tuning guide, surely the question
> we need to ask ourselves is why can't we take the recommendations
> from that and make up some code to implement the trends discussed?
>
> And how do we measure how much memory ZFS is using?

"vmstat -m", or maybe there are some sysctls being exposed by ZFS  
with that info?

Regards,
-- 
-Chuck