Re: ZFS committed to the FreeBSD base.

From: Kris Kennaway <kris_at_obsecurity.org>
Date: Wed, 23 May 2007 05:32:31 -0400
On Wed, May 23, 2007 at 08:55:32AM +0000, Darren Reed wrote:
> On Tue, May 22, 2007 at 10:01:17AM +0100, Vince wrote:
> ...
> > I may reinstall at a later date as this is still very much a box to play
> > with, but I gather there is no great gain from going 64 bit other than
> > not having to play with PAE if you've got lots or RAM.
> 
> It's not RAM that ZFS really likes but your KVA (Kernel Virtual Address)
> space.  With a 32bit kernel you are more likely to experience problems
> with KVA shortage than you are RAM shortage when using ZFS.

Currently in FreeBSD the issues are 1) space in the kmem_map, which is
memory shared between various kernel systems and bounded to be smaller
than the amount of address space dedicated to the kernel (KVA).  The
size of kmem_map depends on the amount of RAM in the system by default
(but can be overridden at compile or boot time), 2) the size of the
KVA, the "container" that all of the addressible data in the kernel
must fit into.

The first is actually the major issue for most people.  This is partly
because the mechanisms that are supposed to provide backpressure to
reduce memory usage when space is becoming tight need some further
adaptation and extension to work with ZFS, and also partly because ZFS
typically wants to allocate hundreds of megabytes out the kmem_map
which is just not sized with this expectation in mind (it would
usually be "wasted" in the pre-ZFS world).

Problem 2) only becomes an issue when you try to increase the size of
kmem_map to deal with problem 1), and you find that when you make it
too large you exceed the size of the "container", i.e. the total
amount of KVA.  In this situation you could just increase KVA, but of
course there are tradeoffs (e.g. less address space available to user
processes).

Plain ZFS (i.e. as it exists in Solaris) actually has very different
ideas about the memory it should be allowed to use.  These ideas were
completely inappropriate on FreeBSD and were mostly corrected here:

http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/contrib/opensolaris/uts/common/fs/zfs/arc.c.diff?r1=1.5;r2=1.6

 	/* Start out with 1/8 of all memory */
-	arc_c = physmem * PAGESIZE / 8;
+	arc_c = kmem_size() / 8;

-	/* set min cache to 1/32 of all memory, or 64MB, whichever is more */
-	arc_c_min = MAX(arc_c / 4, 64<<20);
-	/* set max to 3/4 of all memory, or all but 1GB, whichever is more */
+	/* set min cache to 1/32 of all memory, or 16MB, whichever is more */
+	arc_c_min = MAX(arc_c / 4, 64<<18);
+	/* set max to 1/2 of all memory, or all but 1GB, whichever is more */

i.e. plain ZFS wants to use 3/4 of the *physical* RAM in the system
(or all but 1GB).  i.e. if you have 16GB in your system then zfs will
try to use up to 15GB of it for caching leaving only 1GB for
everything else (kernel + userland).

I would actually be interested to know how Solaris gets away with
this.  It sounds like there must be less of a distinction between
memory allocated to the kernel and to userland, and the ability for
memory to flow between these two with some form of backpressure when
userland wants memory that is currently gobbled by up solaris ZFS.

This kind of system probably makes good sense (although maybe there
are trade-offs), but anyway it's not how FreeBSD does it.

Kris

Received on Wed May 23 2007 - 07:32:39 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:10 UTC