Re: BETA3 crash (zfs related ?)

From: Javier Martín Rueda <jmrueda_at_diatel.upm.es>
Date: Fri, 30 Nov 2007 18:47:10 +0100
Alexandre Biancalana escribió:
> Hi list,
>
>   My Backup Server is running 7-BETA3 from 3 days ago, is a single
> processor Core2 Duo with 2GB Ram AMD64 SMP Kernel with ZFS.
>
>   Last night the machine rebooted after a crash, bellow are my dmesg
> and some messages that I get from /var/log/messsages.
>
>   Let me know if you need some other information.
>   
Although this is a guess, I'm relatively confident that you are having 
the same problem I had, because my impression is that right this moment 
anyone who uses ZFS intensively with a current kernel (7-BETA) will meet 
this problem for sure. I'll explain:

I set up a BETA2 system, I enabled ZFS, created a pool, a few 
filesystems, and everything seemed OK. However, when I started using the 
filesystem heavily, it wouldn't take more than a few minutes for a panic 
to show up. The message was "kmem_malloc(XXX): kmem_map too small: YYY 
total allocated". I updated the sources to BETA3, but the problem was 
the same. This panic means that some kernel subsystem has attempted to 
use too much memory. If that happens, the kernel will just panic.

After investigating a little bit, it seems that the ARC (the ZFS cache) 
was using too much memory. When you boot a FreeBSD system, the kernel 
uses certain formulas to set a maximum on how much memory it will use 
(sysctl vm.kmem_size, vm.kmem_size_max). For instance, on my 4 GiB 
system, the kernel would set the limit at 400 MiB. The ARC also sets a 
limit on how much of that memory it will use (80% I think, sysctl 
vfs.zfs.arc_max). On my system, it was 320 MiB. The problem is that for 
some reason, the ARC actually uses more memory that its limit, and if it 
goes beyond the global kernel limit, that's when you get the panic. 
According to some messages I read a few days ago, there was a recent 
change on how to compute how much memory the kernel was using, and only 
since then you can get this ZFS panic. I don't really know if the 
culprit is the ARC because it goes over its limit, or the way the memory 
is accounted for because it overestimates the ARC memory usage, or if 
there is some other problem.

I solved the problem by telling the kernel to increase its global limit 
well over the ARC limit. That way, even if the ARC uses more memory than 
it should, it will not hit the global limit. Doing some tests, I 
observed that if the ARC maximum limit was 320 MiB, you could see usage 
to increase occasionally to about 650 MiB (vmstat -m | fgrep solaris). 
As my system has 4 GiB, and that's much more than I really need for my 
processes, I decided to add a good safety margin and set the global 
kernel limit to 1.2 GiB, and the ARC limit to 320 MiB. You can always 
fit it a little bit better if you can't spare so much space. I haven't 
done any benchmarking as I have plenty of memory anyway. After doing 
that I haven't had any single panic.

This is my setup in /boot/loader.conf:

vm.kmem_size=1342177280
vm.kmem_size_max=1610612736
vfs.zfs.arc_max=314572800
vfs.zfs.arc_min=16777216

I warn you that if you set those limits incorrectly the kernel may panic 
as soon as it starts booting (and you won't be able to invoke an editor 
to correct the file, of course). So, either have a FreeSBIE CD handy in 
case you have to boot from it to edit the file, or make sure you know 
how to use Option 6 of the FreeBSD boot menu (Escape to loader prompt) 
to change the parameters before booting.
Received on Fri Nov 30 2007 - 19:27:34 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:23 UTC