Re: ZFS: unlimited arc cache growth?

From: Marius Nünnerich <marius_at_nuenneri.ch>
Date: Sat, 18 Apr 2009 14:58:36 +0200
On Sat, Apr 18, 2009 at 09:48, Alexander Leidinger
<Alexander_at_leidinger.net> wrote:
> On Fri, 17 Apr 2009 10:04:15 -0400 Ben Kelly <ben_at_wanderview.com> wrote:
>
>
>> On Apr 17, 2009, at 8:50 AM, Alexander Leidinger wrote:
>> > to fs_at_, please CC me, as I'm not subscribed.
>> >
>> > I monitored (by hand) a while the sysctls
>> > kstat.zfs.misc.arcstats.size and kstat.zfs.misc.arcstats.hdr_size.
>> > Both grow way higher (at some point I've seen more than 500M) than
>> > what I have configured in vfs.zfs.arc_max (40M).
>> >
>> > After a while FS operations (e.g. pkgdb -F with about 900
>> > packages... my specific workload is the fixup of gnome packages
>> > after the removal of the obsolete libusb port) get very slow (in
>> > my specific example I let the pkgdb run several times over night
>> > and it still is not finished).
>> >
>> > The big problem with this is, that at some point in time the
>> > machine reboots (panic, page fault, page not present, during a
>> > fork1). I have the impression (beware, I have a watchdog
>> > configured, as I don't know if a triggered WD would cause the same
>> > panic, the following is just a guess) that I run out of memory of
>> > some kind (I have 1G RAM, i386, max kmem size 700M). I restarted
>> > pkgdb several times after a reboot, and it continues to process the
>> > libusb removal, but hey, this is anoying.
>> >
>> > Does someone see something similar to what I describe (mainly the
>> > growth of the arc cache way beyond what is configured)? Anyone
>> > with some ideas what to try?
>>
>> Can you provide the rest of the arcstats from sysctl?  Also, does
>> your arc_reclaim_thread process get any cycles when this problem
>> occurs? What happens if you kill the pkgdb -F manually before it
>> completes? Does the arc cache size come back down or is it stuck at
>> the abnormally high level?
>
> I haven't tried killing pkgdb and looking at the stats, but on the idle
> machine (reboot after the panic and 5h of no use by me... the machine
> fetches my mails, has a webmail + mysql + imap interface and is a
> fileserver) the size is double of my max value. Again there's no real
> load at this time, just fetching my mails (most traffic from the
> FreeBSD lists) and a little bit of SpamAssassin filtering of them. When
> I logged in this morning the machine was rebooted about 5h ago by a
> panic and no FS traffic was going on (100% idle).
>
> Currently the arc_reclaim_thread has 0:12 of accumulated CPU time,
> the wcpu is at 0%, but it is in the running state. The machine is
> about 80% idle.
>

[snip]

How about adding a few DTrace probes into arc_reclaim_thread and see
what it does?
Received on Sat Apr 18 2009 - 10:58:38 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:46 UTC