Re: ZFS: unlimited arc cache growth?

From: Alexander Leidinger <Alexander_at_Leidinger.net>
Date: Sat, 18 Apr 2009 09:48:21 +0200
On Fri, 17 Apr 2009 10:04:15 -0400 Ben Kelly <ben_at_wanderview.com> wrote:


> On Apr 17, 2009, at 8:50 AM, Alexander Leidinger wrote:
> > to fs_at_, please CC me, as I'm not subscribed.
> >
> > I monitored (by hand) a while the sysctls  
> > kstat.zfs.misc.arcstats.size and kstat.zfs.misc.arcstats.hdr_size.  
> > Both grow way higher (at some point I've seen more than 500M) than  
> > what I have configured in vfs.zfs.arc_max (40M).
> >
> > After a while FS operations (e.g. pkgdb -F with about 900  
> > packages... my specific workload is the fixup of gnome packages  
> > after the removal of the obsolete libusb port) get very slow (in
> > my specific example I let the pkgdb run several times over night
> > and it still is not finished).
> >
> > The big problem with this is, that at some point in time the
> > machine reboots (panic, page fault, page not present, during a
> > fork1). I have the impression (beware, I have a watchdog
> > configured, as I don't know if a triggered WD would cause the same
> > panic, the following is just a guess) that I run out of memory of
> > some kind (I have 1G RAM, i386, max kmem size 700M). I restarted
> > pkgdb several times after a reboot, and it continues to process the
> > libusb removal, but hey, this is anoying.
> >
> > Does someone see something similar to what I describe (mainly the  
> > growth of the arc cache way beyond what is configured)? Anyone
> > with some ideas what to try?
> 
> Can you provide the rest of the arcstats from sysctl?  Also, does
> your arc_reclaim_thread process get any cycles when this problem
> occurs? What happens if you kill the pkgdb -F manually before it
> completes? Does the arc cache size come back down or is it stuck at
> the abnormally high level?

I haven't tried killing pkgdb and looking at the stats, but on the idle
machine (reboot after the panic and 5h of no use by me... the machine
fetches my mails, has a webmail + mysql + imap interface and is a
fileserver) the size is double of my max value. Again there's no real
load at this time, just fetching my mails (most traffic from the
FreeBSD lists) and a little bit of SpamAssassin filtering of them. When
I logged in this morning the machine was rebooted about 5h ago by a
panic and no FS traffic was going on (100% idle).

Currently the arc_reclaim_thread has 0:12 of accumulated CPU time,
the wcpu is at 0%, but it is in the running state. The machine is
about 80% idle.

Here are all zfs sysctls as of now (pkgdb started 5min ago):
---snip---
# sysctl -a | grep zfs
vfs.zfs.arc_meta_limit: 10485760
vfs.zfs.arc_meta_used: 130211600
vfs.zfs.mdcomp_disable: 0
vfs.zfs.arc_min: 22937600
vfs.zfs.arc_max: 41943040
vfs.zfs.zfetch.array_rd_sz: 1048576
vfs.zfs.zfetch.block_cap: 256
vfs.zfs.zfetch.min_sec_reap: 2
vfs.zfs.zfetch.max_streams: 8
vfs.zfs.prefetch_disable: 1
vfs.zfs.recover: 0
vfs.zfs.txg.synctime: 5
vfs.zfs.txg.timeout: 30
vfs.zfs.scrub_limit: 10
vfs.zfs.vdev.cache.bshift: 13
vfs.zfs.vdev.cache.size: 5242880
vfs.zfs.vdev.cache.max: 16384
vfs.zfs.vdev.aggregation_limit: 131072
vfs.zfs.vdev.ramp_rate: 2
vfs.zfs.vdev.time_shift: 6
vfs.zfs.vdev.min_pending: 4
vfs.zfs.vdev.max_pending: 6
vfs.zfs.cache_flush_disable: 0
vfs.zfs.zil_disable: 0
vfs.zfs.version.zpl: 3
vfs.zfs.version.vdev_boot: 1
vfs.zfs.version.spa: 13
vfs.zfs.version.dmu_backup_stream: 1
vfs.zfs.version.dmu_backup_header: 2
vfs.zfs.version.acl: 1
vfs.zfs.debug: 0
vfs.zfs.super_owner: 0
kstat.zfs.misc.arcstats.hits: 2483157
kstat.zfs.misc.arcstats.misses: 604115
kstat.zfs.misc.arcstats.demand_data_hits: 187200
kstat.zfs.misc.arcstats.demand_data_misses: 78685
kstat.zfs.misc.arcstats.demand_metadata_hits: 2295957
kstat.zfs.misc.arcstats.demand_metadata_misses: 525430
kstat.zfs.misc.arcstats.prefetch_data_hits: 0
kstat.zfs.misc.arcstats.prefetch_data_misses: 0
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 0
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 0
kstat.zfs.misc.arcstats.mru_hits: 1621026
kstat.zfs.misc.arcstats.mru_ghost_hits: 32102
kstat.zfs.misc.arcstats.mfu_hits: 862131
kstat.zfs.misc.arcstats.mfu_ghost_hits: 18804
kstat.zfs.misc.arcstats.deleted: 550853
kstat.zfs.misc.arcstats.recycle_miss: 287993
kstat.zfs.misc.arcstats.mutex_miss: 2
kstat.zfs.misc.arcstats.evict_skip: 654418
kstat.zfs.misc.arcstats.hash_elements: 5363
kstat.zfs.misc.arcstats.hash_elements_max: 8569
kstat.zfs.misc.arcstats.hash_collisions: 133396
kstat.zfs.misc.arcstats.hash_chains: 739
kstat.zfs.misc.arcstats.hash_chain_max: 5
kstat.zfs.misc.arcstats.p: 41943040
kstat.zfs.misc.arcstats.c: 41943040
kstat.zfs.misc.arcstats.c_min: 22937600
kstat.zfs.misc.arcstats.c_max: 41943040
kstat.zfs.misc.arcstats.size: 130467088
kstat.zfs.misc.arcstats.hdr_size: 730456
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 0
kstat.zfs.misc.vdev_cache_stats.delegations: 2728
kstat.zfs.misc.vdev_cache_stats.hits: 297326
kstat.zfs.misc.vdev_cache_stats.misses: 368918
---snip---

Bye,
Alexander.
Received on Sat Apr 18 2009 - 05:48:31 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:46 UTC