Re: ZFS leaking vnodes (sort of)

From: Pawel Jakub Dawidek <pjd_at_FreeBSD.org>
Date: Mon, 9 Jul 2007 02:09:18 +0200
On Sat, Jul 07, 2007 at 02:26:17PM +0100, Doug Rabson wrote:
> I've been testing ZFS recently and I noticed some performance issues 
> while doing large-scale port builds on a ZFS mounted /usr/ports tree. 
> Eventually I realised that virtually nothing ever ended up on the vnode 
> free list. This meant that when the system reached its maximum vnode 
> limit, it had to resort to reclaiming vnodes from the various 
> filesystem's active vnode lists (via vlrureclaim). Since those lists 
> are not sorted in LRU order, this led to pessimal cache performance 
> after the system got into that state.
> 
> I looked a bit closer at the ZFS code and poked around with DDB and I 
> think the problem was caused by a couple of extraneous calls to vhold 
> when creating a new ZFS vnode. On FreeBSD, getnewvnode returns a vnode 
> which is already held (not on the free list) so there is no need to 
> call vhold again.

Whoa! Nice catch... The patch works here - I did some pretty heavy
tests, so please commit it ASAP.

I also wonder if this can help with some of those 'kmem_map too small'
panics. I was observing that ARC cannot reclaim memory and this may be
because all vnodes and thus associated data are beeing held.

To ZFS users having problems with performance and/or stability of ZFS:
Can you test the patch and see if it helps?

> This patch appears to fix the problem (only very lightly tested):
> 
> Index: zfs_vnops.c
> ===================================================================
> RCS 
> file: /home/ncvs/src/sys/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c,v
> retrieving revision 1.22
> diff -u -r1.22 zfs_vnops.c
> --- zfs_vnops.c	28 May 2007 02:37:43 -0000	1.22
> +++ zfs_vnops.c	7 Jul 2007 13:01:41 -0000
> _at__at_ -3493,7 +3493,7 _at__at_
>  		rele = 0;
>  	vp->v_data = NULL;
>  	ASSERT(vp->v_holdcnt > 1);
> -	vdropl(vp);
> +	VI_UNLOCK(vp);
>  	if (!zp->z_unlinked && rele)
>  		VFS_RELE(zfsvfs->z_vfs);
>  	return (0);
> Index: zfs_znode.c
> ===================================================================
> RCS 
> file: /home/ncvs/src/sys/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c,v
> retrieving revision 1.8
> diff -u -r1.8 zfs_znode.c
> --- zfs_znode.c	6 May 2007 19:05:37 -0000	1.8
> +++ zfs_znode.c	7 Jul 2007 13:17:32 -0000
> _at__at_ -115,7 +115,6 _at__at_
>  		ASSERT(error == 0);
>  		zp->z_vnode = vp;
>  		vp->v_data = (caddr_t)zp;
> -		vhold(vp);
>  		vp->v_vnlock->lk_flags |= LK_CANRECURSE;
>  		vp->v_vnlock->lk_flags &= ~LK_NOSHARE;
>  	} else {
> _at__at_ -601,7 +600,6 _at__at_
>  			ASSERT(err == 0);
>  			vp = ZTOV(zp);
>  			vp->v_data = (caddr_t)zp;
> -			vhold(vp);
>  			vp->v_vnlock->lk_flags |= LK_CANRECURSE;
>  			vp->v_vnlock->lk_flags &= ~LK_NOSHARE;
>  			vp->v_type = IFTOVT((mode_t)zp->z_phys->zp_mode);

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd_at_FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

Received on Sun Jul 08 2007 - 22:27:31 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:14 UTC