Re: zfs: Fatal trap 12: page fault while in kernel mode

From: Thomas Backman <serenity_at_exscape.org>
Date: Thu, 30 Jul 2009 09:04:38 +0200
On Jul 29, 2009, at 23:18, Pawel Jakub Dawidek wrote:

> On Wed, Jul 29, 2009 at 10:15:06PM +0200, Thomas Backman wrote:
>> On Jul 29, 2009, at 19:18, Andriy Gapon wrote:
>>
>>>
>>> Thanks a lot again!
>>>
>>> Could you please try the following change?
>>> In sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c, in
>>> function
>>> zfs_inactive() insert the following line:
>>> 	vrecycle(vp, curthread);
>>> before the following line:
>>> 	zfs_znode_free(zp);
>>>
>>> This is in "if (zp->z_dbuf == NULL)" branch.
>>>
>>> I hope that this should work in concert with the patch that Pawel
>>> has posted.
>>>
>>> P.S.
>>> Also Pawel has told me that adding 'CFLAGS+=-DDEBUG=1' to sys/
>>> modules/zfs/Makefile
>>> should enable additional debugging checks (ASSERTs) in ZFS code.
>>>
>>> --  
>>> Andriy Gapon
>> Better backtraces:
>>
>> Without your vrecycle() addition, and with the -DDEBUG=1 one (note to
>> self: core.txt.32):
>>
>> Unread portion of the kernel message buffer:
>> panic: solaris assert: ((zp)->z_vnode) == ((void *)0), file: /usr/ 
>> src/
>> sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/
>> zfs_znode.c, line: 1043
>
> Modify zfs_inactive() 'zp->z_dbuf == NULL' case to look like this:
>
> 	if (zp->z_dbuf == NULL) {
> 		/*
> 		 * The fs has been unmounted, or we did a
> 		 * suspend/resume and this file no longer exists.
> 		 */
> 		VI_LOCK(vp);
> 		vp->v_count = 0; /* count arrives as 1 */
> 		vp->v_data = NULL;
> 		VI_UNLOCK(vp);
> 		rw_exit(&zfsvfs->z_teardown_inactive_lock);
> 		ZTOV(zp) = NULL;
> 		vrecycle(vp, curthread);
> 		zfs_znode_free(zp);
> 		return;
> 	}
New code, new panic. :(
Same place as before, on exporting.

panic: solaris assert: zp != ((void *)0), file: /usr/src/sys/modules/ 
zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c,  
line: 4357
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
panic() at panic+0x182
zfs_freebsd_reclaim() at zfs_freebsd_reclaim+0x1f2
VOP_RECLAIM_APV() at VOP_RECLAIM_APV+0x4a
vgonel() at vgonel+0x12e
vrecycle() at vrecycle+0x7d
zfs_inactive() at zfs_inactive+0x1aa
zfs_freebsd_inactive() at zfs_freebsd_inactive+0x1a
VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0x4a
vinactive() at vinactive+0x6a
vput() at vput+0x1c6
dounmount() at dounmount+0x4af
unmount() at unmount+0x3c8
syscall() at syscall+0x28f
Xfast_syscall() at Xfast_syscall+0xe1--- syscall (22, FreeBSD ELF64,  
unmount), rip = 0x80104e9ec, rsp = 0x7fffffffaa98, rbp = 0x801223300 ---
KDB: enter: panic

[lockedvnods]
0xffffff000bf8f3b0: tag zfs, type VDIR
     usecount 0, writecount 0, refcount 1 mountedhere 0
     flags (VI_DOOMED|VI_DOINGINACT)    lock type zfs: EXCL by thread  
0xffffff00450b0390 (pid 1400)panic: from debuggercpuid = 0
Uptime: 1m34s
Physical memory: 2030 MB
Dumping 1407 MB: ...

#11 0xffffffff8033a9cb in panic (fmt=Variable "fmt" is not available.
)
     at /usr/src/sys/kern/kern_shutdown.c:558#12 0xffffffff80b110c2 in  
zfs_freebsd_reclaim () from /boot/kernel/zfs.ko
#13 0xffffffff805c5c2a in VOP_RECLAIM_APV (vop=0x0,  
a=0xffffff803e9578f0)    at vnode_if.c:1926
#14 0xffffffff803c839e in vgonel (vp=0xffffff000bf8f3b0) at vnode_if.h: 
830
#15 0xffffffff803ca7ad in vrecycle (vp=0xffffff000bf8f3b0, td=Variable  
"td" is not available.
)    at /usr/src/sys/kern/vfs_subr.c:2504
#16 0xffffffff80b109ea in zfs_inactive () from /boot/kernel/zfs.ko
#17 0xffffffff80b88220 in ?? ()
#18 0xffffff803e9579f0 in ?? ()
#19 0xffffff00450b0390 in ?? ()
#20 0x0000000000000000 in ?? ()
#21 0xffffff803e957a40 in ?? ()
#22 0xffffff803e9579c0 in ?? ()
#23 0xffffffff80b10a9a in zfs_freebsd_inactive () from /boot/kernel/ 
zfs.ko
#24 0xffffffff805c5b5a in VOP_INACTIVE_APV (vop=0xffffff000bf8f470,     
a=0xffffff0045146a48) at vnode_if.c:1863
#25 0xffffffff803c6aaa in vinactive (vp=0xffffff000bf8f3b0,
     td=0xffffff000bf8f3b0) at vnode_if.h:807
#26 0xffffffff803cbf26 in vput (vp=0xffffff000bf8f3b0)    at /usr/src/ 
sys/kern/vfs_subr.c:2257
#27 0xffffffff803c57ef in dounmount (mp=0xffffff0002d0e8d0, flags=0,  
td=Variable "td" is not available.
)
#28 0xffffffff803c5df8 in unmount (td=0xffffff00450b0390,
     uap=0xffffff803e957bf0) at /usr/src/sys/kern/vfs_mount.c:1174#29  
0xffffffff805980bf in syscall (frame=0xffffff803e957c80)
     at /usr/src/sys/amd64/amd64/trap.c:984
#30 0xffffffff8057e2c1 in Xfast_syscall ()
     at /usr/src/sys/amd64/amd64/exception.S:373#31 0x000000080104e9ec  
in ?? ()Previous frame inner to this frame (corrupt stack?)

BTW, here's my svn diff output (in /usr/src; one irrelevant patch not  
shown; I used your previous zfs_vnops.2.c patch and then replaced the  
if block as above):

Index: sys/modules/zfs/Makefile
===================================================================
--- sys/modules/zfs/Makefile    (revision 195910)
+++ sys/modules/zfs/Makefile    (working copy)
_at__at_ -97,3 +97,4 _at__at_
  CWARNFLAGS+=-Wno-inline
  CWARNFLAGS+=-Wno-switch
  CWARNFLAGS+=-Wno-pointer-arith
+CFLAGS+=-DDEBUG=1
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
===================================================================
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c   
(revision 195910)
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c   
(working copy)
_at__at_ -3709,11 +3709,13 _at__at_
                  * The fs has been unmounted, or we did a
                  * suspend/resume and this file no longer exists.
                  */
-               mutex_enter(&zp->z_lock);
                 VI_LOCK(vp);
                 vp->v_count = 0; /* count arrives as 1 */
-               mutex_exit(&zp->z_lock);
+               vp->v_data = NULL;
+               VI_UNLOCK(vp);
                 rw_exit(&zfsvfs->z_teardown_inactive_lock);
+               ZTOV(zp) = NULL;
+               vrecycle(vp, curthread);
                 zfs_znode_free(zp);
                 return;
         }
_at__at_ -4351,7 +4353,6 _at__at_
  {
         vnode_t *vp = ap->a_vp;
         znode_t *zp = VTOZ(vp);
-       zfsvfs_t *zfsvfs;

         ASSERT(zp != NULL);

_at__at_ -4361,13 +4362,18 _at__at_
         vnode_destroy_vobject(vp);

         mutex_enter(&zp->z_lock);
-       ASSERT(zp->z_phys);
+       ASSERT(zp->z_phys != NULL);
         ZTOV(zp) = NULL;
-       if (!zp->z_unlinked) {
+       mutex_exit(&zp->z_lock);
+
+       if (zp->z_unlinked)
+               ;       /* Do nothing. */
+       else if (zp->z_dbuf == NULL)
+               zfs_znode_free(zp);
+       else /* if (!zp->z_unlinked && zp->z_dbuf != NULL) */ {
+               zfsvfs_t *zfsvfs = zp->z_zfsvfs;
                 int locked;

-               zfsvfs = zp->z_zfsvfs;
-               mutex_exit(&zp->z_lock);
                 locked = MUTEX_HELD(ZFS_OBJ_MUTEX(zfsvfs, zp- 
 >z_id)) ? 2 :
                     ZFS_OBJ_HOLD_TRYENTER(zfsvfs, zp->z_id);
                 if (locked == 0) {
_at__at_ -4383,8 +4389,6 _at__at_
                                 ZFS_OBJ_HOLD_EXIT(zfsvfs, zp->z_id);
                         zfs_znode_free(zp);
                 }
-       } else {
-               mutex_exit(&zp->z_lock);
         }
         VI_LOCK(vp);
         vp->v_data = NULL;


Should I revert to the svn state and then change the if clause as  
above, or is this correct?

Regards,
Thomas
Received on Thu Jul 30 2009 - 05:05:01 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:52 UTC