Re: zfs: Fatal trap 12: page fault while in kernel mode

From: Thomas Backman <serenity_at_exscape.org> Date: Thu, 30 Jul 2009 20:29:27 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:53 UTC

On Jul 30, 2009, at 18:41, Thomas Backman wrote:

> On Jul 30, 2009, at 17:40, Andriy Gapon wrote:
>> on 30/07/2009 18:25 Thomas Backman said the following:
>>> PS. I'll test Pawel's patch sometime after dinner. ;)
>>
>> I believe that you should get a perfect result with it.
>>
>> -- Andriy Gapon
> If I dare say it, you were right! I've been testing for about half  
> an hour or so (probably a bit more) now.
> Still using DEBUG_VFS_LOCKS, and I've tried the test case several  
> times, ran an initial backup (i.e. destroy target pool and send|recv  
> the entire pool) and a few incrementals. Rebooted, tried it again.  
> No panic, no problems! :)
> Let's hope it stays this way.
>
> So, in short: With that patch (copied here just in case: http://exscape.org/temp/zfs_vnops.working.patch 
>  ) and the libzfs patch linked previously, it appears zfs send/recv  
> works plain fine. I have yet to try it with clone/promote and stuff,  
> but since that gave the same panic that this solved, I'm hoping  
> there will be no problems with that anymore.

Arrrgh!
I guess I spoke too soon after all... new panic yet again. :(
*sigh* It feels as if this will never become stable right now. (Maybe  
that's because I've spent all day and most of yesterday too on this ;)

Steps and panic info:

(Prior to this, I tried a simple zfs promote on one of my clones, and  
then reverted it by promoting the other FS again, with no problems on  
running the backup script.)

[root_at_chaos ~]# zfs destroy -r tank/testfs
[root_at_chaos ~]# bash backup.sh backup
(all output is from zfs, on zfs send -R -I old tank_at_new | zfs recv - 
Fvd slave)

attempting destroy slave/testfs_at_backup-20090730-2009
success
attempting destroy slave/testfs_at_backup-20090730-1823
success
attempting destroy slave/testfs_at_backup-20090730-1801
success
attempting destroy slave/testfs_at_backup-20090730-2011
success
attempting destroy slave/testfs_at_backup-20090730-1827
success
attempting destroy slave/testfs
success
receiving incremental stream of tank_at_backup-20090730-2012 into  
slave_at_backup-20090730-2012
received 312B stream in 1 seconds (312B/sec)
receiving incremental stream of tank/tmp_at_backup-20090730-2012 into  
slave/tmp_at_backup-20090730-2012
received 312B stream in 1 seconds (312B/sec)
receiving incremental stream of tank/var_at_backup-20090730-2012 into  
slave/var_at_backup-20090730-2012
received 32.6KB stream in 1 seconds (32.6KB/sec)
receiving incremental stream of tank/var/log_at_backup-20090730-2012 into  
slave/var/log_at_backup-20090730-2012
received 298KB stream in 1 seconds (298KB/sec)
receiving incremental stream of tank/var/crash_at_backup-20090730-2012  
into slave/var/crash_at_backup-20090730-2012
received 312B stream in 1 seconds (312B/sec)
receiving incremental stream of tank/root_at_backup-20090730-2012 into  
slave/root_at_backup-20090730-2012
[... panic here ...]

Unread portion of the kernel message buffer:panic: solaris assert:  
((zp)->z_vnode)->v_usecount > 0, file: /usr/src/sys/modules/zfs/../../ 
cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c, line: 920
cpuid = 0
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
panic() at panic+0x182
zfsvfs_teardown() at zfsvfs_teardown+0x24d
zfs_suspend_fs() at zfs_suspend_fs+0x2b
zfs_ioc_recv() at zfs_ioc_recv+0x28b
zfsdev_ioctl() at zfsdev_ioctl+0x8a
devfs_ioctl_f() at devfs_ioctl_f+0x77
kern_ioctl() at kern_ioctl+0xf6
ioctl() at ioctl+0xfd
syscall() at syscall+0x28f
Xfast_syscall() at Xfast_syscall+0xe1
--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe5f7c, rsp =  
0x7fffffff8ef8, rbp = 0x7fffffff9c30 ---
KDB: enter: panic
panic: from debugger

#9  0xffffffff8057eda7 in calltrap ()
     at /usr/src/sys/amd64/amd64/exception.S:224
#10 0xffffffff8036c8ad in kdb_enter (why=0xffffffff80609c44 "panic",
     msg=0xa <Address 0xa out of bounds>) at cpufunc.h:63
#11 0xffffffff8033abcb in panic (fmt=Variable "fmt" is not available.
)    at /usr/src/sys/kern/kern_shutdown.c:558#12 0xffffffff80b0ec5d in  
zfsvfs_teardown () from /boot/kernel/zfs.ko#13 0x0000000000100000  
in ?? ()
#14 0xffffff001bff0250 in ?? ()
#15 0xffffff001bff0000 in ?? ()
#16 0xffffff0008004000 in ?? ()
#17 0xffffff803e9747a0 in ?? ()
#18 0xffffff803e9747d0 in ?? ()
#19 0xffffff803e974770 in ?? ()
#20 0xffffff803e974740 in ?? ()
#21 0xffffffff80b0ecab in zfs_suspend_fs () from /boot/kernel/zfs.ko
Previous frame inner to this frame (corrupt stack?)

Unfortunately, I'm not sure I can reproduce this reliably, since it  
worked a bunch of times both before and after my previous mail.

Oh, and I'm still using -DDEBUG=1 and DEBUG_VFS_LOCKS... If this isn't  
a new panic because of the changes, perhaps it was triggered now and  
never before because of the -DDEBUG?

Regards,
Thomas