Re: zfs recv panic

From: Andriy Gapon <avg_at_FreeBSD.org>
Date: Tue, 16 May 2017 13:11:32 +0300
On 10/05/2017 12:37, Kristof Provost wrote:
> Hi,
> 
> I have a reproducible panic on CURRENT (r318136) doing
> (jupiter) # zfs send -R -v zroot/var_at_before-kernel-2017-04-26 | nc dual 1234
> (dual) # nc -l 1234 | zfs recv -v -F tank/jupiter/var
> 
> For clarity, the receiving machine is CURRENT r318136, the sending machine is
> running a somewhat older CURRENT version.
> 
> The receiving machine panics a few seconds in:
> 
> receiving full stream of zroot/var_at_before-kernel-2017-04-03 into
> tank/jupiter/var_at_before-kernel-2017-04-03
> panic: solaris assert: dbuf_is_metadata(db) == arc_is_metadata(buf) (0x0 ==
> 0x1), file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c,
> line: 2007

Kristof,

could you please try to revert commits related to the compressed send and see if
that helps?  I assume that the sending machine does not have (does not use) the
feature while the target machine is capable of the feature.

The commits are: r317648 and r317414.  Mot that I really suspect that change,
but just to eliminate the possibility.
Thank you.

> cpuid = 0
> time = 1494408122
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe0120cad930
> vpanic() at vpanic+0x19c/frame 0xfffffe0120cad9b0
> panic() at panic+0x43/frame 0xfffffe0120cada10
> assfail3() at assfail3+0x2c/frame 0xfffffe0120cada30
> dbuf_assign_arcbuf() at dbuf_assign_arcbuf+0xf2/frame 0xfffffe0120cada80
> dmu_assign_arcbuf() at dmu_assign_arcbuf+0x170/frame 0xfffffe0120cadad0
> receive_writer_thread() at receive_writer_thread+0x6ac/frame 0xfffffe0120cadb70
> fork_exit() at fork_exit+0x84/frame 0xfffffe0120cadbb0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0120cadbb0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> KDB: enter: panic
> [ thread pid 7 tid 100672 ]
> Stopped at      kdb_enter+0x3b: movq    $0,kdb_why
> db>
> 
> 
> kgdb backtrace:
> #0  doadump (textdump=0) at pcpu.h:232
> #1  0xffffffff803a208b in db_dump (dummy=<value optimized out>, dummy2=<value
> optimized out>, dummy3=<value optimized out>, dummy4=<value optimized out>) at
> /usr/src/sys/ddb/db_command.c:546
> #2  0xffffffff803a1e7f in db_command (cmd_table=<value optimized out>) at
> /usr/src/sys/ddb/db_command.c:453
> #3  0xffffffff803a1bb4 in db_command_loop () at /usr/src/sys/ddb/db_command.c:506
> #4  0xffffffff803a4c7f in db_trap (type=<value optimized out>, code=<value
> optimized out>) at /usr/src/sys/ddb/db_main.c:248
> #5  0xffffffff80a93cb3 in kdb_trap (type=3, code=-61456, tf=<value optimized
> out>) at /usr/src/sys/kern/subr_kdb.c:654
> #6  0xffffffff80ed3de6 in trap (frame=0xfffffe0120cad860) at
> /usr/src/sys/amd64/amd64/trap.c:537
> #7  0xffffffff80eb62f1 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:236
> #8  0xffffffff80a933eb in kdb_enter (why=0xffffffff8143d8f5 "panic", msg=<value
> optimized out>) at cpufunc.h:63
> #9  0xffffffff80a51cf9 in vpanic (fmt=<value optimized out>,
> ap=0xfffffe0120cad9f0) at /usr/src/sys/kern/kern_shutdown.c:772
> #10 0xffffffff80a51d63 in panic (fmt=<value optimized out>) at
> /usr/src/sys/kern/kern_shutdown.c:710
> #11 0xffffffff8262b26c in assfail3 (a=<value optimized out>, lv=<value optimized
> out>, op=<value optimized out>, rv=<value optimized out>, f=<value optimized
> out>, l=<value optimized out>)
>     at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:91
> #12 0xffffffff822ad892 in dbuf_assign_arcbuf (db=0xfffff8008f23e560,
> buf=0xfffff8008f09fcc0, tx=0xfffff8008a8d5200) at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:2007
> #13 0xffffffff822b87f0 in dmu_assign_arcbuf (handle=<value optimized out>,
> offset=0, buf=0xfffff8008f09fcc0, tx=0xfffff8008a8d5200) at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:1542
> #14 0xffffffff822bf7fc in receive_writer_thread (arg=0xfffffe0120a1d168) at
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_send.c:2284
> #15 0xffffffff80a13704 in fork_exit (callout=0xffffffff822bf150
> <receive_writer_thread>, arg=0xfffffe0120a1d168, frame=0xfffffe0120cadbc0) at
> /usr/src/sys/kern/kern_fork.c:1038
> #16 0xffffffff80eb682e in fork_trampoline () at
> /usr/src/sys/amd64/amd64/exception.S:611
> #17 0x0000000000000000 in ?? ()
> 
> Let me know if there’s any other information I can provide, or things I can test.
> Fortunately the target machine is not a production machine, so I can panic it as
> often as required.

-- 
Andriy Gapon
Received on Tue May 16 2017 - 08:13:02 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:11 UTC