Re: Panic in fusefs(5) on 13.0-CURRENT r359773

From: Konstantin Belousov <kostikbel_at_gmail.com> Date: Mon, 27 Apr 2020 13:51:24 +0300 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:23 UTC

On Mon, Apr 27, 2020 at 11:39:41AM +0200, Mateusz Piotrowski wrote:
> Hi everyone!
> 
> I'm experiencing panics every day. Apparently, they are caused by a bug in
> our FUSE implementation.
> 
> The panic occurs when I'm editing files in a folder mounted via SSHFS from a
> bhyve VM running Ubuntu 18.04.
> 
> I'm sending some parts of the core.txt file. Let me know if I can provide
> any additional information.
> 
> Cheers,
> 
> Mateusz
> 
> -------
> 
> t480 dumped core - see /var/crash/vmcore.3
> 
> Mon Apr 27 11:30:19 CEST 2020
> 
> FreeBSD t480 13.0-CURRENT FreeBSD 13.0-CURRENT #4 r359773: Sat Apr 11
> 05:32:31 CEST 2020 root_at_t480:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64
> 
> panic: Assertion bp->b_flags & B_VMIO failed at
> /usr/src/sys/fs/fuse/fuse_node.c:433
> 
> GNU gdb (GDB) 9.1 [GDB v9.1 for FreeBSD]
> Copyright (C) 2020 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later
> <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
> Type "show copying" and "show warranty" for details.
> This GDB was configured as "x86_64-portbld-freebsd13.0".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
>     <http://www.gnu.org/software/gdb/documentation/>.
> 
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from /boot/kernel/kernel...
> Reading symbols from /usr/lib/debug//boot/kernel/kernel.debug...
> 
> Unread portion of the kernel message buffer:
> WARNING !drm_modeset_is_locked(&crtc->mutex) failed at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_atomic_helper.c:577
> WARNING !drm_modeset_is_locked(&crtc->mutex) failed at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_atomic_helper.c:577
> WARNING !drm_modeset_is_locked(&crtc->mutex) failed at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_atomic_helper.c:577
> WARNING !drm_modeset_is_locked(&dev->mode_config.connection_mutex) failed at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_atomic_helper.c:622
> WARNING !drm_modeset_is_locked(&plane->mutex) failed at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_atomic_helper.c:821
> WARNING !drm_modeset_is_locked(&plane->mutex) failed at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_atomic_helper.c:821
> WARNING !drm_modeset_is_locked(&plane->mutex) failed at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_atomic_helper.c:821
> WARNING !drm_modeset_is_locked(&plane->mutex) failed at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_atomic_helper.c:821
> WARNING !drm_modeset_is_locked(&plane->mutex) failed at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_atomic_helper.c:821
> WARNING !drm_modeset_is_locked(&plane->mutex) failed at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_atomic_helper.c:821
> WARNING !drm_modeset_is_locked(&plane->mutex) failed at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_atomic_helper.c:821
> WARNING !drm_modeset_is_locked(&plane->mutex) failed at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_atomic_helper.c:821
> WARNING !drm_modeset_is_locked(&plane->mutex) failed at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_atomic_helper.c:821
> <4>WARN_ON(!mutex_is_locked(&dev->struct_mutex))
> 
> <4>WARN_ON(!mutex_is_locked(&fbc->lock))
> 
> <4>WARN_ON(!mutex_is_locked(&fbc->lock))WARN_ON(!mutex_is_locked(&fbc->lock))WARN_ON(!mutex_is_locked(&fbc->lock))
> panic: Assertion bp->b_flags & B_VMIO failed at
> /usr/src/sys/fs/fuse/fuse_node.c:433
> cpuid = 5
> time = 1587979693
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfffffe00cc0ef330
> vpanic() at vpanic+0x182/frame 0xfffffe00cc0ef380
> panic() at panic+0x43/frame 0xfffffe00cc0ef3e0
> fuse_vnode_setsize() at fuse_vnode_setsize+0x135/frame 0xfffffe00cc0ef430
> fuse_internal_cache_attrs() at fuse_internal_cache_attrs+0xc4/frame
> 0xfffffe00cc0ef490
> fuse_vnop_lookup() at fuse_vnop_lookup+0x6bf/frame 0xfffffe00cc0ef620
> lookup() at lookup+0x5e1/frame 0xfffffe00cc0ef6c0
> namei() at namei+0x524/frame 0xfffffe00cc0ef7b0
> kern_statat() at kern_statat+0x7f/frame 0xfffffe00cc0ef8d0
> sys_fstatat() at sys_fstatat+0x2f/frame 0xfffffe00cc0ef9d0
> amd64_syscall() at amd64_syscall+0x140/frame 0xfffffe00cc0efaf0
> fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00cc0efaf0
> --- syscall (552, FreeBSD ELF64, sys_fstatat), rip = 0x8006c2aaa, rsp =
> 0x7fffffffda78, rbp = 0x7fffffffdb20 ---
> KDB: enter: panic
> 
> __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
> 55        __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct pcpu,
> (kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
> #1  doadump (textdump=0) at /usr/src/sys/kern/kern_shutdown.c:394
> #2  0xffffffff8049a43a in db_dump (dummy=<optimized out>,
>     dummy2=<unavailable>, dummy3=<unavailable>, dummy4=<unavailable>)
>     at /usr/src/sys/ddb/db_command.c:575
> #3  0xffffffff8049a1fc in db_command (last_cmdp=<optimized out>,
>     cmd_table=<optimized out>, dopager=1) at
> /usr/src/sys/ddb/db_command.c:482
> #4  0xffffffff80499f6d in db_command_loop ()
>     at /usr/src/sys/ddb/db_command.c:535
> #5  0xffffffff8049d168 in db_trap (type=<optimized out>, code=<optimized
> out>)
>     at /usr/src/sys/ddb/db_main.c:253
> #6  0xffffffff80c06f24 in kdb_trap (type=3, code=0, tf=<optimized out>)
>     at /usr/src/sys/kern/subr_kdb.c:699
> #7  0xffffffff8105bc68 in trap (frame=0xfffffe00cc0ef260)
>     at /usr/src/sys/amd64/amd64/trap.c:578
> #8  <signal handler called>
> #9  kdb_enter (why=0xffffffff811ea4f2 "panic", msg=<optimized out>)
>     at /usr/src/sys/kern/subr_kdb.c:486
> #10 0xffffffff80bbc9fe in vpanic (fmt=<optimized out>, ap=<optimized out>)
>     at /usr/src/sys/kern/kern_shutdown.c:902
> #11 0xffffffff80bbc793 in panic (
>     fmt=0xffffffff81c8db28 <cnputs_mtx> "*\350\032\201\377\377\377\377")
>     at /usr/src/sys/kern/kern_shutdown.c:839
> #12 0xffffffff83f0ba05 in fuse_vnode_setsize (vp=0xfffff801e94515b8,
>     newsize=13793) at /usr/src/sys/fs/fuse/fuse_node.c:433
> #13 0xffffffff83f15a74 in fuse_internal_cache_attrs (vp=0xfffff801e94515b8,
>     attr=0xfffffe012ba5b028, attr_valid=1, attr_valid_nsec=0, vap=0x0)
>     at /usr/src/sys/fs/fuse/fuse_internal.c:267
> #14 0xffffffff83f1346e in fuse_vnop_lookup (ap=<optimized out>)
>     at /usr/src/sys/fs/fuse/fuse_vnops.c:1187
> #15 0xffffffff80c873b1 in VOP_LOOKUP (dvp=0xfffff801e9451988,
>     vpp=0xfffffe00cc0ef820, cnp=0xfffffe00cc0ef850) at ./vnode_if.h:54
> #16 lookup (ndp=0xfffffe00cc0ef7c0) at /usr/src/sys/kern/vfs_lookup.c:951
> #17 0xffffffff80c868e4 in namei (ndp=0xfffffe00cc0ef7c0)
>     at /usr/src/sys/kern/vfs_lookup.c:512
> #18 0xffffffff80ca210f in kern_statat (td=0xfffffe00cc61f800,
>     flag=<optimized out>, fd=<optimized out>,
>     path=0x800c9ff50 <error: Cannot access memory at address 0x800c9ff50>,
>     pathseg=UIO_USERSPACE, sbp=0xfffffe00cc0ef8e8, hook=0x0)
>     at /usr/src/sys/kern/vfs_syscalls.c:2340
> #19 0xffffffff80ca28cf in sys_fstatat (td=0xffffffff81c8db28 <cnputs_mtx>,
>     uap=0xfffffe00cc61fbd8) at /usr/src/sys/kern/vfs_syscalls.c:2317
> #20 0xffffffff8105caa0 in syscallenter (td=<optimized out>)
>     at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:162
> #21 amd64_syscall (td=0xfffffe00cc61f800, traced=0)
>     at /usr/src/sys/amd64/amd64/trap.c:1161
> #22 <signal handler called>
> #23 0x00000008006c2aaa in ?? ()
> Backtrace stopped: Cannot access memory at address 0x7fffffffda78
> (kgdb)

I believe that in your case, the file is stat(2)-ed without being opened
before, and might be because the stat(2) is called twice, while file size
is changed, fusefs wants to resize.

For such vnodes, v_object is not created yet, so getblk() sets B_CACHE
because both B_VMIO and B_INVAL are clear.  So it might be worth just do
nothing if v_object is NULL even if newsize < oldsize.

What I do not understand is why vfs_bio_clrbuf() is correct there at all.
It has block granularity, so either we clear too much or too little.
And the data after EOF in the last buffer is not going to be used anyway.
At worst, it might cause userspace which mapped the backing pages to still
see the data after EOF, and if this is a problem, then manual bzero() is
needed anyway.