Re: Possible deadlock on IO / page fault

From: Konstantin Belousov <kostikbel_at_gmail.com> Date: Tue, 29 Sep 2020 16:20:26 +0300 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:25 UTC

On Tue, Sep 29, 2020 at 02:59:43PM +0300, Michael Zhilin wrote:
> Hi,
> 
> I'm using FreeBSD 13-CURRENT (pre-ZoF, r359724) on my laptop with installed
> Gnome. Sometimes
> (once a week/month) gnome hangs and the system may be still responsible
> (may be not).
> This week it happened again and I've gathered information via ddb/textdump
> and rebooted laptop.
> 
> gnome-shell is trying to get exclusive lock on some directory according to
> information
> from "show alllocks" and "bt":
> 
>     Process 2355 (gnome-shell) thread 0xfffffe00b03d4700 (100483)
>     exclusive lockmgr zfs (zfs) r = 0 (0xfffff802112fe808) locked _at_
> /usr/src/sys/kern/vfs_subr.c:2930
> 
>     Tracing command gnome-shell pid 2355 tid 100483 td 0xfffffe00b03d4700
>     sched_switch() at sched_switch+0x5b2/frame 0xfffffe00aeff7220
>     mi_switch() at mi_switch+0x155/frame 0xfffffe00aeff7240
>     sleepq_switch() at sleepq_switch+0x11a/frame 0xfffffe00aeff7280
>     sleeplk() at sleeplk+0x106/frame 0xfffffe00aeff72e0
>     lockmgr_xlock_hard() at lockmgr_xlock_hard+0x21f/frame
> 0xfffffe00aeff7380
>     _vn_lock() at _vn_lock+0x54/frame 0xfffffe00aeff73e0
>     zfs_lookup() at zfs_lookup+0x5f7/frame 0xfffffe00aeff74c0
>     zfs_freebsd_cachedlookup() at zfs_freebsd_cachedlookup+0x8e/frame
> 0xfffffe00aeff7600
>     vfs_cache_lookup() at vfs_cache_lookup+0xa8/frame 0xfffffe00aeff7650
>     lookup() at lookup+0x5e1/frame 0xfffffe00aeff76f0
>     namei() at namei+0x524/frame 0xfffffe00aeff77e0
>     vn_open_cred() at vn_open_cred+0x10b/frame 0xfffffe00aeff7930
>     kern_openat() at kern_openat+0x1fa/frame 0xfffffe00aeff7aa0
>     filemon_wrapper_openat() at filemon_wrapper_openat+0x15/frame
> 0xfffffe00aeff7ad0
>     amd64_syscall() at amd64_syscall+0x140/frame 0xfffffe00aeff7bf0
>     fast_syscall_common() at fast_syscall_common+0x101/frame
> 0xfffffe00aeff7bf0
>     --- syscall (499, FreeBSD ELF64, filemon_wrapper_openat), rip =
> 0x80148dbca, rsp = 0x7fffffffe248, rbp = 0x7fffffffe2c0 ---
> 
> Line vfs_subr.c:2930 mentioned by lockmgr is part of vget_finish. I've
> found blocked stack
> with shared lock request:
> 
>     Tracing command gsd-color pid 2422 tid 100784 td 0xfffffe00b08e0700
>     sched_switch() at sched_switch+0x5b2/frame 0xfffffe00b07084f0
>     mi_switch() at mi_switch+0x155/frame 0xfffffe00b0708510
>     sleepq_switch() at sleepq_switch+0x11a/frame 0xfffffe00b0708550
>     sleeplk() at sleeplk+0x106/frame 0xfffffe00b07085b0
>     lockmgr_slock_hard() at lockmgr_slock_hard+0x1ce/frame
> 0xfffffe00b0708640
>     _vn_lock() at _vn_lock+0x54/frame 0xfffffe00b07086a0
>     vget_finish() at vget_finish+0x42/frame 0xfffffe00b07086d0
>     cache_lookup() at cache_lookup+0x57c/frame 0xfffffe00b0708790
>     vfs_cache_lookup() at vfs_cache_lookup+0x7d/frame 0xfffffe00b07087e0
>     lookup() at lookup+0x5e1/frame 0xfffffe00b0708880
>     namei() at namei+0x524/frame 0xfffffe00b0708970
>     kern_accessat() at kern_accessat+0x106/frame 0xfffffe00b0708ad0
>     amd64_syscall() at amd64_syscall+0x140/frame 0xfffffe00b0708bf0
>     fast_syscall_common() at fast_syscall_common+0x101/frame
> 0xfffffe00b0708bf0
>     --- syscall (33, FreeBSD ELF64, sys_access), rip = 0x800fd0eba, rsp =
> 0x7fffffffe268, rbp = 0x7fffffffe380 ---
> 
> All other locked threads look more interesting. Thread 100747 is trying to
> write
> into file and waiting for page, but thread 101436 is trying to handle page
> fault
> and waiting for a shared lock of locked_range due to present writer. I
> suppose it's
> deadlock caused all troubles.
> 
>     Tracing command dconf-service pid 2384 tid 100747 td 0xfffffe00b08cc000
>     sched_switch() at sched_switch+0x5b2/frame 0xfffffe00b0d762e0
>     mi_switch() at mi_switch+0x155/frame 0xfffffe00b0d76300
>     sleepq_switch() at sleepq_switch+0x11a/frame 0xfffffe00b0d76340
>     _vm_page_busy_sleep() at _vm_page_busy_sleep+0x110/frame
> 0xfffffe00b0d76390
>     vm_page_acquire_unlocked() at vm_page_acquire_unlocked+0x177/frame
> 0xfffffe00b0d763f0
>     vm_page_grab_valid_unlocked() at vm_page_grab_valid_unlocked+0x51/frame
> 0xfffffe00b0d76430
>     zfs_freebsd_write() at zfs_freebsd_write+0x9b6/frame 0xfffffe00b0d76640
>     VOP_WRITE_APV() at VOP_WRITE_APV+0xa7/frame 0xfffffe00b0d76750
>     vn_write() at vn_write+0x2a4/frame 0xfffffe00b0d767d0
>     vn_io_fault_doio() at vn_io_fault_doio+0x43/frame 0xfffffe00b0d76830
>     vn_io_fault1() at vn_io_fault1+0x16c/frame 0xfffffe00b0d76980
>     vn_io_fault() at vn_io_fault+0x182/frame 0xfffffe00b0d769f0
>     dofilewrite() at dofilewrite+0x81/frame 0xfffffe00b0d76a40
>     kern_pwritev() at kern_pwritev+0x62/frame 0xfffffe00b0d76a80
>     sys_pwrite() at sys_pwrite+0x8a/frame 0xfffffe00b0d76ad0
>     amd64_syscall() at amd64_syscall+0x140/frame 0xfffffe00b0d76bf0
>     fast_syscall_common() at fast_syscall_common+0x101/frame
> 0xfffffe00b0d76bf0
>     --- syscall (476, FreeBSD ELF64, sys_pwrite), rip = 0x8007428ca, rsp =
> 0x7fffffffdf08, rbp = 0x7fffffffdf20 ---
> 
>     Tracing command evolution pid 4536 tid 101436 td 0xfffffe00bf484c00
>     sched_switch() at sched_switch+0x5b2/frame 0xfffffe00bfd446e0
>     mi_switch() at mi_switch+0x155/frame 0xfffffe00bfd44700
>     sleepq_switch() at sleepq_switch+0x11a/frame 0xfffffe00bfd44740
>     _cv_wait() at _cv_wait+0x15a/frame 0xfffffe00bfd447a0
>     rangelock_enter() at rangelock_enter+0x306/frame 0xfffffe00bfd447f0
This call to rangelock_enter() looks suspicious.  This is a call to ZFS
own rangelocks, not our rangelocks.  Still, if write took rangelock on the
same range, we get a deadlock due to LoR between rangelock and page busy.

>     zfs_freebsd_getpages() at zfs_freebsd_getpages+0x14f/frame
> 0xfffffe00bfd448a0
>     vnode_pager_getpages() at vnode_pager_getpages+0x37/frame
> 0xfffffe00bfd448e0
>     vm_pager_get_pages() at vm_pager_get_pages+0x4f/frame 0xfffffe00bfd44930
>     vm_fault() at vm_fault+0x780/frame 0xfffffe00bfd44a40
>     vm_fault_trap() at vm_fault_trap+0x6e/frame 0xfffffe00bfd44a80
>     trap_pfault() at trap_pfault+0x1ee/frame 0xfffffe00bfd44ae0
>     trap() at trap+0x44c/frame 0xfffffe00bfd44bf0
>     calltrap() at calltrap+0x8/frame 0xfffffe00bfd44bf0
>     --- trap 0xc, rip = 0x80a55de3f, rsp = 0x7fffffffcc60, rbp =
> 0x7fffffffcc60 ---
> 
> If it happens again, I will try to gather more information about locks,
> corresponding
> files and directories. For a while I wonder how to reproduce this case. For
> instance,
> one program has mmapped huge file and reads it infinitely, another program
> write random
> data in the middle of the file. But it's unclear how to trigger page fault
> manually.
> As I understand, kernel provides "vm_page_set_invalid" to mark it invalid.
> Is there
> any way to invalidate a page of backed object from userland?
> 
> I will appreciate any advice on how to troubleshoot this issue.
> 
> The full output of ddb is here: https://people.freebsd.org/~mizhka/ddb.txt
> Thanks!
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"