Trivial repro: dd if=/dev/zero of=blah & tail -F blah ^C load: 0.21 cmd: tail 72381 [prev->lr_read_cv] 2.17r 0.00u 0.01s 0% 2600k #0 0xffffffff80bce615 at mi_switch+0x155 #1 0xffffffff80c1cfea at sleepq_switch+0x11a #2 0xffffffff80b57f0a at _cv_wait+0x15a #3 0xffffffff829ddab6 at rangelock_enter+0x306 #4 0xffffffff829ecd3f at zfs_freebsd_getpages+0x14f #5 0xffffffff810e3ab9 at VOP_GETPAGES_APV+0x59 #6 0xffffffff80f349e7 at vnode_pager_getpages+0x37 #7 0xffffffff80f2a93f at vm_pager_get_pages+0x4f #8 0xffffffff80f054b0 at vm_fault+0x780 #9 0xffffffff80f04bde at vm_fault_trap+0x6e #10 0xffffffff8106544e at trap_pfault+0x1ee #11 0xffffffff81064a9c at trap+0x44c #12 0xffffffff8103a978 at calltrap+0x8 On 5/12/2020 3:13 PM, Bryan Drewery wrote: >> panic: deadlres_td_sleep_q: possible deadlock detected for 0xfffffe25eefa2e00 (find), blocked for 1802392 ticks > > 2 stuck processes from procstat -kk before panic >> 72559 101698 tail - mi_switch+0x155 sleepq_switch+0x11a _cv_wait+0x15a rangelock_enter+0x306 zfs_freebsd_getpages+0x14f VOP_GETPAGES_APV+0x59 vnode_pager_getpages+0x37 vm_pager_get_pages+0x4f vm_fault+0x780 vm_fault_trap+0x6e trap_pfault+0x1ee > >> 72985 107378 find - mi_switch+0x155 sleepq_switch+0x11a sleeplk+0x106 lockmgr_slock_hard+0x1f5 VOP_LOCK1_APV+0x40 _vn_lock+0x54 lookup+0xdd namei+0x524 vn_open_cred+0x32b kern_openat+0x1fa filemon_wrapper_openat+0x15 amd64_syscall+0x73d > > > The only find running was thread 107378 > > I couldn't record much from ddb but got locked vnodes. >> >> db> show lockedvnods >> Locked vnodes >> vnode 0xfffff804de66e500: type VDIR >> usecount 3, writecount 0, refcount 2 mountedhere 0 >> flags () >> v_object 0xfffff809459cb420 ref 0 pages 0 cleanbuf 0 dirtybuf 0 >> lock type zfs: SHARED (count 1) >> #0 0xffffffff80b94a0f at lockmgr_slock+0xdf >> #1 0xffffffff810e2a40 at VOP_LOCK1_APV+0x40 >> #2 0xffffffff80cb14f4 at _vn_lock+0x54 >> #3 0xffffffff80c9b3ec at vget_finish+0x6c >> #4 0xffffffff80c8051c at cache_lookup+0x57c >> #5 0xffffffff80c84dad at vfs_cache_lookup+0x7d >> #6 0xffffffff810df996 at VOP_LOOKUP_APV+0x56 >> #7 0xffffffff80c8ee61 at lookup+0x601 >> #8 0xffffffff80c8e374 at namei+0x524 >> #9 0xffffffff80caa83f at kern_statat+0x7f >> #10 0xffffffff80caafff at sys_fstatat+0x2f >> #11 0xffffffff81065c40 at amd64_syscall+0x140 >> #12 0xffffffff8103b2a0 at fast_syscall_common+0x101 >> vnode 0xfffff808a08f0a00: type VDIR >> usecount 6, writecount 0, refcount 2 mountedhere 0 >> flags () >> v_object 0xfffff801eb930000 ref 0 pages 0 cleanbuf 0 dirtybuf 0 >> lock type zfs: EXCL by thread 0xfffffe24aadb6100 (pid 72267, gmake, tid 104356) >> with shared waiters pending >> #0 0xffffffff80b94a0f at lockmgr_slock+0xdf >> #1 0xffffffff810e2a40 at VOP_LOCK1_APV+0x40 >> #2 0xffffffff80cb14f4 at _vn_lock+0x54 >> #3 0xffffffff80c8e93d at lookup+0xdd >> #4 0xffffffff80c8e374 at namei+0x524 >> #5 0xffffffff80ca9e69 at kern_funlinkat+0xa9 >> #6 0xffffffff80ca9db8 at sys_unlink+0x28 >> #7 0xffffffff82780586 at filemon_wrapper_unlink+0x16 >> #8 0xffffffff8106623d at amd64_syscall+0x73d >> #9 0xffffffff8103b2a0 at fast_syscall_common+0x101 >> >> vnode 0xfffff80571f29500: type VREG >> usecount 6, writecount 1, refcount 2 >> flags () >> v_object 0xfffff806cb637c60 ref 2 pages 1 cleanbuf 0 dirtybuf 0 >> lock type zfs: SHARED (count 2) >> with exclusive waiters pending >> #0 0xffffffff80b94a0f at lockmgr_slock+0xdf >> #1 0xffffffff810e2a40 at VOP_LOCK1_APV+0x40 >> #2 0xffffffff80cb14f4 at _vn_lock+0x54 >> #3 0xffffffff8243af40 at zfs_lookup+0x610 >> #4 0xffffffff8243b61e at zfs_freebsd_cachedlookup+0x8e >> #5 0xffffffff810dfb46 at VOP_CACHEDLOOKUP_APV+0x56 >> #6 0xffffffff80c84dd8 at vfs_cache_lookup+0xa8 >> #7 0xffffffff810df996 at VOP_LOOKUP_APV+0x56 >> #8 0xffffffff80c8ee61 at lookup+0x601 >> #9 0xffffffff80c8e374 at namei+0x524 >> #10 0xffffffff80caa83f at kern_statat+0x7f >> #11 0xffffffff80caafff at sys_fstatat+0x2f >> #12 0xffffffff8106623d at amd64_syscall+0x73d >> #13 0xffffffff8103b2a0 at fast_syscall_common+0x101 > > It's nice how recent threads are at the top in gdb... >> (kgdb) info threads >> Id Target Id Frame >> 1 Thread 107952 (PID=79390: zfs) sched_switch (td=0xfffffe26ebb36000, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 2 Thread 102764 (PID=73218: zfs) sched_switch (td=0xfffffe2490a12300, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 3 Thread 107378 (PID=72985: find) sched_switch (td=0xfffffe25eefa2e00, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 4 Thread 103940 (PID=72980: rm) sched_switch (td=0xfffffe2451932500, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 5 Thread 101698 (PID=72559: tail) sched_switch (td=0xfffffe255eac0000, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 6 Thread 103660 (PID=72280: timestamp) sched_switch (td=0xfffffe25f948aa00, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 7 Thread 101249 (PID=72280: timestamp/prefix_stdout) sched_switch (td=0xfffffe264412a100, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 8 Thread 101255 (PID=72280: timestamp/prefix_stderr) sched_switch (td=0xfffffe25c8e9bc00, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 9 Thread 104356 (PID=72267: gmake) sched_switch (td=0xfffffe24aadb6100, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 10 Thread 108476 (PID=66957: vim) sched_switch (td=0xfffffe26c8601500, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 > > The 2 threads holding shared lock on vnode 0xfffff80571f29500: > > The tail thread (101698) is waiting for a zfs rangelock getting pages > for vnode 0xfffff80571f29500 > >> (kgdb) thread 5 >> [Switching to thread 5 (Thread 101698)] >> #0 sched_switch (td=0xfffffe255eac0000, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 2147 cpuid = td->td_oncpu = PCPU_GET(cpuid); >> (kgdb) backtrace >> #0 sched_switch (td=0xfffffe255eac0000, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> #1 0xffffffff80bce615 in mi_switch (flags=260) at /usr/src/sys/kern/kern_synch.c:542 >> #2 0xffffffff80c1cfea in sleepq_switch (wchan=0xfffff810fb57dd48, pri=0) at /usr/src/sys/kern/subr_sleepqueue.c:625 >> #3 0xffffffff80b57f0a in _cv_wait (cvp=0xfffff810fb57dd48, lock=0xfffff80049a99040) at /usr/src/sys/kern/kern_condvar.c:146 >> #4 0xffffffff82434ab6 in rangelock_enter_reader (rl=0xfffff80049a99018, new=0xfffff8022cadb100) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:429 >> #5 rangelock_enter (rl=0xfffff80049a99018, off=<optimized out>, len=<optimized out>, type=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_rlock.c:477 >> #6 0xffffffff82443d3f in zfs_getpages (vp=<optimized out>, ma=0xfffffe259f204b18, count=<optimized out>, rbehind=0xfffffe259f204ac4, rahead=0xfffffe259f204ad0) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4500 >> #7 zfs_freebsd_getpages (ap=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4567 >> #8 0xffffffff810e3ab9 in VOP_GETPAGES_APV (vop=0xffffffff8250a1e0 <zfs_vnodeops>, a=0xfffffe259f2049f0) at vnode_if.c:2644 >> #9 0xffffffff80f349e7 in VOP_GETPAGES (vp=<unavailable>, m=<optimized out>, count=<unavailable>, rbehind=<unavailable>, rahead=<unavailable>) at ./vnode_if.h:1171 >> #10 vnode_pager_getpages (object=<optimized out>, m=<optimized out>, count=<unavailable>, rbehind=<unavailable>, rahead=<unavailable>) at /usr/src/sys/vm/vnode_pager.c:743 >> #11 0xffffffff80f2a93f in vm_pager_get_pages (object=0xfffff806cb637c60, m=0xfffffe259f204b18, count=1, rbehind=<unavailable>, rahead=<unavailable>) at /usr/src/sys/vm/vm_pager.c:305 >> #12 0xffffffff80f054b0 in vm_fault_getpages (fs=<optimized out>, nera=0, behindp=<optimized out>, aheadp=<optimized out>) at /usr/src/sys/vm/vm_fault.c:1163 >> #13 vm_fault (map=<optimized out>, vaddr=<optimized out>, fault_type=<optimized out>, fault_flags=<optimized out>, m_hold=<optimized out>) at /usr/src/sys/vm/vm_fault.c:1394 >> #14 0xffffffff80f04bde in vm_fault_trap (map=0xfffffe25653949e8, vaddr=<optimized out>, fault_type=<optimized out>, fault_flags=0, signo=0xfffffe259f204d04, ucode=0xfffffe259f204d00) at /usr/src/sys/vm/vm_fault.c:589 >> #15 0xffffffff8106544e in trap_pfault (frame=0xfffffe259f204d40, usermode=<optimized out>, signo=<optimized out>, ucode=<unavailable>) at /usr/src/sys/amd64/amd64/trap.c:821 >> #16 0xffffffff81064a9c in trap (frame=0xfffffe259f204d40) at /usr/src/sys/amd64/amd64/trap.c:340 >> #17 <signal handler called> >> #18 0x00000000002034fc in ?? () >> (kgdb) frame 11 >> #11 0xffffffff80f2a93f in vm_pager_get_pages (object=0xfffff806cb637c60, m=0xfffffe259f204b18, count=1, rbehind=<unavailable>, rahead=<unavailable>) at /usr/src/sys/vm/vm_pager.c:305 >> 305 r = (*pagertab[object->type]->pgo_getpages)(object, m, count, rbehind, >> (kgdb) p *object >> $10 = {lock = {lock_object = {lo_name = 0xffffffff8114fa30 "vm object", lo_flags = 627245056, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, object_list = {tqe_next = 0xfffff806cb637d68, tqe_prev = 0xfffff806cb637b78}, shadow_head = {lh_first = 0x0}, shadow_list = {le_next = 0xffffffffffffffff, >> le_prev = 0xffffffffffffffff}, memq = {tqh_first = 0xfffffe001cbca850, tqh_last = 0xfffffe001cbca860}, rtree = {rt_root = 18446741875168421969}, size = 1099, domain = {dr_policy = 0x0, dr_iter = 0}, generation = 1, cleangeneration = 1, ref_count = 2, shadow_count = 0, memattr = 6 '\006', type = 2 '\002', >> flags = 4096, pg_color = 0, paging_in_progress = {__count = 2}, busy = {__count = 0}, resident_page_count = 1, backing_object = 0x0, backing_object_offset = 0, pager_object_list = {tqe_next = 0x0, tqe_prev = 0x0}, rvq = {lh_first = 0x0}, handle = 0xfffff80571f29500, un_pager = {vnp = {vnp_size = 4499568, >> writemappings = 0}, devp = {devp_pglist = {tqh_first = 0x44a870, tqh_last = 0x0}, ops = 0x0, dev = 0x0}, sgp = {sgp_pglist = {tqh_first = 0x44a870, tqh_last = 0x0}}, swp = {swp_tmpfs = 0x44a870, swp_blks = {pt_root = 0}, writemappings = 0}}, cred = 0x0, charge = 0, umtx_data = 0x0} >> (kgdb) p object->handle >> $11 = (void *) 0xfffff80571f29500 > >> (kgdb) p *(struct vnode *) 0xfffff80571f29500 >> $18 = {v_type = VREG, v_irflag = 0, v_op = 0xffffffff8250a1e0 <zfs_vnodeops>, v_data = 0xfffff80049a99000, v_mount = 0xfffffe247e5f3700, v_nmntvnodes = {tqe_next = 0xfffff8086eb38a00, tqe_prev = 0xfffff80461c2d7a0}, {v_mountedhere = 0x0, v_unpcb = 0x0, v_rdev = 0x0, v_fifoinfo = 0x0}, v_hashlist = {le_next = 0x0, >> le_prev = 0x0}, v_cache_src = {lh_first = 0x0}, v_cache_dst = {tqh_first = 0x0, tqh_last = 0xfffff80571f29550}, v_cache_dd = 0x0, v_lock = {lock_object = {lo_name = 0xffffffff82486a37 "zfs", lo_flags = 117112832, lo_data = 0, lo_witness = 0x0}, lk_lock = 37, lk_exslpfail = 0, lk_timo = 51, lk_pri = 96, >> lk_stack = {depth = 14, pcs = {18446744071574211087, 18446744071579773504, 18446744071575377140, 18446744071600058176, 18446744071600059934, 18446744071579761478, 18446744071575195096, 18446744071579761046, 18446744071575236193, 18446744071575233396, 18446744071575349311, 18446744071575351295, >> 18446744071579263549, 18446744071579087520, 0, 0, 0, 0}}}, v_interlock = {lock_object = {lo_name = 0xffffffff8123c142 "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0xfffff8123fd73600}, mtx_lock = 0}, v_vnlock = 0xfffff80571f29568, v_vnodelist = {tqe_next = 0xfffff8064bd0dc80, >> tqe_prev = 0xfffff80e250788d8}, v_lazylist = {tqe_next = 0x0, tqe_prev = 0x0}, v_bufobj = {bo_lock = {lock_object = {lo_name = 0xffffffff811fb7ab "bufobj interlock", lo_flags = 86179840, lo_data = 0, lo_witness = 0xfffff8123fd7dd80}, rw_lock = 1}, bo_ops = 0xffffffff8191ead0 <buf_ops_bio>, >> bo_object = 0xfffff806cb637c60, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xfffff80571f29500, bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff80571f296c0}, bv_root = {pt_root = 0}, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xfffff80571f296e0}, bv_root = {pt_root = 0}, >> bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, bo_domain = 5, bo_bsize = 131072}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0x0, v_rl = {rl_waiters = {tqh_first = 0xfffff80f2cc12708, tqh_last = 0xfffff80f2cc12708}, rl_currdep = 0x0}, v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0, v_holdcnt = 2, >> v_usecount = 6, v_iflag = 0, v_vflag = 0, v_mflag = 0, v_dbatchcpu = -1, v_writecount = 1, v_hash = 45676874} > > Is that thread busying the vm object? > > > thread 101255 (timestamp/prefix_stderr) which is also acting on vnode > 0xfffff80571f29500 that the tail thread 101698 was. > >> (kgdb) thread >> [Current thread is 8 (Thread 101255)] >> (kgdb) backtrace >> #0 sched_switch (td=0xfffffe25c8e9bc00, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> #1 0xffffffff80bce615 in mi_switch (flags=260) at /usr/src/sys/kern/kern_synch.c:542 >> #2 0xffffffff80c1cfea in sleepq_switch (wchan=0xfffffe001cbca850, pri=84) at /usr/src/sys/kern/subr_sleepqueue.c:625 >> #3 0xffffffff80f1de50 in _vm_page_busy_sleep (obj=<optimized out>, m=0xfffffe001cbca850, pindex=<optimized out>, wmesg=<optimized out>, allocflags=21504, locked=false) at /usr/src/sys/vm/vm_page.c:1094 >> #4 0xffffffff80f241f7 in vm_page_grab_sleep (object=0xfffff806cb637c60, m=<optimized out>, pindex=<optimized out>, wmesg=<optimized out>, allocflags=21504, locked=<error reading variable: Cannot access memory at address 0x0>) at /usr/src/sys/vm/vm_page.c:4326 >> #5 vm_page_acquire_unlocked (object=0xfffff806cb637c60, pindex=1098, prev=<optimized out>, mp=0xfffffe2717fc6730, allocflags=21504) at /usr/src/sys/vm/vm_page.c:4469 >> #6 0xffffffff80f24c61 in vm_page_grab_valid_unlocked (mp=0xfffffe2717fc6730, object=0xfffff806cb637c60, pindex=1098, allocflags=21504) at /usr/src/sys/vm/vm_page.c:4645 >> #7 0xffffffff82440246 in page_busy (vp=0xfffff80571f29500, start=4497408, off=<optimized out>, nbytes=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:414 >> #8 update_pages (vp=0xfffff80571f29500, start=4497408, len=32, os=0xfffff8096a277400, oid=2209520, segflg=<optimized out>, tx=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:482 >> #9 zfs_write (vp=<optimized out>, uio=<optimized out>, ioflag=0, cr=<optimized out>, ct=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1071 >> #10 zfs_freebsd_write (ap=<optimized out>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4838 >> #11 0xffffffff810e0eaf in VOP_WRITE_APV (vop=0xffffffff8250a1e0 <zfs_vnodeops>, a=0xfffffe2717fc68c8) at vnode_if.c:925 >> #12 0xffffffff80cb574c in VOP_WRITE (vp=0xfffff80571f29500, uio=0xfffffe2717fc6bb0, ioflag=8323073, cred=<optimized out>) at ./vnode_if.h:413 >> #13 vn_write (fp=0xfffff8048195e8c0, uio=<optimized out>, active_cred=<optimized out>, flags=<optimized out>, td=<optimized out>) at /usr/src/sys/kern/vfs_vnops.c:894 >> #14 0xffffffff80cb50c3 in vn_io_fault_doio (args=0xfffffe2717fc6af0, uio=0xfffffe2717fc6bb0, td=0xfffffe25c8e9bc00) at /usr/src/sys/kern/vfs_vnops.c:959 >> #15 0xffffffff80cb1c8c in vn_io_fault1 (vp=<optimized out>, uio=0xfffffe2717fc6bb0, args=0xfffffe2717fc6af0, td=0xfffffe25c8e9bc00) at /usr/src/sys/kern/vfs_vnops.c:1077 >> #16 0xffffffff80cafa32 in vn_io_fault (fp=0xfffff8048195e8c0, uio=0xfffffe2717fc6bb0, active_cred=0xfffff80f2cc12708, flags=0, td=<optimized out>) at /usr/src/sys/kern/vfs_vnops.c:1181 >> #17 0xffffffff80c34331 in fo_write (fp=0xfffff8048195e8c0, uio=0xfffffe2717fc6bb0, active_cred=<unavailable>, flags=<unavailable>, td=0xfffffe25c8e9bc00) at /usr/src/sys/sys/file.h:326 >> #18 dofilewrite (td=0xfffffe25c8e9bc00, fd=2, fp=0xfffff8048195e8c0, auio=0xfffffe2717fc6bb0, offset=<optimized out>, flags=<optimized out>) at /usr/src/sys/kern/sys_generic.c:564 >> #19 0xffffffff80c33eb0 in kern_writev (td=0xfffffe25c8e9bc00, fd=2, auio=<optimized out>) at /usr/src/sys/kern/sys_generic.c:491 >> #20 sys_write (td=0xfffffe25c8e9bc00, uap=<optimized out>) at /usr/src/sys/kern/sys_generic.c:406 >> #21 0xffffffff8106623d in syscallenter (td=<optimized out>) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:150 >> #22 amd64_syscall (td=0xfffffe25c8e9bc00, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1161 >> #23 <signal handler called> >> #24 0x000000080043d53a in ?? () > > Maybe r358443 is related? > > >> (kgdb) frame 4 >> #4 0xffffffff80f241f7 in vm_page_grab_sleep (object=0xfffff806cb637c60, m=<optimized out>, pindex=<optimized out>, wmesg=<optimized out>, allocflags=21504, locked=<error reading variable: Cannot access memory at address 0x0>) at /usr/src/sys/vm/vm_page.c:4326 >> 4326 if (_vm_page_busy_sleep(object, m, m->pindex, wmesg, allocflags, >> (kgdb) p *object >> $8 = {lock = {lock_object = {lo_name = 0xffffffff8114fa30 "vm object", lo_flags = 627245056, lo_data = 0, lo_witness = 0x0}, rw_lock = 1}, object_list = {tqe_next = 0xfffff806cb637d68, tqe_prev = 0xfffff806cb637b78}, shadow_head = {lh_first = 0x0}, shadow_list = {le_next = 0xffffffffffffffff, >> le_prev = 0xffffffffffffffff}, memq = {tqh_first = 0xfffffe001cbca850, tqh_last = 0xfffffe001cbca860}, rtree = {rt_root = 18446741875168421969}, size = 1099, domain = {dr_policy = 0x0, dr_iter = 0}, generation = 1, cleangeneration = 1, ref_count = 2, shadow_count = 0, memattr = 6 '\006', type = 2 '\002', >> flags = 4096, pg_color = 0, paging_in_progress = {__count = 2}, busy = {__count = 0}, resident_page_count = 1, backing_object = 0x0, backing_object_offset = 0, pager_object_list = {tqe_next = 0x0, tqe_prev = 0x0}, rvq = {lh_first = 0x0}, handle = 0xfffff80571f29500, un_pager = {vnp = {vnp_size = 4499568, >> writemappings = 0}, devp = {devp_pglist = {tqh_first = 0x44a870, tqh_last = 0x0}, ops = 0x0, dev = 0x0}, sgp = {sgp_pglist = {tqh_first = 0x44a870, tqh_last = 0x0}}, swp = {swp_tmpfs = 0x44a870, swp_blks = {pt_root = 0}, writemappings = 0}}, cred = 0x0, charge = 0, umtx_data = 0x0} >> (kgdb) frame 5 >> #5 vm_page_acquire_unlocked (object=0xfffff806cb637c60, pindex=1098, prev=<optimized out>, mp=0xfffffe2717fc6730, allocflags=21504) at /usr/src/sys/vm/vm_page.c:4469 >> 4469 if (!vm_page_grab_sleep(object, m, pindex, "pgnslp", >> (kgdb) p *m >> $9 = {plinks = {q = {tqe_next = 0xffffffffffffffff, tqe_prev = 0xffffffffffffffff}, s = {ss = {sle_next = 0xffffffffffffffff}}, memguard = {p = 18446744073709551615, v = 18446744073709551615}, uma = {slab = 0xffffffffffffffff, zone = 0xffffffffffffffff}}, listq = {tqe_next = 0x0, tqe_prev = 0xfffff806cb637ca8}, >> object = 0xfffff806cb637c60, pindex = 1098, phys_addr = 18988408832, md = {pv_list = {tqh_first = 0x0, tqh_last = 0xfffffe001cbca888}, pv_gen = 44682, pat_mode = 6}, ref_count = 2147483648, busy_lock = 1588330502, a = {{flags = 0, queue = 255 '\377', act_count = 0 '\000'}, _bits = 16711680}, order = 13 '\r', >> pool = 0 '\000', flags = 1 '\001', oflags = 0 '\000', psind = 0 '\000', segind = 6 '\006', valid = 0 '\000', dirty = 0 '\000'} > > Pretty sure this thread is holding the rangelock from zfs_write() that > tail is waiting on. So what is this thread (101255) waiting on exactly > for? I'm not sure the way to track down what is using vm object > 0xfffff806cb637c60. If the tail thread busied the page then they are > waiting on each other I guess. If that's true then r358443 removing the > write lock on the object in update_pages() could be a problem. > > > Not sure the rest is interesting. I think they are just waiting on the > locked vnode but I give it here in case I missed something. > > thread 101249 (timestamp/prefix_stdout) is also acting on vnode > 0xfffff80571f29500 > >> (kgdb) thread 7 >> [Switching to thread 7 (Thread 101249)] >> #0 sched_switch (td=0xfffffe264412a100, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 2147 cpuid = td->td_oncpu = PCPU_GET(cpuid); >> (kgdb) backtrace >> #0 sched_switch (td=0xfffffe264412a100, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> #1 0xffffffff80bce615 in mi_switch (flags=260) at /usr/src/sys/kern/kern_synch.c:542 >> #2 0xffffffff80c1cfea in sleepq_switch (wchan=0xfffff8048195e8e2, pri=119) at /usr/src/sys/kern/subr_sleepqueue.c:625 >> #3 0xffffffff80bcdb6d in _sleep (ident=0xfffff8048195e8e2, lock=<optimized out>, priority=119, wmesg=0xffffffff8123c694 "vofflock", sbt=<optimized out>, pr=0, flags=256) at /usr/src/sys/kern/kern_synch.c:221 >> #4 0xffffffff80cb203a in foffset_lock (fp=0xfffff8048195e8c0, flags=<optimized out>) at /usr/src/sys/kern/vfs_vnops.c:700 >> #5 0xffffffff80caf909 in foffset_lock_uio (fp=<optimized out>, uio=<optimized out>, flags=<optimized out>) at /usr/src/sys/kern/vfs_vnops.c:748 >> #6 vn_io_fault (fp=0xfffff8048195e8c0, uio=0xfffffe2719d9cbb0, active_cred=0xfffff80786ecad00, flags=0, td=0xfffffe264412a100) at /usr/src/sys/kern/vfs_vnops.c:1163 >> #7 0xffffffff80c34331 in fo_write (fp=0xfffff8048195e8c0, uio=0xfffffe2719d9cbb0, active_cred=<unavailable>, flags=<unavailable>, td=0xfffffe264412a100) at /usr/src/sys/sys/file.h:326 >> #8 dofilewrite (td=0xfffffe264412a100, fd=1, fp=0xfffff8048195e8c0, auio=0xfffffe2719d9cbb0, offset=<optimized out>, flags=<optimized out>) at /usr/src/sys/kern/sys_generic.c:564 >> #9 0xffffffff80c33eb0 in kern_writev (td=0xfffffe264412a100, fd=1, auio=<optimized out>) at /usr/src/sys/kern/sys_generic.c:491 >> #10 sys_write (td=0xfffffe264412a100, uap=<optimized out>) at /usr/src/sys/kern/sys_generic.c:406 >> #11 0xffffffff8106623d in syscallenter (td=<optimized out>) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:150 >> #12 amd64_syscall (td=0xfffffe264412a100, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1161 >> #13 <signal handler called> >> #14 0x000000080043d53a in ?? () >> Backtrace stopped: Cannot access memory at address 0x7fffdfffddd8 >> (kgdb) frame 6 >> #6 vn_io_fault (fp=0xfffff8048195e8c0, uio=0xfffffe2719d9cbb0, active_cred=0xfffff80786ecad00, flags=0, td=0xfffffe264412a100) at /usr/src/sys/kern/vfs_vnops.c:1163 >> 1163 foffset_lock_uio(fp, uio, flags); >> (kgdb) p *fp >> $22 = {f_data = 0xfffff80571f29500, f_ops = 0xffffffff81923a10 <vnops>, f_cred = 0xfffff80786ecad00, f_vnode = 0xfffff80571f29500, f_type = 1, f_vnread_flags = 3, f_flag = 2, f_count = 4, {f_seqcount = 127, f_pipegen = 127}, f_nextoff = 4499536, f_vnun = {fvn_cdevpriv = 0x0, fvn_advice = 0x0}, f_offset = 4499536, >> f_label = 0x0} > > thread 104356 (gmake) is just waiting on the lock for vnode > 0xfffff80571f29500 > It also is holding exclusive lock on directory vnode 0xfffff808a08f0a00 > >> (kgdb) thread 9 >> [Switching to thread 9 (Thread 104356)] >> #0 sched_switch (td=0xfffffe24aadb6100, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 2147 cpuid = td->td_oncpu = PCPU_GET(cpuid); >> (kgdb) backtrace >> #0 sched_switch (td=0xfffffe24aadb6100, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> #1 0xffffffff80bce615 in mi_switch (flags=260) at /usr/src/sys/kern/kern_synch.c:542 >> #2 0xffffffff80c1cfea in sleepq_switch (wchan=0xfffff80571f29568, pri=96) at /usr/src/sys/kern/subr_sleepqueue.c:625 >> #3 0xffffffff80b954f6 in sleeplk (lk=0xfffff80571f29568, flags=532480, ilk=<optimized out>, wmesg=<optimized out>, pri=<optimized out>, timo=51, queue=0) at /usr/src/sys/kern/kern_lock.c:295 >> #4 0xffffffff80b93a1e in lockmgr_xlock_hard (lk=0xfffff80571f29568, flags=<unavailable>, ilk=0x0, file=<optimized out>, line=1432, lwa=0xfffff80571f29568) at /usr/src/sys/kern/kern_lock.c:841 >> #5 0xffffffff810e2a40 in VOP_LOCK1_APV (vop=0xffffffff8250a1e0 <zfs_vnodeops>, a=0xfffffe271833f4d8) at vnode_if.c:1989 >> #6 0xffffffff80cb14f4 in VOP_LOCK1 (vp=0xfffff80571f29500, flags=532480, file=0xffffffff82472ac9 "/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c", line=1432) at ./vnode_if.h:879 >> #7 _vn_lock (vp=0xfffff80571f29500, flags=532480, file=0xffffffff82472ac9 "/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c", line=1432) at /usr/src/sys/kern/vfs_vnops.c:1613 >> #8 0xffffffff8243af40 in zfs_lookup_lock (dvp=0xfffff808a08f0a00, vp=0xfffff80571f29500, name=0xfffffe271833f630 "copool-basic.sh.log", lkflags=532480) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1432 >> #9 zfs_lookup (dvp=0xfffff808a08f0a00, nm=0xfffffe271833f630 "copool-basic.sh.log", vpp=<optimized out>, cnp=0xfffffe271833faf0, nameiop=2, cr=<optimized out>, td=<optimized out>, flags=0, cached=1) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:1606 >> #10 0xffffffff8243b61e in zfs_freebsd_lookup (ap=0xfffffe271833f780, cached=<error reading variable: Cannot access memory at address 0x1>) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4900 >> #11 zfs_freebsd_cachedlookup (ap=0xfffffe271833f780) at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4908 >> #12 0xffffffff810dfb46 in VOP_CACHEDLOOKUP_APV (vop=0xffffffff8250a1e0 <zfs_vnodeops>, a=0xfffffe271833f780) at vnode_if.c:180 >> #13 0xffffffff80c84dd8 in VOP_CACHEDLOOKUP (dvp=0xfffff808a08f0a00, vpp=0xfffffe271833fac0, cnp=0xfffffe271833faf0) at ./vnode_if.h:80 >> #14 vfs_cache_lookup (ap=<optimized out>) at /usr/src/sys/kern/vfs_cache.c:2149 >> #15 0xffffffff810df996 in VOP_LOOKUP_APV (vop=0xffffffff8250a1e0 <zfs_vnodeops>, a=0xfffffe271833f820) at vnode_if.c:117 >> #16 0xffffffff80c8ee61 in VOP_LOOKUP (dvp=0xfffff808a08f0a00, vpp=0xfffffe271833fac0, cnp=0xfffffe271833faf0) at ./vnode_if.h:54 >> #17 lookup (ndp=0xfffffe271833fa60) at /usr/src/sys/kern/vfs_lookup.c:951 >> #18 0xffffffff80c8e374 in namei (ndp=0xfffffe271833fa60) at /usr/src/sys/kern/vfs_lookup.c:512 >> #19 0xffffffff80ca9e69 in kern_funlinkat (td=0xfffffe24aadb6100, dfd=-100, path=0x800a3982e <error: Cannot access memory at address 0x800a3982e>, fd=<optimized out>, pathseg=UIO_USERSPACE, flag=<optimized out>, oldinum=0) at /usr/src/sys/kern/vfs_syscalls.c:1819 >> #20 0xffffffff80ca9db8 in sys_unlink (td=<unavailable>, uap=<optimized out>) at /usr/src/sys/kern/vfs_syscalls.c:1747 >> #21 0xffffffff82780586 in filemon_wrapper_unlink (td=<unavailable>, uap=0xfffffe24aadb64d8) at /usr/src/sys/dev/filemon/filemon_wrapper.c:350 >> #22 0xffffffff8106623d in syscallenter (td=<optimized out>) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:150 >> #23 amd64_syscall (td=0xfffffe24aadb6100, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1161 > > thread 108476 (vim) is waiting to lock the directory vnode > 0xfffff808a08f0a00 > >> (kgdb) thread 10 >> [Switching to thread 10 (Thread 108476)] >> #0 sched_switch (td=0xfffffe26c8601500, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 2147 cpuid = td->td_oncpu = PCPU_GET(cpuid); >> (kgdb) backtrace >> #0 sched_switch (td=0xfffffe26c8601500, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> #1 0xffffffff80bce615 in mi_switch (flags=260) at /usr/src/sys/kern/kern_synch.c:542 >> #2 0xffffffff80c1cfea in sleepq_switch (wchan=0xfffff808a08f0a68, pri=96) at /usr/src/sys/kern/subr_sleepqueue.c:625 >> #3 0xffffffff80b954f6 in sleeplk (lk=0xfffff808a08f0a68, flags=2105344, ilk=<optimized out>, wmesg=<optimized out>, pri=<optimized out>, timo=51, queue=1) at /usr/src/sys/kern/kern_lock.c:295 >> #4 0xffffffff80b93525 in lockmgr_slock_hard (lk=0xfffff808a08f0a68, flags=2105344, ilk=<optimized out>, file=0xffffffff811fb967 "/usr/src/sys/kern/vfs_subr.c", line=2930, lwa=<optimized out>) at /usr/src/sys/kern/kern_lock.c:649 >> #5 0xffffffff810e2a40 in VOP_LOCK1_APV (vop=0xffffffff8250a1e0 <zfs_vnodeops>, a=0xfffffe271d46d6b8) at vnode_if.c:1989 >> #6 0xffffffff80cb14f4 in VOP_LOCK1 (vp=0xfffff808a08f0a00, flags=2105344, file=0xffffffff811fb967 "/usr/src/sys/kern/vfs_subr.c", line=2930) at ./vnode_if.h:879 >> #7 _vn_lock (vp=0xfffff808a08f0a00, flags=2105344, file=0xffffffff811fb967 "/usr/src/sys/kern/vfs_subr.c", line=2930) at /usr/src/sys/kern/vfs_vnops.c:1613 >> #8 0xffffffff80c9b3ec in vget_finish (vp=0xfffff808a08f0a00, flags=2105344, vs=VGET_USECOUNT) at /usr/src/sys/kern/vfs_subr.c:2930 >> #9 0xffffffff80c8051c in cache_lookup (dvp=<optimized out>, vpp=<optimized out>, cnp=<optimized out>, tsp=<optimized out>, ticksp=<optimized out>) at /usr/src/sys/kern/vfs_cache.c:1407 >> #10 0xffffffff80c84dad in vfs_cache_lookup (ap=<optimized out>) at /usr/src/sys/kern/vfs_cache.c:2147 >> #11 0xffffffff810df996 in VOP_LOOKUP_APV (vop=0xffffffff8250a1e0 <zfs_vnodeops>, a=0xfffffe271d46d8a0) at vnode_if.c:117 >> #12 0xffffffff80c8ee61 in VOP_LOOKUP (dvp=0xfffff804de66e500, vpp=0xfffffe271d46da60, cnp=0xfffffe271d46da90) at ./vnode_if.h:54 >> #13 lookup (ndp=0xfffffe271d46da00) at /usr/src/sys/kern/vfs_lookup.c:951 >> #14 0xffffffff80c8e374 in namei (ndp=0xfffffe271d46da00) at /usr/src/sys/kern/vfs_lookup.c:512 >> #15 0xffffffff80caa83f in kern_statat (td=0xfffffe26c8601500, flag=<optimized out>, fd=<optimized out>, path=0x8049d12c0 <error: Cannot access memory at address 0x8049d12c0>, pathseg=UIO_USERSPACE, sbp=0xfffffe271d46db28, hook=0x0) at /usr/src/sys/kern/vfs_syscalls.c:2340 >> #16 0xffffffff80caafff in sys_fstatat (td=<unavailable>, uap=0xfffffe26c86018d8) at /usr/src/sys/kern/vfs_syscalls.c:2317 >> #17 0xffffffff81065c40 in syscallenter (td=<optimized out>) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:162 >> #18 amd64_syscall (td=0xfffffe26c8601500, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1161 >> #19 <signal handler called> >> #20 0x00000008020ba75a in ?? () > > Lastly the find thread (107378) is waiting to lock the same directory > vnode 0xfffff808a08f0a00 > >> (kgdb) thread 3 >> [Switching to thread 3 (Thread 107378)] >> #0 sched_switch (td=0xfffffe25eefa2e00, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> 2147 cpuid = td->td_oncpu = PCPU_GET(cpuid); >> (kgdb) backtrace >> #0 sched_switch (td=0xfffffe25eefa2e00, flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2147 >> #1 0xffffffff80bce615 in mi_switch (flags=260) at /usr/src/sys/kern/kern_synch.c:542 >> #2 0xffffffff80c1cfea in sleepq_switch (wchan=0xfffff808a08f0a68, pri=96) at /usr/src/sys/kern/subr_sleepqueue.c:625 >> #3 0xffffffff80b954f6 in sleeplk (lk=0xfffff808a08f0a68, flags=2106368, ilk=<optimized out>, wmesg=<optimized out>, pri=<optimized out>, timo=51, queue=1) at /usr/src/sys/kern/kern_lock.c:295 >> #4 0xffffffff80b93525 in lockmgr_slock_hard (lk=0xfffff808a08f0a68, flags=2106368, ilk=<optimized out>, file=0xffffffff811f0ff4 "/usr/src/sys/kern/vfs_lookup.c", line=737, lwa=<optimized out>) at /usr/src/sys/kern/kern_lock.c:649 >> #5 0xffffffff810e2a40 in VOP_LOCK1_APV (vop=0xffffffff8250a1e0 <zfs_vnodeops>, a=0xfffffe271bee5748) at vnode_if.c:1989 >> #6 0xffffffff80cb14f4 in VOP_LOCK1 (vp=0xfffff808a08f0a00, flags=2106368, file=0xffffffff811f0ff4 "/usr/src/sys/kern/vfs_lookup.c", line=737) at ./vnode_if.h:879 >> #7 _vn_lock (vp=0xfffff808a08f0a00, flags=2106368, file=0xffffffff811f0ff4 "/usr/src/sys/kern/vfs_lookup.c", line=737) at /usr/src/sys/kern/vfs_vnops.c:1613 >> #8 0xffffffff80c8e93d in lookup (ndp=0xfffffe271bee5a88) at /usr/src/sys/kern/vfs_lookup.c:735 >> #9 0xffffffff80c8e374 in namei (ndp=0xfffffe271bee5a88) at /usr/src/sys/kern/vfs_lookup.c:512 >> #10 0xffffffff80cb0bdb in vn_open_cred (ndp=0xfffffe271bee5a88, flagp=0xfffffe271bee5bb4, cmode=0, vn_open_flags=0, cred=0xfffff80786ecad00, fp=0xfffff802a8627690) at /usr/src/sys/kern/vfs_vnops.c:288 >> #11 0xffffffff80ca8a8a in kern_openat (td=0xfffffe25eefa2e00, fd=<optimized out>, path=<optimized out>, pathseg=<optimized out>, flags=1048577, mode=<optimized out>) at /usr/src/sys/kern/vfs_syscalls.c:1083 >> #12 0xffffffff82780415 in filemon_wrapper_openat (td=0xfffffe25eefa2e00, uap=0xfffffe25eefa31d8) at /usr/src/sys/dev/filemon/filemon_wrapper.c:232 >> #13 0xffffffff8106623d in syscallenter (td=<optimized out>) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:150 >> #14 amd64_syscall (td=0xfffffe25eefa2e00, traced=0) at /usr/src/sys/amd64/amd64/trap.c:1161 > > -- Regards, Bryan Drewery
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:24 UTC