Re: mtx_lock() of destroyed mutex on NFS

From: Rick Macklem <rmacklem_at_uoguelph.ca>
Date: Wed, 19 Oct 2011 12:00:49 -0400 (EDT)
Bjoern A. Zeeb wrote:
> Hi,
> 
> as a result of a make buildkernel && make installkernel && reboot all
> on NFS I got this with a HEAD SVN source at r226465. I cannot dump
> unfortunately and it seems I just killed the obj tree for this kernel
> though it should be very close.
> 
> Oct 18 10:03:22 lion3 reboot: rebooted by test
> Oct 18 10:03:22 panic: mtx_lock() of destroyed mutex _at_
> /zoo/bz/HEAD.svn/sys/kern/uipc_socket.c:1022
> cpuid = 2
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
> kdb_backtrace() at kdb_backtrace+0x37
> panic() at panic+0x187
> _mtx_lock_flags() at _mtx_lock_flags+0x130
> sosend_dgram() at sosend_dgram+0xbb
> sosend() at sosend+0x82
> clnt_dg_call() at clnt_dg_call+0xb81
> clnt_call_private() at clnt_call_private+0xe8
> nlm_get_rpc() at nlm_get_rpc+0x187
> nlm_host_get_rpc() at nlm_host_get_rpc+0x130
> nlm_clearlock() at nlm_clearlock+0x10a
> nlm_advlock_internal() at nlm_advlock_internal+0x64f
> nlm_advlock() at nlm_advlock+0x2a
> nfs_advlock() at nfs_advlock+0x122
> VOP_ADVLOCK_APV() at VOP_ADVLOCK_APV+0xb7
> vn_closefile() at vn_closefile+0xe8
> _fdrop() at _fdrop+0x23
> closef() at closef+0x5c
> fdfree() at fdfree+0x1b4
> exit1() at exit1+0x31a
> sigexit() at sigexit+0x8f
> cursig() at cursig
> ast() at ast+0x1a9
> doreti_ast() at doreti_ast+0x1f
> KDB: enter: panic
> [ thread pid 1652 tid 100106 ]
> Stopped at kdb_enter+0x3b: movq $0,0x80feb2(%rip)
> db> show reg
> cs 0x20
> ds 0x3b
> es 0x3b003b
> fs 0x1b0013
> gs 0x1b
> ss 0x28
> rax 0x12
> rcx 0xfffffe001a001000
> rdx 0
> rbx 0xffffffff80a2bfa8 __func__.3464+0x111
> rsp 0xffffff85cc173780
> rbp 0xffffff85cc1737a0
> rsi 0x80
> rdi 0xffffff85cc173600
> r8 0xffffffff80a2a498 __func__.6043+0x328
> r9 0xffffff85cc1736b0
> r10 0xfffffe001a001000
> r11 0x1
> r12 0x1
> r13 0xfffffe001a001000
> r14 0x3fe
> r15 0xffffffff80a39948 __func__.7715+0x2d3
> rip 0xffffffff80646c2b kdb_enter+0x3b
> rflags 0x282
> kdb_enter+0x3b: movq $0,0x80feb2(%rip)
> 
This seems to have been caused by a premature soclose(), which in
turn implies a premature call to it from clnt_dg_destroy(). The only
race I can see is that the socket buffer lock is used to protect
checking for so_upcall being set (which it then uses to decide if
a new cs_XXX structure is needed), but this lock isn't held when
it decides to throw it away and close the socket.

You could try the attached patch, which I've tested minimally.
(I think it fixes this race.)

rick
> --
> Bjoern A. Zeeb You have to have visions!
> Stop bit received. Insert coin for new address family.
> 
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to
> "freebsd-current-unsubscribe_at_freebsd.org"

Received on Wed Oct 19 2011 - 14:00:50 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:19 UTC