Re: panic: mutex nfsd_mtx not owned at nfs_srvsock.c:148

From: Robert Watson <rwatson_at_FreeBSD.org> Date: Tue, 15 Aug 2006 19:03:42 +0100 (BST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:59 UTC

On Tue, 15 Aug 2006, Peter Holm wrote:

> On Tue, Aug 15, 2006 at 05:01:20PM +0100, Robert Watson wrote:
>>
>> On Tue, 15 Aug 2006, Peter Holm wrote:
>>
>>> While stress testing GENERIC HEAD from Aug 12 12:55 UTC I got this
>>> panic:
>>>
>>> panic: mutex nfsd_mtx not owned at
>>> ../../../nfsserver/nfs_srvsock.c:148
>>> cpuid = 2
>>> KDB: enter: panic
>>> [thread pid 761 tid 100096 ]
>>> Stopped at      kdb_enter+0x2b: nop
>>> db> where
>>> Tracing pid 761 tid 100096 td 0xc4041a20
>>> kdb_enter(c091cda8) at kdb_enter+0x2b
>>> panic(c091c0b7,c09210c9,c093241d,94,0,...) at panic+0x14b
>>> _mtx_assert(c0a64ec0,1,c093241d,94,c07ec53c,...) at _mtx_assert+0x66
>>> nfs_rephead(0,c52a0600,48,e662e964,e662e968,...) at nfs_rephead+0x25
>>> nfsrv_symlink(c52a0600,c4071e00,c4041a20,e662ec40) at
>>> nfsrv_symlink+0x3b7
>>> nfssvc_nfsd(c4041a20) at nfssvc_nfsd+0x409
>>> nfssvc(c4041a20,e662ed04) at nfssvc+0x18c
>>> syscall(3b,3b,3b,1,0,...) at syscall+0x256
>>>
>>> More details _at_ http://people.freebsd.org/~pho/stress/log/cons204.html
>>
>> Could you use gdb to generate frame debugging information for the frame 
>> above nfs_rephead() (nfsrv_symlink()) also, please?  I'm a bit puzzled as 
>> to how things got into this state, as under normal circumstances, 
>> nfsm_reply() is the source of the nfs_rephead() call, and the NFS mutex is 
>> acquired the line before the call to nfsm_reply().
>
> cons204.html has been updated with info from frame 12.

Ah, all makes sense now.  I didn't realize that nfsm_srvpathsiz() was a route 
to nfsm_reply().  I'll investigate how best to fix this this evening. Similar 
bugs may exist elsewhere in the NFS server, and will presumably turn up during 
mbuf starvation.

Thanks!

Robert N M Watson
Computer Laboratory
University of Cambridge