On Tue, 15 Aug 2006, Peter Holm wrote: > On Tue, Aug 15, 2006 at 05:01:20PM +0100, Robert Watson wrote: >> >> On Tue, 15 Aug 2006, Peter Holm wrote: >> >>> While stress testing GENERIC HEAD from Aug 12 12:55 UTC I got this >>> panic: >>> >>> panic: mutex nfsd_mtx not owned at >>> ../../../nfsserver/nfs_srvsock.c:148 >>> cpuid = 2 >>> KDB: enter: panic >>> [thread pid 761 tid 100096 ] >>> Stopped at kdb_enter+0x2b: nop >>> db> where >>> Tracing pid 761 tid 100096 td 0xc4041a20 >>> kdb_enter(c091cda8) at kdb_enter+0x2b >>> panic(c091c0b7,c09210c9,c093241d,94,0,...) at panic+0x14b >>> _mtx_assert(c0a64ec0,1,c093241d,94,c07ec53c,...) at _mtx_assert+0x66 >>> nfs_rephead(0,c52a0600,48,e662e964,e662e968,...) at nfs_rephead+0x25 >>> nfsrv_symlink(c52a0600,c4071e00,c4041a20,e662ec40) at >>> nfsrv_symlink+0x3b7 >>> nfssvc_nfsd(c4041a20) at nfssvc_nfsd+0x409 >>> nfssvc(c4041a20,e662ed04) at nfssvc+0x18c >>> syscall(3b,3b,3b,1,0,...) at syscall+0x256 >>> >>> More details _at_ http://people.freebsd.org/~pho/stress/log/cons204.html >> >> Could you use gdb to generate frame debugging information for the frame >> above nfs_rephead() (nfsrv_symlink()) also, please? I'm a bit puzzled as >> to how things got into this state, as under normal circumstances, >> nfsm_reply() is the source of the nfs_rephead() call, and the NFS mutex is >> acquired the line before the call to nfsm_reply(). > > cons204.html has been updated with info from frame 12. Ah, all makes sense now. I didn't realize that nfsm_srvpathsiz() was a route to nfsm_reply(). I'll investigate how best to fix this this evening. Similar bugs may exist elsewhere in the NFS server, and will presumably turn up during mbuf starvation. Thanks! Robert N M Watson Computer Laboratory University of CambridgeReceived on Tue Aug 15 2006 - 16:04:55 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:59 UTC