Re: RELENG_5 panic [nic: _mtx_lock_sleep: recursed on non-recursivemutex nfsd_mtx]

From: Robert Watson <rwatson_at_freebsd.org>
Date: Mon, 25 Oct 2004 09:13:27 +0100 (BST)
On Mon, 25 Oct 2004, Wilkinson, Alex wrote:

> The panic doesn't occur any longer since I upgraded world/kernel.
> Therefore what you have written below would be pointless. 
> 
> Are you suspecting that the bug is still there but not being triggered ? 
> 
> I can downgrade world/kernel for testing if possible and needed ? 

I believe I copied you in a follow-up message, which might be a few more
messages into your message queue, in which I said something on the order
of "Ah, I found the bug -- did you recently rearrange the disks?".  I
committed a patch that I believe should fix the problem (and a small class
of related problems).  The problem you were experiencing should have gone
away after the last NFS client we rebooted or unmounted the NFS file
system following the disk rearrangement, and then would not have recurred
unless the disks were further rearranged.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert_at_fledge.watson.org      Principal Research Scientist, McAfee Research


> 
>  - aW
> 
> 
> 	0n Mon, Oct 18, 2004 at 05:22:01AM -0400, Robert Watson wrote: 
> 
> 	Hmm.  There was a bug in nfs_serv.c corrected on 2004/08/25 that corrected
> 	an incorrect lock pair, but I don't know of any related bugs corrrected
> 	since then.  If it's possible to get back to a point where this can be
> 	reproduced, configuring your kernel to run with KTR and to use KTR_LOCK
> 	would be quite helpful  You can find a bit of information on using KTR
> 	here:
> 	
> 	    http://www.watson.org/~robert/freebsd/netperf/ktr/
> 	
> 	Basically, once it panics, you can use the KTR-related commands in DDB to
> 	inspect the KTR buffer, and I believe also extract the KTR log from a core
> 	dump.  Using the set of KTR_COMPILE flags I have on that web page would be
> 	helpeful.  Basically, this generates a trace of locking, context switch,
> 	interrupt, callout, memory allocation, and system call events.  The
> 	problem likely occurred pretty immediately prior to the panic.
> 	
> 	If this is timing-related, it could be KTR disrupts the timing
> 	sufficiently to prevent it.  More likely, it went away because the work
> 	load on the NFS server changed, or the directory layout changed, or some
> 	other factor that caused the passage through the NFS code to change.
> 	
> 	Thanks!
> 	
> 	Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
> 	robert_at_fledge.watson.org      Principal Research Scientist, McAfee Research
> 	
> 	
> 	> 
> 	>  - aW
> 	> 
> 	> 
> 	> 
> 	> 	0n Sun, Oct 17, 2004 at 10:19:34AM -0400, Robert Watson wrote: 
> 	> 	
> 	> 	On Fri, 15 Oct 2004, Wilkinson, Alex wrote:
> 	> 	
> 	> 	> Hi all,
> 	> 	> 
> 	> 	> I currently get a panic with "nfs_server_enable=YES" in /etc/rc.conf.
> 	> 	> 
> 	> 	> OS: FreeBSD 5.3-BETA4 #2: Tue Sep 14 13:55:30 UTC 2004
> 	> 	> 
> 	> 	> Backtrace
> 	> 	> ---------
> 	> 	> 
> 	> 	> panic: _mtx_lock_sleep: recursed on non-recursive mutex nfsd_mtx _at_ /usr /src/sys/nfsserver/nfs_serv.c:1947
> 	> 	
> 	> 	Is the NFS server code compiled into your kernel, or is it getting loaded
> 	> 	as a module?  Do you have any other NFS-related entries in /etc/rc.conf?
> 	> 	Could you show the output of "show locks" and "show witness" with witness
> 	> 	compiled into the kernel?
> 	> 	
> 	> 	Thanks!
> 	> 	
> 	> 	Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
> 	> 	robert_at_fledge.watson.org      Principal Research Scientist, McAfee Research
> 	> 	
> 	> 	
> 	> 	> cpuid = 0
> 	> 	> KDB: enter: panic
> 	> 	> [thread 100074]
> 	> 	> Stopped at      kdb_enter+0x32: leave
> 	> 	> db> tr
> 	> 	> kdb_enter(c068ba66,0,c068aed8,dd2ed90c,c1b6b340) at kdb_enter+0x32
> 	> 	> panic(c068aed8,c069885f,c0698617,79b,c0698617) at panic+0x1b0
> 	> 	> _mtx_lock_sleep(c071c940,c1b6b340,0,c0698617,79b) at _mtx_lock_sleep+0x16e
> 	> 	> _mtx_lock_flags(c071c940,0,c0698617,79b,0) at _mtx_lock_flags+0xb0
> 	> 	> nfsrv_create(c1eb6800,c1b98800,c1b6b340,dd2edc8c,0) at nfsrv_create+0x8c4
> 	> 	> nfssvc(c1b6b340,dd2edd14,8,0,2) at nfssvc+0x6ea
> 	> 	> syscall(2f,2f,2f,0,0) at syscall+0x13b
> 	> 	> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> 	> 	> --- syscall (155, FreeBSD ELF32, nfssvc), eip = 0x280c60bf, esp = 0xbfbfeb1c, eb p = 0xbfbfeb38 ---
> 	> 	> 
> 	> 	> 
> 	> 	>  - aW
> 	> 	> _______________________________________________
> 	> 	> freebsd-current_at_freebsd.org mailing list
> 	> 	> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> 	> 	> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> 	> 	> 
> 	> 	
> 	> _______________________________________________
> 	> freebsd-current_at_freebsd.org mailing list
> 	> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> 	> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> 	> 
> 	
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> 
Received on Mon Oct 25 2004 - 06:13:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:19 UTC