Re: panic: nfssvc_nfsd(): debug.mpsafenet=1 && Giant

From: Robert Watson <rwatson_at_freebsd.org>
Date: Tue, 24 Aug 2004 23:18:55 -0400 (EDT)
On Tue, 24 Aug 2004, Christian Brueffer wrote:

> I'm getting the following panic on 6-CURRENT as well as 5.3-BETA1 with
> sources from today.  It's easily reproducible on both, with
> debug.mpsafenet=1. 

Hmm.  Looks like something has acquired and failed to release Giant in the
NFS server.  It could be we're leaking Giant in an error case that didn't
turn up in previous testing, but for some reason is more common in your
environment.  Unfortunately, as Giant can be acquired recursively, the
"last acquired" information presented by WITNESS isn't useful to us.
There are a couple of ways we could approach debugging this.  I think the
easiest might be the following:

- Recompile your kernel with the following options:

  options KTR
  options KTR_COMPILE=(KTR_LOCK|KTR_PROC)
  options KTR_ENTRIES=16384

- At run-time, before triggering the crash, use sysctl to set the
  following settings:

  sysctl debug.ktr.cpumask=0xff
  sysctl debug.ktr.mask=`sysctl -n debug.ktr.compile`

- When the crash occurs, use "show ktr" to list recent lock operations.
  Using your serial console, copy and paste a few pages of locking
  operations into an e-mail, probably until you hit the next context
  switch (mi_switch: new thread).  You'll notice that the entries are
  sorted by event id, which generally results in reverse-chronological
  order with most recent event earliest.  We'd like to look at all
  acquisitions of Giant and releases of Giant since that context switch.
  There should be one or more acquires than releases in the event stream,
  and we'd like to figure out which is the one not associated with a
  matching unlock.

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert_at_fledge.watson.org      Principal Research Scientist, McAfee Research

> 
> Crashdumps and debugging kernels available for both.
> 
> panic: nfssvc_nfsd(): debug.mpsafenet=1 && Giant
> cpuid = 0
> KDB: enter: panic
> [thread 100102]
> Stopped at      kdb_enter+0x2b: nop
> db> tr
> kdb_enter(c06e619b) at kdb_enter+0x2b
> panic(c06fbcc1,c0515fd4,0,2,2) at panic+0x125
> nfssvc_nfsd(c1c37420,c06e5488,121,c202b22c,c202b1c0) at
> nfssvc_nfsd+0x77e
> nfssvc(c1c37420,d8908d14,2,1,292) at nfssvc+0x1ac
> syscall(2f,2f,2f,bfbfecd8,0) at syscall+0x22b
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (155, FreeBSD ELF32, nfssvc), eip = 0x280d02ef, esp =
> 0xbfbfe92c, ebp = 0xbfbfe948 ---
> db> show locks
> exclusive sleep mutex nfsd_mtx r = 0 (0xc078dee0) locked _at_
> /usr/home/build/src/sys/nfsserver/nfs_syscalls.c:
> 510
> exclusive sleep mutex Giant r = 1 (0xc074c100) locked _at_
> /usr/home/build/src/sys/nfsserver/nfs_serv.c:2693
> db>
> 
> 
> - Christian
> 
> -- 
> Christian Brueffer	chris_at_unixpages.org	brueffer_at_FreeBSD.org
> GPG Key:	 http://people.freebsd.org/~brueffer/brueffer.key.asc
> GPG Fingerprint: A5C8 2099 19FF AACA F41B  B29B 6C76 178C A0ED 982D
> 
Received on Wed Aug 25 2004 - 01:21:10 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:08 UTC