Re: panic: mutex Giant owned at nfs_syscalls.c:556

From: Kostik Belousov <kostikbel_at_gmail.com>
Date: Tue, 4 Mar 2008 12:04:12 +0200
On Mon, Mar 03, 2008 at 11:16:23PM +0300, pluknet wrote:
> On 03/03/2008, Kostik Belousov <kostikbel_at_gmail.com> wrote:
> > On Mon, Mar 03, 2008 at 09:27:15PM +0300, pluknet wrote:
> >  > On 03/03/2008, Kostik Belousov <kostikbel_at_gmail.com> wrote:
> >  > [snip]
> >  > >  To summarize, I need both the tcpdump and kernel/witness messages from
> >  > >  the panic.
> >  > >
> >  >
> >  > I'm sorry. Here it is.
> >  > http://pluknet.nm.ru/dev/tcpdump-nfsserver-full.raw
> >  >
> >  > The messages (same as unread msgbuf in initial posting, hand-scribed):
> >  > panic: mutex Giant owned at
> >  > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_syscalls.c:556
> >  > KDB: enter: panic
> >  > [thread pid 601 tid 100055 ]
> >  > Stopped at kdb_enter+0x3a: movl $0,kdb_why
> >  > db> show locks
> >  > exclusive sleep mutex nfsd_mtx r = 0 (0xc2e0af40) locked _at_
> >  > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_syscalls.c:501
> >  > exclusive sleep mutex Giant r = 0 (0xc07e6410) locked _at_
> >  > /usr/src/sys/kern/vfs_lookup.c:663
> >  >
> >  > >  Nevertheless, the patch below might help with the panic during
> >  > >  the unlinking (not tested).
> >  > >
> >  > >  diff --git a/sys/nfsserver/nfs_serv.c b/sys/nfsserver/nfs_serv.c
> >  > >  index 446651d..87e1aaa 100644
> >  > >  --- a/sys/nfsserver/nfs_serv.c
> >  > >  +++ b/sys/nfsserver/nfs_serv.c
> >  > >  _at__at_ -2146,7 +2146,7 _at__at_ nfsrv_remove(struct nfsrv_descript *nfsd, struct nfssvc_sock *slp,
> >  > >         nfsfh_t nfh;
> >  > >         fhandle_t *fhp;
> >  > >         struct mount *mp = NULL;
> >  > >  -       int vfslocked;
> >  > >  +       int vfslocked, vfslocked1;
> >  > >
> >  > >         nfsdbprintf(("%s %d\n", __FILE__, __LINE__));
> >  > >         ndclear(&nd);
> >  > >  _at__at_ -2168,7 +2168,11 _at__at_ nfsrv_remove(struct nfsrv_descript *nfsd, struct nfssvc_sock *slp,
> >  > >         nd.ni_cnd.cn_flags = LOCKPARENT | LOCKLEAF | MPSAFE;
> >  > >         error = nfs_namei(&nd, fhp, len, slp, nam, &md, &dpos,
> >  > >                 &dirp, v3,  &dirfor, &dirfor_ret, td, FALSE);
> >  > >  -       vfslocked = NDHASGIANT(&nd);
> >  > >  +       vfslocked1 = NDHASGIANT(&nd);
> >  > >  +       if (vfslocked && vfslocked1)
> >  > >  +               VFS_UNLOCK_GIANT(vfslocked1);
> >  > >  +       if (vfslocked || vfslocked1)
> >  > >  +               vfslocked = 1;
> >  > >         if (dirp && !v3) {
> >  > >                 vrele(dirp);
> >  > >                 dirp = NULL;
> >  > >
> >  > >
> >  >
> >  > Now the last lock triplex looks like:
> >  > vfslocked lock in
> >  > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 2161
> >  > vfslocked lock in
> >  > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 1106
> >  > vfslocked lock in
> >  > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 673
> >  > vfslocked unlock in
> >  > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c, 916
> >  > vfslocked1 unlock in
> >  > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 2173
> >  > ^^^
> >  > vfslocked unlock in
> >  > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_serv.c, 2238
> >  >
> >  > And no panic. Thanks.
> >
> >
> > Could you, please, clarify. As I read you mail, the patch fixed at least
> >  one of your panic. Are there any other situations where nfs server over
> >  non-MPSAFE fs panics for you ? It is possible that what you reported
> >  before actually contains several different reasons for Giant leak.
> 
> Of course.
> That another situation is while performing /etc/rc.d/nfsd stop
> > System call nfssvc returning with the following locks held:
> > exclusive sleep mutex Giant r = 2 (0xc07e6410) locked _at_
> > /usr/src/sys/modules/nfsserver/../../nfsserver/nfs_srvsubs.c:1106
> > panic: witness_warn
> 
> I got no panic with this patch:
I lost you again. What patch you are referencing there ? Is that the
patch I sent you, or some _other_ patch ?

> # /etc/rc.d/nfsd stop
> Stopping nfsd.
> kill: 1737: No such process
> kill: 1738: No such process
> kill: 1739: No such process
> kill: 1740: No such process
> #
> 
> wbr,
> pluknet

Received on Tue Mar 04 2008 - 09:04:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:28 UTC