Re: pointyhat panic

From: John Baldwin <jhb_at_freebsd.org> Date: Tue, 16 Jun 2009 08:12:48 -0400 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:50 UTC

On Monday 15 June 2009 8:58:11 pm Adam McDougall wrote:
> On Mon, Jun 15, 2009 at 10:02:16PM +0100, Kris Kennaway wrote:
> 
>   Pav Lucistnik wrote:
>   > panic: mtx_lock() of destroyed mutex _at_ /usr/src/sys/rpc/clnt_vc.c:953
>   > cpuid = 2
>   > KDB: enter: panic
>   > [thread pid 0 tid 100029 ]
>   > Stopped at      kdb_enter+0x3d: movq    $0,0x3f5fb8(%rip)
>   > db> bt
>   > Tracing pid 0 tid 100029 td 0xffffff00018e1000
>   > kdb_enter() at kdb_enter+0x3d
>   > panic() at panic+0x17b
>   > _mtx_lock_flags() at _mtx_lock_flags+0xc5
>   > clnt_vc_soupcall() at clnt_vc_soupcall+0x273
>   > sowakeup() at sowakeup+0xf8
>   > tcp_do_segment() at tcp_do_segment+0x23c9
>   > tcp_input() at tcp_input+0x9ec
>   > ip_input() at ip_input+0xbc
>   > ether_demux() at ether_demux+0x1ed
>   > ether_input() at ether_input+0x171
>   > em_rxeof() at em_rxeof+0x201
>   > em_handle_rxtx() at em_handle_rxtx+0x4b
>   > taskqueue_run() at taskqueue_run+0x96
>   > taskqueue_thread_loop() at taskqueue_thread_loop+0x3f
>   > fork_exit() at fork_exit+0x12a
>   > fork_trampoline() at fork_trampoline+0xe
>   > --- trap 0, rip = 0, rsp = 0xffffffff240a6d40, rbp = 0 ---
>   > 
>   > The box is in kdb on serial console for now. May 9 -CURRENT, I think.
>   > 
>   
>   This happened again.  The trigger was this (^C of a find on a busy 
>   netapp volume with a lot of other concurrent nfs traffic to the same 
>   mountpoint):
>   
>   pointyhat# find . -name \*.bz2 -mmin -10
>   ^Cnfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   nfs server dumpster:/vol/vol4/pointyhat: not responding
>   load: 4.54  cmd: find 93357 [rpccon] 11.19u 111.62s 0% 4848k
>   
>   About 5-10 minutes later the machine panicked.  I'll try updating to a 
>   newer -CURRENT.
>   
>   Kris
> 
> This sounds like nearly exactly the same symptoms I noticed on
> a -current machine a few months ago, I was doing a du on a 
> nfs mount, decided to ctrl-c it, got the not responding for a
> while and a few minutes after the system paniced.  I hadn't
> had a chance to report it yet but I did find a workaround,
> it is stable if I remove "intr" from the NFS mount options.
> Hope this helps a little.

These should be fixed in the latest HEAD.  It would be good to 
re-enable "intr" and test it before 8.0 is released.

-- 
John Baldwin