Re: Deadlocks / hangs in ZFS

From: Slawa Olhovchenkov <slw_at_zxy.spb.ru>
Date: Sun, 3 Jun 2018 22:28:14 +0300
On Sun, Jun 03, 2018 at 09:14:50PM +0200, Alexander Leidinger wrote:

> Quoting Alexander Leidinger <Alexander_at_leidinger.net> (from Mon, 28  
> May 2018 09:02:01 +0200):
> 
> > Quoting Slawa Olhovchenkov <slw_at_zxy.spb.ru> (from Mon, 28 May 2018  
> > 01:06:12 +0300):
> >
> >> On Sun, May 27, 2018 at 09:41:59PM +0200, Kirill Ponomarev wrote:
> >>
> >>> On 05/22, Slawa Olhovchenkov wrote:
> >>>> > It has been a while since I tried Karl's patch the last time, and I
> >>>> > stopped because it didn't apply to -current anymore at some point.
> >>>> > Will what is provided right now in the patch work on -current?
> >>>>
> >>>> I am mean yes, after s/vm_cnt.v_free_count/vm_free_count()/g
> >>>> I am don't know how to have two distinct patch (for stable and  
> >>>> current) in one review.
> >>>
> >>> I'm experiencing these issues sporadically as well, would you mind
> >>> to publish this patch for fresh current?
> >>
> >> Week ago I am adopt and publish patch to fresh current and stable, is
> >> adopt need again?
> >
> > I applied the patch in the review yesterday to rev 333966, it  
> > applied OK (with some fuzz). I will try to reproduce my issue with  
> > the patch.
> 
> The behavior changed (or the system was long enough in this state  
> without me noticing it). I have a panic now:
> panic: deadlkres: possible deadlock detected for 0xfffff803766db580,  
> blocked for 1803003 ticks

Hmm, may be first determinate locked function

addr2line -ie /boot/kernel/kernel 0xfffff803766db580

or

kgdb
x/10i 0xfffff803766db580


> I only have the textdump. Is nayone up to debug this? If yes, I switch  
> to normal dumps, just tell me what I shall check for.
> 
> db:0:kdb.enter.panic>  run lockinfo
> db:1:lockinfo> show locks
> No such command; use "help" to list available commands
> db:1:lockinfo>  show alllocks
> No such command; use "help" to list available commands
> db:1:lockinfo>  show lockedvnods
> Locked vnodes
> db:0:kdb.enter.panic>  show pcpu
> cpuid        = 6
> dynamic pcpu = 0xfffffe008f03e840
> curthread    = 0xfffff80370c82000: pid 0 tid 100218 "deadlkres"
> curpcb       = 0xfffffe0116472cc0
> fpcurthread  = none
> idlethread   = 0xfffff803700b9580: tid 100008 "idle: cpu6"
> curpmap      = 0xffffffff80d28448
> tssp         = 0xffffffff80d96d90
> commontssp   = 0xffffffff80d96d90
> rsp0         = 0xfffffe0116472cc0
> gs32p        = 0xffffffff80d9d9c8
> ldt          = 0xffffffff80d9da08
> tss          = 0xffffffff80d9d9f8
> db:0:kdb.enter.panic>  bt
> Tracing pid 0 tid 100218 td 0xfffff80370c82000
> kdb_enter() at kdb_enter+0x3b/frame 0xfffffe0116472aa0
> vpanic() at vpanic+0x1c0/frame 0xfffffe0116472b00
> panic() at panic+0x43/frame 0xfffffe0116472b60
> deadlkres() at deadlkres+0x3a6/frame 0xfffffe0116472bb0
> fork_exit() at fork_exit+0x84/frame 0xfffffe0116472bf0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0116472bf0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> 
> Bye,
> Alexander.
> 
> -- 
> http://www.Leidinger.net Alexander_at_Leidinger.net: PGP 0x8F31830F9F2772BF
> http://www.FreeBSD.org    netchild_at_FreeBSD.org  : PGP 0x8F31830F9F2772BF
Received on Sun Jun 03 2018 - 17:28:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:16 UTC