Re: ZFS-related panic: "possible" spa->spa_errlog_lock deadlock

From: Fabian Keil <freebsd-listen_at_fabiankeil.de>
Date: Wed, 28 Oct 2015 13:58:21 +0100
Xin Li <delphij_at_delphij.net> wrote:

> On 9/7/14 11:23 PM, Fabian Keil wrote:
> > Xin Li <delphij_at_delphij.net> wrote:
> >   
> >> On 9/7/14 9:02 PM, Fabian Keil wrote:  
> >>> Using a kernel built from FreeBSD 11.0-CURRENT r271182 I got
> >>> the following panic yesterday:
> >>> 
> >>> [...] Unread portion of the kernel message buffer: [6880]
> >>> panic: deadlkres: possible deadlock detected for
> >>> 0xfffff80015289490, blocked for 1800503 ticks  
> >> 
> >> Any chance to get all backtraces (e.g. thread apply all bt full
> >> 16)? I think a different thread that held the lock have been
> >> blocked, probably related to your disconnected vdev.  
> > 
> > Output of "thread apply all bt full 16" is available at: 
> > http://www.fabiankeil.de/tmp/freebsd/kgdb-output-spa_errlog_lock-deadlock.txt
> >
> >  A lot of the backtraces prematurely end with "Cannot access memory
> > at address", therefore I also added "thread apply all bt" output.
> > 
> > Apparently there are at least two additional threads blocking below
> > spa_get_stats():
[...]
> Yes, thread 1182 owned the lock and is waiting for the zio be done.
> Other threads that wanted the lock would have to wait.
> 
> I don't have much clue why the system entered this state, however, as
> the operations should have errored out (the GELI device is gone on
> 21:44:56 based on your log, which suggests all references were closed)
> instead of waiting.

Thanks for the responses.

I finally found the time to analyse the problem which seems
to be that spa_sync() requires at least one writeable vdev to
complete, but holds the lock(s) required to remove or bring back
vdevs.

Letting spa_sync() drop the lock and wait for at least one vdev
to become writeable again seems to make the problem unreproducible
for me, but probably merely shrinks the race window and thus is not
a complete solution.

For details see:
https://www.fabiankeil.de/sourcecode/electrobsd/ZFS-Optionally-let-spa_sync-wait-for-writable-vdev.diff
(Experimental, only lightly tested)

Fabian

Received on Wed Oct 28 2015 - 12:12:27 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:00 UTC