Re: Deadlock between GEOM and devfs device destroy and process exit.

From: Pawel Jakub Dawidek <pjd_at_FreeBSD.org>
Date: Sat, 30 Jan 2010 12:44:51 +0100
On Sat, Jan 30, 2010 at 12:27:49PM +0100, Pawel Jakub Dawidek wrote:
> On Sat, Jan 30, 2010 at 12:58:26AM +0200, Alexander Motin wrote:
> > Hi.
> > 
> > Experimenting with SATA hot-plug I've found quite repeatable deadlock
> > case. Problem observed when several SATA devices, opened via devfs,
> > disappear at exactly same time. In my case, at time of unplugging SATA
> > Port Multiplier with several disks beyond it. All I have to do is to run
> > several `dd if=/dev/adaX of=/dev/null bs=1m &` commands and unplug
> > multiplier. That causes predictable I/O errors and devices destruction.
> > But with high probability several dd processes getting stuck in kernel.
> [...]
> 
> I observed the same thing yesterday while stress-testing HAST:
> 
>  3659  2504  3659     0  DE+     GEOM top 0x8079a348 dd
>  3658  2102  2102     0  DE+     GEOM top 0x8079a348 hastd
>     2     0     0     0  DL      devdrn   0x85b1bc68 [g_event]
> 
> Both dd(1) and hastd(8) wait for the GEOM topology lock in the exit path,
> which is already held by the g_event thread.

Maybe I'll add how I understand what's going on:

GEOM calls destroy_dev() while holding the topology lock.

Destroy_dev() wants to destroy device, but can't because there are
threads that still have it open.

The threads can't close it, because to close it they need the topology
lock.

The deadlock is quite obvious, IMHO.

I believe the problem could be solved by dropping the topology lock in
g_dev_orphan() when calling destroy_dev(dev), but it is hard to say if
it is safe to drop the topology lock there. Maybe Poul-Henning could
take a look.

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd_at_FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

Received on Sat Jan 30 2010 - 10:45:01 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:00 UTC