ATA driver races with interrupts

From: Ville-Pertti Keinonen <will+freebsd-current_at_will.iki.fi>
Date: Mon, 02 Aug 2004 19:15:09 +0300
I've finally had time to track down a long-standing, easily repeatable 
(on my system, anyhow) lockup.

When accessing two SATA disks on different channels of the same 
controller simultaneously (both channels sharing the same interrupt), an 
interrupt on one channel could retire the request on the other channel 
if the interrupt occurred during ata_generic_transaction (after 
ch->running is set, before the actual transaction begins), leaving the 
device stuck (with ch->dma->flags set as ATA_DMA_ACTIVE but ch->running 
== NULL), often with a message of "already active DMA on this device".

Attached is a patch against -current as of a couple of days ago (before 
the #if 0 of PREEMPTION).  I'm not sure whether this is the right way to 
fix this (e.g. disabling interrupts altogether might be safer), but it 
seems to work for me.



Index: ata-all.h
===================================================================
RCS file: /data/freebsd/src/sys/dev/ata/ata-all.h,v
retrieving revision 1.79
diff -u -r1.79 ata-all.h
--- ata-all.h	30 Apr 2004 16:21:34 -0000	1.79
+++ ata-all.h	2 Aug 2004 15:54:50 -0000
_at__at_ -339,6 +339,7 _at__at_
 #define		ATA_48BIT_ACTIVE	0x10
 #define		ATA_IMMEDIATE_MODE	0x20
 #define		ATA_HWGONE		0x40
+#define		ATA_EXPECT_INTR		0x80
 
     struct ata_device		device[2];	/* devices on this channel */
 #define		MASTER			0x00
Index: ata-lowlevel.c
===================================================================
RCS file: /data/freebsd/src/sys/dev/ata/ata-lowlevel.c,v
retrieving revision 1.40
diff -u -r1.40 ata-lowlevel.c
--- ata-lowlevel.c	24 Jul 2004 19:03:28 -0000	1.40
+++ ata-lowlevel.c	2 Aug 2004 16:11:23 -0000
_at__at_ -81,6 +81,7 _at__at_
     }
 
     /* record the request as running */
+    ch->flags &= ~ATA_EXPECT_INTR;
     ch->running = request;
 
     ATA_DEBUG_RQ(request, "transaction");
_at__at_ -140,6 +141,7 _at__at_
 	}
 	
 	/* return and wait for interrupt */
+	ch->flags |= ATA_EXPECT_INTR;
 	return ATA_OP_CONTINUES;
 
     /* ATA DMA data transfer commands */
_at__at_ -169,6 +171,7 _at__at_
 	}
 
 	/* return and wait for interrupt */
+	ch->flags |= ATA_EXPECT_INTR;
 	return ATA_OP_CONTINUES;
 
     /* ATAPI PIO commands */
_at__at_ -225,6 +228,7 _at__at_
 			   ATA_PROTO_ATAPI_12 ? 6 : 8);
 
 	/* return and wait for interrupt */
+	ch->flags |= ATA_EXPECT_INTR;
 	return ATA_OP_CONTINUES;
 
     case ATA_R_ATAPI|ATA_R_DMA:
_at__at_ -290,6 +294,7 _at__at_
 	}
 
 	/* return and wait for interrupt */
+	ch->flags |= ATA_EXPECT_INTR;
 	return ATA_OP_CONTINUES;
     }
 
_at__at_ -308,7 +313,7 _at__at_
     int length;
 
     /* ignore this interrupt if there is no running request */
-    if (!request) 
+    if (!request || !(ch->flags & ATA_EXPECT_INTR))
 	return;
 
     ATA_DEBUG_RQ(request, "interrupt");
Received on Mon Aug 02 2004 - 14:15:24 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:04 UTC