RE: Panics after AHCI timeouts

From: Pegasus Mc Cleaft <ken_at_mthelicon.com>
Date: Fri, 28 Oct 2011 00:19:26 +0100
>> If it's only one process, the machine (usually) doesn't hang, even 
>> when that process is copying big files back and forth for a long 
>> period of time (it's a backup process). But interleave that process 
>> with another one accessing the same disk, and poof!, almost 
>> immediately ahci timeouts. occur. Very strange... Maybe a race 
>> condition of some sort after all?
>> 
>
>No, I cannot say there is any specific correlation to IO load of the
machine, 
>timeouts I saw happen randomly and seem almost always happen as system
uptime
>crosses two weeks boundary. I am suspecting Samsung firmware at this point.

Now that's interesting as I use a mixture of Samsung, WD, and Seagate.. And
I do believe the Samsungs tend to do this more. I see ACHI timeouts from
time to time on my machine (10-Current AMD64) but normally only when I am
doing something like a scrub. The machine has never panicked as a result of
this, it normally just FAULTS the drive in the pool and keeps on going. At
that point, doing a camcontrol rescan all does not bring the drive back into
existence (it will normally just hang on that bus for 15-20 seconds and then
carry on without identifying a drive). I have to pull the drive, let it spin
down and then reinsert it. Once its reinserted, the drive comes back on the
bus and I can online it again. 

The weird thing is this.. For me, it only ever seems to be when I am writing
to the pool/disk. Pure reads don't seem to bother it. 

I don't really know at this point if the SATA ports have gone wonkey on the
motherboard, or if the processor on the HD has crashed. I almost tend to
believe it's the drive because camcontrol stops on that port almost as it if
knows there is a link there, but can't talk to it. 

Peg
Received on Thu Oct 27 2011 - 21:19:52 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:19 UTC