Re: Panics after AHCI timeouts

From: Alexander Kabaev <kabaev_at_gmail.com>
Date: Tue, 25 Oct 2011 20:27:55 -0400
On Tue, 18 Oct 2011 14:28:07 +0100
"Steven Hartland" <killing_at_multiplay.co.uk> wrote:

> 
> ----- Original Message ----- 
> From: "Alexey Shuvaev" <shuvaev_at_physik.uni-wuerzburg.de>
> To: <freebsd-current_at_freebsd.org>
> Sent: Tuesday, October 18, 2011 2:13 PM
> Subject: Re: Panics after AHCI timeouts
> 
> 
> > On Tue, Oct 18, 2011 at 06:19:19AM +0800, Adrian Chadd wrote:
> > Done, kern/161768.
> > 
> > Question to the list: does anybody see successful recovery from AHCI
> > timeout an a recent CURRENT? Recent means June 2011 or newer, so 9.0
> > branch counts also. That is, there are some kernel messages like
> > this:
> > 
> > ahcich0: Timeout on slot 29 port 0
> > ahcich0: is 00000000 cs 00000000 ss ffffffff rs ffffffff tfd 40
> > serr 00000000 cmd 0000fc17
> > 
> > but then AHCI recovers and the system does not panic?
> 
> Not a recent CURRENT but on 8.2-RELEASE we have seen recovery on
> secondary ssd drives without a panic, but it does generally
> drop the disk and need a power off, power on to recover the
> disk properly; although we believe that's a firmware bug on
> the ssds
> 
>     Regards
>     Steve
> 
I do see timeouts on one of my Samsung ST3750330A disks and they
definitely do not cause any panics. The weird part in my case is that
disk then immediately reappears as online and mirror zpool can be
rebuilt by just onlining the disk with 'zpool online <pool> <disk>'
command.

It seems to be happening once system has accumulated some uptime. If
rebooted, it keeps running for a week or two with no issues, but then
timeouts start to happen more or less reliably every single 24 hours.
 
-- 
Alexander Kabaev

Received on Tue Oct 25 2011 - 22:58:14 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:19 UTC