Re: aac0: COMMAND 0xffffffffxxxxxxxx TIMEOUT AFTER xx SECONDS

From: Chris Hedley <cbh-freebsd-current_at_groups.chrishedley.com>
Date: Tue, 3 Oct 2006 09:02:33 +0100 (BST)
On Tue, 3 Oct 2006, Ian FREISLICH wrote:
>> I've been having a look at some reviews, but unfortunately few of them
>> make it clear whether or not the hard drives' cache is set to write back
>> or write through.  Needless to say, I'm not desperately enthusiastic about
>> combining a RAID controller with write back caching, but I suspect that a
>> lot of controllers are heavily dependent on it being enabled to attain
>> their performance: it seems that my 2410SA's rather dismal 3-6 & 30-40MB/s
>> respective RAID5 write & read speeds would increase dramatically were I to
>> use write-back, but I'm not going there...  I guess my point is that I
>> really don't want to find myself with another dog if I buy something with
>> apparently superior performance if it's completely reliant on on-disc
>> write back caching being enabled.
>
> I'm guessing this is because this controller has no read cache and
> no battery for it's write cache:
>
> AAC0> conta sho cache 0
> Executing: container show cache 0
>
> Global Container Read Cache Size  : 0
> Global Container Write Cache Size : 16203776
>
> Read Cache Setting        : ENABLE
> Write Cache Setting       : ENABLE WHEN PROTECTED
> Write Cache Status        : Inactive, battery not present
>
> If you're happy using the controller's write cache without battery
> backup, you can turn it on quite easily:
>
> container set cache /unprotected 0

Thanks for the reply, Ian.

I gave your suggestion a try just to see what difference it made, but I 
was only seeing marginal (if any) improvement with the transfer rates, 
which I thought was rather strange: reads are still in the order of 30-40 
MB/s and writes still under 8 MB/s, to both RAID5 and RAID10 containers, 
which I find rather disappointing performance-wise.

> What do you mean by "anything more than trivial read or write accesses"?

Even just a "find /mnt -print | cpio -ov > /dev/null" can cause frequent 
hiccups with the whole system stalling for 30 - 60 second periods 
randomly; but I get real problems if I start up something like Hercules 
that can issue lots of small, randomish accesses, in which case the 
timeout warnings (not just about aac, but fxp and various other things) 
will start scrolling up the console, in which case the system's had it 
unless I'm already logged in and know hercules' PID so I can kill it. 
I've since moved the hercules directory to a SCSI disc where the problems 
don't occur, but apart from it now being in the "wrong" place, I don't 
really like that sort of work-around and still get the same problem with 
other applications often enough that it's a bit of a hazard.

> When I tested these adaptors (ok it was still a simple test) I setup
> the container with a hot spare ufs+softupdate.  I then started 5
> or 6 parallell tar -xvf of the FreeBSD CVS repo.   After a few minutes
> I pulled the SATA cable from one of the drives.  Ther tar -xvf
> didn't even blink, the buzzer sounded, the "failed" disk was replaced
> automatically and the controler started rebuilding.  It always
> worked.  No matter how busy I tried to make the disks.

I can't complain about how well it rebuilds, I just wish it didn't take so 
long!  Mine's done several due to a bad connection to one of its discs and 
has recovered every time; the only niggle is that it takes between 6 and 
12 hours to recover the disc on a config with 4x250GB units containing two 
small RAID-10 containers and one large RAID-5.  I'm not sure if that's 
really what I should be expecting; I've heard other controllers rebuilding 
the likes of Raptors in about half an hour, so I'd've thought I might 
expect perhaps no more than two or three hours, albeit a rather simplistic 
assumption.

Chris.
Received on Tue Oct 03 2006 - 06:02:45 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:01 UTC