Re: Functional RAID controller?

From: Barrett Lyon <blyon_at_blyon.com>
Date: Tue, 8 May 2007 15:23:30 -0700
> If you have "a good idea what's wrong with the twa driver", would  
> you mind
> sharing a stack trace or other information?  So far I have only  
> been told that
> "system hangs when I do heavy I/O".  This is _not_ reproducable here.
> Have you run memtest86 on the machine?  Have you run a PCI analyzer on
> your machine to see who is on the PCI bus before/during the hang?

We have done everything including asking to bring the machines that  
are crashing to AMCC's offices which are down the street.  I have not  
been doing the technical debugging but a few members of AMCC's staff  
have been trying to help.  We've been running memtest, etc.  When the  
machines hang there are no debugging options, it's completely frozen  
without any details pointing to why.  Its not clear from that  
condition whether the problem is due to an unacknowledged interrupt  
or a mutex deadlock of some sort.  We are assuming that in this case  
it is due to the driver trying to do work assuming the interrupt is  
valid and getting stuck or returning early before the interrupt is  
acknowledged, causing it to trigger over and over and over.

If you want to see it reproduced, we are more than happy to provide  
you two machines that both have this condition.

> You claim the hang doesn't happen on the 6.2 series twa driver,
> the driver changes between the 6.x and 7.x twa driver are _very_  
> minimal,
> some simple time keeping changes, and some XPT_* path inquiry handling
> changes.

Under 6.x the systems as built function completely stable.

> I am really surprised that you are trying to design servers around the
> FreeBSD un-stable kernel.

There are other reasons for this which I don't want to discuss here,  
but the other components we are using work very well within 7.0 and  
we have a lot of performance gains that make it worth using a  
development kernel.  The 10GbE drivers like mxge are having a lot of  
development work done in HEAD and as a result the 6.x is getting left  
behind on some of the work we are doing.  At the very least, I want  
to make sure I deploy hardware that will function beyond 6.x.


-Barrett
Received on Tue May 08 2007 - 20:24:10 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:09 UTC