Re: aac(4) resource FIB starvation on BUS scan revisited

From: Alexander Sack <pisymbol_at_gmail.com>
Date: Mon, 7 Dec 2009 23:00:14 -0500
On Mon, Dec 7, 2009 at 8:14 PM, Scott Long <scottl_at_samsco.org> wrote:
> On Dec 7, 2009, at 6:05 PM, Jung-uk Kim wrote:
>>
>> On Monday 07 December 2009 07:47 pm, Scott Long wrote:
>>>
>>> On Dec 7, 2009, at 5:31 PM, Jung-uk Kim wrote:
>>>>
>>>> On Monday 07 December 2009 05:30 pm, Alexander Sack wrote:
>>>>>
>>>>> On Mon, Dec 7, 2009 at 4:42 PM, Alexander Sack
>>>>> <pisymbol_at_gmail.com>
>>>>
>>>> wrote:
>>>>>>
>>>>>> Folks:
>>>>>>
>>>>>> I posted a similar thread on freebsd-scsi only to realize that
>>>>>> scottl had fixed my first issue during some MP CAM cleanup with
>>>>>> respect to a race during resource allocation issues on a later
>>>>>> version of the driver we are using (I believe we did the same
>>>>>> thing to resolve a lock issue on bootup).
>>>>>>
>>>>>> However on my RELENG_8 box with (2) Adaptec 5085s connected to
>>>>>> some JBODs (9TB each) I still have a FIB starvation issue
>>>>>> during the LUN scan:
>>>>>>
>>>>>> The number of FIBs allocated to this card is 512 (older cards
>>>>>> are 256).  The max_target per bus is 287.  On a six channel
>>>>>> controller with a BUS scan done in parallel I see a lot of
>>>>>> this:
>>>>>>
>>>>>> ...
>>>>>> (probe501:aacp1:0:214:0): Request Requeued
>>>>>> (probe501:aacp1:0:214:0): Retrying Command
>>>>>> (probe520:aacp1:0:233:0): Request Requeued
>>>>>> (probe520:aacp1:0:233:0): Retrying Command
>>>>>> (probe528:aacp1:0:241:0): Request Requeued
>>>>>> (probe528:aacp1:0:241:0): Retrying Command
>>>>>> (probe540:aacp1:0:253:0): Request Requeued
>>>>>> (probe540:aacp1:0:253:0): Retrying Command
>>>>>> (probe541:aacp1:0:254:0): Request Requeued
>>>>>> (probe541:aacp1:0:254:0): Retrying Command
>>>>>> ....
>>>>>>
>>>>>> I think the driver is much happier with the following attached
>>>>>> patch (with dmesg).
>>>>>
>>>>> Patch again but this time not base-64 encoded:
>>>>
>>>> [SNIP!]
>>>>
>>>> I want it to be little conservative here, i.e., pre-allocating
>>>> half of max_fibs.  Will the attached patch work for you?
>>>
>>> The FIB allocation scheme was written when it was common for
>>> machines to only have 64MB of RAM and proportionally less KVA, so
>>> 256KB or 512KB was a lot of RAM to wire down.  Those days have
>>> probably passed.
>>
>> So, what would do if you were hypothetically rewriting it today? :-)
>>
>
> Most hardware have mechanisms for probing their command queue depth.  What I
> typically do these days is allocate a minimum number of commands so that
> this probing can be done, then do a single slab allocation based on the
> results.  AAC doesn't have this capability, but the 256/512 size is pretty
> well understood.  The page-by-page allocation of aac works, but adds extra
> bookkeeping and complication to the driver.
>

Right Scott, that is what JK and I discussed this evening.  I figured
the 128 macro was just historical cruft and your email confirms it.
So are we ALL okay with the original patch as it stands for now?  JK I
am fine with the divide 2 change but I think raising it to 256 is
really the way to go at this point!  :D

-aps
Received on Tue Dec 08 2009 - 03:00:16 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:58 UTC