Folks: I posted a similar thread on freebsd-scsi only to realize that scottl had fixed my first issue during some MP CAM cleanup with respect to a race during resource allocation issues on a later version of the driver we are using (I believe we did the same thing to resolve a lock issue on bootup). However on my RELENG_8 box with (2) Adaptec 5085s connected to some JBODs (9TB each) I still have a FIB starvation issue during the LUN scan: The number of FIBs allocated to this card is 512 (older cards are 256). The max_target per bus is 287. On a six channel controller with a BUS scan done in parallel I see a lot of this: ... (probe501:aacp1:0:214:0): Request Requeued (probe501:aacp1:0:214:0): Retrying Command (probe520:aacp1:0:233:0): Request Requeued (probe520:aacp1:0:233:0): Retrying Command (probe528:aacp1:0:241:0): Request Requeued (probe528:aacp1:0:241:0): Retrying Command (probe540:aacp1:0:253:0): Request Requeued (probe540:aacp1:0:253:0): Retrying Command (probe541:aacp1:0:254:0): Request Requeued (probe541:aacp1:0:254:0): Retrying Command .... I think the driver is much happier with the following attached patch (with dmesg). The CAM probeXXX process is now much much faster with ZERO retries. Is there anything bad about adding PIM_SYNCSCAN to hba_misc? What's the down side? It ensures minimally you don't run out of FIBs during a scan. The patch also bumps the number of FIBs to the maximum since I think its good to have that pool preallocated and its not that much memory on modern systems (this also helps if you have a controller that supports 512). Its 2 per page (FIBs are 2k) so its either 256 or 512, i.e. maximum of 1MB pool of FIBs. Perhaps that is not really necessary but again, why not? (if I get shot down so be it!) Anybody? Is this PR worthy? -aps
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:58 UTC