Re: aac(4) resource FIB starvation on BUS scan revisited

From: Alexander Sack <pisymbol_at_gmail.com>
Date: Mon, 7 Dec 2009 17:30:11 -0500
On Mon, Dec 7, 2009 at 4:42 PM, Alexander Sack <pisymbol_at_gmail.com> wrote:
>
> Folks:
>
> I posted a similar thread on freebsd-scsi only to realize that scottl had fixed my first issue during some MP CAM cleanup with respect to a race during resource allocation issues on a later version of the driver we are using (I believe we did the same thing to resolve a lock issue on bootup).
>
> However on my RELENG_8 box with (2) Adaptec 5085s connected to some JBODs (9TB each) I still have a FIB starvation issue during the LUN scan:
>
> The number of FIBs allocated to this card is 512 (older cards are 256).  The max_target per bus is 287.  On a six channel controller with a BUS scan done in parallel I see a lot of this:
>
> ...
> (probe501:aacp1:0:214:0): Request Requeued
> (probe501:aacp1:0:214:0): Retrying Command
> (probe520:aacp1:0:233:0): Request Requeued
> (probe520:aacp1:0:233:0): Retrying Command
> (probe528:aacp1:0:241:0): Request Requeued
> (probe528:aacp1:0:241:0): Retrying Command
> (probe540:aacp1:0:253:0): Request Requeued
> (probe540:aacp1:0:253:0): Retrying Command
> (probe541:aacp1:0:254:0): Request Requeued
> (probe541:aacp1:0:254:0): Retrying Command
> ....
>
> I think the driver is much happier with the following attached patch (with dmesg).

Patch again but this time not base-64 encoded:

Index: aac.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/aac/aac.c,v
retrieving revision 1.143.2.4
diff -u -r1.143.2.4 aac.c
--- aac.c	5 Nov 2009 18:34:01 -0000	1.143.2.4
+++ aac.c	7 Dec 2009 21:23:43 -0000
_at__at_ -604,7 +604,7 _at__at_
 	TAILQ_INIT(&sc->aac_fibmap_tqh);
 	sc->aac_commands = malloc(sc->aac_max_fibs * sizeof(struct aac_command),
 				  M_AACBUF, M_WAITOK|M_ZERO);
-	while (sc->total_fibs < AAC_PREALLOCATE_FIBS) {
+	while (sc->total_fibs < sc->aac_max_fibs) {
 		if (aac_alloc_commands(sc) != 0)
 			break;
 	}
Index: aac_cam.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/aac/aac_cam.c,v
retrieving revision 1.31.2.2
diff -u -r1.31.2.2 aac_cam.c
--- aac_cam.c	5 Nov 2009 18:34:01 -0000	1.31.2.2
+++ aac_cam.c	7 Dec 2009 21:23:43 -0000
_at__at_ -261,7 +261,7 _at__at_
 		cpi->target_sprt = 0;

 		/* Resetting via the passthrough causes problems. */
-		cpi->hba_misc = PIM_NOBUSRESET;
+		cpi->hba_misc = PIM_NOBUSRESET | PIM_SEQSCAN;
 		cpi->hba_eng_cnt = 0;
 		cpi->max_target = camsc->inf->TargetsPerBus;
 		cpi->max_lun = 8;	/* Per the controller spec */
Index: aacvar.h
===================================================================
RCS file: /home/ncvs/src/sys/dev/aac/aacvar.h,v
retrieving revision 1.52.2.2
diff -u -r1.52.2.2 aacvar.h
--- aacvar.h	2 Nov 2009 16:54:23 -0000	1.52.2.2
+++ aacvar.h	7 Dec 2009 21:23:44 -0000
_at__at_ -57,13 +57,6 _at__at_
 #define AAC_ADAPTER_FIBS	8

 /*
- * FIBs are allocated in page-size chunks and can grow up to the 512
- * limit imposed by the hardware.
- */
-#define AAC_PREALLOCATE_FIBS	128
-#define AAC_NUM_MGT_FIB		8
-
-/*
  * The controller reports status events in AIFs.  We hang on to a number of
  * these in order to pass them out to user-space management tools.
  */


And dmesg:

Copyright (c) 1992-2009 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
	The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 8.0-STABLE #2: Sun Dec  6 21:19:10 EST 2009
    root_at_watchmen.localdomain:/usr/home/asack/Development/freebsd/RELENG_8/src/sys/amd64/compile/GENERIC-DDB
amd64
Timecounter "i8254" frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(R) CPU           E5410  _at_ 2.33GHz (2327.52-MHz K8-class CPU)
  Origin = "GenuineIntel"  Id = 0x1067a  Stepping = 10
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x40ce3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,DCA,SSE4.1,XSAVE>
  AMD Features=0x20100800<SYSCALL,NX,LM>
  AMD Features2=0x1<LAHF>
  TSC: P-state invariant
real memory  = 17179869184 (16384 MB)
avail memory = 16526032896 (15760 MB)
ACPI APIC Table: <INTEL  S5000PAL>
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
FreeBSD/SMP: 1 package(s) x 8 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  2
 cpu3 (AP): APIC ID:  3
 cpu4 (AP): APIC ID:  4
 cpu5 (AP): APIC ID:  5
 cpu6 (AP): APIC ID:  6
 cpu7 (AP): APIC ID:  7
ioapic0 <Version 2.0> irqs 0-23 on motherboard
ioapic1 <Version 2.0> irqs 24-47 on motherboard
lapic0: Forcing LINT1 to edge trigger
kbd1 at kbdmux0
acpi0: <INTEL S5000PAL> on motherboard
acpi0: [ITHREAD]
ACPI Error: Package List length (6) larger than NumElements count (2), truncated
 20090521 dsobject-590
ACPI Error: Package List length (6) larger than NumElements count (2), truncated
 20090521 dsobject-590
ACPI Error: Package List length (6) larger than NumElements count (2), truncated
 20090521 dsobject-590
ACPI Error: Package List length (6) larger than NumElements count (2), truncated
 20090521 dsobject-590
ACPI Error: Package List length (6) larger than NumElements count (2), truncated
 20090521 dsobject-590
ACPI Error: Package List length (6) larger than NumElements count (2), truncated
 20090521 dsobject-590
ACPI Error: Package List length (6) larger than NumElements count (2), truncated
 20090521 dsobject-590
ACPI Error: Package List length (6) larger than NumElements count (2), truncated
 20090521 dsobject-590
acpi0: Power Button (fixed)
Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 900
acpi_button0: <Power Button> on acpi0
pcib0: <ACPI Host-PCI bridge> port 0xca2,0xca3,0xcf8-0xcff on acpi0
pci0: <ACPI PCI bus> on pcib0
pcib1: <ACPI PCI-PCI bridge> at device 2.0 on pci0
pci1: <ACPI PCI bus> on pcib1
pcib2: <ACPI PCI-PCI bridge> irq 16 at device 0.0 on pci1
pci2: <ACPI PCI bus> on pcib2
pcib3: <ACPI PCI-PCI bridge> irq 16 at device 0.0 on pci2
pci3: <ACPI PCI bus> on pcib3
pcib4: <ACPI PCI-PCI bridge> at device 0.0 on pci3
pci4: <ACPI PCI bus> on pcib4
mfi0: <LSI MegaSAS 1064R> mem
0xb9000000-0xb900ffff,0xb8900000-0xb891ffff irq 18 at device 14.0 on
pci4
mfi0: Megaraid SAS driver Ver 3.00
mfi0: 1804 (313511129s/0x0020/info) - Shutdown command received from host
mfi0: 1805 (boot + 0s/0x0020/info) - Firmware initialization started
(PCI ID 0411/1000/3501/8086)
mfi0: 1806 (boot + 0s/0x0020/info) - Firmware version 1.12.230-0598
mfi0: 1807 (boot + 0s/0x0020/info) - Firmware initialization started
(PCI ID 0411/1000/3501/8086)
mfi0: 1808 (boot + 0s/0x0020/info) - Firmware version 1.12.230-0598
mfi0: 1809 (boot + 71s/0x0008/info) - Battery temperature is normal
mfi0: 1810 (boot + 71s/0x0008/info) - Battery Present
mfi0: 1811 (boot + 71s/0x0020/info) - Board Revision
mfi0: 1812 (boot + 100s/0x0004/info) - Enclosure (SES) discovered on
PD 0c(c None/p1)
mfi0: 1813 (boot + 100s/0x0002/info) - Inserted: Encl PD 0c
mfi0: 1814 (boot + 100s/0x0002/info) - Inserted: PD 0c(c None/p1)
Info: enclPd=0c, scsiType=d, portMap=09,
sasAddr=500150796b8c0000,0000000000000000
mfi0: 1815 (boot + 100s/0x0002/info) - Inserted: PD 0a(e0x0c/s0)
mfi0: 1816 (boot + 100s/0x0002/info) - Inserted: PD 0a(e0x0c/s0) Info:
enclPd=0c, scsiType=0, portMap=00,
sasAddr=71903a26a4948e89,0000000000000000
mfi0: 1817 (boot + 100s/0x0002/info) - Inserted: PD 0b(e0x0c/s1)
mfi0: 1818 (boot + 100s/0x0002/info) - Inserted: PD 0b(e0x0c/s1) Info:
enclPd=0c, scsiType=0, portMap=01,
sasAddr=71903a27a68d958a,0000000000000000
mfi0: [ITHREAD]
pcib5: <PCI-PCI bridge> at device 0.2 on pci3
pci5: <PCI bus> on pcib5
pcib6: <ACPI PCI-PCI bridge> irq 16 at device 1.0 on pci2
pci6: <ACPI PCI bus> on pcib6
pcib7: <ACPI PCI-PCI bridge> irq 16 at device 2.0 on pci2
pci7: <ACPI PCI bus> on pcib7
em0: <Intel(R) PRO/1000 Network Connection 6.9.14> port 0x2020-0x203f
mem 0xb8820000-0xb883ffff,0xb8400000-0xb87fffff irq 18 at device 0.0
on pci7
em0: Using MSI interrupt
em0: [FILTER]
em0: Ethernet address: 00:15:17:96:b8:c0
em1: <Intel(R) PRO/1000 Network Connection 6.9.14> port 0x2000-0x201f
mem 0xb8800000-0xb881ffff,0xb8000000-0xb83fffff irq 19 at device 0.1
on pci7
em1: Using MSI interrupt
em1: [FILTER]
em1: Ethernet address: 00:15:17:96:b8:c1
pcib8: <ACPI PCI-PCI bridge> at device 0.3 on pci1
pci8: <ACPI PCI bus> on pcib8
pcib9: <PCI-PCI bridge> at device 3.0 on pci0
pci9: <PCI bus> on pcib9
pcib10: <ACPI PCI-PCI bridge> at device 4.0 on pci0
pci10: <ACPI PCI bus> on pcib10
aac0: <Adaptec RAID 5085> mem 0xb8e00000-0xb8ffffff irq 16 at device
0.0 on pci10
aac0: Enabling 64-bit address support
aac0: Enable Raw I/O
aac0: Enable 64-bit array
aac0: New comm. interface enabled
aac0: [ITHREAD]
aac0: Adaptec 5085, aac driver 2.0.0-1
aacp0: <SCSI Passthrough Bus> on aac0
aacp1: <SCSI Passthrough Bus> on aac0
aacp2: <SCSI Passthrough Bus> on aac0
pcib11: <ACPI PCI-PCI bridge> at device 5.0 on pci0
pci11: <ACPI PCI bus> on pcib11
aac1: <Adaptec RAID 5085> mem 0xb8c00000-0xb8dfffff irq 18 at device
0.0 on pci11
aac1: Enabling 64-bit address support
aac1: Enable Raw I/O
aac1: Enable 64-bit array
aac1: New comm. interface enabled
aac1: [ITHREAD]
aac1: Adaptec 5085, aac driver 2.0.0-1
aacp3: <SCSI Passthrough Bus> on aac1
aacp4: <SCSI Passthrough Bus> on aac1
aacp5: <SCSI Passthrough Bus> on aac1
pcib12: <ACPI PCI-PCI bridge> at device 6.0 on pci0
pci12: <ACPI PCI bus> on pcib12
pci12: <network> at device 0.0 (no driver attached)
pcib13: <ACPI PCI-PCI bridge> at device 7.0 on pci0
pci13: <ACPI PCI bus> on pcib13
pci13: <network> at device 0.0 (no driver attached)
pci0: <base peripheral> at device 8.0 (no driver attached)
pcib14: <ACPI PCI-PCI bridge> at device 30.0 on pci0
pci14: <ACPI PCI bus> on pcib14
vgapci0: <VGA-compatible display> port 0x1000-0x10ff mem
0xb0000000-0xb7ffffff,0xb9100000-0xb910ffff irq 17 at device 12.0 on
pci14
isab0: <PCI-ISA bridge> at device 31.0 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel 63XXESB2 UDMA100 controller> port
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x3040-0x304f irq 20 at device
31.1 on pci0
ata0: <ATA channel 0> on atapci0
ata0: [ITHREAD]
atapci1: <Intel 63XXESB2 SATA300 controller> port
0x3058-0x305f,0x3074-0x3077,0x3050-0x3057,0x3070-0x3073,0x3020-0x303f
mem 0xb9400000-0xb94003ff irq 20 at device 31.2 on pci0
atapci1: [ITHREAD]
atapci1: AHCI called from vendor specific driver
atapci1: AHCI v1.10 controller with 6 3Gbps ports, PM supported
ata2: <ATA channel 0> on atapci1
ata2: [ITHREAD]
ata3: <ATA channel 1> on atapci1
ata3: [ITHREAD]
ata4: <ATA channel 2> on atapci1
ata4: [ITHREAD]
ata5: <ATA channel 3> on atapci1
ata5: [ITHREAD]
ata6: <ATA channel 4> on atapci1
ata6: [ITHREAD]
ata7: <ATA channel 5> on atapci1
ata7: [ITHREAD]
pci0: <serial bus, SMBus> at device 31.3 (no driver attached)
atrtc0: <AT realtime clock> port 0x70-0x71,0x74-0x77 irq 8 on acpi0
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart0: [FILTER]
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
uart1: [FILTER]
atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
atkbd0: <AT Keyboard> irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
atkbd0: [ITHREAD]
psm0: <PS/2 Mouse> irq 12 on atkbdc0
psm0: [GIANT-LOCKED]
psm0: [ITHREAD]
psm0: model IntelliMouse, device ID 3
cpu0: <ACPI CPU> on acpi0
est0: <Enhanced SpeedStep Frequency Control> on cpu0
p4tcc0: <CPU Frequency Thermal Control> on cpu0
cpu1: <ACPI CPU> on acpi0
est1: <Enhanced SpeedStep Frequency Control> on cpu1
p4tcc1: <CPU Frequency Thermal Control> on cpu1
cpu2: <ACPI CPU> on acpi0
est2: <Enhanced SpeedStep Frequency Control> on cpu2
p4tcc2: <CPU Frequency Thermal Control> on cpu2
cpu3: <ACPI CPU> on acpi0
est3: <Enhanced SpeedStep Frequency Control> on cpu3
p4tcc3: <CPU Frequency Thermal Control> on cpu3
cpu4: <ACPI CPU> on acpi0
est4: <Enhanced SpeedStep Frequency Control> on cpu4
p4tcc4: <CPU Frequency Thermal Control> on cpu4
cpu5: <ACPI CPU> on acpi0
est5: <Enhanced SpeedStep Frequency Control> on cpu5
p4tcc5: <CPU Frequency Thermal Control> on cpu5
cpu6: <ACPI CPU> on acpi0
est6: <Enhanced SpeedStep Frequency Control> on cpu6
p4tcc6: <CPU Frequency Thermal Control> on cpu6
cpu7: <ACPI CPU> on acpi0
est7: <Enhanced SpeedStep Frequency Control> on cpu7
p4tcc7: <CPU Frequency Thermal Control> on cpu7
orm0: <ISA Option ROMs> at iomem 0xc0000-0xc8fff,0xc9000-0xcf7ff on isa0
sc0: <System console> at flags 0x100 on isa0
sc0: VGA <16 virtual consoles, flags=0x300>
vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
ppc0: cannot reserve I/O port range
Timecounters tick every 1.000 msec
acd0: CDROM <CD-224E-R/1.CA> at ata0-slave UDMA33
mfi0: 1819 (313511305s/0x0020/info) - Time established as 12/07/09
14:28:25; (102 seconds since power on)
mfid0: <MFI Logical Disk> on mfi0
mfid0: 238418MB (488280064 sectors) RAID volume '' is optimal
aacd0: <RAID 5> on aac1
aacd0: 9533430MB (19524464640 sectors)
aacd1: <RAID 5> on aac1
aacd1: 9533430MB (19524464640 sectors)
ses0 at aacp5 bus 0 scbus5 target 0 lun 0
ses0: <Newisys SA2120 T033> Fixed Enclosure Services SCSI-5 device
ses0: 3.300MB/s transfers
ses0: SCSI-3 SES Device
ses1 at aacp5 bus 0 scbus5 target 1 lun 0
ses1: <Newisys SA2120 T033> Fixed Enclosure Services SCSI-5 device
ses1: 3.300MB/s transfers
ses1: SCSI-3 SES Device
lapic3: Forcing LINT1 to edge trigger
SMP: AP CPU #3 Launched!
lapic1: Forcing LINT1 to edge trigger
SMP: AP CPU #1 Launched!
lapic2: Forcing LINT1 to edge trigger
SMP: AP CPU #2 Launched!
lapic4: Forcing LINT1 to edge trigger
SMP: AP CPU #4 Launched!
lapic7: Forcing LINT1 to edge trigger
SMP: AP CPU #7 Launched!
lapic5: Forcing LINT1 to edge trigger
SMP: AP CPU #5 Launched!
lapic6: Forcing LINT1 to edge trigger
SMP: AP CPU #6 Launched!
Trying to mount root from ufs:/dev/mfid0s1a
em0: link state changed to UP

etc.

Thanks!

-aps
Received on Mon Dec 07 2009 - 21:30:14 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:58 UTC