RE: HPT372 bug summary [was: RE: escalation stage 2]

From: Harald Schmalzbauer <h_at_schmalzbauer.de>
Date: Fri, 18 Jul 2003 07:20:33 +0200
Harald Schmalzbauer wrote:
> Ok, like I thought, the disk was not defect. There seems to be a
> bug in ata
> regarding HPT372
>
> First: Wiht BIOS version 2.342 the secondary master disk id is incorrectly
> detected (something liek "X X X X X X X X X X X X X X X" instead of
> "IC25N030ATCS04-0"

Please forget that. It was because for convinience reasons I had turned the
80-pin ATA cables upside down. So the black was at the controller and the
blue at the drive.
I can't imagine that this makes any technical difference (as long as no
slave drive is connected and there's no open end)
But it seems the single connectors are electrical coded (again I can't
imagine how?!?)

I tested the following BIOS versions which all had the same result: the
machine panics if one drive failed and there's no possibility to rebuild the
failed array (under FreeBSD)
2.34 (original Dawicontrol)
2.341 (372N2341.p5e from Highpoint)
2.343 (3XXV2343.p4e from Highpoint)
2.2 (from Highpoint)

The rest can be considered as confirmed

>
> I downgraded the BIOS to 2.2.
>
> Now I did the following test:
> 1. created a RAID1 with the controllers BIOS(two Hitachi 2.5" Notebook
> drives)
> 2. installed DOS
> 3. while DOS running I unpluged the (5v only) powersupply from one disk.
> 4 After powering off I reconnected the power supply to the disk
> 5. After switching on the controllers BIOS told me that the array
> has to be
> rebuild.
>
> So far it seems hardware is fine and working as designed.
>
> Now I installed FreeBSD 5.1 on the controller generated RAID1 ar0
> (it's name
> in the BIOS is read as "RAID1_1" I don't know what names this exactly
> reflects)
> When I unplug one drive the same way like before (or even do a "atacontrol
> detach 3 (the secondary channol of the controller)) FreeBSD warns me that
> ar0 is degraded. In the atacontrol list the disk on channel 3 (ad6)
> vanished.
> Now after some time, the machine panics with the dump I already supplied
> down this message (at least last time I didn't really unplug the power,
> instead issued a "atacontrol detach 3").
>
> Now when the machine is repowerd after corrected disk
> connections, the BIOS
> doesn't admit me to rebuild the array, but gives me the option to select a
> replacement disk and rebuild. But this doesn't work, the error is
> that there
> are not enaugh spare disks. At the status I can see the arry
> named "RAID1_1"
> which was established via the controllers BIOS. When I choose "continue to
> boot" I can see another array named "FreeBSD" which I never established.
> When again continuing booting the kernel boots and then the
> machine panics.
> I have to delete the array.
> After deleting the mirror the FreeBSD boots correct with degraded
> ar0 but I
> have no chance to rebuild the array. "atacontrol addspare ar0
> ad4" gives the
> error liek (can't remember exactly) "sioctl (ATASPAREADD) not configured".
> Also no detach/reinit/attach helps.
>
> I also think the RAID configuration is stored on the disks since when I
> create a non-DOS compatible slice (starting at 0 not 63) the RAID
> configuration vanishes.
>
> Now I assume that there are two different RAID configurations,
> one stored on
> disk by the controllers BIOS and anotherone which FreeBSD stores elsewhere
> (e.g: with the sil0680 I can well create slices starting at 0).
> Now when one drive fails both configurations are marked degraded but in a
> different manner (because there is one array named "RAID1_1" and a second
> which is named "FreeBSD")
> And that's why FreeBSD panics until I delete the mirror relationship.
>
> This has nothing to do with the initiating crash coming from
> "sysinstall or
> sysctl -a" but is also ugly since the controller doesn't do it's job
> correctly under FreeBSD.
>
> So I hope Soren can have a look at it or at least correct me if I'm wrong.
>
> Since this is my most important server I can't help you the next weeks. On
> sunday I'll buy a SIL0680 based controller because I did the same
> test with
> it and it's working.
> Now I'm currently setting up FreeBSD and building a kernel with DDB.
>
> Please let me know what I can do, I'm no programmer. I only know that
> something like backtrace is usually useful. But I dnon't know
> what backtrace
> is, so if you'd need information from me please tell me axactly
> what to do.
>
> Best regards,
>
> -Harry
>
> >
> >
> > Now after resetting the machine which was hung by "sysinstall" it claims
> > that ad4 (one of two mirrored 30GB 2.5" disks" was absent (see
> > dmesg below)
> > Now the controller warns me that one drive is bad (which in fact is
> > definatley not) and allows me to select "continue boot"
> > That's what I do and after kernel probing the machine reboots with the
> > folowing error (well, this takes some time to typewrite it from
> > my monchrome
> > screen):
> >
> > Fatal trap 12: page fault while in kernel mode
> > fault virtual address = 0x10
> > fault code=			supervisor read, page not present
> > instruction pinter=	0x8:0xc014a0a6
> > stack pointer=		0x10:0xcce65bd8
> > frame pointer=		0x10:0xcce65c58
> > code	segment		= base 0x0, limit 0xfffff type 0x1b
> > 				= DPL 0, pres 1, def32 1, gran 1
> > processor eflags		= interrupt enabled, resume, IOPL=0
> > current process		= 4(g_down)
> > trap number			= 12
> > panic: page fault
> >
> > Then it reboots!
> >
> > Now please give me a hint what to do. This is my brand new
> > fileserver which
> > collected all improtant data from the last decade and since it's
> > brand new I
> > didn't manage any backup.
> > When testing the hardware (unplugging one drive while the machine was
> > running) I had the same error but I thought that would never
> happen under
> > normal circumstances.
> >
> > If sysinstall breakes a RAID1 server 5.1-RELEASE should be immediately
> > replaced by a corrected version!!!!!
> >
> > (Controller is a Dawicontrol DC-100 with HPT372 chipset and 2.343
> > BIOS, the
> > original 2.34 BIOS didn't work at all with FreeBSD (while it did with
> > Windows98))
> > The machine is the ######## VIA Fileserver######## like dmesg'ed below
> >
> > Best regards,
> >
> > -Harry
> >
> > P.S: Now it has not only ad6 in the following message but also
> > ad4 (and that
> > always has been  the reason for the panic during my testings!)
> (watch out
> > the four ad4 and only two ad6)
> > Opened disk ad4 -> 1
> > Opened disk ad4 -> 1
> > Opened disk ad4 -> 1
> > Opened disk ad4 -> 1
> > Opened disk ad6 -> 1
> > Opened disk ad6 -> 1
> >
> > > Dear all,
> > >
> > > I'm experimenting with 5.1-REL for some weeks and during that
> time I had
> > > some mysterious hangs which I didn't take serious because I modified
> > > /usr/src/sys/cam/scsi/scsi_da.c to support my CF-Card-Reader.
> > > But now I saw exactly the same problem on my brand new (and
> cosidered by
> > > hardware extremely different) fileserver.
> > >
> > > The machine freezes for about one minute and then reboots itself
> > > withut any
> > > error message.
> > > It happens when I do a "/stand/sysinstall" or a "sysctl -a"
> > >
> > > This is VERY ugly because when my fileserver dies my
> > workstation also died
> > > (home was nfs-mounted)
> > >
> > > I'm no developer, but if someone tells me what to do I'll help
> > > solving that
> > > BUG.
> > >
> > > Here is some info about my two machines (which are running
> > 5.1-release and
> > > showed the same bug):
> > >
> > > ######## VIA FIleserver ############
> > >
> > > FreeBSD 5.1-RELEASE #2: Fri Jul  4 14:02:06 CEST 2003
> > >     root_at_tek.flintsbach.schmalzbauer.de:/usr/obj/usr/src/sys/EPIA
> > > Preloaded elf kernel "/boot/kernel/kernel" at 0xc04c0000.
> > > Preloaded elf module "/boot/kernel/acpi.ko" at 0xc04c01f4.
> > > Timecounter "i8254"  frequency 1192944 Hz
> > > Timecounter "TSC"  frequency 800032401 Hz
> > > CPU: VIA C3 Samuel 2 (800.03-MHz 686-class CPU)
> > >   Origin = "CentaurHauls"  Id = 0x67a  Stepping = 10
> > >   Features=0x803035<FPU,DE,TSC,MSR,MTRR,PGE,MMX>
> > > real memory  = 266272768 (253 MB)
> > > avail memory = 253374464 (241 MB)
> > > VESA: v2.0, 2048k memory, flags:0x0, mode table:0xc00c8ac8 (c0008ac8)
> > > VESA: Copyright 1998 TRIDENT MICROSYSTEMS INC.
> > > npx0: <math processor> on motherboard
> > > npx0: INT 16 interface
> > > acpi0: <VIA601 AWRDACPI> on motherboard
> > > pcibios: BIOS version 2.10
> > > Using $PIR table, 5 entries at 0xc00fdc70
> > > acpi0: power button is handled as a fixed feature programming model.
> > > Timecounter "ACPI-safe"  frequency 3579545 Hz
> > > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x4008-0x400b on acpi0
> > > acpi_cpu0: <CPU> port 0x530-0x537 on acpi0
> > > acpi_tz0: <thermal zone> port 0x530-0x537 on acpi0
> > > acpi_button0: <Power Button> on acpi0
> > > pcib0: <ACPI Host-PCI bridge> port
> > > 0x6000-0x607f,0x5000-0x500f,0x4080-0x40ff,0x4000-0x407f,0xcf8-0xcf
> > > f on acpi0
> > > pci0: <ACPI PCI bus> on pcib0
> > > agp0: <VIA Generic host to PCI bridge> mem
> > 0xd0000000-0xd3ffffff at device
> > > 0.0 on pci0
> > > pcib1: <PCI-PCI bridge> at device 1.0 on pci0
> > > pci1: <PCI bus> on pcib1
> > > pci1: <display, VGA> at device 0.0 (no driver attached)
> > > isab0: <PCI-ISA bridge> at device 17.0 on pci0
> > > isa0: <ISA bus> on isab0
> > > atapci0: <VIA 8231 UDMA100 controller> port 0xc000-0xc00f at
> > > device 17.1 on
> > > pci0
> > > ata0: at 0x1f0 irq 14 on atapci0
> > > ata1: at 0x170 irq 15 on atapci0
> > > uhci0: <VIA 83C572 USB controller> port 0xc400-0xc41f irq 5 at
> > device 17.2
> > > on pci0
> > > usb0: <VIA 83C572 USB controller> on uhci0
> > > usb0: USB revision 1.0
> > > uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> > > uhub0: 2 ports with 2 removable, self powered
> > > uhci1: <VIA 83C572 USB controller> port 0xc800-0xc81f irq 5 at
> > device 17.3
> > > on pci0
> > > usb1: <VIA 83C572 USB controller> on uhci1
> > > usb1: USB revision 1.0
> > > uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> > > uhub1: 2 ports with 2 removable, self powered
> > > pci0: <bridge, PCI-unknown> at device 17.4 (no driver attached)
> > > pcm0: <VIA VT82C686A> port
> > > 0xd400-0xd403,0xd000-0xd003,0xcc00-0xccff irq 12
> > > at device 17.5 on pci0
> > > pcm0: <VIA Technologies VIA1612A AC97 Codec>
> > > vr0: <VIA VT6102 Rhine II 10/100BaseTX> port 0xd800-0xd8ff mem
> > > 0xd8000000-0xd80000ff irq 10 at device 18.0 on pci0
> > > vr0: Ethernet address: 00:40:63:c2:9d:af
> > > miibus0: <MII bus> on vr0
> > > ukphy0: <Generic IEEE 802.3u media interface> on miibus0
> > > ukphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> > > atapci1: <HighPoint HPT372 UDMA133 controller> port
> > > 0xec00-0xecff,0xe800-0xe803,0xe400-0xe407,0xe000-0xe003,0xdc00-0xd
> > > c07 irq 11
> > > at device 20.0 on pci0
> > > ata2: at 0xdc00 on atapci1
> > > ata3: at 0xe400 on atapci1
> > > sio0 port 0x3f8-0x3ff irq 4 on acpi0
> > > sio0: type 16550A
> > > orm0: <Option ROMs> at iomem
> > > 0xd0000-0xd9fff,0xcc000-0xcffff,0xc0000-0xcbfff
> > > on isa0
> > > pmtimer0 on isa0
> > > atkbdc0: <Keyboard controller (i8042)> at port 0x64,0x60 on isa0
> > > sc0: <System console> at flags 0x100 on isa0
> > > sc0: VGA <12 virtual consoles, flags=0x300>
> > > sio1: configured irq 3 not in bitmap of probed irqs 0
> > > sio1: port may not be enabled
> > > vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem
> > 0xa0000-0xbffff on isa0
> > > Timecounters tick every 1.000 msec
> > > acpi_cpu: throttling enabled, 2 steps (100% to 50.0%),
> currently 100.0%
> > > ad6: 28615MB <IC25N030ATCS04-0> [58140/16/63] at ata3-master UDMA100
> > > acd0: MODE_SENSE_BIG trying to write on read buffer
> > > acd0: MODE_SENSE_BIG - NO SENSE asc=0x00 ascq=0x00 error=0x04
> > > acd0: CDROM <CD-224E> at ata1-slave PIO4
> > > ar0: WARNING - mirror lost
> > > ar0: 28615MB <ATA RAID1 array> [3647/255/63] status: DEGRADED
> subdisks:
> > >  disk0 READY on ad6 at ata3-master
> > > Opened disk ad6 -> 1
> > > Opened disk ad6 -> 1
> > > Opened disk ad6 -> 1
> > > Opened disk ad6 -> 1
> > > Mounting root from ufs:/dev/ar0s1a
> > > WARNING: / was not properly dismounted
> > > WARNING: /tmp was not properly dismounted
> > > WARNING: /usr was not properly dismounted
> > > WARNING: /var was not properly dismounted
> > >
> > >
> > > ######### Intel Workstation ##################
> > >
> > > FreeBSD 5.1-RELEASE #2: Tue Jul  8 01:08:36 CEST 2003
> > >     harry_at_cale.flintsbach.schmalzbauer.de:/usr/obj/usr/src/sys/CALE
> > > Preloaded elf kernel "/boot/kernel/kernel" at 0xc04c7000.
> > > Preloaded elf module "/boot/kernel/acpi.ko" at 0xc04c721c.
> > > Timecounter "i8254"  frequency 1193267 Hz
> > > Timecounter "TSC"  frequency 737075599 Hz
> > > CPU: Intel Pentium III (737.08-MHz 686-class CPU)
> > >   Origin = "GenuineIntel"  Id = 0x683  Stepping = 3
> > >
> > > Features=0x383f9ff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE
> > > ,MCA,CMOV,
> > > PAT,PSE36,MMX,FXSR,SSE>
> > > real memory  = 267300864 (254 MB)
> > > avail memory = 254337024 (242 MB)
> > > Pentium Pro MTRR support enabled
> > > VESA: v3.0, 1024k memory, flags:0x1, mode table:0xc03f8922 (1000022)
> > > VESA: Intel(R) 810, Intel(R) 815 Chipset Video BIOS
> > > npx0: <math processor> on motherboard
> > > npx0: INT 16 interface
> > > acpi0: <ASUS   CUSL2   > on motherboard
> > > pcibios: BIOS version 2.10
> > > Using $PIR table, 10 entries at 0xc00f1360
> > > acpi0: power button is handled as a fixed feature programming model.
> > > Timecounter "ACPI-safe"  frequency 3579545 Hz
> > > acpi_timer0: <24-bit timer at 3.579545MHz> port 0xe408-0xe40b on acpi0
> > > acpi_cpu0: <CPU> port 0x530-0x537 on acpi0
> > > acpi_button0: <Power Button> on acpi0
> > > pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> > > pci0: <ACPI PCI bus> on pcib0
> > > agp0: <Intel 82815 (i815 GMCH) SVGA controller> mem
> > > 0xf7000000-0xf707ffff,0xf8000000-0xfbffffff irq 11 at device
> 2.0 on pci0
> > > pcib1: <ACPI PCI-PCI bridge> at device 30.0 on pci0
> > > pci1: <ACPI PCI bus> on pcib1
> > > fxp0: <Intel 82557/8/9 EtherExpress Pro/100(B) Ethernet> port
> > > 0xd800-0xd83f
> > > mem 0xf6000000-0xf601ffff,0xf6800000-0xf6800fff irq 9 at device
> > > 10.0 on pci1
> > > fxp0: Ethernet address 00:02:b3:89:e5:55
> > > miibus0: <MII bus> on fxp0
> > > inphy0: <i82555 10/100 media interface> on miibus0
> > > inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
> > > pcm0: <Creative CT5880-C> port 0xd400-0xd43f at device 13.0 on pci1
> > > pcm0: <SigmaTel STAC9708/11 AC97 Codec>
> > > pcib1: slot 13 INTA is routed to irq 10
> > > isab0: <PCI-ISA bridge> at device 31.0 on pci0
> > > isa0: <ISA bus> on isab0
> > > atapci0: <Intel ICH2 UDMA100 controller> port 0xb800-0xb80f at
> > device 31.1
> > > on pci0
> > > ata0: at 0x1f0 irq 14 on atapci0
> > > ata1: at 0x170 irq 15 on atapci0
> > > uhci0: <Intel 82801BA/BAM (ICH2) USB controller USB-A> port
> > 0xb400-0xb41f
> > > irq 7 at device 31.2 on pci0
> > > usb0: <Intel 82801BA/BAM (ICH2) USB controller USB-A> on uhci0
> > > usb0: USB revision 1.0
> > > uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> > > uhub0: 2 ports with 2 removable, self powered
> > > ums0: Microsoft Microsoft IntelliMouse® Explorer, rev
> 1.10/1.14, addr 2,
> > > iclass 3/1
> > > ums0: 5 buttons and Z dir.
> > > ichsmb0: <Intel 82801BA (ICH2) SMBus controller> port
> > 0xe800-0xe80f irq 10
> > > at device 31.3 on pci0
> > > smbus0: <System Management Bus> on ichsmb0
> > > smb0: <SMBus generic I/O> on smbus0
> > > uhci1: <Intel 82801BA/BAM (ICH2) USB controller USB-B> port
> > 0xb000-0xb01f
> > > irq 9 at device 31.4 on pci0
> > > usb1: <Intel 82801BA/BAM (ICH2) USB controller USB-B> on uhci1
> > > usb1: USB revision 1.0
> > > uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
> > > uhub1: 2 ports with 2 removable, self powered
> > > uhub2: ALCOR Generic USB Hub, class 9/0, rev 1.10/1.00, addr 2
> > > uhub2: 4 ports with 4 removable, self powered
> > > sio0 port 0x3f8-0x3ff irq 4 on acpi0
> > > sio0: type 16550A
> > > sio1 port 0x2f8-0x2ff irq 3 on acpi0
> > > sio1: type 16550A
> > > atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
> > > atkbd0: <AT Keyboard> flags 0x1 irq 1 on atkbdc0
> > > orm0: <Option ROMs> at iomem
> > > 0xd0000-0xd17ff,0xcc000-0xcffff,0xc0000-0xcbfff
> > > on isa0
> > > pmtimer0 on isa0
> > > sc0: <System console> at flags 0x100 on isa0
> > > sc0: VGA <12 virtual consoles, flags=0x300>
> > > vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem
> > 0xa0000-0xbffff on isa0
> > > Timecounters tick every 1.000 msec
> > > acpi_cpu: throttling enabled, 8 steps (100% to 12.5%),
> currently 100.0%
> > > ad0: 39083MB <Maxtor 4D040H2> [79408/16/63] at ata0-master UDMA100
> > > acd0: CDROM <SONY CDU4811> at ata1-master PIO4
> > > Mounting root from ufs:/dev/ad0s1a
> > >
> > > _______________________________________________
> > > freebsd-current_at_freebsd.org mailing list
> > > http://lists.freebsd.org/mailman/listinfo/freebsd-current
> > > To unsubscribe, send any mail to
> > "freebsd-current-unsubscribe_at_freebsd.org"
> > >
> >
> > _______________________________________________
> > freebsd-current_at_freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to
> "freebsd-current-unsubscribe_at_freebsd.org"
> >
>
Received on Thu Jul 17 2003 - 20:20:56 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:15 UTC