Re: ataraid + geom_stripe problems

From: Søren Schmidt <sos_at_DeepCore.dk>
Date: Mon, 30 Aug 2004 17:57:33 +0200
Daniel Eriksson wrote:
> A few days ago I decided to try to switch from gvinum in RAID-0 mode to
> geom_stripe on one of my arrays (4 x 36GB SCSI). Unfortunately I never
> managed to get it to work since the machine protested loudly and crashed all
> of my ataraid arrays whenever geom_stripe tried to start up its array. This
> was on a 6-CURRENT system compiled from sources dated
> 2004.08.26.(something).
> 
> Has anyone tried to use both ataraid and geom_stripe on the same machine?
> 
> I also use gvinum on this machine, but it is not loaded during boot so it
> should not affect this.
> 
> Attached is a dmesg from the machine (but with a slightly newer kernel, no
> other changes were made though other than to remove geom_stripe). It should
> provide info on what hardware is used.
> 
> Here's how it looked on the console when I tried it 3 days ago. The ataraid
> discs all hang off of two HighPoint RocketRAID 454 cards. Once all the
> atariad arrays had been crashed I could delete and re-create them without
> any problems. I didn't dare to try to access them however (live data on the
> filesystems).
> 
> ar0: 476950MB <ATA RAID0 array> [60802/255/63] status: READY subdisks:
>  disk0 READY on ad4 at ata2-master
>  disk1 READY on ad5 at ata2-slave
> ar1: 478744MB <ATA RAID0 array> [61031/255/63] status: READY subdisks:
>  disk0 READY on ad6 at ata3-master
>  disk1 READY on ad7 at ata3-slave
> ar2: 388962MB <ATA RAID0 array> [49585/255/63] status: READY subdisks:
>  disk0 READY on ad9 at ata4-slave
>  disk1 READY on ad8 at ata4-master
> ar3: 228946MB <ATA RAID0 array> [29186/255/63] status: READY subdisks:
>  disk0 READY on ad23 at ata11-slave
>  disk1 READY on ad24 at ata12-master
> Waiting 5 seconds for SCSI devices to settle
> sa0 at ahc0 bus 0 target 5 lun 0
> sa0: <Seagate STT20000N 6A51> Removable Sequential Access SCSI-2 device 
> sa0: 10.000MB/s transfers (10.000MHz, offset 15)
> da0 at ahc0 bus 0 target 10 lun 0
> da0: <IBM DDYS-T36950N S93E> Fixed Direct Access SCSI-3 device 
> da0: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing
> Enabled
> da0: 35003MB (71687340 512 byte sectors: 255H 63S/T 4462C)
> da1 at ahc0 bus 0 target 11 lun 0
> da1: <IBM IC35L036UWD210-0 S5CQ> Fixed Direct Access SCSI-3 device 
> da1: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing
> Enabled
> da1: 35003MB (71687340 512 byte sectors: 255H 63S/T 4462C)
> da2 at ahc0 bus 0 target 12 lun 0
> da2: <IBM DDYS-T36950N S93E> Fixed Direct Access SCSI-3 device 
> da2: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing
> Enabled
> da2: 35003MB (71687340 512 byte sectors: 255H 63S/T 4462C)
> da3 at ahc0 bus 0 target 13 lun 0
> da3: <IBM DDYS-T36950N S93E> Fixed Direct Access SCSI-3 device 
> da3: 160.000MB/s transfers (80.000MHz, offset 63, 16bit), Tagged Queueing
> Enabled
> da3: 35003MB (71687340 512 byte sectors: 255H 63S/T 4462C)
> Mounting root from ufs:/dev/ad0s1a
> Enter full pathname of shell or RETURN for /bin/sh: 
> # kldload geom_stripe
> # GEOM_STRIPE: Device testraid created (id=3167252550).
> GEOM_STRIPE: Disk da3 attached to racingraid.
> GEOM_STRIPE: Disk da2 attached to racingraid.
> GEOM_STRIPE: Disk da1 attached to racingraid.
> GEOM_STRIPE: Disk da0 attached to racingraid.
> GEOM_STRIPE: Device testraid activated.
> Interrupt storm detected on "irq16: atapci0+++"; throttling interrupt source

Hmm, looks like you re triggering the throtteling code, that will lead 
to catastophic failure as it tosses out interrupts causing this:

> ad24: TIMEOUT - READ_DMA retrying (2 retries left) LBA=234441657
> ad24: TIMEOUT - READ_DMA retrying (1 retry left) LBA=234441657
> ad24: FAILURE - READ_DMA timed out
> ar3: ERROR - array broken
> ad8: TIMEOUT - READ_DMA retrying (2 retries left) LBA=398297097
> ad8: TIMEOUT - READ_DMA retrying (1 retry left) LBA=398297097
> ad8: FAILURE - READ_DMA timed out
> ar2: ERROR - array broken
> ad7: TIMEOUT - READ_DMA retrying (2 retries left) LBA=490234761
> ad7: TIMEOUT - READ_DMA retrying (1 retry left) LBA=490234761
> ad7: FAILURE - READ_DMA timed out
> ar1: ERROR - array broken
> ad5: TIMEOUT - READ_DMA retrying (2 retries left) LBA=488397177
> ad5: TIMEOUT - READ_DMA retrying (1 retry left) LBA=488397177
> ad5: FAILURE - READ_DMA timed out
> ar0: ERROR - array broken

Anyhow you would want up to date -current ATA sources as quite a few 
problems has been solved..

-Søren
Received on Mon Aug 30 2004 - 13:58:08 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:09 UTC