Re: Panic during install on Sparc64 - Only with large HDD

From: Søren Schmidt <sos_at_FreeBSD.org>
Date: Sun, 14 Aug 2005 21:41:38 +0200
On 14/08/2005, at 20:16, Chris Gilbert wrote:

> Also, it seems that setting hw.ata.ata_dma=0 (forcing it into PIO  
> mode) fixes
> the issue.
>
> # sysctl -a hw.ata.ata_dma
> hw.ata.ata_dma: 0
>
> # dd count=1 obs=1024 seek=93321656 if=/dev/urandom of=/dev/ad0g
> 1+0 records in
> 0+1 records out
> 512 bytes transferred in 0.001390 secs (368351 bytes/sec)
>
> Also, seems there is a bug summitted on this, and a posting to the
> freebsd-sparc64 mailing list.
>
> http://lists.freebsd.org/pipermail/freebsd-sparc64/2005-June/ 
> 003212.html
>
> Will continue looking into the chipset docs and FreeBSD driver...  
> but thought
> I should point this out.

Actually the problem is in the Acer chip, it cant handle 48bit  
addressing in DMA mode, unless the version is above 0xc4 IIRC.

Either you should use disks with a size less137GB, or you need to  
engage PIO mode.

A workaround in ATA could be to use PIO mode when crossing the  
boundary, but there is no framework for quirks like that present yet,  
could be pretty easily done though so give a me few days (I'm busy as  
usual)

-Søren

>
> -- 
> Thanks,
> Chris (Lance) Gilbert
> Ph: +45 33 73 29 31 (UTC +0100)
>
> On Saturday 13 August 2005 23:21, Chris Gilbert wrote:
>
>> Well, I've continued looking into this problem as I really  
>> _really_ want to
>> see it fixed for 6.0-RELEASE.
>>
>> I did some general device stress-testing to make sure that is was  
>> directly
>> triggerable and reproducible, and was not just an intermittent  
>> failure.
>>
>> I have successfully created, and installed FreeBSD on (without any  
>> errors):
>>
>> /dev/ad0a
>> /dev/ad0b
>> /dev/ad0c
>> /dev/ad0d
>> /dev/ad0e
>> /dev/ad0f
>>
>> Even though the newfs on it failed, creating the slice itself  
>> worked for my
>> large partition (/dev/ad0g).
>>
>> Therefore, I can dd data to it, but I can't write a UFS filesystem  
>> to it in
>> order to mount.
>>
>> I then went about writing data to this filesystem for long periods  
>> of time
>> to try and hit the problem:
>>
>> # time dd if=/dev/urandom of=/dev/ad0g
>> 143337401+0 records in
>> 143337401+0 records out
>> 73388749312 bytes transferred in 89392.318911 secs (820974 bytes/sec)
>> 614.444u 41826.640s 24:49:52.35 47.4%   244+1708k 0+0io 0pf+0w
>>
>> After this ran without a single error for about 20 hours, I  
>> stopped it and
>> started trying to hit the block that triggered the issue manually.
>>
>> After a few hours of "double and half(ing) " I finally managed to  
>> find the
>> block:
>>
>> # dd count=1 obs=1024 seek=93321655 if=/dev/urandom of=/dev/ad0g
>> 1+0 records in
>> 0+1 records out
>> 512 bytes transferred in 0.001470 secs (348278 bytes/sec)
>>
>> This one was successful... but the very next one:
>>
>> # dd count=1 obs=1024 seek=93321656 if=/dev/urandom of=/dev/ad0g
>> ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435456
>> ad0: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435456
>> ad0: FAILURE - WRITE_DMA timed out LBA=268435456
>> dd: /dev/ad0g: Input/output error
>> 1+0 records in
>> 0+0 records out
>> 0 bytes transferred in 16.453833 secs (0 bytes/sec)
>>
>> And incrementing this by one block shows:
>>
>> # dd count=1 obs=1024 seek=93321657 if=/dev/urandom of=/dev/ad0g
>> ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435458
>> ad0: TIMEOUT - WRITE_DMA retrying (0 retries left) LBA=268435458
>> ad0: FAILURE - WRITE_DMA timed out LBA=268435458
>> dd: /dev/ad0g: Input/output error
>> 1+0 records in
>> 0+0 records out
>> 0 bytes transferred in 16.452722 secs (0 bytes/sec)
>>
>> This makes perfect sense because my block size is specified at  
>> 1024 on the
>> dd command, and the default blocksize is 512. Therefore,  
>> incrementing it by
>> a single 1024 size block would return 2 blocks further in the LBA.
>>
>> ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435456
>> (then...)
>> ad0: TIMEOUT - WRITE_DMA retrying (1 retry left) LBA=268435458
>>
>> Bingo! We've finally found the wall!
>>
>> I'm going to look further into the IDE chipset (atapci0: <AcerLabs  
>> M5229
>> UDMA66 controller>) tonight. Both for it's whitepapers (To see if  
>> it has
>> some sort of quirk or limitation around this area.) and it's FreeBSD
>> driver, to see if something funky is going on.
>>
>> As I said before, if anyone is interesting in helping me resolve  
>> this I
>> would appreciate it greatly. This is a bug which has haunted me  
>> and several
>> others since FreeBSD 5.2-RC2 and it needs to be fixed.
>>
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current- 
> unsubscribe_at_freebsd.org"
>
>

- Søren
Received on Sun Aug 14 2005 - 17:41:44 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:41 UTC