Re: ATA? related trouble with r300299

From: Kenneth D. Merry <ken_at_FreeBSD.ORG>
Date: Tue, 24 May 2016 16:18:38 -0400
On Tue, May 24, 2016 at 21:59:53 +0700, Alex V. Petrov wrote:
> 24.05.16 20:21, Kenneth D. Merry ??????????:
> > On Tue, May 24, 2016 at 08:04:21 +0300, Oleg V. Nauman wrote:
> >> On Monday 23 May 2016 19:08:16 Kenneth D. Merry wrote:
> >>> On Tue, May 24, 2016 at 00:59:34 +0300, Oleg V. Nauman wrote:
> >>>> On Monday 23 May 2016 17:30:45 you wrote:
> >>>>> On Tue, May 24, 2016 at 00:15:25 +0300, Oleg V. Nauman wrote:
> >>>>>> On Monday 23 May 2016 17:11:34 Kenneth D. Merry wrote:
> >>>>>>> On Tue, May 24, 2016 at 00:05:49 +0300, Oleg V. Nauman wrote:
> >>>>>>>> On Monday 23 May 2016 16:53:55 Kenneth D. Merry wrote:
> >>>>>>>>> On Mon, May 23, 2016 at 23:21:32 +0300, Oleg V. Nauman wrote:
> >>>>>>>>>> On Monday 23 May 2016 15:25:39 Kenneth D. Merry wrote:
> >>>>>>>>>>> On Sat, May 21, 2016 at 09:30:35 +0300, Oleg V. Nauman 
> >> wrote:
> >>>>>>>>>>>>  I have faced the issue with fresh CURRENT stopped to boot
> >>>>>>>>>>>>  on
> >>>>>>>>>>>>  my
> >>>>>>>>>>>>  old
> >>>>>>>>>>>>  desktop
> >>>>>>>>>>>>
> >>>>>>>>>>>> after update to r300299
> >>>>>>>>>>>> Verbose boot shows the endless cycle of
> >>>>>>>>>>>>
> >>>>>>>>>>>> ata2: SATA reset: ports status=0x05
> >>>>>>>>>>>> ata2: reset tp1 mask=03 ostat0=50 ostat1=50
> >>>>>>>>>>>> ata2: stat0=0x50 err=0x01 lsb=0x00 msb=0x00
> >>>>>>>>>>>> ata2: stat1=0x50 err=0x01 lsb=0x00 msb=0x00
> >>>>>>>>>>>> ata2: reset tp2 stat0=50 stat1=50 devices=0x3
> >>>>>>>>>>>> messages logged to console.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Below is the relevant portion of ATA controller/devices
> >>>>>>>>>>>> probed/attached
> >>>>>>>>>>>> during the boot:
> >>>>>>>>>>>>
> >>>>>>>>>>>> atapci0: <Intel ICH7 UDMA100 controller> port
> >>>>>>>>>>>> 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at
> >>>>>>>>>>>> device
> >>>>>>>>>>>> 31.1
> >>>>>>>>>>>> on
> >>>>>>>>>>>> pci0
> >>>>>>>>>>>> ata0: <ATA channel> at channel 0 on atapci0
> >>>>>>>>>>>> atapci1: <Intel ICH7 SATA300 controller> port
> >>>>>>>>>>>> 0xd080-0xd087,
> >>>>>>>>>>>> 0xd000-0xd003,
> >>>>>>>>>>>> 0xcc00-0xcc07,0xc880-0xc883,0xc800-0xc80f irq 19 at device
> >>>>>>>>>>>> 31.2 on
> >>>>>>>>>>>> pci0
> >>>>>>>>>>>> ata2: <ATA channel> at channel 0 on atapci1
> >>>>>>>>>>>> ata3: <ATA channel> at channel 1 on atapci1
> >>>>>>>>>>>> ada0 at ata2 bus 0 scbus1 target 0 lun 0
> >>>>>>>>>>>> ada0: <SAMSUNG HD200HJ KF100-06> ATA-7 SATA 2.x device
> >>>>>>>>>>>> ada1 at ata2 bus 0 scbus1 target 1 lun 0
> >>>>>>>>>>>> ada1: <ST500DM002-1BC142 JC4B> ATA8-ACS SATA 3.x device
> >>>>>>>>>>>> cd0 at ata0 bus 0 scbus0 target 0 lun 0
> >>>>>>>>>>>> cd0: <_NEC DVD_RW ND-3570A 1.11> Removable CD-ROM SCSI
> >>>>>>>>>>>> device
> >>>>>>>>>>>
> >>>>>>>>>>> I'm not entirely sure what is causing the problem with your
> >>>>>>>>>>> system,
> >>>>>>>>>>> but
> >>>>>>>>>>> hopefully we can narrow it down a bit.
> >>>>>>>>>>>
> >>>>>>>>>>> There is a bug that came in with my SMR changes in revision
> >>>>>>>>>>> 300207
> >>>>>>>>>>> that
> >>>>>>>>>>> broke the quirk functionality in the ada(4) driver.  I don't
> >>>>>>>>>>> think
> >>>>>>>>>>> that
> >>>>>>>>>>> is
> >>>>>>>>>>> the problem you're seeing, though.
> >>>>>>>>>>>
> >>>>>>>>>>> Can you try out this patch:
> >>>>>>>>>>>
> >>>>>>>>>>> https://people.freebsd.org/~ken/cam_smr_ada_patch.20160523.1
> >>>>>>>>>>> .txt
> >>>>>>>>>>>
> >>>>>>>>>>> In /boot/loader.conf, put the following:
> >>>>>>>>>>>
> >>>>>>>>>>> kern.cam.ada.0.quirks="0x04"
> >>>>>>>>>>> kern.cam.ada.1.quirks="0x04"
> >>>>>>>>>>>
> >>>>>>>>>>> If you're able to boot with those quirk entries in the
> >>>>>>>>>>> loader.conf,
> >>>>>>>>>>> try
> >>>>>>>>>>> taking one of them out, and reboot.  If that works, try
> >>>>>>>>>>> taking
> >>>>>>>>>>> the
> >>>>>>>>>>> other
> >>>>>>>>>>> one out and reboot.
> >>>>>>>>>>>
> >>>>>>>>>>> What I'm trying to figure out here is where the problem
> >>>>>>>>>>> lies:
> >>>>>>>>>>>
> >>>>>>>>>>> 1. The bug with the ada(4) driver (in where it loaded the
> >>>>>>>>>>> quirks).
> >>>>>>>>>>> 2. The extra probe steps in the ada(4) driver might be
> >>>>>>>>>>> causing a
> >>>>>>>>>>> problem
> >>>>>>>>>>>
> >>>>>>>>>>>    with ada0 (Samsung drive).
> >>>>>>>>>>>
> >>>>>>>>>>> 3. The extra probe steps in the ada(4) driver might be
> >>>>>>>>>>> causing a
> >>>>>>>>>>> problem
> >>>>>>>>>>>
> >>>>>>>>>>>    with ada1 (Seagate drive).
> >>>>>>>>>>>
> >>>>>>>>>>> 4. Something else.
> >>>>>>>>>>>
> >>>>>>>>>>> So, if you can try the patch and try to eliminate a few
> >>>>>>>>>>> possibilities,
> >>>>>>>>>>> we
> >>>>>>>>>>> may be able to narrow it down.
> >>>>>>>>>>  
> >>>>>>>>>>  I was able to boot after applying the patch ;
> >>>>>>>>>>
> >>>>>>>>>> kern.cam.ada.0.quirks="0x04"
> >>>>>>>>>> was the quirk in effect. It is quirk for my Samsung HD200HJ
> >>>>>>>>>> KF100-06
> >>>>>>>>>> hard
> >>>>>>>>>> drive.
> >>>>>>>>>
> >>>>>>>>> Okay.  Just so we can narrow it down a little more, can you try
> >>>>>>>>> this:
> >>>>>>>>>
> >>>>>>>>> First, let's try getting an ATA Log directory using the PIO
> >>>>>>>>> version
> >>>>>>>>> of
> >>>>>>>>> the
> >>>>>>>>> command:
> >>>>>>>>>
> >>>>>>>>> camcontrol cmd ada0 -v -a "2f 0 0 0 0 0 0 0 0 0 1 0" -i 512 -
> >>>>>>>>> |hd
> >>>>>>>>>
> >>>>>>>>> If that works (you should get hexdump output), try the DMA
> >>>>>>>>> version
> >>>>>>>>> of
> >>>>>>>>> the
> >>>>>>>>> command:
> >>>>>>>>>
> >>>>>>>>> camcontrol cmd ada0 -v -d -a "47 0 0 0 0 0 0 0 0 0 1 0" -i 512 -
> >>>>>>>>> |hd
> >>>>>>>>
> >>>>>>>> "Expecting a character pointer argument." error for both commands.
> >>>>>>>
> >>>>>>> Did the double quotes make it onto the command line?  Both of those
> >>>>>>> work
> >>>>>>> for me...
> >>>>>>  
> >>>>>>  Something went wrong from my side, sorry.
> >>>>>>
> >>>>>> Below is the output of commands:
> >>>>>>
> >>>>>> root_at_desktop:~ # camcontrol cmd ada0 -v -a "2f 0 0 0 0 0 0 0 0 0 1 0"
> >>>>>> -i
> >>>>>> 512 ->
> >>>>>>
> >>>>>> |hd
> >>>>>>
> >>>>>> camcontrol: error sending command
> >>>>>> (pass1:ata2:0:0:0): READ_LOG_EXT. ACB: 2f 00 00 00 00 00 00 00 00 00
> >>>>>> 01 00
> >>>>>> (pass1:ata2:0:0:0): CAM status: ATA Status Error
> >>>>>> (pass1:ata2:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04 (ABRT )
> >>>>>> (pass1:ata2:0:0:0): RES: 51 04 00 00 00 00 00 00 00 01 00
> >>>>>> root_at_desktop:~ # camcontrol cmd ada0 -v -d -a "47 0 0 0 0 0 0 0 0 0 1
> >>>>>> 0"
> >>>>>> -i
> >>>>>> 512 - |hd
> >>>>>> camcontrol: error sending command
> >>>>>> (pass1:ata2:0:0:0): READ_LOG_DMA_EXT. ACB: 47 00 00 00 00 00 00 00 00
> >>>>>> 00
> >>>>>> 01 00 (pass1:ata2:0:0:0): CAM status: ATA Status Error
> >>>>>> (pass1:ata2:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04 (ABRT )
> >>>>>> (pass1:ata2:0:0:0): RES: 51 04 00 00 00 00 00 00 00 01 00
> >>>>>
> >>>>> Okay, at least it consistently fails with both the PIO and DMA versions.
> >>>>> Looks like the drive claims to support READ LOG, but doesn't actually
> >>>>> support it.
> >>>>>
> >>>>> Can you revert the previous patch, take the quirk out of loader.conf,
> >>>>> and
> >>>>> try this patch?
> >>>>>
> >>>>> https://people.freebsd.org/~ken/cam_smr_ada_patch.20160523.2.txt
> >>>>>
> >>>>> It adds the model number for your drive into the ada(4) driver as a
> >>>>> quirk.
> >>>>  
> >>>>  Unfortunately it is not working ; but allows to boot with quirk added
> >>>>  back to> 
> >>>> loader.conf
> >>>
> >>> Okay, try this one.  I put a question mark in place of the space, perhaps
> >>> that will match it.
> >>>
> >>> https://people.freebsd.org/~ken/cam_smr_ada_patch.20160523.3.txt
> >>
> >>  Still no luck, but it works with quirk in the loader.conf 
> >> Below is the drive identification from 'smartctl' output:
> >>
> >> === START OF INFORMATION SECTION ===
> >> Model Family:     SAMSUNG SpinPoint S250
> >> Device Model:     SAMSUNG HD200HJ
> >> Serial Number:    S16KJ1CQ500218
> >> LU WWN Device Id: 5 0000f0 01b500218
> >> Firmware Version: KF100-06
> > 
> > Hmm.  Turns out a question mark won't match a space, so the previous patch
> > wouldn't work.  Can you send the output of:
> > 
> > camcontrol identify ada0 -v
> > 
> > That will include a raw identify data dump.  Hopefully I can figure out
> > what is going on from that.
> > 
> > Thanks,
> > 
> > Ken
> > 
> 
> My old AMD(nForce4-ultra) has the same problems (don't boot on new
> revisions).
> 
> camcontrol: sending ATA ATA_IDENTIFY with timeout of 30000 msecs
> pass0: Raw identify data:
>    0: 0040 3fff c837 0010 8856 022a 003f 0000
>    8: 0000 0000 5330 4d55 4a31 5050 3530 3936
>   16: 3137 2020 2020 2020 0003 8000 0004 4352
>   24: 3130 302d 3130 5341 4d53 554e 4720 4844
>   32: 3530 314c 4a20 2020 2020 2020 2020 2020
>   40: 2020 2020 2020 2020 2020 2020 2020 8010
>   48: 0000 2f00 4000 0200 0200 0007 3fff 0010
>   56: 003f fc10 00fb 0110 ffff 0fff 0000 0007
>   64: 0003 0078 0078 0078 0078 0000 0000 0000
>   72: 0000 0000 0000 001f 0706 0000 004c 0040
>   80: 01f8 0052 746b 7f01 4123 7469 bc01 4123
>   88: 20ff 0054 0054 0000 fffe 0000 fe00 0000
>   96: 0000 0000 0000 0000 6030 3a38 0000 0000
>  104: 0000 0000 0000 0000 5000 0f00 1b50 9617
>  112: 0000 0000 0000 0000 0000 0000 0000 4010
>  120: 4010 0000 0000 0000 0000 0000 0000 0000
>  128: 0021 0000 0000 0000 0000 0000 0000 0000
>  136: 0000 0000 0000 0000 ffff 0400 0e00 0003
>  144: 0000 9a00 0300 2400 6420 3231 0000 0000
>  152: 0000 0000 0000 0000 0000 0000 0000 0000
>  160: 0000 0000 0000 0000 0000 0000 0000 0000
>  168: 0000 0000 0000 0000 0000 0000 0000 0000
>  176: 0000 0000 0000 0000 0000 0000 0000 0000
>  184: 0000 0000 0000 0000 0000 0000 0000 0000
>  192: 0000 0000 0000 0000 0000 0000 0000 0000
>  200: 0000 0000 0000 0000 0000 0000 003f 0000
>  208: 0000 0000 0000 0000 0000 0000 0000 0000
>  216: 0000 0000 0000 0000 0000 0000 100f 0000
>  224: 0000 0000 0000 0000 0000 0000 0000 0000
>  232: 0000 0000 0001 0400 0000 0000 0000 0000
>  240: 0000 0000 0000 0000 0000 0000 0000 0000
>  248: 0000 0000 0000 0000 0000 0000 0000 75a5
> 
> camcontrol: sending ATA READ_NATIVE_MAX_ADDRESS48 with timeout of 1000 msecs
> pass0: Raw native max data:
>    0: 5000 2f00 3860 3a3a 0000 0000
> error = 0x00, sector_count = 0x0000, device = 0x3a, status = 0x50
> pass0: <SAMSUNG HD501LJ CR100-10> ATA8-ACS SATA 2.x device
> pass0: 300.000MB/s transfers (SATA 2.x, UDMA5, PIO 8192bytes)
> 
> protocol              ATA/ATAPI-8 SATA 2.x
> device model          SAMSUNG HD501LJ
> firmware revision     CR100-10
> serial number         S0MUJ1PP509617
> WWN                   50000f001b509617
> cylinders             16383
> heads                 16
> sectors/track         63
> sector size           logical 512, physical 512, offset 0
> LBA supported         268435455 sectors
> LBA48 supported       976773168 sectors
> PIO supported         PIO4
> DMA supported         WDMA2 UDMA6
> 
> Feature                      Support  Enabled   Value           Vendor
> read ahead                     yes	yes
> write cache                    yes	yes
> flush cache                    yes	yes
> overlap                        no
> Tagged Command Queuing (TCQ)   no	no
> Native Command Queuing (NCQ)   yes		32 tags
> NCQ Queue Management           no
> NCQ Streaming                  no
> Receive & Send FPDMA Queued    no
> SMART                          yes	yes
> microcode download             yes	yes
> security                       yes	no
> power management               yes	yes
> advanced power management      no	no
> automatic acoustic management  yes	no	0/0x00	254/0xFE
> media status notification      no	no
> power-up in Standby            no	no
> write-read-verify              no	no
> unload                         no	no
> general purpose logging        yes	yes
> free-fall                      no	no
> Data Set Management (DSM/TRIM) no
> Host Protected Area (HPA)      yes      no      976773168/976773168
> HPA - Security                 no

Can you try this patch and see whether it works for you?

https://people.freebsd.org/~ken/cam_smr_ada_patch.20160524.2.txt

Thanks,

Ken
-- 
Kenneth Merry
ken_at_FreeBSD.ORG
Received on Tue May 24 2016 - 18:18:42 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:05 UTC