Re: ATA? related trouble with r300299

From: Kenneth D. Merry <ken_at_FreeBSD.ORG>
Date: Tue, 24 May 2016 10:02:09 -0400
On Tue, May 24, 2016 at 16:38:40 +0300, Oleg V. Nauman wrote:
> On Tuesday 24 May 2016 09:21:17 Kenneth D. Merry wrote:
> > On Tue, May 24, 2016 at 08:04:21 +0300, Oleg V. Nauman wrote:
> > > On Monday 23 May 2016 19:08:16 Kenneth D. Merry wrote:
> > > > On Tue, May 24, 2016 at 00:59:34 +0300, Oleg V. Nauman wrote:
> > > > > On Monday 23 May 2016 17:30:45 you wrote:
> > > > > > On Tue, May 24, 2016 at 00:15:25 +0300, Oleg V. Nauman wrote:
> > > > > > > On Monday 23 May 2016 17:11:34 Kenneth D. Merry wrote:
> > > > > > > > On Tue, May 24, 2016 at 00:05:49 +0300, Oleg V. Nauman wrote:
> > > > > > > > > On Monday 23 May 2016 16:53:55 Kenneth D. Merry wrote:
> > > > > > > > > > On Mon, May 23, 2016 at 23:21:32 +0300, Oleg V. Nauman 
> wrote:
> > > > > > > > > > > On Monday 23 May 2016 15:25:39 Kenneth D. Merry wrote:
> > > > > > > > > > > > On Sat, May 21, 2016 at 09:30:35 +0300, Oleg V. Nauman
> > > 
> > > wrote:
> > > > > > > > > > > > >  I have faced the issue with fresh CURRENT stopped to
> > > > > > > > > > > > >  boot
> > > > > > > > > > > > >  on
> > > > > > > > > > > > >  my
> > > > > > > > > > > > >  old
> > > > > > > > > > > > >  desktop
> > > > > > > > > > > > > 
> > > > > > > > > > > > > after update to r300299
> > > > > > > > > > > > > Verbose boot shows the endless cycle of
> > > > > > > > > > > > > 
> > > > > > > > > > > > > ata2: SATA reset: ports status=0x05
> > > > > > > > > > > > > ata2: reset tp1 mask=03 ostat0=50 ostat1=50
> > > > > > > > > > > > > ata2: stat0=0x50 err=0x01 lsb=0x00 msb=0x00
> > > > > > > > > > > > > ata2: stat1=0x50 err=0x01 lsb=0x00 msb=0x00
> > > > > > > > > > > > > ata2: reset tp2 stat0=50 stat1=50 devices=0x3
> > > > > > > > > > > > > messages logged to console.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Below is the relevant portion of ATA
> > > > > > > > > > > > > controller/devices
> > > > > > > > > > > > > probed/attached
> > > > > > > > > > > > > during the boot:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > atapci0: <Intel ICH7 UDMA100 controller> port
> > > > > > > > > > > > > 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at
> > > > > > > > > > > > > device
> > > > > > > > > > > > > 31.1
> > > > > > > > > > > > > on
> > > > > > > > > > > > > pci0
> > > > > > > > > > > > > ata0: <ATA channel> at channel 0 on atapci0
> > > > > > > > > > > > > atapci1: <Intel ICH7 SATA300 controller> port
> > > > > > > > > > > > > 0xd080-0xd087,
> > > > > > > > > > > > > 0xd000-0xd003,
> > > > > > > > > > > > > 0xcc00-0xcc07,0xc880-0xc883,0xc800-0xc80f irq 19 at
> > > > > > > > > > > > > device
> > > > > > > > > > > > > 31.2 on
> > > > > > > > > > > > > pci0
> > > > > > > > > > > > > ata2: <ATA channel> at channel 0 on atapci1
> > > > > > > > > > > > > ata3: <ATA channel> at channel 1 on atapci1
> > > > > > > > > > > > > ada0 at ata2 bus 0 scbus1 target 0 lun 0
> > > > > > > > > > > > > ada0: <SAMSUNG HD200HJ KF100-06> ATA-7 SATA 2.x device
> > > > > > > > > > > > > ada1 at ata2 bus 0 scbus1 target 1 lun 0
> > > > > > > > > > > > > ada1: <ST500DM002-1BC142 JC4B> ATA8-ACS SATA 3.x
> > > > > > > > > > > > > device
> > > > > > > > > > > > > cd0 at ata0 bus 0 scbus0 target 0 lun 0
> > > > > > > > > > > > > cd0: <_NEC DVD_RW ND-3570A 1.11> Removable CD-ROM SCSI
> > > > > > > > > > > > > device
> > > > > > > > > > > > 
> > > > > > > > > > > > I'm not entirely sure what is causing the problem with
> > > > > > > > > > > > your
> > > > > > > > > > > > system,
> > > > > > > > > > > > but
> > > > > > > > > > > > hopefully we can narrow it down a bit.
> > > > > > > > > > > > 
> > > > > > > > > > > > There is a bug that came in with my SMR changes in
> > > > > > > > > > > > revision
> > > > > > > > > > > > 300207
> > > > > > > > > > > > that
> > > > > > > > > > > > broke the quirk functionality in the ada(4) driver.  I
> > > > > > > > > > > > don't
> > > > > > > > > > > > think
> > > > > > > > > > > > that
> > > > > > > > > > > > is
> > > > > > > > > > > > the problem you're seeing, though.
> > > > > > > > > > > > 
> > > > > > > > > > > > Can you try out this patch:
> > > > > > > > > > > > 
> > > > > > > > > > > > https://people.freebsd.org/~ken/cam_smr_ada_patch.201605
> > > > > > > > > > > > 23.1
> > > > > > > > > > > > .txt
> > > > > > > > > > > > 
> > > > > > > > > > > > In /boot/loader.conf, put the following:
> > > > > > > > > > > > 
> > > > > > > > > > > > kern.cam.ada.0.quirks="0x04"
> > > > > > > > > > > > kern.cam.ada.1.quirks="0x04"
> > > > > > > > > > > > 
> > > > > > > > > > > > If you're able to boot with those quirk entries in the
> > > > > > > > > > > > loader.conf,
> > > > > > > > > > > > try
> > > > > > > > > > > > taking one of them out, and reboot.  If that works, try
> > > > > > > > > > > > taking
> > > > > > > > > > > > the
> > > > > > > > > > > > other
> > > > > > > > > > > > one out and reboot.
> > > > > > > > > > > > 
> > > > > > > > > > > > What I'm trying to figure out here is where the problem
> > > > > > > > > > > > lies:
> > > > > > > > > > > > 
> > > > > > > > > > > > 1. The bug with the ada(4) driver (in where it loaded
> > > > > > > > > > > > the
> > > > > > > > > > > > quirks).
> > > > > > > > > > > > 2. The extra probe steps in the ada(4) driver might be
> > > > > > > > > > > > causing a
> > > > > > > > > > > > problem
> > > > > > > > > > > > 
> > > > > > > > > > > >    with ada0 (Samsung drive).
> > > > > > > > > > > > 
> > > > > > > > > > > > 3. The extra probe steps in the ada(4) driver might be
> > > > > > > > > > > > causing a
> > > > > > > > > > > > problem
> > > > > > > > > > > > 
> > > > > > > > > > > >    with ada1 (Seagate drive).
> > > > > > > > > > > > 
> > > > > > > > > > > > 4. Something else.
> > > > > > > > > > > > 
> > > > > > > > > > > > So, if you can try the patch and try to eliminate a few
> > > > > > > > > > > > possibilities,
> > > > > > > > > > > > we
> > > > > > > > > > > > may be able to narrow it down.
> > > > > > > > > > >  
> > > > > > > > > > >  I was able to boot after applying the patch ;
> > > > > > > > > > > 
> > > > > > > > > > > kern.cam.ada.0.quirks="0x04"
> > > > > > > > > > > was the quirk in effect. It is quirk for my Samsung
> > > > > > > > > > > HD200HJ
> > > > > > > > > > > KF100-06
> > > > > > > > > > > hard
> > > > > > > > > > > drive.
> > > > > > > > > > 
> > > > > > > > > > Okay.  Just so we can narrow it down a little more, can you
> > > > > > > > > > try
> > > > > > > > > > this:
> > > > > > > > > > 
> > > > > > > > > > First, let's try getting an ATA Log directory using the PIO
> > > > > > > > > > version
> > > > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > command:
> > > > > > > > > > 
> > > > > > > > > > camcontrol cmd ada0 -v -a "2f 0 0 0 0 0 0 0 0 0 1 0" -i 512
> > > > > > > > > > -
> > > > > > > > > > 
> > > > > > > > > > |hd
> > > > > > > > > > 
> > > > > > > > > > If that works (you should get hexdump output), try the DMA
> > > > > > > > > > version
> > > > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > command:
> > > > > > > > > > 
> > > > > > > > > > camcontrol cmd ada0 -v -d -a "47 0 0 0 0 0 0 0 0 0 1 0" -i
> > > > > > > > > > 512 -
> > > > > > > > > > 
> > > > > > > > > > |hd
> > > > > > > > > 
> > > > > > > > > "Expecting a character pointer argument." error for both
> > > > > > > > > commands.
> > > > > > > > 
> > > > > > > > Did the double quotes make it onto the command line?  Both of
> > > > > > > > those
> > > > > > > > work
> > > > > > > > for me...
> > > > > > >  
> > > > > > >  Something went wrong from my side, sorry.
> > > > > > > 
> > > > > > > Below is the output of commands:
> > > > > > > 
> > > > > > > root_at_desktop:~ # camcontrol cmd ada0 -v -a "2f 0 0 0 0 0 0 0 0 0 1
> > > > > > > 0"
> > > > > > > -i
> > > > > > > 512 ->
> > > > > > > 
> > > > > > > |hd
> > > > > > > 
> > > > > > > camcontrol: error sending command
> > > > > > > (pass1:ata2:0:0:0): READ_LOG_EXT. ACB: 2f 00 00 00 00 00 00 00 00
> > > > > > > 00
> > > > > > > 01 00
> > > > > > > (pass1:ata2:0:0:0): CAM status: ATA Status Error
> > > > > > > (pass1:ata2:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04
> > > > > > > (ABRT )
> > > > > > > (pass1:ata2:0:0:0): RES: 51 04 00 00 00 00 00 00 00 01 00
> > > > > > > root_at_desktop:~ # camcontrol cmd ada0 -v -d -a "47 0 0 0 0 0 0 0 0
> > > > > > > 0 1
> > > > > > > 0"
> > > > > > > -i
> > > > > > > 512 - |hd
> > > > > > > camcontrol: error sending command
> > > > > > > (pass1:ata2:0:0:0): READ_LOG_DMA_EXT. ACB: 47 00 00 00 00 00 00 00
> > > > > > > 00
> > > > > > > 00
> > > > > > > 01 00 (pass1:ata2:0:0:0): CAM status: ATA Status Error
> > > > > > > (pass1:ata2:0:0:0): ATA status: 51 (DRDY SERV ERR), error: 04
> > > > > > > (ABRT )
> > > > > > > (pass1:ata2:0:0:0): RES: 51 04 00 00 00 00 00 00 00 01 00
> > > > > > 
> > > > > > Okay, at least it consistently fails with both the PIO and DMA
> > > > > > versions.
> > > > > > Looks like the drive claims to support READ LOG, but doesn't
> > > > > > actually
> > > > > > support it.
> > > > > > 
> > > > > > Can you revert the previous patch, take the quirk out of
> > > > > > loader.conf,
> > > > > > and
> > > > > > try this patch?
> > > > > > 
> > > > > > https://people.freebsd.org/~ken/cam_smr_ada_patch.20160523.2.txt
> > > > > > 
> > > > > > It adds the model number for your drive into the ada(4) driver as a
> > > > > > quirk.
> > > > >  
> > > > >  Unfortunately it is not working ; but allows to boot with quirk added
> > > > >  back to>
> > > > > 
> > > > > loader.conf
> > > > 
> > > > Okay, try this one.  I put a question mark in place of the space,
> > > > perhaps
> > > > that will match it.
> > > > 
> > > > https://people.freebsd.org/~ken/cam_smr_ada_patch.20160523.3.txt
> > >  
> > >  Still no luck, but it works with quirk in the loader.conf
> > > 
> > > Below is the drive identification from 'smartctl' output:
> > > 
> > > === START OF INFORMATION SECTION ===
> > > Model Family:     SAMSUNG SpinPoint S250
> > > Device Model:     SAMSUNG HD200HJ
> > > Serial Number:    S16KJ1CQ500218
> > > LU WWN Device Id: 5 0000f0 01b500218
> > > Firmware Version: KF100-06
> > 
> > Hmm.  Turns out a question mark won't match a space, so the previous patch
> > wouldn't work.  Can you send the output of:
> > 
> > camcontrol identify ada0 -v
> > 
> > That will include a raw identify data dump.  Hopefully I can figure out
> > what is going on from that.
> 
> root_at_desktop:~ # camcontrol identify ada0 -v
> camcontrol: sending ATA ATA_IDENTIFY with timeout of 30000 msecs
> pass1: Raw identify data:
>    0: 0040 3fff c837 0010 8856 022a 003f 0000 
>    8: 0000 0000 5331 364b 4a31 4351 3530 3032 
>   16: 3138 2020 2020 2020 0003 4000 0004 4b46 
>   24: 3130 302d 3036 5341 4d53 554e 4720 4844 
>   32: 3230 3048 4a20 2020 2020 2020 2020 2020 
>   40: 2020 2020 2020 2020 2020 2020 2020 8010 
>   48: 0000 2f00 4000 0200 0200 0007 3fff 0010 
>   56: 003f fc10 00fb 0110 ffff 0fff 0000 0007 
>   64: 0003 0078 0078 0078 0078 0000 0000 0000 
>   72: 0000 0000 0000 001f 0706 0000 004c 0040 
>   80: 00f8 0052 746b 7f09 4123 7469 bc01 4123 
>   88: 20ff 0019 0019 0000 fffe 0000 fe00 0000 
>   96: 0000 0000 0000 0000 f1b0 1749 0000 0000 
>  104: 0000 0000 0000 0000 5000 0f00 1b50 0218 
>  112: 0000 0000 0000 0000 0000 0000 0000 401c 
>  120: 401c 0000 0000 0000 0000 0000 0000 0000 
>  128: 0029 0000 0000 0000 0000 0000 0000 0000 
>  136: 0000 0000 0000 0000 ffff 0400 4e00 0003 
>  144: 0000 9a00 0300 2400 7920 3438 0000 0000 
>  152: 0000 0000 0000 0000 0000 0000 0000 0000 
>  160: 0000 0000 0000 0000 0000 0000 0000 0000 
>  168: 0000 0000 0000 0000 0000 0000 0000 0000 
>  176: 0000 0000 0000 0000 0000 0000 0000 0000 
>  184: 0000 0000 0000 0000 0000 0000 0000 0000 
>  192: 0000 0000 0000 0000 0000 0000 0000 0000 
>  200: 0000 0000 0000 0000 0000 0000 003f 0000 
>  208: 0000 0000 0000 0000 0000 0000 0000 0000 
>  216: 0000 0000 0000 0000 0000 0000 0000 0000 
>  224: 0000 0000 0000 0000 0000 0000 0000 0000 
>  232: 0000 0000 0001 0400 0000 0000 0000 0000 
>  240: 0000 0000 0000 0000 0000 0000 0000 0000 
>  248: 0000 0000 0000 0000 0000 0000 0000 98a5 
> 
> camcontrol: sending ATA READ_NATIVE_MAX_ADDRESS48 with timeout of 1000 msecs
> pass1: Raw native max data:
>    0: 5000 af00 49f1 1717 0000 0000 
> error = 0x00, sector_count = 0x0000, device = 0x17, status = 0x50
> pass1: <SAMSUNG HD200HJ KF100-06> ATA-7 SATA 2.x device
> pass1: 150.000MB/s transfers (SATA, UDMA5, PIO 8192bytes)
> 
> protocol              ATA/ATAPI-7 SATA 2.x
> device model          SAMSUNG HD200HJ
> firmware revision     KF100-06
> serial number         S16KJ1CQ500218
> WWN                   50000f001b500218
> cylinders             16383
> heads                 16
> sectors/track         63
> sector size           logical 512, physical 512, offset 0
> LBA supported         268435455 sectors
> LBA48 supported       390721968 sectors
> PIO supported         PIO4
> DMA supported         WDMA2 UDMA6
> 
> Feature                      Support  Enabled   Value           Vendor
> read ahead                     yes      yes
> write cache                    yes      yes
> flush cache                    yes      yes
> overlap                        no
> Tagged Command Queuing (TCQ)   no       no
> Native Command Queuing (NCQ)   yes              32 tags
> NCQ Queue Management           no
> NCQ Streaming                  no
> Receive & Send FPDMA Queued    no
> SMART                          yes      yes
> microcode download             yes      yes
> security                       yes      no
> power management               yes      yes
> advanced power management      yes      no      0/0x00
> automatic acoustic management  yes      no      0/0x00  254/0xFE
> media status notification      no       no
> power-up in Standby            no       no
> write-read-verify              no       no
> unload                         no       no
> general purpose logging        yes      yes
> free-fall                      no       no
> Data Set Management (DSM/TRIM) no
> Host Protected Area (HPA)      yes      no      390721968/390721968
> HPA - Security                 no

Okay, there are no surprises in the identify data, so here's another patch
that will hopefully shed some more light on where the quirks are getting
messed up.

https://people.freebsd.org/~ken/cam_smr_ada_patch.20160524.1.txt

Thanks,

Ken
-- 
Kenneth Merry
ken_at_FreeBSD.ORG
Received on Tue May 24 2016 - 12:02:12 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:05 UTC