Re: Any objections/comments on axing out old ATA stack?

From: Jeremy Chadwick <jdc_at_koitsu.org>
Date: Wed, 3 Apr 2013 18:05:26 -0700
On Thu, Apr 04, 2013 at 02:19:16AM +0200, Matthias Andree wrote:
> Am 04.04.2013 01:38, schrieb Jeremy Chadwick:
> 
> ...
> 
> > While skimming Linux libata code and commits in the past, the only
> > glaringly obvious bug/issue I see is with SB600/SB700 chipsets (the
> > hardware revision apparently matters) and port multiplier (PMP) support
> > and soft resets.
> > 
> > Are you using a port multiplier?  I doubt it, but I have to ask.
> 
> I am not using a PMP as far as I know (unless one is buried on my Asus
> M4A78T-E main board). It would seem the drives are directly attached to
> the south bridge's SATA ports.

Then the answer is nope, you're not using a PM.  Details:

http://www.serialata.org/technology/port_multipliers.asp
http://en.wikipedia.org/wiki/Port_multiplier

> >> Why only my Samsung HDD drive triggers this but not the WD drive, I do
> >> not know yet.
> > 
> > Please provide "gpart show -p ada1" output, both here and in the PR,
> > if you could.
> 
> =>        63  1953525105    ada1  MBR  (931G)
>           63   209714337  ada1s1  freebsd  [active]  (100G)
>    209714400         800          - free -  (400k)
>    209715200    71680000  ada1s2  ntfs  (34G)
>    281395200       15405          - free -  (7.5M)
>    281410605   488263545  ada1s3  linux-data  (232G)
>    769674150  1183851018          - free -  (564G)

This is what I was worried about.  Referring to your "camcontrol
identify" output:

> device model SAMSUNG HD103SI
> sector size logical 512, physical 512, offset 0

Hear me out entirely on this one.

My theory is that your hard disk actually uses 4096-byte sectors but is
too old to provide ATA IDENTIFY semantics to delineate between logical
vs. physical sector size.  In other words, only logical is provided,
thus logical=physical in the eyes of all software; smartctl will show
you the exact same thing too.

There are drives like this in the wild, both SSDs as well as MHDDs.
For example, the Intel 320-series SSD behaves this way too (providing
only logical size).

Do not let the capacity/size of the drive be the deciding factor; your
drive is 1TB, but I also have many 1TB MHDDs that use 4096-byte sectors.

Seagate/Samsung's specification** for the HD103SI states, and I quote:
"Byte per Sensor: 512 bytes".  Yes, it says "Sensor".  Whether or not
this documentation is correct/accurate is unknown, and when vendors have
typos in their own specification docs, I cannot help but to honour the
possibility of the information being wrong.  So I'm unsure if this drive
uses 512-byte sectors or 4096-byte sectors.

That said: in your "gpart show ada1" output, none of your partitions
(FreeBSD, NTFS, nor Linux) appear to be aligned to 4096-byte boundaries.
Ideally you'd want to have these aligned to 1MB or 2MByte boundaries in
the case you ever move to an SSD.  You're also using the MBR scheme,
which does not tend to play well with alignment.

Comparatively, your WD5002ABYS drive **does** use 512-byte sectors (I
know this for a fact).

The problem here is that I cannot guarantee you that alignment is
the problem.  The performance impact of writes to partitions which are
non-aligned is quite high, and NCQ just exacerbates this problem.  I
would love to tell you "switch to GPT and follow Warren Block's
document***" but if your NTFS partition is Windows and is a Windows version
older than Windows 7 GPT is not supported.

One piece of evidence that refutes my theory is that if Windows and/or
Linux partition are something you boot into and use often, I would
imagine NCQ would be used in both of those environments and would suffer
from the same issue.  Although Windows tends to hide all sorts of
transient errors from the user (sigh), Linux tends to be like FreeBSD
with regards to such issues (on the console anyway; you wouldn't see
such messages normally inside of X).

If you have the time and want to put forth the effort, I would recommend
backing up all your data on ada1, zero the first and last 1MByte of the
drive, and then try following Warren Block's guide.  I'd just recommend
doing this:

gpart create -s gpt ada1
gpart add -t freebsd-ufs -b 2m ada1
newfs -U -j /dev/ada1p1   (or remove -j if you don't want to use SUJ)

I picked an alignment value of 2MBytes since it's both 4K-aligned and is
generally safe for things like newer SSDs that have larger NAND erase
block size (I am not going to get into a discussion about that here, so
please stay focused.  :-) )

If the problem is gone after that (it should be easy to induce by
writing tons and tons of data to the drive), then we can safely say that
the drive uses 4096-byte sectors and need to add it to the quirks list
in ata_da.c.

If the problem remains after that, then further investigation is needed,
and we can safely rule out alignment.  Welcome to all the pain/effort
one has to go through when troubleshooting things like this.  :-)

Another thing: in your PR you state:

> - I am running with kern.cam.ada.default_timeout=5 which makes the
> computer recover faster

I can definitely imagine cases where a drive using NCQ but doing writes
to a non-aligned partition could take longer than 5 seconds to respond
to an ATA CDB (this is different than a SATA or AHCI layer timeout).  I am
not telling you "change this back to 30", but it might not be helping
your situation at all given my above theory.

Finally: could you please provide output from "smartctl -x /dev/ada1"?
I would like to rule out any possibility of your drive having some other
kind of issue that might cause it to go catatonic.  Thanks.


** -- http://www.seagate.com/files/www-content/support-content/documentation/samsung/tech-specs/eco_greenf2.pdf

*** -- http://www.wonkity.com/~wblock/docs/html/ssd.html

-- 
| Jeremy Chadwick                                   jdc_at_koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |
Received on Wed Apr 03 2013 - 23:05:27 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:36 UTC