Re: AHCI/ADA regression?

From: Kenneth D. Merry <ken_at_FreeBSD.ORG>
Date: Tue, 24 May 2016 15:10:41 -0400
On Tue, May 24, 2016 at 20:00:51 +0200, Gary Jennejohn wrote:
> On Tue, 24 May 2016 10:41:25 -0400
> "Kenneth D. Merry" <ken_at_FreeBSD.ORG> wrote:
> 
> > > The question in my mind is - why are "empty" multiplier ports being
> > > probed with the new code but not with the old code?  
> > 
> > If the HBA says that it supports port multipliers, the kernel should always
> > look for them.  It probes the port multiplier first, before moving on to
> > look for regular targets.
> > 
> > So, from that standpoint, it should not be any different.  It sounds like
> > we're either getting further in the port multiplier probe process, or there
> > is something different about the way things are behaving.
> > 
> > If you can determine which commands are timing out, that may give us an
> > idea about where it is in the probe process.
> > 
> > Here is one way we may be able to track things down...  Build a kernel with
> > these options:
> > 
> > options	CAMDEBUG
> > options CAM_DEBUG_FLAGS=CAM_DEBUG_PROBE
> > 
> > If you build a kernel before and after the change with those options, it
> > will hopefully allow us to compare the probe sequence and get a clue about
> > where to look for the problem.
> > 
> 
> OK, both the old and new kernel versions do an extremely fast intial
> probe with these results (note: obtained with grep over dmesg.boot):
> 
> (aprobe0:ahcich0:0:15:0): Probe started
> (aprobe0:ahcich0:0:15:0): Probe PROBE_INVALID to PROBE_RESET
> (aprobe1:ahcich1:0:15:0): Probe started
> (aprobe1:ahcich1:0:15:0): Probe PROBE_INVALID to PROBE_RESET
> (aprobe2:ahcich2:0:15:0): Probe started
> (aprobe2:ahcich2:0:15:0): Probe PROBE_INVALID to PROBE_RESET
> (aprobe3:ahcich3:0:15:0): Probe started
> (aprobe3:ahcich3:0:15:0): Probe PROBE_INVALID to PROBE_RESET
> (aprobe4:ahcich4:0:15:0): Probe started
> (aprobe4:ahcich4:0:15:0): Probe PROBE_INVALID to PROBE_RESET
> (aprobe5:ahcich5:0:15:0): Probe started
> (aprobe5:ahcich5:0:15:0): Probe PROBE_INVALID to PROBE_RESET
> (aprobe4:ahcich4:0:15:0): Probe PROBE_RESET to PROBE_RESET
> (aprobe4:ahcich4:0:15:0): Probe PROBE_RESET to PROBE_INVALID
> (aprobe4:ahcich4:0:15:0): Probe completed
> (aprobe4:ahcich4:0:0:0): Probe started
> (aprobe4:ahcich4:0:0:0): Probe PROBE_INVALID to PROBE_RESET
> (aprobe4:ahcich4:0:0:0): Probe PROBE_RESET to PROBE_INVALID
> (aprobe4:ahcich4:0:0:0): Probe completed
> (aprobe0:ahcich0:0:15:0): Probe PROBE_RESET to PROBE_RESET
> (aprobe2:ahcich2:0:15:0): Probe PROBE_RESET to PROBE_RESET
> (aprobe0:ahcich0:0:15:0): SIGNATURE: 0000
> (aprobe0:ahcich0:0:15:0): Probe PROBE_RESET to PROBE_INVALID
> (aprobe0:ahcich0:0:15:0): Probe completed
> (aprobe2:ahcich2:0:15:0): SIGNATURE: 0000
> (aprobe2:ahcich2:0:15:0): Probe PROBE_RESET to PROBE_INVALID
> (aprobe2:ahcich2:0:15:0): Probe completed
> (aprobe0:ahcich0:0:0:0): Probe started
> (aprobe0:ahcich0:0:0:0): Probe PROBE_INVALID to PROBE_RESET
> (aprobe2:ahcich2:0:0:0): Probe started
> (aprobe2:ahcich2:0:0:0): Probe PROBE_INVALID to PROBE_RESET
> (aprobe3:ahcich3:0:15:0): Probe PROBE_RESET to PROBE_RESET
> (aprobe5:ahcich5:0:15:0): Probe PROBE_RESET to PROBE_RESET
> (aprobe0:ahcich0:0:0:0): SIGNATURE: 0000
> (aprobe0:ahcich0:0:0:0): Probe PROBE_RESET to PROBE_IDENTIFY
> (aprobe2:ahcich2:0:0:0): SIGNATURE: 0000
> (aprobe2:ahcich2:0:0:0): Probe PROBE_RESET to PROBE_IDENTIFY
> (aprobe1:ahcich1:0:15:0): Probe PROBE_RESET to PROBE_RESET
> (aprobe3:ahcich3:0:15:0): SIGNATURE: 0000
> (aprobe3:ahcich3:0:15:0): Probe PROBE_RESET to PROBE_INVALID
> (aprobe3:ahcich3:0:15:0): Probe completed
> (aprobe5:ahcich5:0:15:0): SIGNATURE: 0000
> (aprobe5:ahcich5:0:15:0): Probe PROBE_RESET to PROBE_INVALID
> (aprobe5:ahcich5:0:15:0): Probe completed
> (aprobe1:ahcich1:0:15:0): SIGNATURE: eb14
> (aprobe1:ahcich1:0:15:0): Probe PROBE_RESET to PROBE_INVALID
> (aprobe1:ahcich1:0:15:0): Probe completed
> (aprobe1:ahcich3:0:0:0): Probe started
> (aprobe1:ahcich3:0:0:0): Probe PROBE_INVALID to PROBE_RESET
> (aprobe3:ahcich5:0:0:0): Probe started
> (aprobe3:ahcich5:0:0:0): Probe PROBE_INVALID to PROBE_RESET
> (aprobe4:ahcich1:0:0:0): Probe started
> (aprobe4:ahcich1:0:0:0): Probe PROBE_INVALID to PROBE_RESET
> (aprobe1:ahcich3:0:0:0): SIGNATURE: 0000
> (aprobe1:ahcich3:0:0:0): Probe PROBE_RESET to PROBE_IDENTIFY
> (aprobe3:ahcich5:0:0:0): SIGNATURE: 0000
> (aprobe3:ahcich5:0:0:0): Probe PROBE_RESET to PROBE_IDENTIFY
> (aprobe4:ahcich1:0:0:0): SIGNATURE: eb14
> (aprobe4:ahcich1:0:0:0): Probe PROBE_RESET to PROBE_IDENTIFY
> (aprobe0:ahcich0:0:0:0): Probe PROBE_IDENTIFY to PROBE_SETMODE
> (aprobe1:ahcich3:0:0:0): Probe PROBE_IDENTIFY to PROBE_SETMODE
> (aprobe3:ahcich5:0:0:0): Probe PROBE_IDENTIFY to PROBE_SETMODE
> (aprobe0:ahcich0:0:0:0): Probe PROBE_SETMODE to PROBE_SET_MULTI
> (aprobe1:ahcich3:0:0:0): Probe PROBE_SETMODE to PROBE_SETDMAAA
> (aprobe3:ahcich5:0:0:0): Probe PROBE_SETMODE to PROBE_SETDMAAA
> (aprobe0:ahcich0:0:0:0): Probe PROBE_SET_MULTI to PROBE_DONE
> (aprobe0:ahcich0:0:0:0): Probe completed
> (aprobe1:ahcich3:0:0:0): Probe PROBE_SETDMAAA to PROBE_SET_MULTI
> (aprobe4:ahcich1:0:0:0): Probe PROBE_IDENTIFY to PROBE_SETMODE
> (aprobe3:ahcich5:0:0:0): Probe PROBE_SETDMAAA to PROBE_SET_MULTI
> (aprobe1:ahcich3:0:0:0): Probe PROBE_SET_MULTI to PROBE_DONE
> (aprobe1:ahcich3:0:0:0): Probe completed
> (aprobe3:ahcich5:0:0:0): Probe PROBE_SET_MULTI to PROBE_DONE
> (aprobe3:ahcich5:0:0:0): Probe completed
> (aprobe2:ahcich2:0:0:0): Probe PROBE_IDENTIFY to PROBE_SETMODE
> (aprobe4:ahcich1:0:0:0): Probe PROBE_SETMODE to PROBE_INQUIRY
> (aprobe2:ahcich2:0:0:0): Probe PROBE_SETMODE to PROBE_SETDMAAA
> (aprobe2:ahcich2:0:0:0): Probe PROBE_SETDMAAA to PROBE_SET_MULTI
> (aprobe2:ahcich2:0:0:0): Probe PROBE_SET_MULTI to PROBE_DONE
> (aprobe2:ahcich2:0:0:0): Probe completed
> (aprobe4:ahcich1:0:0:0): Probe PROBE_INQUIRY to PROBE_FULL_INQUIRY
> (aprobe4:ahcich1:0:0:0): Probe PROBE_FULL_INQUIRY to PROBE_DONE
> (aprobe4:ahcich1:0:0:0): Probe completed
> 
> it looks like the code is basically checking to see how many ports are
> on a multiplier since it looks at port 0 and port 15.

Okay.

> [I don't have a dmesg.boot for the "broken" kernel, but I saw the
> probe messages going shooting past and it was obviuos that they
> were the same]
> 
> The old code stops there.
> 
> The new code starts probing for what appear to be random ports on a
> multiplier, like 0, 7, 20, 17, etc.  These always result in timeouts.

I'm guessing you would have seen "SIGNATURE: 9669" in the logs with the new
code in that case.  That is the only way we should have gotten into probing
a port multiplier.

> These probes always start with "Probe PROBE_INVALID to PROBE_IDENTIFY"
> Why the new code doesn't accept the PROBE_INVALID and skip over it
> I don't understand.

PROBE_INVALID is a state in the ATA probe state machine.  It is where all
of the probes start, and where probes end if they have "failed" in the
sense that we're not going to continue.

The probe messages are intermixed, so you'd have to match them up by the
probe peripheral driver instance number (aprobe2, aprobe4, etc.) to see the
sequence for one particular device.

> After I noticed that the same ports had been probed several times on
> achich2 with no apparent end in sight I rebooted.  This behavior is
> extremely poor.

Can you send full dmesg output from the working kernel?

It looks like you have some ATAPI devcies on your machine (signature eb14).
They would likely be attaching to the da(4) driver if they are disks, and
that is a different code path.

Thanks,

Ken
-- 
Kenneth Merry
ken_at_FreeBSD.ORG
Received on Tue May 24 2016 - 17:10:50 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:05 UTC