Re: [RFC] ahci(4) patch

From: Alexander Motin <mav_at_FreeBSD.org>
Date: Thu, 17 Nov 2011 19:02:32 +0200
On 11/17/11 18:35, Maksim Yevmenkin wrote:
> On Thu, Nov 17, 2011 at 8:09 AM, Alexander Motin <mav_at_freebsd.org> wrote:
>> On 11/17/11 01:08, Maksim Yevmenkin wrote:
>>> On Wed, Nov 16, 2011 at 2:59 PM, Alexander Motin <mav_at_freebsd.org> wrote:
>>>> On 17.11.2011 00:44, Maksim Yevmenkin wrote:
>>>>>
>>>>> On Wed, Nov 16, 2011 at 2:35 PM, Alexander Motin<mav_at_freebsd.org>  wrote:
>>>>>>
>>>>>> On 16.11.2011 23:59, Maksim Yevmenkin wrote:
>>>>>>>
>>>>>>> would anyone object to the following ahci(4) patch?
>>>>>>>
>>>>>>> ==
>>>>>>>
>>>>>>> --- ahci.c.orig 2011-11-16 21:35:26.000000000 +0000
>>>>>>> +++ ahci.c      2011-11-16 21:35:41.000000000 +0000
>>>>>>> _at__at_ -500,7 +500,7 _at__at_
>>>>>>>        for (unit = 0; unit<    ctlr->channels; unit++) {
>>>>>>>                if ((ctlr->ichannels&    (1<<    unit)) == 0)
>>>>>>>                        continue;
>>>>>>> -               child = device_add_child(dev, "ahcich", -1);
>>>>>>> +               child = device_add_child(dev, "ahcich", unit);
>>>>>>>                if (child == NULL)
>>>>>>>                        device_printf(dev, "failed to add channel
>>>>>>> device\n");
>>>>>>>                else
>>>>>>>
>>>>>>> ==
>>>>>>>
>>>>>>> the idea is to have "static" numbering for ada(4) disks.
>>>>>>
>>>>>> I do. The only way I see this useful is if you have BIOS configured for
>>>>>> non-hot-swappable disks, in which case you have some AHCI channels
>>>>>> disabled,
>>>>>> but want to keep numbers of the rest. While I don't like this mode in
>>>>>> general, especially when it can't be disabled, that patch could be useful
>>>>>> in
>>>>>> these cases. But in other cases, when you have several AHCI controllers,
>>>>>> it
>>>>>> just wont not work. You will receive error on attempt to create second
>>>>>> ahcich0.
>>>>>
>>>>> shouldn't achcichX be destroyed when disk is detached/removed/etc.?
>>>>
>>>> List of implemented AHCI channels to expose as ahcichX set by BIOS in
>>>> vendor-specific way, but only during boot and not by many BIOSes. Destroying
>>>> them on disk detach theoretically possible, but IMHO not right, as bus
>>>> doesn't disappear on disk disconnect.
>>>>
>>>>> the particular problem i'm trying to address is disk re-numbering when
>>>>> one of the disks fails/removed/etc. i'm trying to use hints to wire
>>>>> disks to controllers/busses. it works perfectly fine with da(4) disks
>>>>> (even hot swappable ones) , but i can not make it to work with ada(4)
>>>>> disks. i'm perfectly fine to hid this under some sort of option
>>>>> (something similar to ATA_STATIC_ID, AHCI_STATIC_ID for example)
>>>>
>>>> Wiring works for adaX also, unless your BIOS is so "intelligent" to report
>>>> unconnected ports and not impliemented.. You should just wire CAM buses to
>>>> ahcichX, not to ahciX. It could be done in other way changing just by one
>>>> line around xpt_bus_register(), but now it is done so.
>>>
>>> ok. then i must be missing something, here is what i have in device.hints
>>>
>>> hint.scbus.0.at="umass-sim0"
>>> hint.scbus.1.at="ahcich0"
>>> hint.scbus.2.at="ahcich1"
>>> hint.scbus.3.at="ahcich2"
>>> hint.scbus.4.at="ahcich3"
>>> hint.scbus.5.at="ahcich4"
>>> hint.scbus.6.at="ahcich5"
>>>
>>> hint.da.0.at="scbus0"
>>> hint.ada.0.at="scbus1"
>>> hint.ada.1.at="scbus2"
>>> hint.ada.2.at="scbus3"
>>> hint.ada.3.at="scbus4"
>>> hint.ada.4.at="scbus5"
>>> hint.ada.5.at="scbus6"
>>>
>>> this is for 6-port ahci(4) compatible controller (intel) on the
>>> motherboard. no matter which achi(4) ports are connected, resulted
>>> adaX devices are always sequential starting from ada0. so, the
>>> question is: what am i doing wrong here?
>>
>> Just put your lines into my loader.conf and got this:
>>
>> %camcontrol devlist -v
>> scbus1 on ahcich0 bus 0:
>> <INTEL SSDSA2M080G2GC 2CV102M3>    at scbus1 target 0 lun 0 (pass0,ada0)
>> <>                                 at scbus1 target -1 lun -1 ()
>> scbus2 on ahcich1 bus 0:
>> <>                                 at scbus2 target -1 lun -1 ()
>> scbus3 on ahcich2 bus 0:
>> <INTEL SSDSA2M080G2GC 2CV102M3>    at scbus3 target 0 lun 0 (pass1,ada2)
>> <>                                 at scbus3 target -1 lun -1 ()
>> scbus4 on ahcich3 bus 0:
>> <>                                 at scbus4 target -1 lun -1 ()
>> scbus5 on ahcich4 bus 0:
>> <>                                 at scbus5 target -1 lun -1 ()
>> scbus6 on ahcich5 bus 0:
>> <>                                 at scbus6 target -1 lun -1 ()
>> ...
>>
>> I see no problem. What am I doing wrong?
>>
>> Let me repeat question not asked directly: Does your BIOS reports unused
>> AHCI channels as implemented? Do you always have all 6 ahcich devices or
>> not?
> 
> i not exactly sure. below is the relevant dmesg. as far as i can tell,
> channels are correct, however, ahcich numbering is sequential and that
> is why hints dont work.
> 
> Nov 17 09:22:51 PREPROD-red1 kernel: ahci0: <Intel Cougar Point AHCI
> SATA controller> port
> 0xf070-0xf077,0xf060-0xf063,0xf050-0xf057,0xf040-0xf043,0xf000-0xf01f
> mem 0xfbb21000-0xfbb217ff irq 19 at device 31.2 on pci0
> Nov 17 09:22:51 PREPROD-red1 kernel: ahci0: attempting to allocate 1
> MSI vectors (1 supported)
> Nov 17 09:22:51 PREPROD-red1 kernel: ahci0: using IRQ 270 for MSI
> Nov 17 09:22:51 PREPROD-red1 kernel: ahci0: AHCI v1.30 with 6 6Gbps
> ports, Port Multiplier not supported
> Nov 17 09:22:51 PREPROD-red1 kernel: ahci0: Caps: 64bit NCQ SNTF AL
> CLO 6Gbps PMD SSC PSC 32cmd EM 6ports
> Nov 17 09:22:51 PREPROD-red1 kernel: ahci0: Caps2: APST
> Nov 17 09:22:51 PREPROD-red1 kernel: ahci0: EM Caps: ALHD XMT SMB LED
> Nov 17 09:22:51 PREPROD-red1 kernel: ahcich0: <AHCI channel> at
> channel 0 on ahci0
> Nov 17 09:22:51 PREPROD-red1 kernel: ahcich0: Caps:
> Nov 17 09:22:51 PREPROD-red1 kernel: ahcich1: <AHCI channel> at
> channel 2 on ahci0
> Nov 17 09:22:51 PREPROD-red1 kernel: ahcich1: Caps:
> Nov 17 09:22:51 PREPROD-red1 kernel: ahcich2: <AHCI channel> at
> channel 5 on ahci0
> Nov 17 09:22:51 PREPROD-red1 kernel: ahcich2: Caps:

Yes, that's it. You should have six ahcichX devices, but have only three
- only for connected devices. Some vendors (Supermicro was first I've
heard about) last time randomly started to make BIOS mark unused AHCI
channels as unimplemented. Sometimes it can be configured in BIOS setup
(it may be called like "hot-swappable"), sometimes not. I believe that
is wrong AHCI spec interpretation.

If it is not configurable in BIOS, we could add respective hint to the
ahci(4) to allow user ignore these fake "implemented" flags to always
present all possible ports, but it is direct AHCI spec violation.

Other possible way is to leave ahcichX device as-is, but instead
register ports in CAM in different fashion -- as buses of the same
controller. Patch may look like this:

--- ahci.c      (revision 227580)
+++ ahci.c      (working copy)
_at__at_ -985,8 +985,8 _at__at_
                goto err1;
        }
        /* Construct SIM entry */
-       ch->sim = cam_sim_alloc(ahciaction, ahcipoll, "ahcich", ch,
-           device_get_unit(dev), &ch->mtx,
+       ch->sim = cam_sim_alloc(ahciaction, ahcipoll, "ahci", ch,
+           device_get_unit(device_get_parent(dev)), &ch->mtx,
            min(2, ch->numslots),
            (ch->caps & AHCI_CAP_SNCQ) ? ch->numslots : 0,
            devq);
_at__at_ -996,7 +996,7 _at__at_
                error = ENOMEM;
                goto err1;
        }
-       if (xpt_bus_register(ch->sim, dev, 0) != CAM_SUCCESS) {
+       if (xpt_bus_register(ch->sim, dev, ch->unit) != CAM_SUCCESS) {
                device_printf(dev, "unable to register xpt bus\n");
                error = ENXIO;
                goto err2;

Then output looks that way:
%camcontrol devlist -v
scbus1 on ahci0 bus 0:
<INTEL SSDSA2M080G2GC 2CV102M3>    at scbus1 target 0 lun 0 (pass0,ada0)
<>                                 at scbus1 target -1 lun -1 ()
scbus2 on ahci0 bus 1:
<>                                 at scbus2 target -1 lun -1 ()
scbus3 on ahci0 bus 2:
<INTEL SSDSA2M080G2GC 2CV102M3>    at scbus3 target 0 lun 0 (pass1,ada2)
<>                                 at scbus3 target -1 lun -1 ()
scbus4 on ahci0 bus 3:
<>                                 at scbus4 target -1 lun -1 ()
scbus5 on ahci0 bus 4:
<>                                 at scbus5 target -1 lun -1 ()
scbus6 on ahci0 bus 5:
<>                                 at scbus6 target -1 lun -1 ()

But in that case two problems remain:
 - per-port device hints still won't be usable, while changing them will
break POLA;
 - it will also break POLA for users who are using wiring now.

-- 
Alexander Motin
Received on Thu Nov 17 2011 - 16:02:43 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:20 UTC