Re: PCIe hotplug

From: Gary Palmer <gpalmer_at_freebsd.org>
Date: Tue, 24 Jul 2012 13:51:08 -0400
On Mon, Jul 23, 2012 at 12:45:14AM -0700, Julian Elischer wrote:
> On 7/22/12 9:11 PM, Warner Losh wrote:
> >On Jul 22, 2012, at 9:12 PM, Alexander Kabaev wrote:
> >
> >>On Sun, 22 Jul 2012 20:22:33 -0600
> >>Scott Long <scottl_at_samsco.org> wrote:
> >>
> >>>On Jul 20, 2012, at 8:04 PM, Julian Elischer wrote:
> >>>
> >>>>Is anyone looking at PCIe hotplug support?
> >>>>
> >>>>I'm especially interested if anyone has a strategy for device
> >>>>re-insertion and reassociating the reinserted device with its old
> >>>>device_t so that it gets the same unit number.. (assumes access to
> >>>>a serial number or similar) Even if it is put back into a different
> >>>>slot.
> >>>>
> >>>Would the PCI system be responsible for figuring out this serial
> >>>number?  I don't think that it can, but it's a question to answer, I
> >>>guess.  If it can't then it's up to the driver to generate a unique
> >>>cookie that would be stored by the PCI subsystem.  This cookie would
> >>>have to be based off of data that can be retrieved from the PCI
> >>>config space and/or VPD space, since anything more would require
> >>>resource allocation, which is only allowed in the DEV_ATTACH phase,
> >>>and once you've hit that phase you've already pretty much sealed the
> >>>deal on unit number assignment.
> >>>
> >>>So what would probably happen is that the PCI layer provides a ring
> >>>buffer of cookie storage and a set of accessors for the drivers.  The
> >>>cookies would map to a key-value pair with the device unit name and
> >>>number.  During probe, a driver can look at PCI config space and
> >>>generate a cookie.  That cookie can then be communicated up to the
> >>>PCI layer for storage.  Maybe the driver calls a match routine that
> >>>returns a unit number on match and a store on failure, then the
> >>>driver calls a set_unit_number accessor.  Only the driver that wins
> >>>the bid would win the unit number reassignment or cookie storage.  Or
> >>>maybe the driver passes the cookie up as part of its return code, and
> >>>the match and unit assignment happens automatically.  Drivers that
> >>>don't want to participate in this simply wouldn't, and everything
> >>>would continue to operate the same way.  The two sticky parts are
> >>>rogue/buggy drivers that abuse the api and cause a flood of cookies
> >>>to be generated, and questions on when a unit number is eligible for
> >>>reuse.  For the first one, a ring buffer of cookies would solve the
> >>>immediate problem, but you might still have some risk of drivers
> >>>selectively wrapping the buffer for whatever accidental or evil
> >>>purpose.  For the second problem, maybe a unit number stays
> >>>persistent only if the PCIe hot remove mechanism requests it, and
> >>>then only until the ring-buffer wraps.
> >>>
> >>>Scott
> >>>
> >>I do not think the whole problem as depicted by Julian is even worth
> >>solving. Why keeping any data for the device that might _never_ come
> >>back? What if the device hierarchy just starts from the PCI-e and
> >>extends upwards and user still holds on to some vestiges of a previous
> >>device chain (say, by keeping a character control device sharing the
> >>same unit number open, common practice)? Reusing unit number is much
> >>trickier then, and might not be even possible. So, before one jumps
> >>into 'how', can we agree on 'why' first? When device goes away, it is
> >>not just this device's device_t that is disappearing, it is a whole
> >>tree rooted at that device. I see no point in trying to reconstruct
> >>that.
> >There's a reason that PC Card and CardBus never supported this at all.  
> >The assumption was that reconnecting devices is so cheap that it isn't 
> >worth the bother.  This is true for all but some specialized devices 
> >today: network information is easy to reconstruct, storage drives are easy 
> >to reconfigure (since we already fail all in-flight transactions when the 
> >device goes away), etc.  I can see some advantage to having storage cope, 
> >but there already geom classes that can help people code when drives can 
> >go away.
> >
> >>PCI-e hotplug proper is very much orthogonal to the question of unit
> >>numbering and IS worth supporting.
> >Yes.  totally agreed.
> 
> I'm not saying that it's vitally important but was wondering if people 
> had a strategy for it..
> i.e. is it a question worth worrying about?
> 
> In a separate forum Warner and I (yeah I know I'm answering Warner, 
> but I'm addressing the others) discussed the feasibility  of surviving 
> an "oops pulled the wrong card" event with regards to a particular 
> flash memory card. I was just carrying that forwards as a thought 
> experiment (There is actually a strategy that sounds feasible).
> 
> The problem of getting a serial number out of the BAR space during 
> probe is also possibly solvable in our case but the question of how 
> long to remember a device is legitimate an My answer would be that
> 1/ a particular driver would be able to specify whether it could 
> handle this, and
> 2/ it might be limited to some pragmatic number such as 16 or 32, or a 
> time limit.

Why not extend the geom_label idea further?  If there is a serial
number, can that be exposed via /dev somehow so that the problem is
moved out of the kernel space?  That way devd could say "this serial
number gets symlinked to this disk node" (for example).  

Gary
Received on Tue Jul 24 2012 - 15:51:18 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:29 UTC