Re: device_attach(9) and driver initialization

From: Warner Losh <imp_at_bsdimp.com> Date: Mon, 9 Apr 2012 09:58:14 -0600 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:25 UTC

On Apr 9, 2012, at 9:42 AM, Ian Lepore wrote:

> On Mon, 2012-04-09 at 17:59 +0300, Konstantin Belousov wrote:
>> On Mon, Apr 09, 2012 at 08:41:15AM -0600, Ian Lepore wrote:
>>> On Sun, 2012-04-08 at 06:58 +0300, Konstantin Belousov wrote:
>>>> On Sat, Apr 07, 2012 at 09:10:55PM -0600, Warner Losh wrote:
>>>>> 
>>>>> On Apr 7, 2012, at 8:57 AM, Konstantin Belousov wrote:
>>>>> 
>>>>>> On Sat, Apr 07, 2012 at 08:46:41AM -0600, Ian Lepore wrote:
>>>>>>> On Sat, 2012-04-07 at 15:50 +0300, Konstantin Belousov wrote:
>>>>>>>> Hello,
>>>>>>>> there seems to be a problem with device attach sequence offered by newbus.
>>>>>>>> Basically, when device attach method is executing, device is not fully
>>>>>>>> initialized yet. Also the device state in the newbus part of the world
>>>>>>>> is DS_ALIVE. There is definitely no shattering news in the statements,
>>>>>>>> but drivers that e.g. create devfs node to communicate with consumers
>>>>>>>> are prone to a race.
>>>>>>>> 
>>>>>>>> If /dev node is created inside device attach method, then usermode
>>>>>>>> can start calling cdevsw methods before device fully initialized itself.
>>>>>>>> Even more, if device tries to use newbus helpers in cdevsw methods,
>>>>>>>> like device_busy(9), then panic occurs "called for unatteched device".
>>>>>>>> I get reports from users about this issues, to it is not something
>>>>>>>> that only could happen.
>>>>>>>> 
>>>>>>>> I propose to add DEVICE_AFTER_ATTACH() driver method, to be called
>>>>>>>> from newbus right after device attach finished and newbus considers
>>>>>>>> the device fully initialized. Driver then could create devfs node
>>>>>>>> in the after_attach method instead of attach. Please see the patch below.
>>>>>>>> 
>>>>>>>> diff --git a/sys/kern/device_if.m b/sys/kern/device_if.m
>>>>>>>> index eb720eb..9db74e2 100644
>>>>>>>> --- a/sys/kern/device_if.m
>>>>>>>> +++ b/sys/kern/device_if.m
>>>>>>>> _at__at_ -43,6 +43,10 _at__at_ INTERFACE device;
>>>>>>>> # Default implementations of some methods.
>>>>>>>> #
>>>>>>>> CODE {
>>>>>>>> +	static void null_after_attach(device_t dev)
>>>>>>>> +	{
>>>>>>>> +	}
>>>>>>>> +
>>>>>>>> 	static int null_shutdown(device_t dev)
>>>>>>>> 	{
>>>>>>>> 	    return 0;
>>>>>>>> _at__at_ -199,6 +203,21 _at__at_ METHOD int attach {
>>>>>>>> };
>>>>>>>> 
>>>>>>>> /**
>>>>>>>> + * _at_brief Notify the driver that device is in attached state
>>>>>>>> + *
>>>>>>>> + * Called after driver is successfully attached to the device and
>>>>>>>> + * corresponding device_t is fully operational. Driver now may expose
>>>>>>>> + * the device to the consumers, e.g. create devfs nodes.
>>>>>>>> + *
>>>>>>>> + * _at_param dev		the device to probe
>>>>>>>> + *
>>>>>>>> + * _at_see DEVICE_ATTACH()
>>>>>>>> + */
>>>>>>>> +METHOD void after_attach {
>>>>>>>> +	device_t dev;
>>>>>>>> +} DEFAULT null_after_attach;
>>>>>>>> +
>>>>>>>> +/**
>>>>>>>> * _at_brief Detach a driver from a device.
>>>>>>>> *
>>>>>>>> * This can be called if the user is replacing the
>>>>>>>> diff --git a/sys/kern/subr_bus.c b/sys/kern/subr_bus.c
>>>>>>>> index d485b9f..6d849cb 100644
>>>>>>>> --- a/sys/kern/subr_bus.c
>>>>>>>> +++ b/sys/kern/subr_bus.c
>>>>>>>> _at__at_ -2743,6 +2743,7 _at__at_ device_attach(device_t dev)
>>>>>>>> 	dev->state = DS_ATTACHED;
>>>>>>>> 	dev->flags &= ~DF_DONENOMATCH;
>>>>>>>> 	devadded(dev);
>>>>>>>> +	DEVICE_AFTER_ATTACH(dev);
>>>>>>>> 	return (0);
>>>>>>>> }
>>>>>>>> 
>>>>>>> 
>>>>>>> Does device_get_softc() work before attach is completed?  (I don't have
>>>>>>> time to go look in the code right now).  If so, then a mutex initialized
>>>>>>> and acquired early in the driver's attach routine, and also acquired in
>>>>>>> the driver's cdev implementation routines before using any newbus
>>>>>>> functions other than device_get_softc(), would solve the problem without
>>>>>>> a driver api change that would make it harder to backport/MFC driver
>>>>>>> changes.
>>>>>> No, 'a mutex' does not solve anything. It only adds enourmous burden
>>>>>> on the driver developers, because you cannot sleep under mutex. Changing
>>>>>> the mutex to the sleepable lock also does not byy you much, since
>>>>>> you need to somehow solve the issues with some cdevsw call waking up
>>>>>> thread sleeping into another cdevsw call, just for example.
>>>>>> 
>>>>>> Singlethreading a driver due to this race is just silly.
>>>>>> 
>>>>>> And, what do you mean by 'making it harder to MFC' ? How ?
>>>>> 
>>>>> driver_attach()
>>>>> {
>>>>> 	...
>>>>> 	softc->flags = 0; // redundant, since softc is initialized to 0.
>>>>> 	softc->cdev = device_create...();
>>>>> 	...
>>>>> 	softc->flags |= READY;
>>>>> }
>>>>> 
>>>>> driver_open(...)
>>>>> {
>>>>> 	if (!(softc->flags & READY))
>>>>> 		return ENXIO;
>>>>> 	...
>>>>> }
>>>>> 
>>>>> What's the big burden here?
>>>> The burden is that your proposal does not work. As I described above,
>>>> device_busy() calls from cdevsw method panic if open() is called before
>>>> DS_ATTACHED is set by newbus. And, DS_ATTACHED is only set after device
>>>> attach() method returned, so no workarounds from attach() could solve
>>>> this.
>>> 
>>> One thing that keeps floating to the front of my brain is that all the
>>> proposals so far (including my not-well-thought-out mutex suggestion)
>>> requires changing every existing driver to get the new safe behavior.
>>> 
>>> Hmmm.  Looking at the code, not very many drivers call device_busy().
>>> Why is that?  
>>> 
>>> I agree that calling device_create() should be deferred until the driver
>> You mean make_dev(9), I assume.
>> 
> 
> Yes, of course, sorry (I was looking at a call to a local convenience
> routine in a private driver).
> 
>>> is ready to handle requests.  That's only part of the fix if the newbus
>>> support routines are still going to have a window where they can panic
>>> because the internal state variables haven't yet transitioned to the
>>> correct state.  
>>> 
>>> Also, the implementation of device_busy() looks to be unsafe unless it's
>>> being implicitly protected by some locking in a call chain that isn't
>>> jumping out at me with simple grepping of the code.  For example,
>>> concurrent callers in a device's open() and close() methods for a driver
>>> that calls busy/unbusy from cdev open/close could leave the parent
>>> device's busy count in an indeterminate state.  Could fixing this by
>>> enforcing single threading through busy/unbusy provide an opportunity to
>>> fix the original problem by having the attach code acquire the same
>>> lock, so that an early call to a cdev method that invokes device_busy()
>>> ends up sleeping until the attach routine returns?  That way you don't
>>> single-thread the whole cdev open/close handling, just the part that's
>>> currently causing a problem.
>> I think device_busy() calls require Giant, the same as the whole newbus.
>> The absence of GIANT_REQUIRED assertions in device_busy() and device_unbusy()
>> looks like overlook.
> 
> Hmmm.  Shouldn't device_attach() then require Giant as well?  I see that
> device_detach() and device_probe_and_attach() contains GIANT_REQUIRED
> but device_attach() does not.  

They both require Giant because the device tree is only locked via Giant.  Pretty much all newbus functions require Giant to be held when you call them.  This is poorly enforced in the code, however.  And also poorly documented.  The reasons this wasn't well documented was the promise of newbus locking that never made it into the tree, but also blocked efforts to improve the documentation here, as well as adding additional needs giant stuff.

> Of the few devices calling device_busy(), some use D_NEEDGIANT and some
> do not (the do-nots include drm, fdc, rp -- I stopped looking at this
> point).  Only the fdc code specifically calls mtx_lock(&Giant), but it
> appears that it does not do so around the device_busy/unbusy() calls.
> 
> If you're supposed to hold Giant during calls to device_busy/unbusy, and
> assuming it should be held by whomever calls device_attach(), wouldn't
> that fix the panic?  I don't know if that's a good assumption, because
> I'm not seeing anything in the manpages or architecture handbook about
> holding Giant during newbus calls

Since we're recursively traveling up the tree, we need some topology lock in order to ensure tree consistency.  Giant is that lock, for better or worse.  At one point, and I'm not sure if this is still true, the rule was either Giant being held, or interrupts haven't been enabled yet.  Now that we enable interrupts earlier, I think this has become 'Giant has to be held'.

That's one reason I rarely use device_busy and prefer to use other methods of blocking unload and early entrance to cdevsw in drivers I've written that I care about these cases.

Warner