Re: panic: mutex Giant not owned at /usr/src/sys/kern/tty_ttydisc.c:1127

From: Hans Petter Selasky <hselasky_at_c2i.net> Date: Fri, 23 Jan 2009 23:37:04 +0100 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:40 UTC

Hi Maksim,

On Friday 23 January 2009, Maksim Yevmenkin wrote:

>
> i've added taskqueue_drain() to detach() in my "stall" diff that i
> posted few hours back. is that addresses some of your concerns? 

Yes, at detach you always need to drain by principle.

> also 
> keep in mind that while netgraph node is created and destroyed
> primarily by the driver in response to hardware arrival and departure,
> nothing protects it from the netgraph side. in other words there is
> another way to manipulate netgraph node using netgraph api. for
> example user can request shutdown of a netgraph node, etc. so all of
> this has to be taking into consideration.

Ok, I've fixed this: Destroy is now non-blocking USB-wise. See:

http://perforce.freebsd.org/chv.cgi?CH=156586

>
> in fact, if i understand you correctly, with addition of
> taskqueue_drain() to detach(), the later becomes completely
> synchronous. that is
>
> 1) NG_NODE_REF()
> 2) ng_rmnode_self (i.e. mark node as possibly dead, detach all the
> hooks, etc. etc.)
> 3) taskqueue_drain() (i.e. sleep until task has finished)
> 4) usb2_transfer_unsetup() (which does "usb2_transfer_drain() and sleeps)
> 5) NG_NODE_UNREF()
> 6) free everything up
>
> in theory, items (1) and (5) above just an additional protection
> because i was not sure about inner workings of usb2. i think, when we
> sort everything out they can be removed.

I have a question. What is the following doing at the middle of the 
ubt_detach():

	if (node != NULL)
		NG_NODE_UNREF(node);

If you say that:

		ng_rmnode_self(node);

Cleans up the last reference?

> >> >> > 2) You drain all USB transfers: "usb2_transfer_drain()" called
> >> >> > unlocked.
> >> >>
> >> >> yes, usb2_transfer_unsetup() does that, does it not?
> >> >
> >> > That is correct, but sometimes you need to stop the transfers at an
> >> > earlier point. It depends on your design.
>
> right
>
> >> when detach() is called its already out of driver's hands. all it can
> >> do now is to call unsetup(), or am i missing something here?
> >
> > The correct thing is to do a usb2_transfer_drain() which will wait until
> > the callback has been called back, if a transfer is pending.
>
> i do not understand. if usb2_transfer_unsetup() calls
> usb2_transfer_drain() and sleeps, then what difference does it make in
> detach()?

You are right that USB-wise it does not matter if you call 
usb2_transfer_drain() separately or if you call usb2_transfer_unsetup(). I'm 
just thinking with regard to the state inside bluetooth, because there are 
two contexts running in parallell. I.E. Callbacks might be executed in 
paralell to bluetooth tearing down.

> i just followed original code that did it this way, i.e. 
> drain and unsetup all transfers in one call.

Yes, that's fine.

>
> > In USB2 a "usb2_start_hardware()" call is always paired with a USB
> > transfer callback, no matter what you do! This way of design has been
> > chosen so that you can do refcounts inside the callback!
>
> ok, and that is exactly what code does. the only thing is that
> refcounts are on netgraph node and not on softc structure. again the
> reason for that is because netgraph node can be accessed from both
> usb2 driver and netgraph subsystem directly.

Isn't it enough to increment the refcount at attach and decrement it at 
detach ????

> > > I see in your code that you try to do things Asynchronously. This
> > >
> >> > complicates stuff quite a lot. In my patches I've tried to do some
> >> > things synchronously.
> >>
> >> no, No, NO (sorry for shouting). we _CAN_NOT_ sleep in netgraph.
> >> period. you _CAN_NOT_ grab usb interface mutexes unless you can
> >> absolutely guarantee that they will not sleep for any significant
> >> amount of time.
> >
> > usb2_transfer_start() and usb2_transfer_stop() are non-blocking and will
> > never sleep. The exception is "usb2_transfer_drain()" which can sleep for
> > a few milliseconds and is only called when the hook is disconnected. I do
> > not see that as a problem versus having "X" more checks in the code.
>
> well, when i looked at the code, i saw that both functions do
> USB_BUS_LOCK() and USB_BUS_UNLOCK(). 

Well, this is the lock of the USB IRQ handler. One per USB controller. There 
are no more locks than this that will be locked. The IRQ handler is called 
from a high priority context and will not block very long. Same with USB 
callbacks.

> i do not know enough about those 
> locks and usb2 in general, so i decided to err on the side of caution
> and move all the operations that could potentially sleep into the
> taskqueue context. this actually completely decouples netgraph from
> usb2, and that is a "good thing (tm)" imo :) a little bit of
> complexity is totally justified here imo.

Ok, I see.

>
> also, now that you mentioned it, i should call usb2_transfer_drain()
> in ubt_task() instead of usb2_transfer_stop() to make transfer stop
> completely synchronous as well.

When I think about it, usb2_transfer_stop() is enough. See my latest Patch in 
P4.

> please let me reiterate, sleeping in netgraph is NOT allowed. even if
> its for a few milliseconds. please accept it as a fact. think of all
> netgraph methods as fast interrupt handlers. it is very typical for
> fast interrupt handlers to do minimal amount of work and schedule swi
> to do more heavy processing. that is exactly what the code does here.

Fixed!

>
> there is no NG_NODE_REF in attach(). when we create node is comes with
> one reference and that reference is removed by ng_rmnode_self().
> please read my comments about detach() above. hopefully it will make
> sense now.

Yes, I'm starting to understand it.

>
> i do not really have strong opinion about it. i just thought there
> would be less contention between isoc and bulk/intr/ctrl transfers
> when they use different locks. probably does not matter, since
> everything is going over the same bus. i'm fine with this change utill
> someone has credible evidence otherwise.

Ok.

--HPS