Re: panic: mutex Giant not owned at /usr/src/sys/kern/tty_ttydisc.c:1127

From: Maksim Yevmenkin <maksim.yevmenkin_at_gmail.com>
Date: Fri, 23 Jan 2009 12:01:44 -0800
On Fri, Jan 23, 2009 at 11:29 AM, Hans Petter Selasky <hselasky_at_c2i.net> wrote:
> Hi Maksim,
>
> First of all I've made some patches which try to tidy up the USB blutooth
> driver. Please see:
>
> http://perforce.freebsd.org/chv.cgi?CH=156577
> http://perforce.freebsd.org/chv.cgi?CH=156579

thanks! unfortunately, i have few problems with those patches. please
read below.

>> that should not be needed (in theory) but i will add it.
>
> Yes! We are multithreaded!

please tell me that you read the blob about locking strategy at the
top of ng_ubt2.c file. when ubt_task() is pending, it holds a
reference to the netgraph node. that means the pointer is still
pointing to a valid structure, but the node is marked as "dead" and
NG_NODE[_NOT]_VALID() macros can be use to check for that. so, even if
task is still pending while the rest of ng_ubt2 stuff is gone, it does
not matter because task holds netgraph node. so the next time task is
executed it will see that the node is "dead", release the node and
simply return doing nothing.

>> > 2) You drain all USB transfers: "usb2_transfer_drain()" called unlocked.
>>
>> yes, usb2_transfer_unsetup() does that, does it not?
>
> That is correct, but sometimes you need to stop the transfers at an earlier
> point. It depends on your design.

when detach() is called its already out of driver's hands. all it can
do now is to call unsetup(), or am i missing something here?

> I see in your code that you try to do things Asynchronously. This complicates
> stuff quite a lot. In my patches I've tried to do some things synchronously.

no, No, NO (sorry for shouting). we _CAN_NOT_ sleep in netgraph.
period. you _CAN_NOT_ grab usb interface mutexes unless you can
absolutely guarantee that they will not sleep for any significant
amount of time. netgraph is essentially single threaded and by
sleeping you will stall entire netgraph subsystem and not just current
thread.

so, objection #1 (the biggest one) is: please leave sc_mbufq_mtx and
ubt_task() glue in place. it is needed as far as i can tell.

>> i'm also deeply confused about error handling in transfers. in
>> particular, any error that is not USB_ERR_CANCELLED makes code to
>> assume that it was a stall and queue clear stall transfer. that is
>> clearly not the case when hardware is yanked while transfer is active.
>> in this case transfer callback seems to be called with USB_ERR_IOERROR
>> (or something like that). so, shouldn't we be safe and only queue
>> stall transfer if callback was in fact called with USB_ERR_STALLED?
>
> 1) Any non-cancelled error goes through the standard clear-stall procedure.
> The clear stall transfers are niced for 50ms interval, see the "interval"
> field in the "usb2_config" structure! We do the clear stall, because it puts
> some delay between the error and the re-start of the transfers. We don't want
> to end up in a case where the transfer simply stops because of a silly CRC
> error. Or that we go into fast race with the hardware on errors.

ok, that makes sense. except the case when hardware is gone and
transfer callback is called with IOERROR (or something like that). i
guess the question is can callback be called with IOERROR for any
reason other than hardware departure?

so, back to your patches.


objection #2 (somewhat major). please do NOT remove NG_NODE_REF() call
in detach() before calling ng_rmnode_self() (unless you remove
NG_NODE_UNREF() below as well). again the reason here is to tell
netgraph node that we are dying, but make sure we keep node_p pointer
pointing to a valid structure as long as possible. to summarize,

NG_NODE_REF(node) <- grab our reference to ensure node pointer remains valid
ng_rmnode_self(node) <- tell node we are dying, mark node as "dead"
but NOT free node structure
/* do other stuff */
NG_NODE_UNREF(node) <- drop our reference and possibly free node pointer


more of a question than objection. why you insist on having single
mutex for both interfaces? isoc transfers callbacks will be called
much more often that control, bulk and interrupt callbacks, so
wouldn't it make sense to use different lock for isoc transfers?


changes in attach() and usb2_config structures are fine (i guess). but
please keep asynchronous design in place. it is required.

thanks,
max
Received on Fri Jan 23 2009 - 19:01:44 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:40 UTC