Re: panic: mutex Giant not owned at /usr/src/sys/kern/tty_ttydisc.c:1127

From: Maksim Yevmenkin <maksim.yevmenkin_at_gmail.com>
Date: Sat, 24 Jan 2009 01:38:03 -0800
Hans Petter,

>> i'm sorry, did i mention there is no sleeping in netgraph? :-)
>>
>> again, sorry, but this is not going to work. you still doing
>> UBT_LOCK()/UBT_UNLOCK() in netgraph method. that is you are trying to
>> grab the same lock that is used for locking interfaces.
>
> Yes, you have to do that. Else you end up with a state nightmare with the
> taskqueue where you don't know in the end if you should start or stop the USB
> interface!

exactly!

>> more importantly, like i said before, usb2_transfer_stop() does
>> USB_BUS_LOCK()/USB_BUS_UNLOCK(). is there a guarantee that another usb
>> device will not do something synchronously over the same usb bus?
>
> Every time something is done synchronously the USB_BUS_LOCK() is dropped. This
> mutex is only used when filling out DMA structures and when touching USB
> hardware registers. I cannot see that this will take any time at all. We are
> talking about microseconds of congestion time! USB_BUS_LOCK() is a mutex and
> not a SX lock! A mutex cannot sleep in the same way an SX lock can.

from what i can see you are _NOT_ using _SPIN_ mutexes (aka spin
locks). you are using regular mutexes. let me quote locking(9) man
page

"
Mutexes
     Basically (regular) mutexes will deschedule the thread if the mutex can
     not be acquired.  A non-spin mutex can be considered to be equivalent to
     getting a write lock on an rw_lock (see below),
"

so, if thread can not get mutex it will be descheduled. this
absolutely can not happen in netgraph!

>> > I have a question. What is the following doing at the middle of the
>> > ubt_detach():
>> >
>> >        if (node != NULL)
>> >                NG_NODE_UNREF(node);
>> >
>> > If you say that:
>> >
>> >                ng_rmnode_self(node);
>> >
>> > Cleans up the last reference?
>>
>> the complete code is
>>
>> node_p  node = sc->sc_node;
>>
>> if (node != NULL) {
>>    sc->sc_node = NULL;  <-- clear sc_node
>>
>>    NG_NODE_SET_PRIVATE(node, NULL);
>>    NG_NODE_REALLY_DIE(node);
>>
>>    NG_NODE_REF(node); <--- grab +1 reference
>>    ng_rmnode_self(node); <--- mark node as "dead", but not ensure its
>> not free()d
>> }
>>
>> /* bla, bla */
>>
>> if (node != NULL)
>>    NG_NODE_UNREF(node); <--- drop 1 reference and possibly free() node
>
> Yes, but you are already dropping an extra reference in ubt_shutdown(). What
> about that?

shutdown method is called as part of ng_rmnode_self() and drop the
reference that node was born with. the extra reference before
ng_rmnode_self() is to ensure that node pointer is still valid after
ng_rmnode_self() returns. otherwise there is a change that node
pointer becomes invalid while after ng_rmnode_self() calls shutdown
method.

[...]

> I think you misunderstand. I'm using mutexes. They will not block for very
> long! There are no DELAY()'s with mutexes held! When another USB device is
> attached / detached it is:
>
> 1) Done from another thread
> 2) The USB locks in the critical path are dropped when waiting for wakeup or
> doing so called synchronous stuff.
>>
>> it would be if we could sleep in netgraph, and we can not.
>
> Do mutexes sleep? No? So my code should be fine?

yes, regular mutexes sleep.

>> in theory,
>> we only need one extra reference which would tell us if there anything
>> in usb2 that still could access node pointer. in practice, instead of
>> checking every transfer and making sure its no pending before dropping
>> that extra reference i just add multiple references for each usb2
>> transfer and ubt_task (i.e. things that could access node pointer).
>
> I'm just thinking that this is starting to get complicated, and please
> understand I've spent much time on detach issues, and there is alot of
> builtin funcionality inside USB which will handle start/stop/re-start issues
> automagically. I see it like duplicate code when you check for USB transfer
> re-start in netgraph ???

first of all, i do not think crashes are caused by detach(). in fact,
detach() is clean. i've tested it and it worked for me. i tried to
start/stop device while doing flood l2ping. i also tried to yank the
device while doing flood l2ping. it works correctly. i think, the
issue is related to stalled transfers. there is still something wrong
with the code path that deals with stalled transfers. stalls do not
happen on my test box, so i can not test it. also there is NO code
duplication. asynchronous path is required to decouple netgraph from
usb2.

>> right, and what if some other usb device attached to the same usb
>> controller is doing something synchronously? do you see that you could
>> potentially block netgraph for arbitrary time and that is really out
>> of ng_ubt2 driver control?
>
> Again, USB_BUS_LOCK() is a mutex, not an SX lock. All code using this lock is
> called from High Priority threads! See explanation further up. And will not
> block very long at all.

regular mutexes can sleep. we are not allowed to sleep in netgraph.
therefor we must transition out of netgraph context before calling
into any code that tries to grab regular mutex. the async design is
there not because i want to make things complicated. its there because
it is needed.

thanks,
max
Received on Sat Jan 24 2009 - 08:38:03 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:41 UTC