Re: panic: mutex Giant not owned at /usr/src/sys/kern/tty_ttydisc.c:1127

From: Julian Elischer <julian_at_elischer.org>
Date: Sat, 24 Jan 2009 15:06:48 -0800
Maksim Yevmenkin wrote:
> Hans Petter,
> 
>>> i'm sorry, did i mention there is no sleeping in netgraph? :-)
>>>
>>> again, sorry, but this is not going to work. you still doing
>>> UBT_LOCK()/UBT_UNLOCK() in netgraph method. that is you are trying to
>>> grab the same lock that is used for locking interfaces.
>> Yes, you have to do that. Else you end up with a state nightmare with the
>> taskqueue where you don't know in the end if you should start or stop the USB
>> interface!
> 
> exactly!
> 
>>> more importantly, like i said before, usb2_transfer_stop() does
>>> USB_BUS_LOCK()/USB_BUS_UNLOCK(). is there a guarantee that another usb
>>> device will not do something synchronously over the same usb bus?
>> Every time something is done synchronously the USB_BUS_LOCK() is dropped. This
>> mutex is only used when filling out DMA structures and when touching USB
>> hardware registers. I cannot see that this will take any time at all. We are
>> talking about microseconds of congestion time! USB_BUS_LOCK() is a mutex and
>> not a SX lock! A mutex cannot sleep in the same way an SX lock can.
> 
> from what i can see you are _NOT_ using _SPIN_ mutexes (aka spin
> locks). you are using regular mutexes. let me quote locking(9) man
> page
> 
> "
> Mutexes
>      Basically (regular) mutexes will deschedule the thread if the mutex can
>      not be acquired.  A non-spin mutex can be considered to be equivalent to
>      getting a write lock on an rw_lock (see below),
> "
> 
> so, if thread can not get mutex it will be descheduled. this
> absolutely can not happen in netgraph!

Having said htat in the past I think that things have progressed to 
the point where this needs clarification.
Mutexes are now ok as long as the transitive set doesn't include 
anything that takes too long.

this has probably been true for a year or so.

so if Hans Petter can guarantee that the lock is 'short' then probly 
the adaptive mutex code will spin for a few instructions and continue on.

so I think we should allow this usage now.

it was true earlier but as I said, I think things have progressed to 
the state where this is PROBABLY ok.




> 
>>>> I have a question. What is the following doing at the middle of the
>>>> ubt_detach():
>>>>
>>>>        if (node != NULL)
>>>>                NG_NODE_UNREF(node);
>>>>
>>>> If you say that:
>>>>
>>>>                ng_rmnode_self(node);
>>>>
>>>> Cleans up the last reference?
>>> the complete code is
>>>
>>> node_p  node = sc->sc_node;
>>>
>>> if (node != NULL) {
>>>    sc->sc_node = NULL;  <-- clear sc_node
>>>
>>>    NG_NODE_SET_PRIVATE(node, NULL);
>>>    NG_NODE_REALLY_DIE(node);
>>>
>>>    NG_NODE_REF(node); <--- grab +1 reference
>>>    ng_rmnode_self(node); <--- mark node as "dead", but not ensure its
>>> not free()d
>>> }
>>>
>>> /* bla, bla */
>>>
>>> if (node != NULL)
>>>    NG_NODE_UNREF(node); <--- drop 1 reference and possibly free() node
>> Yes, but you are already dropping an extra reference in ubt_shutdown(). What
>> about that?
> 
> shutdown method is called as part of ng_rmnode_self() and drop the
> reference that node was born with. the extra reference before
> ng_rmnode_self() is to ensure that node pointer is still valid after
> ng_rmnode_self() returns. otherwise there is a change that node
> pointer becomes invalid while after ng_rmnode_self() calls shutdown
> method.
> 
> [...]
> 
>> I think you misunderstand. I'm using mutexes. They will not block for very
>> long! There are no DELAY()'s with mutexes held! When another USB device is
>> attached / detached it is:
>>
>> 1) Done from another thread
>> 2) The USB locks in the critical path are dropped when waiting for wakeup or
>> doing so called synchronous stuff.
>>> it would be if we could sleep in netgraph, and we can not.
>> Do mutexes sleep? No? So my code should be fine?
> 
> yes, regular mutexes sleep.
> 
>>> in theory,
>>> we only need one extra reference which would tell us if there anything
>>> in usb2 that still could access node pointer. in practice, instead of
>>> checking every transfer and making sure its no pending before dropping
>>> that extra reference i just add multiple references for each usb2
>>> transfer and ubt_task (i.e. things that could access node pointer).
>> I'm just thinking that this is starting to get complicated, and please
>> understand I've spent much time on detach issues, and there is alot of
>> builtin funcionality inside USB which will handle start/stop/re-start issues
>> automagically. I see it like duplicate code when you check for USB transfer
>> re-start in netgraph ???
> 
> first of all, i do not think crashes are caused by detach(). in fact,
> detach() is clean. i've tested it and it worked for me. i tried to
> start/stop device while doing flood l2ping. i also tried to yank the
> device while doing flood l2ping. it works correctly. i think, the
> issue is related to stalled transfers. there is still something wrong
> with the code path that deals with stalled transfers. stalls do not
> happen on my test box, so i can not test it. also there is NO code
> duplication. asynchronous path is required to decouple netgraph from
> usb2.
> 
>>> right, and what if some other usb device attached to the same usb
>>> controller is doing something synchronously? do you see that you could
>>> potentially block netgraph for arbitrary time and that is really out
>>> of ng_ubt2 driver control?
>> Again, USB_BUS_LOCK() is a mutex, not an SX lock. All code using this lock is
>> called from High Priority threads! See explanation further up. And will not
>> block very long at all.
> 
> regular mutexes can sleep. we are not allowed to sleep in netgraph.
> therefor we must transition out of netgraph context before calling
> into any code that tries to grab regular mutex. the async design is
> there not because i want to make things complicated. its there because
> it is needed.
> 
> thanks,
> max
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
Received on Sat Jan 24 2009 - 22:06:49 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:41 UTC