Re: a few usb issues related to edge cases

From: Andriy Gapon <avg_at_FreeBSD.org>
Date: Tue, 20 Dec 2011 14:25:20 +0200
Now the juicy stuff :)

on 19/12/2011 16:30 Hans Petter Selasky said the following:
>> 3.  Looking at usbd_transfer_poll I see that it touches a lot of locks,
>> including taking the bus lock.  As we've discussed before, this is not safe
>> in a particular context where the polling is supposed to be used - in the
>> kdb/ddb context.  If the lock is already taken by another thread, then
>> instead of being able to use a USB keyboard a user would get even less
>> debug-able crash.  Also, it seems that usbd_transfer_poll calls into the
>> usual state machine with various callbacks and dynamically made decisions
>> about whether to execute some actions directly or defer their execution to
>> a different thread. 
> 
> This is an optimisation. If the current thread can do the job without a LOR, 
> then we do it right away. Else we let another thread do it. It is possible to 
> have a more simple model, but then you will also get more task switches.

I think that you are speaking here about the code behavior in the general context.
And I want to limit this part of discussion to the contexts where
usbd_transfer_poll is actually used - kdb and panic.  In those contexts there is
one and only one thread that must do all the work.  Other threads are stopped
and "frozen" in the middle of whatever they were doing before kdb was entered or
panic called (provided SCHEDULER_STOPPED() == true).

>> That code also touches locks in various places.  I
>> think that it would be more preferable to have a method that does the job
>> in a more straight-forward way, without touching any locks, ignoring the
>> usual code paths and assuming that no other treads are running in
>> parallel.  Ditto for the method to submit a request.
> 
> The current USB code can be run fine without real locks, if you do a few 
> tricks. I have a single-threaded BSD-kernel replacement for this which works 
> like a charm for non-FreeBSD projects.

That's very cool.  I believe that various implementations of USB stack for BIOS
and similar are also non-threaded.

> I'm going to paste a few lines FYI:
> 
> Why not extend "struct mtx" to have two fields which are only used in case of 
> system polling (no scheduler running):
> 
> struct mtx {
>   xxx;
>   int owned_polling = 0;
>   struct mtx *parent_polling;
> };
> 
> void
> mtx_init(struct mtx *mtx, const char *name, const char *type, int opt)
> {
>         mtx->owned = 0;
>         mtx->parent = mtx;
> }
> 
> void
> mtx_lock(struct mtx *mtx)
> {
>         mtx = mtx->parent;
>         mtx->owned++;
> }
> 
> void
> mtx_unlock(struct mtx *mtx)
> {
>         mtx = mtx->parent;
>         mtx->owned--;
> }
> 
> int
> mtx_owned(struct mtx *mtx)
> {
>         mtx = mtx->parent;
>         return (mtx->owned != 0);
> }
> 
> void
> mtx_destroy(struct mtx *mtx)
> {
>         /* NOP */
> }
> 
> Maybe mtx_init, mtx_lock, mtx_unlock mtx_owned, mtx_destroy, etc, could be 
> function pointers, which are swapped at panic.

I am not sure if I got your idea right based on the pseudo-code above.
Right now in the head we already skip all locks when SCHEDULER_STOPPED() is
true.  But, but, that doesn't cover all polling cases as the automatic lock
skipping is _not_ done for kdb.  There are various reasons for that.  That's why
the kdb_active flag should be checked by the code that can be executed in the
kdb context when dealing with locking or shared resources in general.


> USB is SMP! 

Right and that's good, but there are cases where SMP is effectively stopped.
Those are panic and kdb.

> To run SMP code from a single thread, you need to create a 
> hiherachy of the threads:
> 
> 1) Callbacks (Giant)
> 2) Callbacks (non-Giant)
> 3) Control EP (non-Giant)
> 4) Explore thread (non-Giant)
> 
> When the explore thread is busy, we look for work in the level above and so 
> on. The USB stack implements this principle, which is maybe not documented 
> anywhere btw. If you want more than code, you can hire me to do that.
> 
> The mtx-code above I believe is far less work than to make new code which 
> handles the polling case only.
> 
> The reason for the parent mutex field, is to allow easy grouping of mutexes, 
> so that USB easily can figure out what can run or not.

This all is really above my level of knowledge.

Basically my current intention is the following: make ukbd/usb code work in
panic+SCHEDULER_STOPPED case in the same way (or not worse, at least) as it
currently works for the kdb case.  I don't have enough knowledge (and time) to
go beyond that.
I just wanted to draw your attention to the fact that obtaining any locks in the
kdb context (or USB polling code in general, even) is not a good idea.
Chances of getting into trouble on those locks are probably quite moderate or
even low, but they do exist.  I am not sure if you are getting any bug reports
about such troubles :-)  Regular users probably do not use kdb too often and a
panic for them is just a "crash", so they likely do not expect anything
usable/debuggable after that :-)

P.S.  Since most (but not all) locking operations in the USB code are already
wrapped under USB-specific macros, then probably it could make sense to
implement your suggestion locally to the USB code.  Just need to take some care
to cover the cases that are not wrapped yet (direct calls to mtx_lock and
similar) and to handle cases where important decisions are made based on
mtx_owned return value (especially in loops).

For example, below are some macros that I want to use in the ukbd code:
#define KBD_LOCK()                              \
        do {                                    \
                if (!kdb_active)                \
                        mtx_lock(&Giant);       \
        } while (0)
#define KBD_UNLOCK()                            \
        do {                                    \
                if (!kdb_active)                \
                        mtx_unlock(&Giant);     \
        } while (0)

#ifdef INVARIANTS
#define KBD_LOCK_ASSERT()                                       \
        do {                                                    \
                if (!cold && !kdb_active && panicstr == NULL)   \
                        mtx_assert(&Giant, MA_OWNED);           \
        } while (0)
#else
#define KBD_LOCK_ASSERT()       (void)0
#endif

P.P.S. Did you have a chance to look at http://wiki.freebsd.org/AvgSyscons yet?

Thank you!
-- 
Andriy Gapon
Received on Tue Dec 20 2011 - 11:25:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:22 UTC