Re: a few usb issues related to edge cases

From: Hans Petter Selasky <hselasky_at_c2i.net>
Date: Mon, 19 Dec 2011 15:30:40 +0100
On Monday 19 December 2011 13:16:17 Andriy Gapon wrote:
> Hans Petter,
> 
> I think that I see some issues in the USB code that could cause problems in
> some edge cases.
> From easiest to hardest:
> 

Hi,

> 1.  I think that currently there is a LOR in usb_bus_shutdown.  I think
> that the following patch should fix it:
> ===========================================================================
> --- a/sys/dev/usb/controller/usb_controller.c
> +++ b/sys/dev/usb/controller/usb_controller.c
> _at__at_ -479,6 +481,7 _at__at_ usb_bus_shutdown(struct usb_proc_msg *pm)
> 
>  	bus_generic_shutdown(bus->bdev);
> 
> +	USB_BUS_UNLOCK(bus);
>  	usbd_enum_lock(udev);
> 
>  	err = usbd_set_config_index(udev, USB_UNCONFIG_INDEX);
> _at__at_ -497,6 +500,7 _at__at_ usb_bus_shutdown(struct usb_proc_msg *pm)
>  		(bus->methods->set_hw_power_sleep) (bus, USB_HW_POWER_SHUTDOWN);
> 
>  	usbd_enum_unlock(udev);
> +	USB_BUS_LOCK(bus);
>  }
> 

You are right! I believe my kernel tests were run without WITNESS.

> 2.  Somewhat related to the above.  I think that because the USB subsystem
> implements the shutdown method and detaches all its drivers, then the ukbd
> driver won't be able to properly handle the 'shutdown -h' case where the
> kernel asks to "press any key to reboot" at the end.  Depending on which
> thread wins the race (the one that executes the mainline shutdown code or
> the USB explore thread that detaches USB devices) there will either an
> immediate reboot or a later crash when any key is pressed.
> This is not critical, but OTOH perhaps the USB subsystem doesn't have to do
> the shutdown.  As far as I can see a lot of the drivers just do nothing
> for the shutdown, for better or for worth.
> 
> A side note: perhaps it would be a good idea to pass the 'how' value as an
> additional parameter to device_shutdown.

The shutdown of USB is done to give USB devices at last chance to turn off or 
reduce their current consumption.

In the old code the Host controller itself would be disabled, so keyboard 
wouldn't have worked I believe like you suggest.

BTW: Shutdown should be executed after any "Press any key to reboot." and 
shutdown should be given time to complete, hence for USB this needs to happen 
in sync with the rest of the USB system.

> 3.  Looking at usbd_transfer_poll I see that it touches a lot of locks,
> including taking the bus lock.  As we've discussed before, this is not safe
> in a particular context where the polling is supposed to be used - in the
> kdb/ddb context.  If the lock is already taken by another thread, then
> instead of being able to use a USB keyboard a user would get even less
> debug-able crash.  Also, it seems that usbd_transfer_poll calls into the
> usual state machine with various callbacks and dynamically made decisions
> about whether to execute some actions directly or defer their execution to
> a different thread. 

This is an optimisation. If the current thread can do the job without a LOR, 
then we do it right away. Else we let another thread do it. It is possible to 
have a more simple model, but then you will also get more task switches.

> That code also touches locks in various places.  I
> think that it would be more preferable to have a method that does the job
> in a more straight-forward way, without touching any locks, ignoring the
> usual code paths and assuming that no other treads are running in
> parallel.  Ditto for the method to submit a request.

The current USB code can be run fine without real locks, if you do a few 
tricks. I have a single-threaded BSD-kernel replacement for this which works 
like a charm for non-FreeBSD projects. I'm going to paste a few lines FYI:

Why not extend "struct mtx" to have two fields which are only used in case of 
system polling (no scheduler running):

struct mtx {
  xxx;
  int owned_polling = 0;
  struct mtx *parent_polling;
};

void
mtx_init(struct mtx *mtx, const char *name, const char *type, int opt)
{
        mtx->owned = 0;
        mtx->parent = mtx;
}

void
mtx_lock(struct mtx *mtx)
{
        mtx = mtx->parent;
        mtx->owned++;
}

void
mtx_unlock(struct mtx *mtx)
{
        mtx = mtx->parent;
        mtx->owned--;
}

int
mtx_owned(struct mtx *mtx)
{
        mtx = mtx->parent;
        return (mtx->owned != 0);
}

void
mtx_destroy(struct mtx *mtx)
{
        /* NOP */
}

Maybe mtx_init, mtx_lock, mtx_unlock mtx_owned, mtx_destroy, etc, could be 
function pointers, which are swapped at panic.

USB is SMP! To run SMP code from a single thread, you need to create a 
hiherachy of the threads:

1) Callbacks (Giant)
2) Callbacks (non-Giant)
3) Control EP (non-Giant)
4) Explore thread (non-Giant)

When the explore thread is busy, we look for work in the level above and so 
on. The USB stack implements this principle, which is maybe not documented 
anywhere btw. If you want more than code, you can hire me to do that.

The mtx-code above I believe is far less work than to make new code which 
handles the polling case only.

The reason for the parent mutex field, is to allow easy grouping of mutexes, 
so that USB easily can figure out what can run or not.

> 
> As a side note: we probably need a flag to mark certain things such as e.g.
> the ukbd driver as "non recoverable", meaning that once those are used in
> the kdb context then there is no safe way to go back to normal system
> operation.

I think you need to do shutdown _after_ the "Press any key to reboot". A flag 
won't help.

--HPS
Received on Mon Dec 19 2011 - 13:33:11 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:22 UTC