Re: panic after removing usb flash drive

From: Kyle Brooks <captinsmock_at_columbus.rr.com>
Date: Thu, 01 Sep 2005 00:09:26 +0000
On Wed, 2005-08-31 at 17:02 -0600, Scott Long wrote:
> Bernd Walter wrote:
> > On Wed, Aug 31, 2005 at 12:05:17PM -0600, Scott Long wrote:
> > 
> >>Bernd Walter wrote:
> >>
> >>>On Wed, Aug 31, 2005 at 09:38:20AM -0600, Scott Long wrote:
> >>>
> >>>>Ben Kaduk wrote:
> >>>>
> >>>>>On 8/31/05, Kyle Brooks <captinsmock_at_columbus.rr.com> wrote:
> >>>
> >>>This would really a step backward.
> >>>Originally we had LUN creation/deletion on shared SIM and lots of
> >>>different problems.
> >>>SIM deletion should really be fixed - not only for umass, but generally
> >>>as we live in a world with removeable cards.
> >>
> >>Bugs in the umass detach code are immediately responsible for the
> >>problem, but you are correct that CAM in general doesn't like SIMs
> >>going away.  DFly worked on this a while back, but I don't recall
> >>whether the work there was to add more sanity checks in the data
> >>path (which I don't want to do), or if it was the correct approach of
> >>flushing and quiescing the data/queuing, topology, and error recovery
> >>paths.
> > 
> > 
> > What bugs are you refering?
> 
> The CAM detach code should be returning EBUSY errors since the 'da'
> periph has open references to it.  I'm pretty sure that thse errors
> are being ignored by umass.c, leading to the SIM going away unexpectedly
> to CAM.
> 
> > Sample code on how to correctly detach a sim a sparse.
> > The umass code doesn't know that devices on a scbus are in use, or
> > should it?
> 
> Well, it doesn't need to right now because you only have one target
> per SIM.  It should though; most normal SCSI drivers keep a table of
> known devices in order to remember negotiated transfer settings and
> handle reconnections.
> 
> > I think with a single sim style the target remains a ghost device
> > as long as it is refered, right?
> 
> More or less, yes.  Periph drivers are refcounted and understand the
> idea of going away when a selection fails (i.e. the device got pulled
> from the bus).
> 
> > 
> > In the USB-world we have the problem that a blocking detach blocks
> > other port attachment/detachments on the same bus as well.
> > This is because we currently have a single thread per bus processing
> > these events.
> > We already see this problem with ucom devices where an open tty on a
> > detached device blocks.
> > But it is better than the panic that we have right now.
> > In the tty case we have a timeout, don't know what we can do with CAM.
> > 
> 
> In the sim-per-target model, you need to completely drain the simq and
> devq for the SIM before allowing detach to complete.  This means 
> freezing the simq then waiting until the camisr can run and process any
> pending CCBs on the completion queue.  The camisr is an SWI, so you'll
> need to sleep so that it can run.
> 
> > 
> >>>Technically a shared sim with using targets could be made work for
> >>>umass as it's defined today, but it won't work for USB to SCSI
> >>>converters - that we don't support one of these adapters today doesn't
> >>>solve the problem.
> >>
> >>This is a completely different situation.  A USB-SCSI adapter would
> >>provide its own SCSI bus that is separate from the USB bus with its
> >>own queueing resources and own error recovery mechanisms.
> > 
> > 
> > Many umass devices are ide converters - even some flash drives.
> > 
> > 
> >>>Is is a academical standpoint defining where in the USB/umass
> >>>infrastrukture the SIM is located, but I personally always saw it
> >>>inside the USB-device and not on the USB.
> >>>USB is just a transport medium and not a SIM in the same way as PCI is
> >>>just a transport medium for a classical SCSI-Interface.
> >>>Yes - umass creates a SIM, bus and targed, because that is what a user
> >>>really attaches/detaches.
> >>>
> >>
> >>It is muddy, but for a mass storage class device, you are using the
> >>USB bus as the transport medium and you are using the USB controller
> >>as the transport initiator.  Command queueing and resource arbitration
> >>happens in the USB controller and driver, not in the umass device or
> >>driver.  Same for error recovery.  The USB controller is essentially
> >>acting as a SCSI controller, just with a USB bus instead of a SPI bus.
> >>The whole point of CAM is to assist with queueing and arbitrating bus
> >>resources.  There is no way that the SIM-per-device approach can provide
> >>this information.
> > 
> > 
> > I can follow you to some degree.
> > With the single-sim design we have had the following problems:
> > - probing of LUNs filed on attach and required a manual rescan.
> >   undoubly fixable by someone with good CAM knowledge.
> 
> I'm not parsing this.  Are you referring to the need to do a rescan
> after plugging in a device?  This is easy to solve.  When a umass
> target is attached, you either send an AC_FOUND_DEVICE async event to
> announce the target, or you request a bus rescan from within the driver.
> I think that the ciss driver has an example of this.
> 
> > - CAM needs a max target value, but how many target do we really have?
> >   Each USB has up to 127 devices (pratically 100 useable as we need
> >   some hubs)
> 
> The max target value is really only important for bus rescans.  The SIM 
> can just track what targets it currently knows about and reject CCBs
> for ones that don't exist (it somewhat does this now, though with only
> one target per SIM it's kinda silly).  Setting the max target value to
> 127 and rejecting targets that don't exist won't slow down a bus rescan
> but much at all.
> 
> >   Each device can have multiple functions, which means multiple umass
> >   instances.
> 
> I have a umass device that is a CF+SD card reader.  It shows up in CAM
> as a single target with 2 LUNs.  Is this the kind of thing that you are
> talking about?  If so, then there is no reason not to continue to use
> the model of a single target with multiple LUNs.
> 
> >   Previously we had a small hardcoded number, too big numbers slow
> >   down bus rescans, too small restrict the number of possible devices.
> >   We should have a dynamic way.
> > Don't remember if ther were others.
> > 
> > From the technical standpoint - no matter what we do, there are
> > problems to solve.
> > 
> > 
> >>Your analogy of a PCI bus is correct for the USB-SCSI adapters, where 
> >>the adapter is doing a full conversion and bridge from one bus type to
> >>another.  It's not true for a umass device where it's merely using the
> >>USB bus as a SCSI transport.
> > 
> > 
> > So what is an USB-IDE converter?
> 
> I assumed that you were talking about devices that bridge from USB to
> IDE/SCSI and did not conform to the umass standard.  I have a USB2-IDE
> converter in an external exclosure that speaks umass and is probably
> closer to what you are talking about here.  But again, umass is really
> about using the USB bus to transport SCSI/ATA protocol, not about
> providing full access to a SCSI/IDE bus via a USB tunnel.  That is a
> significant difference, IMHO.  The USB controller is acting in a direct
> role as the SCSI/ATA initiator, vs. just tunnelling to a smart
> initiator.
> 
> > OK - that won't help for a practical solution.
> > In the practical way it sounded easier to go the multiple sim way.
> > sim detach needs to be fixed either way.
> 
> Yes.  It somewhat works now as long as the system is completely idle.
> It breaks down horribly if I/O or error recovery is in progress and
> a periph driver is left with CCBs in flight and/or a dangling
> reference to a SIM.  The only way to deal with this is to allow
> blocking while CAM drains itself.
> 
> > Are there any other technical reasons for doing single sim?
> > You've mentioned rescource arbitration and error recovery.
> > Is there anything that can CAM do for us that it won't with multiple
> > sim?
> > 
> 
> It means that you'll be able to detach umass targets without doing the
> complicated dance of sleeping for CAM to drain itself.  It also will
> mean that it's less fragile to edge cases that are hard to identify and
> deal with.  Fixing CAM detach so that this works reliably is definitely
> something that must be done, but you can't avoid sleeping in order for
> it to work.
> 
> Scott
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"

I'm don't really know the mechanics of CAM, but just for the record this
flash drive works in 5.4, but CAM in current seems to give up after
about 10 minutes with this:

umass0: LEXAR MEDIA JUMPDRIVE2, rev 2.00/1.25, addr 2
(da0:umass-sim0:0:0:0): got CAM status 0x4
(da0:umass-sim0:0:0:0): fatal error, failed to attach to device
(da0:umass-sim0:0:0:0): lost device
(da0:umass-sim0:0:0:0): removing device entry
Opened disk da0 -> 5

sorry i missed this before, didn't keep it in that long, if it helps
attached is a diff file for CAM from 5.4 to current.

Received on Wed Aug 31 2005 - 22:09:36 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:42 UTC