Re: panic after removing usb flash drive

From: Kyle Brooks <captinsmock_at_columbus.rr.com> Date: Wed, 31 Aug 2005 19:35:38 +0000 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:42 UTC

On Wed, 2005-08-31 at 19:22 +0000, Ben Kaduk wrote:
> On 8/31/05, Scott Long <scottl_at_samsco.org> wrote:
> > 
> > Ben Kaduk wrote:
> > > On 8/31/05, Kyle Brooks <captinsmock_at_columbus.rr.com> wrote:
> > >
> > >>umass0: LEXAR MEDIA JUMPDRIVE2, rev 2.00/1.25, addr 2
> > >>umass0: at uhub4 port 6 (addr 2) disconnected
> > >>panic: vm_fault: fault on nofault entry, addr: deadc000
> > >>
> > >>kernel:
> > >>
> > >>FreeBSD 7.0-CURRENT #2: Mon Aug 29 00:39:21 UTC 2005
> > >>
> > >>problem:
> > >>
> > >>kernel panics when usb flash drive is removed
> > >>
> > >>backtrace:
> > >>
> > >>#0 doadump () at pcpu.h:165
> > >>#1 0xc068610e in boot (howto=260)
> > >>at /usr/src/sys/kern/kern_shutdown.c:397
> > >>#2 0xc0685b92 in panic (
> > >>fmt=0xc090e46c "vm_fault: fault on nofault entry, addr: %lx")
> > >>at /usr/src/sys/kern/kern_shutdown.c:553
> > >>#3 0xc0812de1 in vm_fault (map=0xc1060000, vaddr=3735928832,
> > >>fault_type=2 '\002', fault_flags=0)
> > >>at /usr/src/sys/vm/vm_fault.c:884
> > >>#4 0xc0888807 in trap_pfault (frame=0xe6a06bf0, usermode=0,
> > >>eva=3735929110)
> > >>at /usr/src/sys/i386/i386/trap.c:741
> > >>#5 0xc0888d04 in trap (frame=
> > >>{tf_fs = 8, tf_es = -1063649240, tf_ds = 40, tf_edi = -993875968,
> > >>tf_esi = -1014223872, tf_ebp = -425694000, tf_isp = -425694180, tf_ebx =
> > >>-1063640044, tf_edx = -993875900, tf_ecx = 0, tf_eax = -559038242,
> > >>tf_trapno = 12, tf_err = 2, tf_eip = -1069194040, tf_cs = 32, tf_eflags
> > >>= 66050, tf_esp = -1063640032, tf_ss = 0})
> > >>at /usr/src/sys/i386/i386/trap.c:442
> > >>#6 0xc08745ba in calltrap () at /usr/src/sys/i386/i386/exception.s:139
> > >>#7 0x00000008 in ?? ()
> > >>#8 0xc09a0028 in atdma_acpi_driver_mod ()
> > >>#9 0x00000028 in ?? ()
> > >>#10 0xc4c2a800 in ?? ()
> > >>#11 0xc38c2c00 in ?? ()
> > >>#12 0xe6a06cd0 in ?? ()
> > >>#13 0xe6a06c1c in ?? ()
> > >>---Type <return> to continue, or q <return> to quit---
> > >>#14 0xc09a2414 in xsoftc ()
> > >>#15 0xc4c2a844 in ?? ()
> > >>#16 0x00000000 in ?? ()
> > >>#17 0xdeadc0de in ?? ()
> > >>#18 0x0000000c in ?? ()
> > >>#19 0x00000002 in ?? ()
> > >>#20 0xc04564c8 in camisr (V_queue=0xc09a2414)
> > >>at /usr/src/sys/cam/cam_xpt.c:7066
> > >>#21 0xc066f84e in ithread_loop (arg=0xc356fa80)
> > >>at /usr/src/sys/kern/kern_intr.c:545
> > >>#22 0xc066e808 in fork_exit (callout=0xc066f665 <ithread_loop>, arg=0x0,
> > >>frame=0x0) at /usr/src/sys/kern/kern_fork.c:789
> > >>#23 0xc087461c in fork_trampoline ()
> > >>at /usr/src/sys/i386/i386/exception.s:208
> > >>
> > >
> > > This is the expected behaviour
> > 
> > Panics are not acceptable or expected behaviour in any situation, btw.
> > 
> > > if you didn't unmount the filesystem on the
> > > thumbdrive before removing it. There was some discussion on this a while 
> > ago
> > > (but I don't seem to be able to find the exact posts), but the general 
> > idea
> > > is that the kernel has no idea in what state the actual physical medium
> > > (disc) is/was in after being pulled, and may have some stale buffers 
> > holding
> > > data that got written to disk. It doesn't know what to do with this 
> > data, or
> > > how to treat requests to that device, so it panics.
> > >
> > 
> > I probably missed the earlier discussion that you are referring to, but
> > what you are saying here actually isn't true. There are a number of
> > problems:
> 
> 
> Sorry to be giving out bad information -- I really should have found the old 
> discussions I remember before posting. 
> 
> 1) When the thumbdrive gets pulled, the umass driver gets told to
> > detach. It tries to detach itself from CAM, but things don't get torn
> > down correctly because there is an open reference to the target in CAM
> > (because there is a mounted filesystem on the device). umass truddles
> > along anyways and goes away, leaving lots of dangling pointers in CAM
> > that blow up on the next attempted I/O access.
> > 
> > Part of the problem here is that the umass driver is architected wrong.
> > It creates a SIM, bus, and target instance for every umass device that
> > gets inserted. When the device gets pulled, it tries to tear down
> > each of those instances all at once. CAM simply wasn't designed for
> > this. It was designed for the SIMs and buses to be long-lived objects
> > where only the targets (and luns) come and go. Making umass fit this
> > model would invlove turning it into two logical drivers. One would be
> > a SIM that would attach to the root hub instance of each USB controller
> > and would treat the USB bus as a CAM bus. The other would be a target
> > driver that gets created and destroyed on a per-device basis as those
> > devices come and go. When a umass device gets plugged in, the USB
> > framework would tell the apprpriate SIM to create a target instance.
> > When the device gets pulled, the framework would tell the SIM to detach
> > and destroy the target. No dangling pointers would be left behind by
> > the SIM going away. I have some prototype work in progress on this.
> > 
> > 2) Some filesystems, UFS in particular, assume that an I/O will never
> > fail. Instead of checking the error status of the buf on completion,
> > they just continue on and assume that everything is fine. If the
> > VM is trying to page in a vnode, for example, it'll think that
> > the operation succeeded, and then really bad things will happen. I'm
> > not sure if the same problem exists in MSDOSFS because I don't have
> > any DOS filesystems except on USB, and the problem with umass stands
> > in the way of further testing. In luei of fixing umass, I might have to
> > create a synthetic md device to hold a msdos filesystem so that I can
> > test how it behaves.
> > 
> > 3) It's unknown if the VM system knows how to rationally deal with
> > failed I/O or how to propagate that kind of failure to the rest of the
> > kernel and/or applications. What happens if you mmap a file, and then
> > the device holding the file goes away? How do you let the application
> > know that its mmap is now invalid? Send it a Sig11, maybe? How should
> > the vnode pager deal with failure? There are lots of interesting
> > problems here.
> > 
> > In any case, the panic posted in the grandparent message implicates CAM
> > and umass, which is what I would expect. There may be more layers of
> > problems underneath it.
> > 
> > Scott
> > 
> 
> Thanks for the in-depth explanation. I will search the archives tonight to 
> find the old discussion and see where I was misreading things.
> 
> Ben Kaduk
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"

I did not notice this before but after leaving the device in the drive
for about 10 minutes the following kernel messages appear:

umass0: LEXAR MEDIA JUMPDRIVE2, rev 2.00/1.25, addr 2
(da0:umass-sim0:0:0:0): got CAM status 0x4
(da0:umass-sim0:0:0:0): fatal error, failed to attach to device
(da0:umass-sim0:0:0:0): lost device

The device file never shows up in the /dev directory