Re: panic after removing usb flash drive

From: Ben Kaduk <minimarmot_at_gmail.com>
Date: Wed, 31 Aug 2005 19:22:31 +0000
On 8/31/05, Scott Long <scottl_at_samsco.org> wrote:
> 
> Ben Kaduk wrote:
> > On 8/31/05, Kyle Brooks <captinsmock_at_columbus.rr.com> wrote:
> >
> >>umass0: LEXAR MEDIA JUMPDRIVE2, rev 2.00/1.25, addr 2
> >>umass0: at uhub4 port 6 (addr 2) disconnected
> >>panic: vm_fault: fault on nofault entry, addr: deadc000
> >>
> >>kernel:
> >>
> >>FreeBSD 7.0-CURRENT #2: Mon Aug 29 00:39:21 UTC 2005
> >>
> >>problem:
> >>
> >>kernel panics when usb flash drive is removed
> >>
> >>backtrace:
> >>
> >>#0 doadump () at pcpu.h:165
> >>#1 0xc068610e in boot (howto=260)
> >>at /usr/src/sys/kern/kern_shutdown.c:397
> >>#2 0xc0685b92 in panic (
> >>fmt=0xc090e46c "vm_fault: fault on nofault entry, addr: %lx")
> >>at /usr/src/sys/kern/kern_shutdown.c:553
> >>#3 0xc0812de1 in vm_fault (map=0xc1060000, vaddr=3735928832,
> >>fault_type=2 '\002', fault_flags=0)
> >>at /usr/src/sys/vm/vm_fault.c:884
> >>#4 0xc0888807 in trap_pfault (frame=0xe6a06bf0, usermode=0,
> >>eva=3735929110)
> >>at /usr/src/sys/i386/i386/trap.c:741
> >>#5 0xc0888d04 in trap (frame=
> >>{tf_fs = 8, tf_es = -1063649240, tf_ds = 40, tf_edi = -993875968,
> >>tf_esi = -1014223872, tf_ebp = -425694000, tf_isp = -425694180, tf_ebx =
> >>-1063640044, tf_edx = -993875900, tf_ecx = 0, tf_eax = -559038242,
> >>tf_trapno = 12, tf_err = 2, tf_eip = -1069194040, tf_cs = 32, tf_eflags
> >>= 66050, tf_esp = -1063640032, tf_ss = 0})
> >>at /usr/src/sys/i386/i386/trap.c:442
> >>#6 0xc08745ba in calltrap () at /usr/src/sys/i386/i386/exception.s:139
> >>#7 0x00000008 in ?? ()
> >>#8 0xc09a0028 in atdma_acpi_driver_mod ()
> >>#9 0x00000028 in ?? ()
> >>#10 0xc4c2a800 in ?? ()
> >>#11 0xc38c2c00 in ?? ()
> >>#12 0xe6a06cd0 in ?? ()
> >>#13 0xe6a06c1c in ?? ()
> >>---Type <return> to continue, or q <return> to quit---
> >>#14 0xc09a2414 in xsoftc ()
> >>#15 0xc4c2a844 in ?? ()
> >>#16 0x00000000 in ?? ()
> >>#17 0xdeadc0de in ?? ()
> >>#18 0x0000000c in ?? ()
> >>#19 0x00000002 in ?? ()
> >>#20 0xc04564c8 in camisr (V_queue=0xc09a2414)
> >>at /usr/src/sys/cam/cam_xpt.c:7066
> >>#21 0xc066f84e in ithread_loop (arg=0xc356fa80)
> >>at /usr/src/sys/kern/kern_intr.c:545
> >>#22 0xc066e808 in fork_exit (callout=0xc066f665 <ithread_loop>, arg=0x0,
> >>frame=0x0) at /usr/src/sys/kern/kern_fork.c:789
> >>#23 0xc087461c in fork_trampoline ()
> >>at /usr/src/sys/i386/i386/exception.s:208
> >>
> >
> > This is the expected behaviour
> 
> Panics are not acceptable or expected behaviour in any situation, btw.
> 
> > if you didn't unmount the filesystem on the
> > thumbdrive before removing it. There was some discussion on this a while 
> ago
> > (but I don't seem to be able to find the exact posts), but the general 
> idea
> > is that the kernel has no idea in what state the actual physical medium
> > (disc) is/was in after being pulled, and may have some stale buffers 
> holding
> > data that got written to disk. It doesn't know what to do with this 
> data, or
> > how to treat requests to that device, so it panics.
> >
> 
> I probably missed the earlier discussion that you are referring to, but
> what you are saying here actually isn't true. There are a number of
> problems:


Sorry to be giving out bad information -- I really should have found the old 
discussions I remember before posting. 

1) When the thumbdrive gets pulled, the umass driver gets told to
> detach. It tries to detach itself from CAM, but things don't get torn
> down correctly because there is an open reference to the target in CAM
> (because there is a mounted filesystem on the device). umass truddles
> along anyways and goes away, leaving lots of dangling pointers in CAM
> that blow up on the next attempted I/O access.
> 
> Part of the problem here is that the umass driver is architected wrong.
> It creates a SIM, bus, and target instance for every umass device that
> gets inserted. When the device gets pulled, it tries to tear down
> each of those instances all at once. CAM simply wasn't designed for
> this. It was designed for the SIMs and buses to be long-lived objects
> where only the targets (and luns) come and go. Making umass fit this
> model would invlove turning it into two logical drivers. One would be
> a SIM that would attach to the root hub instance of each USB controller
> and would treat the USB bus as a CAM bus. The other would be a target
> driver that gets created and destroyed on a per-device basis as those
> devices come and go. When a umass device gets plugged in, the USB
> framework would tell the apprpriate SIM to create a target instance.
> When the device gets pulled, the framework would tell the SIM to detach
> and destroy the target. No dangling pointers would be left behind by
> the SIM going away. I have some prototype work in progress on this.
> 
> 2) Some filesystems, UFS in particular, assume that an I/O will never
> fail. Instead of checking the error status of the buf on completion,
> they just continue on and assume that everything is fine. If the
> VM is trying to page in a vnode, for example, it'll think that
> the operation succeeded, and then really bad things will happen. I'm
> not sure if the same problem exists in MSDOSFS because I don't have
> any DOS filesystems except on USB, and the problem with umass stands
> in the way of further testing. In luei of fixing umass, I might have to
> create a synthetic md device to hold a msdos filesystem so that I can
> test how it behaves.
> 
> 3) It's unknown if the VM system knows how to rationally deal with
> failed I/O or how to propagate that kind of failure to the rest of the
> kernel and/or applications. What happens if you mmap a file, and then
> the device holding the file goes away? How do you let the application
> know that its mmap is now invalid? Send it a Sig11, maybe? How should
> the vnode pager deal with failure? There are lots of interesting
> problems here.
> 
> In any case, the panic posted in the grandparent message implicates CAM
> and umass, which is what I would expect. There may be more layers of
> problems underneath it.
> 
> Scott
> 

Thanks for the in-depth explanation. I will search the archives tonight to 
find the old discussion and see where I was misreading things.

Ben Kaduk
Received on Wed Aug 31 2005 - 17:22:34 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:42 UTC