Re: Instant panic CAM or USB subsystem

From: Steve Kargl <sgk_at_troutmask.apl.washington.edu>
Date: Mon, 3 Feb 2014 08:32:19 -0800
On Tue, Jan 28, 2014 at 12:32:21PM -0500, John Baldwin wrote:
> On Saturday, January 25, 2014 12:21:06 pm Steve Kargl wrote:
> > If I plug my Samsung Intensity II cellphone into a usb port,
> > I get an instant panic.  This is 100% reproducible.  I have
> > the core and kernel for further debugging.  Dmesg.boot follows
> > my sig.
> > 
> > % kgdb /boot/kernel/kernel /vmcore.0
> > 
> > Unread portion of the kernel message buffer:
> > cd1 at umass-sim1 bus 1 scbus4 target 0 lun 0
> > cd1: <SAMSUNG CD-ROM 1.00> Removable CD-ROM SCSI-2 device 
> > cd1: Serial Number 000000000002
> > cd1: 1.000MB/s transfers
> > cd1: cd present [3840000 x 512 byte records]
> > cd1: quirks=0x10<10_BYTE_ONLY>
> > panic: mutex CAM device lock not owned at /usr/src/sys/cam/cam_periph.c:301
> > cpuid = 0
> > KDB: enter: panic
> 
> scsi_at_ might work better for this.  It looks like when cdasync() calls 
> cam_periph_alloc() it doesn't have its associated xpt_path locked.  All the 
> other async xpt callbacks I looked at don't lock the xpt path either.  It 
> seems they expect it to be locked by the caller when they are invoked.  It 
> seems xpt_async_process_dev() doesn't always lock xpt_lock, but sometimes
> locks the device instead:
> 
> 	/*
> 	 * If async for specific device is to be delivered to
> 	 * the wildcard client, take the specific device lock.
> 	 * XXX: We may need a way for client to specify it.
> 	 */
> 	if ((device->lun_id == CAM_LUN_WILDCARD &&
> 	     path->device->lun_id != CAM_LUN_WILDCARD) ||
> 	    (device->target->target_id == CAM_TARGET_WILDCARD &&
> 	     path->target->target_id != CAM_TARGET_WILDCARD) ||
> 	    (device->target->bus->path_id == CAM_BUS_WILDCARD &&
> 	     path->target->bus->path_id != CAM_BUS_WILDCARD)) {
> 		mtx_unlock(&device->device_mtx);
> 		xpt_path_lock(path);
> 		relock = 1;
> 	} else
> 		relock = 0;
> 
> 	(*(device->target->bus->xport->async))(async_code,
> 	    device->target->bus, device->target, device, async_arg);
> 	xpt_async_bcast(&device->asyncs, async_code, path, async_arg);
> 
> 	if (relock) {
> 		xpt_path_unlock(path);
> 		mtx_lock(&device->device_mtx);
> 	}
> 
> Maybe try going up to this frame (16) in your dump and do
> 'p *device->target'?  However, someone with more CAM knowledge needs to look 
> at this to see what is actually broken.
> 

I finally have time to look at this again.  Here's kgdb for frame 16
as you suggested and then frame 17.


Script started on Mon Feb  3 08:16:32 2014
% kgdb /dsk1/obj/usr/src/sys/MOBILE/kernel.debug vmcore.0

Unread portion of the kernel message buffer:
panic: mutex CAM device lock not owned at /usr/src/sys/cam/cam_periph.c:301
cpuid = 1
KDB: enter: panic

#16 0xc047d6a5 in xpt_async_process_dev (device=<value optimized out>, 
    arg=0xc70aa800) at /usr/src/sys/cam/cam_xpt.c:4208
#17 0xc047b346 in xpt_async_process (periph=0x0, ccb=0xc70aa800)
    at /usr/src/sys/cam/cam_xpt.c:4173
#18 0xc047bd15 in xpt_done_process (ccb_h=0xc70aa800)
    at /usr/src/sys/cam/cam_xpt.c:5249
#19 0xc047ef14 in xpt_done_td (arg=<value optimized out>)
    at /usr/src/sys/cam/cam_xpt.c:5276
#20 0xc0723daf in fork_exit (callout=0xc047edb0 <xpt_done_td>)
    at /usr/src/sys/kern/kern_fork.c:977
#21 0xc09fb3e4 in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:278
Current language:  auto; currently minimal
(kgdb) frame 16
#16 0xc047d6a5 in xpt_async_process_dev (device=<value optimized out>, 
    arg=0xc70aa800) at /usr/src/sys/cam/cam_xpt.c:4208
4208				cur_entry->callback(cur_entry->callback_arg,
(kgdb) p *device
Cannot access memory at address 0x0
(kgdb) up 1
#17 0xc047b346 in xpt_async_process (periph=0x0, ccb=0xc70aa800)
    at /usr/src/sys/cam/cam_xpt.c:4173
4173			xpt_async_process_dev(xpt_periph->path->device, ccb);
(kgdb) p *xpt_periph->path->device->target
$2 = {ed_entries = {tqh_first = 0xc6f4b800, tqh_last = 0xc6f4b80c}, links = {
    tqe_next = 0x0, tqe_prev = 0xc6eaaa00}, bus = 0xc6eaaa00, 
  target_id = 4294967295, refcount = 2, generation = 1, last_reset = {
    tv_sec = 0, tv_usec = 0}, rpl_size = 0, luns = 0x0, luns_mtx = {
    lock_object = {lo_name = 0xc0a3f9bc "CAM LUNs lock", lo_flags = 16973824, 
      lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}}
(kgdb) p *xpt_periph->path->device->target->bus
$3 = {et_entries = {tqh_first = 0xc6eaa980, tqh_last = 0xc6eaa988}, links = {
    tqe_next = 0x0, tqe_prev = 0xc7690008}, path_id = 4294967295, 
  sim = 0xc6eaaa80, last_reset = {tv_sec = 0, tv_usec = 0}, flags = 0, 
  refcount = 3, generation = 3, parent_dev = 0x0, xport = 0xc0b2f568, 
  eb_mtx = {lock_object = {lo_name = 0xc0a3f85a "CAM bus lock", 
      lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}}
(kgdb) quit
% exit
exit

Script done on Mon Feb  3 08:20:44 2014

-- 
Steve
Received on Mon Feb 03 2014 - 15:32:30 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:46 UTC