Re: if_bridge crash

From: Doug Rabson <dfr_at_rabson.org>
Date: Sun, 22 Jul 2007 08:58:12 +0100
On Saturday 21 July 2007, Andrew Thompson wrote:
> On Sun, Jul 22, 2007 at 09:07:59AM +1200, Andrew Thompson wrote:
> > On Sat, Jul 21, 2007 at 08:38:59PM +0200, Attilio Rao wrote:
> > > Doug Rabson wrote:
> > > >I've been using if_bridge and if_tap to join various qemu
> > > > virtual machines onto my local network. I use this script to
> > > > set up the bridge:
> > > >
> > > >	ifconfig bridge0 create
> > > >	ifconfig tap0 create
> > > >	ifconfig bridge0 addm vr0 addm tap0 up
> > > >
> > > >I had forgotten what stupid mac address qemu had made up for its
> > > >interface and I needed to adjust my dhcpd config so I typed
> > > > 'ifconfig bridge addr' to list the addresses on the bridge and
> > > > got an instant panic. Qemu was not running at this point. The
> > > > kernel address where it crashed was good - it was the userland
> > > > address which faulted.
> > > >
> > > >The crash was in generic_copyout+0x36 called from
> > > > bridge_ioctl+0x1ae. I took a look at the code and as far as I
> > > > can make out, trap() got a bit confused and managed to ignore
> > > > the pcb_onfault marker left by copyout. Its hard to tell
> > > > exactly what happened since the damn compiler has optimised the
> > > > crap out of the code there.
> > > >
> > > >As far as I can see, the bridge code is calling copyout with a
> > > > mutex held. Is that allowed? It doesn't sound like it should be
> > > > allowed but I'm not quite up-to-date on that aspect of the
> > > > current kernel api.
> > >
> > > Since a copyout() can generate a page fault (which can let the
> > > thread sleep) it is not allowed to mantain neither a blockable
> > > lock (mutex, rwlock) or a spinlock over a copyout.
> >
> > Please test this patch.
>
> One more time with the file attached.

I still get a panic but I managed to get more information this time. The 
original panic was a WITNESS complaint (not sure why that put it into 
the debugger rather than just logging the LOR).

Here is what happens with your patch. I continued from the first call to 
ddb and that generated the subsequent LOR report.

Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex if_bridge r = 0 (0xcc3be00c) locked 
_at_ /usr/src/sys/modules/if_bridge/../../net/if_bridge.c:715
KDB: stack backtrace:
db_trace_self_wrapper(c082abf1,f6b9c9c0,c05f25fd,c082afb0,f6b9c9d4,...) 
at db_trace_self_wrapper+0x26
kdb_backtrace(c082afb0,f6b9c9d4,4,1,0,...) at kdb_backtrace+0x29
witness_warn(5,0,c084f993,c08bd4f4,f6b9c9f4,...) at witness_warn+0x1cd
trap(f6b9ca60) at trap+0x165
calltrap() at calltrap+0x6
--- trap 0xc, eip = 0xc07d760e, esp = 0xf6b9caa0, ebp = 0xf6b9caec ---
generic_copyout(cc3be000,f6b9cb08,cc40ae17,2cb,cc3be00c,...) at 
generic_copyout+0x36
bridge_ioctl(c57cf400,c01c697b,c6c6a860,c0877704,c0834167,...) at 
bridge_ioctl+0x1c8
in_control(c882dc60,c01c697b,c6c6a860,c57cf400,cc378e00,...) at 
in_control+0xda4
ifioctl(c882dc60,c01c697b,c6c6a860,cc378e00,cc378e00,...) at 
ifioctl+0x323
soo_ioctl(ca21fbd0,c01c697b,c6c6a860,ca3ac400,cc378e00,...) at 
soo_ioctl+0x3c7
kern_ioctl(cc378e00,3,c01c697b,c6c6a860,1000000,...) at kern_ioctl+0x243
ioctl(cc378e00,f6b9ccfc,c,c082d957,c0871950,...) at ioctl+0x134
syscall(f6b9cd38) at syscall+0x2b3
Xint0x80_syscall() at Xint0x80_syscall+0x20
--- syscall (54, FreeBSD ELF32, ioctl), eip = 0x2816733f, esp = 
0xbfbfe27c, ebp = 0xbfbfe2c8 ---


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address	= 0x2820a000
fault code		= supervisor write, page not present
instruction pointer	= 0x20:0xc07d760e
stack pointer	        = 0x28:0xf6b9caa0
frame pointer	        = 0x28:0xf6b9caec
code segment		= base 0x0, limit 0xfffff, type 0x1b
			= DPL 0, pres 1, def32 1, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 1512 (ifconfig)
lock order reversal: (sleepable after non-sleepable)
 1st 0xcc3be00c if_bridge (if_bridge) 
_at_ /usr/src/sys/modules/if_bridge/../../net/if_bridge.c:715
 2nd 0xc87953e4 user map (user map) _at_ vm/vm_map.c:3075
KDB: stack backtrace:
db_trace_self_wrapper(c082abf1,f6b9c7cc,c05f347e,c082d065,c87953e4,...) 
at db_trace_self_wrapper+0x26
kdb_backtrace(c082d065,c87953e4,c0845907,c0845907,c08458a9,...) at 
kdb_backtrace+0x29
witness_checkorder(c87953e4,9,c08458a0,c03,f6b9c814,...) at 
witness_checkorder+0x6de
_sx_xlock(c87953e4,0,c08458a0,c03,f6b9c834,...) at _sx_xlock+0x7d
_vm_map_lock_read(c87953a0,c08458a0,c03,0,c05f1032,...) at 
_vm_map_lock_read+0x50
vm_map_lookup(f6b9c91c,2820a000,2,f6b9c920,f6b9c910,...) at 
vm_map_lookup+0x38
vm_fault(c87953a0,2820a000,2,8,2820a000,...) at vm_fault+0x83
trap_pfault(5,0,c084f993,c08bd4f4,f6b9c9f4,...) at trap_pfault+0xf9
trap(f6b9ca60) at trap+0x412
calltrap() at calltrap+0x6
--- trap 0xc, eip = 0xc07d760e, esp = 0xf6b9caa0, ebp = 0xf6b9caec ---
generic_copyout(cc3be000,f6b9cb08,cc40ae17,2cb,cc3be00c,...) at 
generic_copyout+0x36
bridge_ioctl(c57cf400,c01c697b,c6c6a860,c0877704,c0834167,...) at 
bridge_ioctl+0x1c8
in_control(c882dc60,c01c697b,c6c6a860,c57cf400,cc378e00,...) at 
in_control+0xda4
ifioctl(c882dc60,c01c697b,c6c6a860,cc378e00,cc378e00,...) at 
ifioctl+0x323
soo_ioctl(ca21fbd0,c01c697b,c6c6a860,ca3ac400,cc378e00,...) at 
soo_ioctl+0x3c7
kern_ioctl(cc378e00,3,c01c697b,c6c6a860,1000000,...) at kern_ioctl+0x243
ioctl(cc378e00,f6b9ccfc,c,c082d957,c0871950,...) at ioctl+0x134
syscall(f6b9cd38) at syscall+0x2b3
Xint0x80_syscall() at Xint0x80_syscall+0x20
--- syscall (54, FreeBSD ELF32, ioctl), eip = 0x2816733f, esp = 
0xbfbfe27c, ebp = 0xbfbfe2c8 ---
Received on Sun Jul 22 2007 - 05:59:07 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:15 UTC