Re: lagg0 <em0,iwn0> and tcpdump problem

From: Sam Leffler <sam_at_errno.com>
Date: Wed, 22 Jul 2009 17:29:57 -0700
Giorgos Keramidas wrote:
> When I run tcpdump on lagg0 (with em0 and iwn0 as laggports), tcpdump
> seems to work fine, but typing ^C kills the wireless interface too.
> 
> My /var/log/messages shows at the time:
> 
> Jul 22 17:59:29 kobe kernel: --- syscall (6, FreeBSD ELF32, close), eip = 0x28393313, esp = 0xbfbfe78c, ebp = 0xbfbfe7a8 ---
> Jul 22 17:59:29 kobe kernel: taskqueue_drain with the following non-sleepable locks held:
> Jul 22 17:59:29 kobe kernel: exclusive rw if_lagg rwlock (if_lagg rwlock) r = 0 (0xcb651704) locked _at_ /usr/src/sys/modules/if_lagg/../../net/if_lagg.c:953
> Jul 22 17:59:29 kobe kernel: exclusive sleep mutex bpf global lock (bpf global lock) r = 0 (0xc0bc1e90) locked _at_ /usr/src/sys/net/bpf.c:605
> Jul 22 17:59:29 kobe kernel: KDB: stack backtrace:
> Jul 22 17:59:29 kobe kernel: db_trace_self_wrapper(c09a567c,fba428ec,c06be155,c09b0bcd,25d,...) at db_trace_self_wrapper+0x26
> Jul 22 17:59:29 kobe kernel: kdb_backtrace(c09b0bcd,25d,ffffffff,c0b90604,fba42924,...) at kdb_backtrace+0x29
> Jul 22 17:59:29 kobe kernel: _witness_debugger(c09a7af7,fba42938,4,1,0,...) at _witness_debugger+0x25
> Jul 22 17:59:29 kobe kernel: witness_warn(5,0,c0961ec8,137,c673c85c,...) at witness_warn+0x1fd
> Jul 22 17:59:29 kobe kernel: taskqueue_drain(c673c840,c678c0b8,d5821000,fba42994,c075a5fc,...) at taskqueue_drain+0xa9
> Jul 22 17:59:29 kobe kernel: ieee80211_waitfor_parent(c678c000,0,c09b6c55,caf,c678c000,...) at ieee80211_waitfor_parent+0x7b
> Jul 22 17:59:29 kobe kernel: ieee80211_ioctl(d2633400,80206910,fba429b4,c6515748,8903,...) at ieee80211_ioctl+0x1ac
> Jul 22 17:59:29 kobe kernel: if_setflag(d2633438,0,fba42a18,c06bdf9c,100,...) at if_setflag+0x10a
> Jul 22 17:59:29 kobe kernel: ifpromisc(d2633400,0,c7e96a43,431,1,...) at ifpromisc+0x33
> Jul 22 17:59:29 kobe kernel: lagg_setflags(cb651704,c7e96a43,3b9,c09b09ad,c650e380,...) at lagg_setflags+0x84
> Jul 22 17:59:29 kobe kernel: lagg_ioctl(c7057800,80206910,fba42aec,fba42b1c,8903,...) at lagg_ioctl+0x50c
> Jul 22 17:59:29 kobe kernel: if_setflag(c7057838,0,c09a0c3d,df,0,...) at if_setflag+0x10a
> Jul 22 17:59:29 kobe kernel: ifpromisc(c7057800,0,c09b0bcd,236,c6551a4c,...) at ifpromisc+0x33
> Jul 22 17:59:29 kobe kernel: bpf_detachd(c0bc1e90,0,c09b0bcd,25d,d93e57a0,...) at bpf_detachd+0x249
> Jul 22 17:59:29 kobe kernel: bpf_dtor(ca46a100,0,c099768a,9e,c7789460,...) at bpf_dtor+0xb0
> Jul 22 17:59:29 kobe kernel: devfs_destroy_cdevpriv(d93e57a0,0,c099768a,a8,fba42be4,...) at devfs_destroy_cdevpriv+0xac
> Jul 22 17:59:29 kobe kernel: devfs_fpdrop(c7789460,cd581b40,3,0,c7789460,...) at devfs_fpdrop+0x68
> Jul 22 17:59:29 kobe kernel: _fdrop(c7789460,cd581b40,fba42c18,c06bdf9c,0,cd581be4,c0b90600,c0a0afa0,c099cf22,cc5efa2c,45b,c099cf22,fba42c40,c0684440,cc5efa2c,8,c099cf22,45b) at _fdrop+0x53
> Jul 22 17:59:29 kobe kernel: closef(c7789460,cd581b40,45b,440,cc5efa2c,...) at closef+0x290
> Jul 22 17:59:29 kobe kernel: kern_close(cd581b40,3,fba42d2c,c0932863,cd581b40,...) at kern_close+0x117
> Jul 22 17:59:29 kobe kernel: close(cd581b40,fba42cf8,4,c099ed18,c0a01b68,...) at close+0x1a
> Jul 22 17:59:29 kobe kernel: syscall(fba42d38) at syscall+0x2a3
> Jul 22 17:59:29 kobe kernel: Xint0x80_syscall() at Xint0x80_syscall+0x20
> Jul 22 17:59:29 kobe kernel: --- syscall (6, FreeBSD ELF32, close), eip = 0x28393313, esp = 0xbfbfe78c, ebp = 0xbfbfe7a8 ---
> 

This is a known issue; bpf is holding a mutex over calls to the driver 
that may block (in this case the taskqueue_drain calls in net80211). 
It's unlikely to be resolved for 8.0 (too risky).

> Then typing ^C stops tcpdump but the log shows:
> 
> Jul 22 17:59:29 kobe kernel: wlan0: promiscuous mode disabled
> Jul 22 17:59:29 kobe kernel: em0: promiscuous mode disabled
> Jul 22 17:59:29 kobe kernel: iwn0: error, INTR=82000000<SW_ERROR,RX_INTR> STATUS=0x40010000
> Jul 22 17:59:29 kobe kernel: lagg0: promiscuous mode disabled
> Jul 22 17:59:30 kobe kernel: iwn0: iwn_transfer_firmware: timeout waiting for first alive notice, error 35
> Jul 22 17:59:30 kobe kernel: iwn0: iwn_init_locked: could not load firmware, error 35
> Jul 22 17:59:30 kobe kernel: wlan0: link state changed to DOWN
> Jul 22 17:59:30 kobe kernel: lagg0: link state changed to DOWN
> 
> At this point wlan0 is without carrier, and stays that way until I
> unplumb wlan0 and lagg0 and re-create them.
> 
> It seems that at this part of if_lagg.c we are locking the lagg softc,
> but then we call lagg_setflags() -> lagg_setflag():
> 
>     953                 LAGG_WLOCK(sc);
>     954                 SLIST_FOREACH(lp, &sc->sc_ports, lp_entries) {
>     955                         lagg_setflags(lp, 1);
>     956                 }
>     957                 LAGG_WUNLOCK(sc);
> 
> but this vectors into the wlan code near if_lagg.c:line 1088.  Does it
> make sense to drop the exclusive lagg lock around the code to the port
> flag changing code or would this introduce a silly race?
> 
> %%%
> --- a/sys/net/if_lagg.c Wed Jul 15 15:29:17 2009 +0300
> +++ b/sys/net/if_lagg.c Wed Jul 22 18:10:29 2009 +0300
> _at__at_ -1085,7 +1085,9 _at__at_
>          * in accord with actual ports flags.
>          */
>         if (status != (lp->lp_ifflags & flag)) {
> +               LAGG_WUNLOCK(sc);
>                 error = (*func)(ifp, status);
> +               LAGG_WLOCK(sc);
>                 if (error)
>                         return (error);
>                 lp->lp_ifflags &= ~flag;
> %%%

Sounds like iwn isn't reacting well to the calls coming in from lagg. 
wlandebug state should provide some insight.  I've used lagg+iwn+em on a 
t61p with no obvious issues but never tried to run tcpdump on the lagg port.

	Sam
Received on Wed Jul 22 2009 - 22:29:58 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:52 UTC