Re: lagg + wlan0 boot timing (EBUSY)

From: Sam Leffler <sam_at_freebsd.org>
Date: Thu, 24 Sep 2009 20:16:15 -0700
David Horn wrote:
> Tracking 8/stable branch on this particular machine (although I do
> have access to -current for testing as needed)  uname -a:
> 
> FreeBSD lagg 8.0-RC1 FreeBSD 8.0-RC1 #11 r197417: Wed Sep 23 01:05:15
> EDT 2009     root_at_lagg:/usr/obj/usr/src/sys/GENERIC  amd64
> 
> I have been trying to track down a problem with my lagg connection
> sometimes not properly enabling wlan as fallback on boot.  It would
> work properly about 60% of the time.  The other times, it would fail
> with SIOCSLAGGPORT: Device busy
> 
> Here is the relevant rc.conf entries:
> 
> ifconfig_bfe0="up"
> wlans_iwn0="wlan0"
> ifconfig_wlan0="WPA"
> ifconfig_iwn0="ether 00:1c:23:98:2c:5d"
> cloned_interfaces="lagg0"
> ipv6_network_interfaces="lagg0"
> ifconfig_lagg0="laggproto failover laggport bfe0 laggport wlan0 DHCP"
> ipv6_enable="YES"
> 
> So, I turned on some logging of all ifconfig commands with timestamps
> and stdout/stderr/returncode, and noticed this:
> 
> Wed Sep 23 01:39:56 EDT 2009 ifconfig: lagg0 create ;
> ;; Wed Sep 23 01:39:56 EDT 2009 lagg0 rc='0' end.
> Wed Sep 23 01:39:56 EDT 2009 ifconfig: -l ;
> iwn0 bfe0 fwe0 fwip0 lo0 lagg0
> ;; Wed Sep 23 01:39:56 EDT 2009 -l rc='0' end.
> Wed Sep 23 01:39:56 EDT 2009 ifconfig: -l ;
> iwn0 bfe0 fwe0 fwip0 lo0 lagg0
> ;; Wed Sep 23 01:39:56 EDT 2009 -l rc='0' end.
> Wed Sep 23 01:39:56 EDT 2009 ifconfig: lo0 inet 127.0.0.1 ;
> ;; Wed Sep 23 01:39:56 EDT 2009 lo0 rc='0' end.
> Wed Sep 23 01:39:56 EDT 2009 ifconfig: lo0 up ;
> ;; Wed Sep 23 01:39:56 EDT 2009 lo0 rc='0' end.
> Wed Sep 23 01:39:56 EDT 2009 ifconfig: iwn0 ether 00:1c:23:98:2c:5d ;
> ;; Wed Sep 23 01:39:56 EDT 2009 iwn0 rc='0' end.
> Wed Sep 23 01:39:56 EDT 2009 ifconfig: iwn0 up ;
> ;; Wed Sep 23 01:39:56 EDT 2009 iwn0 rc='0' end.
> Wed Sep 23 01:39:56 EDT 2009 ifconfig: wlan0 create wlandev iwn0 ;
> ;; Wed Sep 23 01:39:56 EDT 2009 wlan0 rc='0' end.
> Wed Sep 23 01:39:56 EDT 2009 ifconfig: wlan0 ;
> wlan0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
>         ether 00:1c:23:98:2c:5d
>         media: IEEE 802.11 Wireless Ethernet autoselect (autoselect)
>         status: no carrier
>         ssid "" channel 1 (2412 Mhz 11b)
>         country US authmode OPEN privacy OFF txpower 14 bmiss 10 scanvalid 60
>         wme bintval 0
> ;; Wed Sep 23 01:39:56 EDT 2009 wlan0 rc='0' end.
> Wed Sep 23 01:39:57 EDT 2009 ifconfig: lagg0 laggproto failover
> laggport bfe0 laggport wlan0 ;
> ifconfig.real: SIOCSLAGGPORT: Device busy
> ;; Wed Sep 23 01:39:57 EDT 2009 lagg0 rc='1' end.
> 
> So, I started looking at the /sys/net/if_lagg.c source, and found the
> EBUSY response cases:
> 
> This one
> 
> /* New lagg port has to be in an idle state */
>         if (ifp->if_drv_flags & IFF_DRV_OACTIVE)
>                 return (EBUSY);
> 
> seems to be the culprit, but unfortunately, I'm not familiar enough
> with the code to take this much further.  I did build a kernel without
> this check, and everything seems to be fixed, but this is obviously
> not a real fix to the problem.  So, I would say the fact that
> wpa_supplicant is talking to wlan0 (trying to scan/associate/auth)
> while lagg is trying to add wlan0 to the portlist is the timing issue.
> 
> I confirmed this behavior as follows:
> 
> ifconfig wlan0 destroy
> ifconfig lagg0 destroy
> ifconfig lagg0 create
> ifconfig wlan0 create wlandev iwn0  & ; ifconfig lagg0 laggproto
> failover laggport bfe0 laggport wlan0
> results in:
> ifconfig: SIOCSLAGGPORT: Device busy
> 
> Someone more clueful than me know of a correct way to fix this
> contention issue ?
> Want me to file a PR for tracking purposes ?

OACTIVE is marked on wlan0 if packets come down the tx path before the
ifnet reaches RUN state.  This is done to block traffic and should have
no effect except to cause packets to be queued in the snd q.  This
probably happens when IPV6 is enabled because NDP kicks in on link state
change (though that should happen only after reaching RUN state).  I've
no idea why lagg is treating OACTIVE as it is; I'd need to read the code.

	Sam
Received on Fri Sep 25 2009 - 01:16:17 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:56 UTC