Re: Connection problems with wme enabled

From: Sam Leffler <sam_at_freebsd.org>
Date: Thu, 01 May 2008 13:25:44 -0700
Fabian Keil wrote:
> Sam Leffler <sam_at_freebsd.org> wrote:
>
>   
>> Fabian Keil wrote:
>>     
>>> Sam Leffler <sam_at_freebsd.org> wrote:
>>>       
>
>   
>>> dmesg doesn't show any relevant messages,
>>> even when booted in verbose mode.
>>>
>>> The ifconfig output looks normal (to me) as well:
>>>
>>> fk_at_TP51 ~ $sudo ifconfig -v wlan0
>>> wlan0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
>>>         ether 00:0e:...
>>>         inet 192.168.0.49 netmask 0xffffff00 broadcast 192.168.0.255
>>>         media: IEEE 802.11 Wireless Ethernet OFDM/54Mbps mode 11g
>>>         status: associated
>>>         ssid ... channel 7 (2442 Mhz 11g) bssid 00:14:...
>>>         regdomain DEBUG country DE anywhere -ecm authmode OPEN -wps -tsn
>>>         privacy ON deftxkey 1
>>>         wepkey 1:104-bit powersavemode OFF powersavesleep 100 txpower 30
>>>         txpowmax 50.0 -dotd rtsthreshold 2346 fragthreshold 2346 bmiss 24
>>>         11b    ucast NONE    mgmt  1 Mb/s mcast  1 Mb/s maxretry 6
>>>         11g    ucast NONE    mgmt  1 Mb/s mcast  1 Mb/s maxretry 6
>>>         11na   ucast NONE    mgmt  0 MCS  mcast  0 MCS  maxretry 6
>>>         11ng   ucast NONE    mgmt  0 MCS  mcast  0 MCS  maxretry 6
>>>         scanvalid 60 -bgscan bgscanintvl 300 bgscanidle 250
>>>         roam:11b    rssi    7dBm rate  1 Mb/s
>>>         roam:11g    rssi    7dBm rate  5 Mb/s -pureg protmode CTS -ht
>>>         -htcompat -ampdu ampdulimit 8k ampdudensity - -amsdu -shortgi
>>>         htprotmode RTSCTS -puren wme -burst -ff -dturbo -dwds roaming AUTO
>>>         bintval 100
>>>         AC_BE cwmin  4 cwmax 10 aifs  3 txopLimit   0 -acm ack
>>>               cwmin  4 cwmax 10 aifs  3 txopLimit   0 -acm
>>>         AC_BK cwmin  4 cwmax 10 aifs  7 txopLimit   0 -acm ack
>>>               cwmin  4 cwmax 10 aifs  7 txopLimit   0 -acm
>>>         AC_VI cwmin  3 cwmax  4 aifs  2 txopLimit  94 -acm ack
>>>               cwmin  3 cwmax  4 aifs  2 txopLimit  94 -acm
>>>         AC_VO cwmin  2 cwmax  3 aifs  2 txopLimit  47 -acm ack
>>>               cwmin  2 cwmax  3 aifs  2 txopLimit  47 -acm
>>>         groups: wlan 
>>>
>>> While it shows association, open connections stall and
>>> I can't create new ones until reviving the device with
>>> ifconfig wlan0 down up.
>>>
>>> Under load (100K download rate) and with wme enabled
>>> the problem occurs after less than 5 seconds, if there's
>>> less load, it'll work a bit longer.
>>>
>>> wlanstats while the device is unresponsive:
>>>
>>> fk_at_TP51 ~ $wlanstats 
>>> 1        rx from wrong bssid
>>> 4756     rx discard 'cuz dup
>>> 33       rx discard 'cuz mcast echo
>>> 6        rx discard mgt frames
>>> 471      rx beacon frames
>>> 6        rx element unknown
>>> 390      rx frame chan mismatch
>>> 8        rx disassociation
>>> 8        beacon miss events handled
>>> 23       rx discard 'cuz port unauthorized
>>> 25       active scans started
>>> 123844   wep crypto done in s/w
>>> 934      rx management frames
>>> 24       tx failed 'cuz vap not in RUN state
>>> 165      total data frames received
>>> 160      unicast data frames received
>>> 5        multicast data frames received
>>> 355      total data frames transmit
>>> 355      unicast data frames sent
>>> 54M      current transmit rate
>>> 42       current rssi
>>> 42       current signal (dBm)
>>>
>>>   
>>>       
>> "8 beacon miss events handled"--so the firmware said you lost signal.
>>
>>     
>>> While the number of "chan mismatch" seems high,
>>> I get the impression that it only increases while
>>> the device is getting down and up. It doesn't seem
>>> to increase while the device is working or hanging.
>>>
>>> wlanstats a bit later with wme disabled and wlan0 working:
>>>
>>> fk_at_TP51 ~ $wlanstats 
>>> 1        rx from wrong bssid
>>> 4891     rx discard 'cuz dup
>>> 33       rx discard 'cuz mcast echo
>>> 6        rx discard mgt frames
>>> 519      rx beacon frames
>>> 6        rx element unknown
>>> 453      rx frame chan mismatch
>>> 8        rx disassociation
>>> 8        beacon miss events handled
>>> 23       rx discard 'cuz port unauthorized
>>> 27       active scans started
>>> 130514   wep crypto done in s/w
>>> 1048     rx management frames
>>> 25       tx failed 'cuz vap not in RUN state
>>> 3318     total data frames received
>>> 3318     unicast data frames received
>>> 2829     total data frames transmit
>>> 2829     unicast data frames sent
>>> 36M      current transmit rate
>>> 42       current rssi
>>> 42       current signal (dBm)
>>>   
>>>       
>> wlanstats 1 gives you a rolling display every second; that's usually 
>> more helpful in understanding what's happening.  Unfortunately there are 
>> more stats than can fit on a rolling display so sometimes the one(s) you 
>> want aren't shown.  There is a column fmt mechanism a la ps to control 
>> output but it's not well developed (someone please take and improve).  
>> Also some stats are maintained by drivers and not yet counted in the 
>> net80211 layer (again, folks are welcome to help).
>>     
>
> While working:
>
> fk_at_TP51 ~ $wlanstats 1
>    input 2short rx_ucast bvers wrbss rxdup mecho wrdir
>       14     0       14     0     1 29861    54     0
>       19     0       19     0     0     0     0     0
>        7     0        7     0     0     0     0     0
>       15     0       15     0     0     0     0     0
>       14     0       14     0     0     0     0     0
>        4     0        4     0     0     0     0     0
>        1     0        1     0     0     0     0     0
>        1     0        1     0     0     0     0     0
>        1     0        1     0     0     0     0     0
>        3     0        3     0     0     0     0     0
>        2     0        2     0     0     0     0     0
>        2     0        2     0     0     0     0     0
>        2     0        2     0     0     0     0     0
>        3     0        3     0     0     0     0     0
>        2     0        2     0     0     0     0     0
>        0     0        0     0     0     0     0     0
>        3     0        3     0     0     0     0     0
>        2     0        2     0     0     0     0     0
>        0     0        0     0     0     0     0     0
>        1     0        1     0     0     0     0     0
>        1     0        1     0     0     0     0     0
> ^C
>
> While "hanging" ...
>
> fk_at_TP51 ~ $wlanstats 1
>    input 2short rx_ucast bvers wrbss rxdup mecho wrdir
>      882     0      831     0     1 29859    50     0
>        1     0        0     0     0     0     0     0
>        0     0        0     0     0     0     0     0
>        1     0        0     0     0     0     0     0
>        0     0        0     0     0     0     0     0
>        1     0        0     0     0     0     0     0
>        1     0        0     0     0     0     0     0
>        0     0        0     0     0     0     0     0
>        0     0        0     0     0     0     0     0
>        1     0        0     0     0     0     0     0
>        1     0        0     0     0     0     0     0
>        1     0        0     0     0     0     0     0
>        0     0        0     0     0     0     0     0
>        1     0        0     0     0     0     0     0
>        1     0        0     0     0     0     0     0
>        1     0        0     0     0     0     0     0
>        0     0        0     0     0     0     0     0
>        0     0        0     0     0     0     0     0
>        0     0        0     0     0     0     0     0
>        0     0        0     0     0     0     0     0
> ^C
>
>   

Your code is out of date, I just imported some fixes yesterday :)

>>> It's interesting that with wme enabled the hangs
>>> usually occur with the transmit rate at 54, while
>>> it's usually a lot lower with wme disabled and the
>>> device working.
>>>       
>
>   
>> iwi does tx rate control in the firmware so unlikely to be related.  The 
>> more likely issue is the beacon miss handling.  The driver should 
>> recover and reconnect but it sounds like something didn't work.  As a 
>> workaround you can try upping the bmiss count to see if this is a 
>> problem w/ the radio going deaf for a period of time--something I've 
>> seen on older Intel parts.
>>     
>
> Increasing bmiss to 250 (or decreasing it to 10)
> doesn't seem to affect the problem.
>   

Well if your beacon interval is 100 TU then the default setting of 24 
means you didn't see a beacon frame in 2400 TU (~2.4 seconds) which is a 
really long time even if the channel is way busy.  The firmware handles 
this notification so it could be a firmware issue; if I were 
investigating I'd sniff packets to see.

I've tested bmiss handling before (yesterday even) and it worked for me 
w/ and w/o wme enabled so not sure what to say.  What I have noticed is 
the firmware some times delivers a slew of beacon miss notifications 
immediately after associating to an ap.  I have some ideas why this 
might occur but Intel wouldn't answer when asked.  However if you're 
seeing bmiss after lots of traffic has passed then it's unclear what's 
happening.

I tested mostly with a 2915 card fwiw.

>   
>>> There are several access points in my neighbourhood,
>>> mine doesn't always have the strongest signal:
>>>
>>> fk_at_TP51 ~ $ifconfig wlan0 scan
>>> SSID            BSSID              CHAN RATE   S:N     INT CAPS
>>> ...             00:18:...           11   54M  21:0    100 EPS 
>>> my ap           00:14:...            7   54M  21:0    100 EPS  WME
>>> ...             00:15:...            6   54M  14:0    100 EPB  WPA
>>> ...             00:04:...            6   54M  19:0    100 EP   WPA WME
>>>
>>> I can't reproduce the problem with ath0.
>>>
>>> I'll be glad to provide further information, just tell me what you need.
>>>       
>
>   
>> See above.  I ran tests yesterday w/ wme enabled in my environment but 
>> the signal was stronger so not an equivalent test.  What you need to do 
>> is get a log that captures the event of losing connectivity.  This must 
>> include net80211 logging and probably needs to include some level of 
>> driver debugging as the problem is in the driver.  Try:
>>
>> wlandebug state+scan+auth+assoc
>>     
>
> fk_at_TP51 ~ $sudo wlandebug state+scan+auth+assoc
> wlandebug: sysctl-get(net.wlan.0.debug): No such file or directory
>
> fk_at_TP51 ~ $sysctl net.wlan     
> net.wlan.addba_maxtries: 3
> net.wlan.addba_backoff: 10000
> net.wlan.addba_timeout: 250
> net.wlan.cac_timeout: 60
> net.wlan.nol_timeout: 1800
> net.wlan.recv_bar: 1
> net.wlan.0.%parent: iwi0
> net.wlan.0.driver_caps: 92307968
> net.wlan.0.bmiss_max: 200 (increased by me, without noticeable effect)
> net.wlan.0.inact_run: 300
> net.wlan.0.inact_probe: 30
> net.wlan.0.inact_auth: 180
> net.wlan.0.inact_init: 30
>
>   
>> sysctl debug.iwi=5
>>     
>
> I'm not sure how useful it is without net80211 logging,
> but I uploaded 160K of iwi0 messages at:
>
> http://www.fabiankeil.de/tmp/freebsd/iwi0-messages.txt
>
> During the "hangs" the device seems to be
> sending more often than it does receive.
>
> Fabian
>   
Looks like I failed to include IEEE80211_DEBUG in the default kernel 
configs; you'll need that to get wlan debug msgs.  I'll try to look at 
your log later.

    Sam
Received on Thu May 01 2008 - 18:25:49 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:30 UTC