Re: hostap recently broken

From: Michal Mertl <mime_at_traveller.cz>
Date: Wed, 27 Jul 2005 09:53:57 +0200
Sam Leffler wrote:
> Michal Mertl wrote:
> > I think I found what change causes the problem I experience. See below. 
> > 
> > Michal Mertl wrote:
> > 
> >>I'm sorry I forgot to answer one of Sam's questions.
> >>
> >>Michal Mertl wrote:
> >>
> >>>Sam Leffler píše v út 26. 07. 2005 v 09:29 -0700:
> >>>
> >>>>Michal Mertl wrote:
> >>>>
> >>>>>Sam Leffler wrote:
> >>>>>
> >>>>>
> >>>>>>Michal Mertl wrote:
> >>>>>>
> >>>>>>
> >>>>>>>Hello,
> >>>>>>>
> >>>>>>>I've just found out that something very recently broke hostap on FreeBSD
> >>>>>>>CURRENT. The client associates and gets the MAC address of the AP. When
> >>>>>>>I run tcpdump on the AP I see the pings from the client getting in but
> >>>>>>>the AP doesn't reply. The ARP protocol works but nothing else does.
> >>>>>>>
> >>>>>>>Source checked on 2005-07-22 16:00 UTC works fine.
> >>>>>>>
> >>>>>>>The AP card is atheros but just reverting the last changes to the driver
> >>>>>>>doesn't help.
> >>>>>>
> >>>>>>I just tried with CURRENT (from last night).  5212 card setup with TKIP 
> >>>>>>for PTK and GTK.  ap operating in 11g.  Powerbook running Tiger 
> >>>>>>associated and operated fine.  29Mb/s for upstream tcp netperf (sta and 
> >>>>>>ap in close proximity--rssi 41).
> >>>>>>
> >>>>>>I appreciate you testing stuff but please try to diagnose your problems 
> >>>>>>a bit harder and then provide more useful info like the h/w revs and the 
> >>>>>>exact steps you use to setup a non-working system.
> >>>>>
> >>>>>
> >>>>>Sorry, I had the exact same HW setup as before which I described in my
> >>>>>email about the problem with bridging. 
> >>>>>
> >>>>>I've got several Atheros 5212 cards (mac 5.9 phy 4.3 radio 3.6) and also
> >>>>>IPW notebook all running CURRENT, the notebook and the client several
> >>>>>days old (from before 2005-07-22 16:00 UTC).
> >>>>>
> >>>>>The most basic setup - 'ifconfig ath0 192.168.0.1 mediaopt hostap ssid
> >>>>>aaa' on the AP and 'ifconfig ath0 192.168.0.2 ssid aaa' worked like a
> >>>>>charm before the date and not after. With the newer kernel on the AP the
> >>>>>cards associate and as I've just found I can communicate between the
> >>>>>stations on the AP. Ping to the AP doesn't work even when I get the MAC
> >>>>>address of the AP via ARP. Adhoc connection works.
> >>>>
> >>>>I am unclear still on what happens.  I believe you are saying:
> >>>>
> >>>>ping 192.168.0.1
> >>>>
> >>>>from the station to the ap fails.  If so what does 80211stats show on 
> >>>>the ap when this happens (do releveant error stats go up)?  If you do
> >>>
> >>> ./80211stats -a
> >>>00:0b:6b:35:dc:d4:
> >>>        rx_mgmt 1
> >>>        tx_data 107 tx_bytes 9788
> >>>
> >>>00:0b:6b:35:dc:f0:
> >>>        rx_data 107 rx_mgmt 1 rx_bytes 10430
> >>>        tx_data 6 tx_mgmt 2 tx_bytes 36
> >>>        tx_assoc 1 tx_auth 1
> >>>
> >>>
> >>>./athstats
> >>>8 tx management frames
> >>>3 tx frames discarded prior to association
> >>>93 tx failed 'cuz too many retries
> >>>930 long on-chip tx retries
> >>>1 tx frames with no ack marked
> >>>8148 beacons transmitted
> >>>27 periodic calibrations
> >>>834 rate control checks
> >>>rssi of last ack: 48
> >>>avg recv rssi: 49
> >>>1 switched default/rx antenna
> >>>Antenna profile:
> >>>[1] tx        8 rx       97
> >>>[2] tx        1 rx        0
> >>>
> >>>
> >>>These are shortly after reboot after several minutes of inactivity and
> >>>now ping running 150 sec.
> >>>
> >>>After some 20 secs:
> >>>
> >>>./athstats
> >>>8 tx management frames
> >>>3 tx frames discarded prior to association
> >>>181 tx failed 'cuz too many retries
> >>>1810 long on-chip tx retries
> >>>1 tx frames with no ack marked
> >>>9021 beacons transmitted
> >>>30 periodic calibrations
> >>>923 rate control checks
> >>>rssi of last ack: 48
> >>>avg recv rssi: 44
> >>>1 switched default/rx antenna
> >>>Antenna profile:
> >>>[1] tx        8 rx      185
> >>>[2] tx        1 rx        0
> >>>
> >>>./80211stats -a
> >>>00:0b:6b:35:dc:d4:
> >>>        rx_mgmt 1
> >>>        tx_data 183 tx_bytes 16780
> >>>
> >>>00:0b:6b:35:dc:f0:
> >>>        rx_data 183 rx_mgmt 1 rx_bytes 17878
> >>>        tx_data 6 tx_mgmt 2 tx_bytes 36
> >>>        tx_assoc 1 tx_auth 1
> >>>
> >>>
> >>>
> >>>>80211debug +input
> >>>
> >>>
> >>>>on the ap do you get any log msgs about discarded frames?
> >>>
> >>>Nothing is displayed.
> >>>
> >>>
> >>>>You also seem to say the sta resolves the ip w/ arp.  Is the same true 
> >>>>for the ap (i.e. that it resolves the ip address of the sta)?  I'm 
> >>>>assuming you are NOT running firewall rules do not have crypto setup and 
> >>>>have not fiddled with parameters like apbridge (you didn't provide 
> >>>>ifconfig output for each side).
> >>
> >>I forgot to answer the question about ARP:
> >>
> >>The STA gets the MAC address of the AP via ARP but the AP most often
> >>doesn't. AP gets it only when both it and the STA doesn't have the ARP
> >>record and STA initiates ping. When I delete the ARP entry on the AP
> >>afterwards, it won't recreate it no matter what direction I ping.
> >>
> >>When doing tcpdump on the STA I see the arp who-has coming in and reply
> >>coming out. When I configure a static ARP entry on the AP I still can't
> >>communicate. When I ping from AP to STA I see both echo and echo-reply
> >>in tcpdump on the STA but the reply doesn't make it to the AP or
> >>something.
> >>
> >>I see the echo replies even in tcpdump on the AP:
> >>
> >>21:07:31.589408 44us DA:00:0b:6b:35:dc:f0 BSSID:00:0b:6b:35:dc:d4
> >>SA:00:0b:6b:35:dc:d4 LLC, dsap SNAP (0xaa), ssap SNAP (0xaa), cmd 0x03:
> >>oui Ethernet (0x000000), ethertype IPv4 (0x0800): (tos 0x0, ttl  64, id
> >>15394, offset 0, flags [none], proto: ICMP (1), length: 84) 192.168.0.1
> >>
> >>>192.168.0.2: ICMP echo request, id 65028, seq 0, length 64
> >>
> >>21:07:31.589801 44us BSSID:00:0b:6b:35:dc:d4 SA:00:0b:6b:35:dc:f0
> >>DA:00:0b:6b:35:dc:d4 LLC, dsap SNAP (0xaa), ssap SNAP (0xaa), cmd 0x03:
> >>oui Ethernet (0x000000), ethertype IPv4 (0x0800): (tos 0x0, ttl  64, id
> >>1528, offset 0, flags [none], proto: ICMP (1), length: 84) 192.168.0.2 >
> >>192.168.0.1: ICMP echo reply, id 65028, seq 0, length 64
> >>
> >>21:07:31.589813 60us DA:00:0b:6b:35:dc:d4 BSSID:00:0b:6b:35:dc:d4
> >>SA:00:0b:6b:35:dc:f0 LLC, dsap SNAP (0xaa), ssap SNAP (0xaa), cmd 0x03:
> >>oui Ethernet (0x000000), ethertype IPv4 (0x0800): (tos 0x0, ttl  64, id
> >>1528, offset 0, flags [none], proto: ICMP (1), length: 84) 192.168.0.2 >
> >>192.168.0.1: ICMP echo reply, id 65028, seq 0, length 64
> >>
> >>
> >>>From reading this I got puzzled - why are there multiple packets with
> >>the reply? When I disable the apbridge with 'ifconfig ath0 -apbridge'
> >>everything works!
> >>
> >>I hope this helps.
> >>
> > 
> > 
> > It helped me I guess :-).
> > 
> > Rev. 1.67 of src/sys/net80211/ieee80211_input.c did almost exact shift
> > of several rows of code from the body of ieee80211_input() to a new
> > function. The only difference I see is a change of one check.
> > 
> > The old code "if (ni1->ni_associd != 0) {" was replaced by "if
> > (ieee80211_node_is_authorized(ni1)) {".
> > 
> > The called function is this:
> > 
> > ieee80211_node_is_authorized(const struct ieee80211_node *ni)
> > {
> >         return (ni->ni_flags & IEEE80211_NODE_AUTH);
> > }
> > 
> > The code in question is only called when the interface is in apbridge
> > mode and that's why I was able to locate the problem rather easily. The
> > state of apbridge setting is only checked at one place.
> > 
> > I don't know what is the correct way to fix it, if the old code should
> > be restored here or what.
> > 
> > Definitely changing the line back to pre 1.67 contents fixes the problem
> > for me.
> 	...
> 
> The change to validate the station is authorized is correct; this was a 
> longstanding bugfix I'd been meaning to pull into cvs.  The issue was 
> that you cannot bridge directly to the bss node as traffic to it must 
> take the normal input path.  I've committed a change that I believe 
> corrects the problem.  Thank you.
> 
> 	Sam

Thank you so much.

I haven't tested the change (rev. 1.76 ieee80211_input.c) yet but I
think the fix looks correct. I'll inform you if it doesn't work for me.

Michal
Received on Wed Jul 27 2005 - 05:54:11 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:39 UTC