Re: if_ral regression

From: Sepherosa Ziehau <sepherosa_at_gmail.com>
Date: Tue, 1 Jan 2008 14:27:47 +0800
On Dec 29, 2007 8:33 PM, Dag-Erling Smørgrav <des_at_des.no> wrote:
> I upgraded my router cum firewall cum access point (soekris net4801 with
> a cheap third-party ralink-based wlan adapter) from RELENG_6 to HEAD and
> noticed what seems to be a regression in if_ral.  After a certain amount
> of use (i.e. actually having a client connected to it and transferring
> data), the connection falters, and eventually the client can no longer
> see even see the access point in a scan.  Restarting the interface on
> the router (/etc/rc.d/netif restart ral0) fixes it.  I now have a cron
> job that does this every five minutes.  I still get occasional outages,
> but all I have to do is wait a few minutes for the cron job to kick in.
>
> Outages are clearly related to traffic; a sure-fire way to trigger one
> is to start a backup job on my laptop (rsync to my file server).  I will
> lose the wlan connection repeatedly until I either stop trying or run
> the script with a bandwidth limit.
>
> des_at_soe ~% uname -a
> FreeBSD soe.des.no 8.0-CURRENT FreeBSD 8.0-CURRENT #0: Sat Dec 15 20:46:29 UTC 2007     des_at_pwd.des.no:/usr/obj/usr/src/sys/soe  i386
> des_at_soe ~% kldstat -v
> Id Refs Address    Size     Name
>  1   18 0xc0400000 33fdfc   kernel (/boot/soe/kernel)
>  2    1 0xc0740000 7690     if_sis.ko (/boot/soe/if_sis.ko)
>  3    2 0xc0748000 1dbe0    miibus.ko (/boot/soe/miibus.ko)
>  4    1 0xc0766000 18e28    if_ral.ko (/boot/soe/if_ral.ko)
>  5    4 0xc077f000 2a95c    wlan.ko (/boot/soe/wlan.ko)
>  6    1 0xc07aa000 2cb0     wlan_acl.ko (/boot/soe/wlan_acl.ko)
>  7    1 0xc07ad000 1924     wlan_scan_ap.ko (/boot/soe/wlan_scan_ap.ko)
>  8    1 0xc107f000 6000     geom_md.ko (/boot/soe/geom_md.ko)
>  9    1 0xc10f9000 2000     pflog.ko (/boot/soe/pflog.ko)
> 10    1 0xc10fb000 2f000    pf.ko (/boot/soe/pf.ko)
> 11    4 0xc118d000 a000     netgraph.ko (/boot/soe/netgraph.ko)
> 12    1 0xc119c000 3000     ng_ether.ko (/boot/soe/ng_ether.ko)
> 13    1 0xc11a8000 5000     ng_pppoe.ko (/boot/soe/ng_pppoe.ko)
> 14    1 0xc11ad000 4000     ng_socket.ko (/boot/soe/ng_socket.ko)
> des_at_soe ~% grep ral0 /var/run/dmesg.boot
> ral0: <Ralink Technology RT2560> mem 0xa0004000-0xa0005fff irq 11 at device 10.0 on pci0

I don't whether following thingies will fix your problem:

1)
rt2560.c: rt2560_setup_tx_desc()
Set RT2560_{TX,TX_CIPHER}_BUSY desc flag at the end of this function,
instead of at the beginning of this function.  The original way _may_
confuse hardware encryption/tx engine.

2)
And the rt2560_bbp_read() is not correct, it should look like following:
static uint8_t
rt2560_bbp_read(struct rt2560_softc *sc, uint8_t reg)
{
	uint32_t val;
	int ntries;

	for (ntries = 0; ntries < 100; ntries++) {
		if (!(RAL_READ(sc, RT2560_BBPCSR) & RT2560_BBP_BUSY))
			break;
		DELAY(1);
	}
	if (ntries == 100) {
		device_printf(sc->sc_dev, "could not read from BBP\n");
		return 0;
	}

	val = RT2560_BBP_BUSY | reg << 8;
	RAL_WRITE(sc, RT2560_BBPCSR, val);

	for (ntries = 0; ntries < 100; ntries++) {
		val = RAL_READ(sc, RT2560_BBPCSR);
		if (!(val & RT2560_BBP_BUSY))
			return val & 0xff;
		DELAY(1);
	}

	device_printf(sc->sc_dev, "could not read from BBP\n");
	return 0;
}

3)
After above fix,
rt2560_set_txantenna() and rt2560_set_rxantenna() should be called
after rt2560_bbp_init(), since above two function touch BBP.  NOTE:
without above fix, you may burn your card.

Even with these in place in dfly, I still have strange TX performance
regression in sta mode (drop from 20Mb/s to 3Mb/s under very well
condition) on certain hardwares after 20sec~30sec TCP_STREAM netperf
testing; didn't have enough time to dig, however, all of the tested
hardwares stayed connected during testing (I usually run netperf
stream test for 12 hours or more).

Best Regards,
sephe

-- 
Live Free or Die
Received on Tue Jan 01 2008 - 05:54:39 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:24 UTC