Re: bge driver autoneg failure and system-wide stalls

From: Bill Paul <wpaul_at_FreeBSD.ORG>
Date: Mon, 28 Nov 2005 09:17:58 +0000 (GMT)
> On Fri, Nov 25, 2005 at 04:22:28PM +0300, Gleb Smirnoff wrote:
> > On Fri, Nov 25, 2005 at 01:20:41PM +1100, Emil Mikulic wrote:
> > E> The other problem is that bge will never negotiate a working link speed.
> > E> ifconfig will always return "status: no carrier"
> > E> 
> > E> If I force the media to 10baseT/UTP or 100baseTX (either mediaopt
> > E> full-duplex or not), it will issue a couple more MII_TICKs then stop,
> > E> ifconfig will return "status: active", there will be no more stalls,
> > E> and, most importantly, the network connection will actually work.
> > 
> > Please try out the attached patch.
> 
> No effect.

In your original e-mail, you write:

> I have a network port with bad wiring in the walls - a cable tester
> shows only wires 1,2,3 and 6 are actually connected.

Actually, this is not 'bad' wiring. It's correct for 10/100 ethernet
as long as a) the cabling is actually cat5, and not moldy old cat3
or something, and b) the four wires are actually connected in the right
sequence. Pins 1 and 2 form one pair, and pins 3 and 6 form the second
pair. A typical installation may have the orange/orange+white pair
on pins 1 and 2, and the blue/blue+white pair on 3 and 6. And both
sides must match. If it's not done this way, then while you may have
a DC path between all 4 pins on each side, you won't be getting the
proper noise cancellation effect of twisted pair cabling. This can
cause signal distortion, dropped packets, and possibly botched autoneg.

You didn't say if you checked for this though, so we can't speculate
if this is really the problem. If the pairs are wrong, then that could
be why autoneg is failing. It's also the least of your worries, since
even if you could convince the software to establish a link, you
might end up with rotten performance.

A couple things you neglected to mention (and which Gleb failed to
ask you about):

- Exactly what kind of switch is on the other end of this wiring?
- Is the port that corresponds to this wall jack a gigabit ethernet
  port, or just 10/100?

If it is a gigE port, then you're being silly. 4 pairs are required
for gigE. Period. The NWAY autonegotiation exchange can take place
over just 2 pairs, but the gigE signalling scheme requires all 4
pairs to be present in order to establish a link. If there's just
two pairs connected, both sides will can announce that they support
gigabit speeds, and both sides will try configuring themselves
for gigE operation, but no link will ever be established.

If you manually override the autonegotiation in this case, you should
do "ifconfig bge0 media 100baseTX" only. Do not specify full duplex.
This won't work. When you manually select the mode, autoneg will be
turned off, and the other side will rely on parallel detection to
select the appropriate link speed, but it won't be able to sense if
the link partner is in full or half duplex mode, so it will default
to half. If you manually specify full, this will create a duplex
mismatch, and you'll get rotten throughput.

If the switch port is 10/100 and not gigE, then autoneg should be working
properly, and I don't know why it isn't.

As an aside, I really don't understand the purpose of the brgphy_loop()
function. (I didn't write it.) It looks like it tries to put the PHY
into loopback mode, and then waits for the PHY to report that there's
a good link. I'm not really sure of the point here. I mean, you can
do that, but I don't understand why. Also, the DELAY(10) here can
probably be replaced with a tsleep() or something, which will allow
the CPU to do other work while waiting for the PHY instead of hard
busywaiting and blocking up the whole system (allowing a reschedule
here should not hurt).

> It still can't autoselect a working media, and I still get loops in
> miibus.

[...]

> Anything else I can try?
> 
> --Emil

If the switch port really is 10/100, then maybe, just maybe, you can
try increasing the autoneg timeout. In brgphy_service(), you'll see
this:

                /*
                 * Only retry autonegotiation every 5 seconds.
                 */
                if (++sc->mii_ticks <= 5)
                        break;

Change the 5 to a 10 and see if that helps. But if you really are
trying to autoneg a link with a gigE switch port, this won't make
any difference. If the switch is managed, and you have the password
to it, you can try programming it to only announce 10/100 support on that
port until such time as you can recable the place for gigE. Alternatively,
you can attempt to steal two pairs from a neighboring cable that leads
to the same jack.

-Bill

--
=============================================================================
-Bill Paul            (510) 749-2329 | Senior Engineer, Master of Unix-Fu
                 wpaul_at_windriver.com | Wind River Systems
=============================================================================
              <adamw> you're just BEGGING to face the moose
=============================================================================
Received on Mon Nov 28 2005 - 08:17:58 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:48 UTC