Re: Constant stream of errors on msk0

From: Joe Marcus Clarke <marcus_at_FreeBSD.org>
Date: Mon, 19 Feb 2007 13:57:07 -0500
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Pyun YongHyeon wrote:
> On Wed, Feb 14, 2007 at 01:32:57PM -0500, Joe Marcus Clarke wrote:
>  > -----BEGIN PGP SIGNED MESSAGE-----
>  > Hash: SHA1
>  > 
>  > Pyun YongHyeon wrote:
>  > > On Tue, Feb 13, 2007 at 12:34:57AM -0500, Joe Marcus Clarke wrote:
>  > >  > On Tue, 2007-02-13 at 13:47 +0900, Pyun YongHyeon wrote:
>  > >  > > On Mon, Feb 12, 2007 at 07:38:03PM -0500, Joe Marcus Clarke wrote:
>  > >  > >  > On Tue, 2007-02-13 at 09:09 +0900, Pyun YongHyeon wrote:
>  > >  > >  > > On Mon, Feb 12, 2007 at 02:08:49PM -0500, Joe Marcus Clarke wrote:
>  > >  > >  > >  > -----BEGIN PGP SIGNED MESSAGE-----
>  > >  > >  > >  > Hash: SHA1
>  > >  > >  > >  > 
>  > >  > >  > >  > I recently upgraded my MacBook Pro from -STABLE to -CURRENT.  I used to
>  > >  > >  > >  > be using the Marvell myk driver for my wired ethernet.  This driver
>  > >  > >  > >  > worked fine.  I'm now using the built-in msk driver, but this driver
>  > >  > >  > >  > causes the interface to report a constant stream of input errors.  There
>  > >  > >  > > 
>  > >  > >  > > Would you explain this input errors?
>  > >  > >  > 
>  > >  > >  > netstat -i reports steadily increasing input errors (Ierrs) every time
>  > >  > >  > packets arrive on the machine.
>  > >  > >  > 
>  > >  > > 
>  > >  > > It looks like link speed/duplex mismatch.
>  > >  > > How about manual configuration?
>  > >  > > (e.g. ifconfig msk0 media 1000baseTX mediaopt full-duplex)
>  > >  > 
>  > >  > That was the first thing I though of.  I tried all settings from
>  > >  > 100BaseTX half/full to 1000BaseTX half/full to auto.  The same problem
>  > >  > was always observed.  Additionally, I turned off TSO to see if that made
>  > >  > any difference, and it did not.
>  > >  > 
>  > > 
>  > > Ok, let's see what's happending on your NIC.
>  > > Try attached patch and let me know the output.
>  > 
>  > Okay.  On an otherwise idle machine, I started a regular 64-byte ping
>  > (to rule out TCP), and I was getting regular packet loss (50 responses
>  > were seen for 89 packets or 43.8% packet loss).  Here is the debug
>  > output for that time:
>  > 
>  > Feb 14 13:22:05 gyros kernel: 0x004a2100 : 70 : 74
>  > Feb 14 13:22:05 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:22:06 gyros kernel: 0x004a2100 : 70 : 74
>  > Feb 14 13:22:07 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:22:08 gyros kernel: 0x004a2100 : 70 : 74
>  > Feb 14 13:22:10 gyros kernel: 0x024e2300 : 586 : 590
>  > Feb 14 13:22:13 gyros kernel: 0x00b12300 : 173 : 177
>  > Feb 14 13:22:16 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:22:20 gyros kernel: 0x004a2100 : 70 : 74
>  > Feb 14 13:22:20 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:22:21 gyros kernel: 0x004a2100 : 70 : 74
>  > Feb 14 13:22:22 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:22:25 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:22:26 gyros kernel: 0x004a2100 : 70 : 74
>  > Feb 14 13:22:27 gyros kernel: 0x003c2300 : 56 : 60
>  > Feb 14 13:22:28 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:22:28 gyros kernel: 0x003c2300 : 56 : 60
>  > Feb 14 13:22:30 gyros kernel: 0x00f32300 : 239 : 243
>  > Feb 14 13:22:30 gyros kernel: 0x003c2100 : 56 : 60
>  > Feb 14 13:22:31 gyros kernel: 0x005c2300 : 88 : 92
>  > Feb 14 13:22:32 gyros kernel: 0x004a2100 : 70 : 74
>  > Feb 14 13:22:33 gyros kernel: 0x003c2300 : 56 : 60
>  > Feb 14 13:22:34 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:22:35 gyros kernel: 0x004a2100 : 70 : 74
>  > Feb 14 13:22:36 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:22:46 gyros last message repeated 6 times
>  > Feb 14 13:22:47 gyros kernel: 0x003c2300 : 56 : 60
>  > Feb 14 13:22:49 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:22:50 gyros kernel: 0x004a2100 : 70 : 74
>  > Feb 14 13:22:51 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:22:55 gyros last message repeated 3 times
>  > Feb 14 13:22:56 gyros kernel: 0x005a2300 : 86 : 90
>  > Feb 14 13:22:58 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:23:00 gyros kernel: 0x006e2300 : 106 : 110
>  > Feb 14 13:23:00 gyros kernel: 0x00c12300 : 189 : 193
>  > Feb 14 13:23:02 gyros kernel: 0x004a2100 : 70 : 74
>  > Feb 14 13:23:03 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:23:03 gyros kernel: 0x01602300 : 348 : 352
>  > Feb 14 13:23:05 gyros kernel: 0x004a2100 : 70 : 74
>  > Feb 14 13:23:06 gyros kernel: 0x01002300 : 252 : 256
>  > Feb 14 13:23:07 gyros kernel: 0x004a2100 : 70 : 74
>  > Feb 14 13:23:08 gyros kernel: 0x003c2100 : 56 : 60
>  > Feb 14 13:23:09 gyros kernel: 0x003c2100 : 56 : 60
>  > Feb 14 13:23:09 gyros kernel: 0x003c2300 : 56 : 60
>  > Feb 14 13:23:10 gyros kernel: 0x00622100 : 94 : 98
>  > Feb 14 13:23:11 gyros kernel: 0x004a2100 : 70 : 74
>  > Feb 14 13:23:13 gyros kernel: 0x00c12300 : 189 : 193
>  > Feb 14 13:23:14 gyros kernel: 0x003e2100 : 58 : 62
>  > Feb 14 13:23:14 gyros kernel: 0x006e2300 : 106 : 110
>  > 
>  > Additionally, while I wasn't seeing a large number of interrupts, I
>  > tried disabling MSI/MSI-X just as a test, and that had no effect.
>  > Netstat-wise, I have currently received 5645 packets with 847 input
>  > errors.  TCP-wise, I have received 3447 TCP packets with 334
>  > out-of-order packets.
>  > 
> 
> Hmm, this is very strange to me.
> If the packet is normal ICMP echo request packet its packet length
> on receiver side should be 98 bytes(14(ethernet hdr) + 20(IP hdr) +
> 8(icmp hdr) + 56(icmp data)). However the output shows various
> packet length ranging from 56 to 590. In addition msk(4) showed the
> received packet length differences between MAC and host and all those
> packets were VLAN tagged packet. Do you use VLAN on your environments?
> 
> Assuming you've just sent non-VLAN tagged ICMP echo request packet
> it would be the result of speed/duplex mismatch.

I moved to a switch over which I had control, and set the port to auto,
msk0 configured itself to 100BaseTX/full, and the input errors stopped.

The current switch is an old Cisco 2924XL (10/100 port).  The previous
switch was most likely a Cisco 6500 with a 10/100/1000 port (set to
auto).  So the msk interface must not have liked what the 6500 was doing
negotiation wise on its gigabit port.  Either that, or we have a bug in
the version of code running on that switch.

Joe

- --
Joe Marcus Clarke
FreeBSD GNOME Team	::	gnome_at_FreeBSD.org
FreeNode / #freebsd-gnome
http://www.FreeBSD.org/gnome
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF2fMDb2iPiv4Uz4cRAlhMAJ4pvjs8PCC6rI560SEkH9/oOcecJgCdGVH+
RePy+hVanwaPaj6hzBqZtik=
=ObzO
-----END PGP SIGNATURE-----
Received on Mon Feb 19 2007 - 17:56:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:05 UTC