Re: Constant stream of errors on msk0

From: Pyun YongHyeon <pyunyh_at_gmail.com>
Date: Tue, 20 Feb 2007 10:24:43 +0900
On Mon, Feb 19, 2007 at 01:57:07PM -0500, Joe Marcus Clarke wrote:
 > -----BEGIN PGP SIGNED MESSAGE-----
 > Hash: SHA1
 > 
 > Pyun YongHyeon wrote:
 > > On Wed, Feb 14, 2007 at 01:32:57PM -0500, Joe Marcus Clarke wrote:
 > >  > -----BEGIN PGP SIGNED MESSAGE-----
 > >  > Hash: SHA1
 > >  > 
 > >  > Pyun YongHyeon wrote:
 > >  > > On Tue, Feb 13, 2007 at 12:34:57AM -0500, Joe Marcus Clarke wrote:
 > >  > >  > On Tue, 2007-02-13 at 13:47 +0900, Pyun YongHyeon wrote:
 > >  > >  > > On Mon, Feb 12, 2007 at 07:38:03PM -0500, Joe Marcus Clarke wrote:
 > >  > >  > >  > On Tue, 2007-02-13 at 09:09 +0900, Pyun YongHyeon wrote:
 > >  > >  > >  > > On Mon, Feb 12, 2007 at 02:08:49PM -0500, Joe Marcus Clarke wrote:
 > >  > >  > >  > >  > -----BEGIN PGP SIGNED MESSAGE-----
 > >  > >  > >  > >  > Hash: SHA1
 > >  > >  > >  > >  > 
 > >  > >  > >  > >  > I recently upgraded my MacBook Pro from -STABLE to -CURRENT.  I used to
 > >  > >  > >  > >  > be using the Marvell myk driver for my wired ethernet.  This driver
 > >  > >  > >  > >  > worked fine.  I'm now using the built-in msk driver, but this driver
 > >  > >  > >  > >  > causes the interface to report a constant stream of input errors.  There
 > >  > >  > >  > > 
 > >  > >  > >  > > Would you explain this input errors?
 > >  > >  > >  > 
 > >  > >  > >  > netstat -i reports steadily increasing input errors (Ierrs) every time
 > >  > >  > >  > packets arrive on the machine.
 > >  > >  > >  > 
 > >  > >  > > 
 > >  > >  > > It looks like link speed/duplex mismatch.
 > >  > >  > > How about manual configuration?
 > >  > >  > > (e.g. ifconfig msk0 media 1000baseTX mediaopt full-duplex)
 > >  > >  > 
 > >  > >  > That was the first thing I though of.  I tried all settings from
 > >  > >  > 100BaseTX half/full to 1000BaseTX half/full to auto.  The same problem
 > >  > >  > was always observed.  Additionally, I turned off TSO to see if that made
 > >  > >  > any difference, and it did not.
 > >  > >  > 
 > >  > > 
 > >  > > Ok, let's see what's happending on your NIC.
 > >  > > Try attached patch and let me know the output.
 > >  > 
 > >  > Okay.  On an otherwise idle machine, I started a regular 64-byte ping
 > >  > (to rule out TCP), and I was getting regular packet loss (50 responses
 > >  > were seen for 89 packets or 43.8% packet loss).  Here is the debug
 > >  > output for that time:
 > >  > 
 > >  > Feb 14 13:22:05 gyros kernel: 0x004a2100 : 70 : 74
 > >  > Feb 14 13:22:05 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:22:06 gyros kernel: 0x004a2100 : 70 : 74
 > >  > Feb 14 13:22:07 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:22:08 gyros kernel: 0x004a2100 : 70 : 74
 > >  > Feb 14 13:22:10 gyros kernel: 0x024e2300 : 586 : 590
 > >  > Feb 14 13:22:13 gyros kernel: 0x00b12300 : 173 : 177
 > >  > Feb 14 13:22:16 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:22:20 gyros kernel: 0x004a2100 : 70 : 74
 > >  > Feb 14 13:22:20 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:22:21 gyros kernel: 0x004a2100 : 70 : 74
 > >  > Feb 14 13:22:22 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:22:25 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:22:26 gyros kernel: 0x004a2100 : 70 : 74
 > >  > Feb 14 13:22:27 gyros kernel: 0x003c2300 : 56 : 60
 > >  > Feb 14 13:22:28 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:22:28 gyros kernel: 0x003c2300 : 56 : 60
 > >  > Feb 14 13:22:30 gyros kernel: 0x00f32300 : 239 : 243
 > >  > Feb 14 13:22:30 gyros kernel: 0x003c2100 : 56 : 60
 > >  > Feb 14 13:22:31 gyros kernel: 0x005c2300 : 88 : 92
 > >  > Feb 14 13:22:32 gyros kernel: 0x004a2100 : 70 : 74
 > >  > Feb 14 13:22:33 gyros kernel: 0x003c2300 : 56 : 60
 > >  > Feb 14 13:22:34 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:22:35 gyros kernel: 0x004a2100 : 70 : 74
 > >  > Feb 14 13:22:36 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:22:46 gyros last message repeated 6 times
 > >  > Feb 14 13:22:47 gyros kernel: 0x003c2300 : 56 : 60
 > >  > Feb 14 13:22:49 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:22:50 gyros kernel: 0x004a2100 : 70 : 74
 > >  > Feb 14 13:22:51 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:22:55 gyros last message repeated 3 times
 > >  > Feb 14 13:22:56 gyros kernel: 0x005a2300 : 86 : 90
 > >  > Feb 14 13:22:58 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:23:00 gyros kernel: 0x006e2300 : 106 : 110
 > >  > Feb 14 13:23:00 gyros kernel: 0x00c12300 : 189 : 193
 > >  > Feb 14 13:23:02 gyros kernel: 0x004a2100 : 70 : 74
 > >  > Feb 14 13:23:03 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:23:03 gyros kernel: 0x01602300 : 348 : 352
 > >  > Feb 14 13:23:05 gyros kernel: 0x004a2100 : 70 : 74
 > >  > Feb 14 13:23:06 gyros kernel: 0x01002300 : 252 : 256
 > >  > Feb 14 13:23:07 gyros kernel: 0x004a2100 : 70 : 74
 > >  > Feb 14 13:23:08 gyros kernel: 0x003c2100 : 56 : 60
 > >  > Feb 14 13:23:09 gyros kernel: 0x003c2100 : 56 : 60
 > >  > Feb 14 13:23:09 gyros kernel: 0x003c2300 : 56 : 60
 > >  > Feb 14 13:23:10 gyros kernel: 0x00622100 : 94 : 98
 > >  > Feb 14 13:23:11 gyros kernel: 0x004a2100 : 70 : 74
 > >  > Feb 14 13:23:13 gyros kernel: 0x00c12300 : 189 : 193
 > >  > Feb 14 13:23:14 gyros kernel: 0x003e2100 : 58 : 62
 > >  > Feb 14 13:23:14 gyros kernel: 0x006e2300 : 106 : 110
 > >  > 
 > >  > Additionally, while I wasn't seeing a large number of interrupts, I
 > >  > tried disabling MSI/MSI-X just as a test, and that had no effect.
 > >  > Netstat-wise, I have currently received 5645 packets with 847 input
 > >  > errors.  TCP-wise, I have received 3447 TCP packets with 334
 > >  > out-of-order packets.
 > >  > 
 > > 
 > > Hmm, this is very strange to me.
 > > If the packet is normal ICMP echo request packet its packet length
 > > on receiver side should be 98 bytes(14(ethernet hdr) + 20(IP hdr) +
 > > 8(icmp hdr) + 56(icmp data)). However the output shows various
 > > packet length ranging from 56 to 590. In addition msk(4) showed the
 > > received packet length differences between MAC and host and all those
 > > packets were VLAN tagged packet. Do you use VLAN on your environments?
 > > 
 > > Assuming you've just sent non-VLAN tagged ICMP echo request packet
 > > it would be the result of speed/duplex mismatch.
 > 
 > I moved to a switch over which I had control, and set the port to auto,
 > msk0 configured itself to 100BaseTX/full, and the input errors stopped.
 > 
 > The current switch is an old Cisco 2924XL (10/100 port).  The previous
 > switch was most likely a Cisco 6500 with a 10/100/1000 port (set to
 > auto).  So the msk interface must not have liked what the 6500 was doing
 > negotiation wise on its gigabit port.  Either that, or we have a bug in
 > the version of code running on that switch.
 > 

Does the other GigE work without issues on Cisco 6500?

The msk(4) can have a bug with autonegotiation but I've never seen
negotiation mismatches on msk(4). Whilst writing the driver I've
checked speed/link negotiation against gigabit switch and directly
connected GigE and I found nothing unusual. Of course, that does
not necessarily mean msk(4) is perfect for link negotiation.

Because you've said myk(4) works well on Cisco 6500 I have to
diagnose the issue. When setting the link manually on gigabit
environments, nomally one side should be master and the other the
slave so how about forcing master bit on msk(4) side?

#ifconfig msk0 media 1000baseTX mediaopt full-duplex link0
                                                     ^^^^^

-- 
Regards,
Pyun YongHyeon
Received on Tue Feb 20 2007 - 00:26:58 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:05 UTC