Re: Which GigE NIC for reliable use?

From: Steve Kargl <sgk_at_troutmask.apl.washington.edu>
Date: Thu, 21 Jun 2007 10:01:14 -0700
On Thu, Jun 21, 2007 at 06:47:37PM +0200, Sameh Ghane wrote:
> Le (On) Thu, Jun 21, 2007 at 09:07:43AM -0700, Steve Kargl ecrivit (wrote):
> > 
> > Jun 20 23:22:33 node10 kernel: TCP: [10.208.78.111]:54801 to 
> > [10.208.78.111]:49376 tcpflags 0x10<ACK>; syncache_expand: Segment failed
> > SYNCOOKIE authentication, segment rejected (probably spoofed)
> 
> How does a local communication get affected by your NIC's behavior !?

It is an application that uses the Message Passing Interface.  There
are 4 processes running on node16 and 4 processes on node10.  All
processes are communicating with each other, when the link goes
down/up the processes stop talking.  The processes on node10 are
trying to send/receive data from the now non-existent processes on
node16.  I'm assuming that communication between the processes on
node10 gets out of sync and the above message appears. 

> 
> You seem to use Jumbo frames, maybe the link loss is switch related ?

Same problem with jumbo frames are good old mtu 1500 frames.

> 
> > So, I plan to replace all of the bge devices with a reliable,
> > robust GigE NIC.  Anyone have a suggestion for such a cards?
> 
> I would go for em(4) because the driver works really fine, for
> quite some time.

How does em(4) compare to msk(4)?

> Polling support is really good, and helps reducing interrupts.

Tried that.  Too much latencies.  Too many dropped packets.  
The execution time of the app is doubled if not triple.

Thanks for the info.  I'll investigate the em(4).

-- 
Steve
Received on Thu Jun 21 2007 - 15:02:46 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:13 UTC