Re: SYNCOOKIE authentication problems

From: Steve Kargl <sgk_at_troutmask.apl.washington.edu>
Date: Fri, 29 Jun 2007 15:47:25 -0700
On Fri, Jun 29, 2007 at 10:27:06PM +0100, David Malone wrote:
> > Jun 29 09:21:58 node11 kernel: TCP: [192.168.0.12]:54528 to [192.168.0.11]:526
> 
> OK - I can see the packets corresponding to this error by doing something
> like:
> 
> % tcpdump -S -r synfinrstdata -n port 62391 and port 60621

(output elided).

> The start of this looks like a perfectly normal TCP connection -
> it opens normally, transfers about 12 bytes in one direction and
> then closes. Strangley, 192.168.0.11 then sends two FIN packets,
> followed by a reset. The error message produced by the kernel should
> have produced a reset in response, but I'm not sure I can see quite
> enough to see what happened.
> 
> We could try to get all of the packets in the connection by doing:
> 
> 	tcpdump -i whatever_interface -w /tmp/fulldump -s 80

I'm doing this now.  It seems that putting bge0 in promiscous mode
has provided some stability.  fulldump is currently at  2.4 GB.

> > poll({4/POLLIN 5/POLLIN 6/POLLIN 7/POLLIN 9/POLLIN 10/POLLIN 11/POLLIN 13/POLL
> 
> It looks like MPI is looking only for file discriptors to become
> ready for reading. I'd guess one of the file discriptors is in an
> error state, but MPI isn't checking for theat, so it is spinning.
> 

I've both OpenMPI and MPICH2 implementation.  Neither handles a disappearing
process in an elegant manner.  They simply assume that network is robust
and 100% reliable.

-- 
Steve
Received on Fri Jun 29 2007 - 20:48:10 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:13 UTC