Re: Sockets stuck in SYN_RCVD (re(4), RELENG_7, i386)

From: Oliver Fromme <olli_at_lurza.secnetix.de>
Date: Wed, 21 Nov 2007 10:56:18 +0100 (CET)
Pyun YongHyeon wrote:
 > On Tue, Nov 20, 2007 at 04:19:18PM +0100, Oliver Fromme wrote:
 > > Some additional information.
 > > 
 > > Today I have run the re(4) interface at 100 Mbps for a few
 > > hours.  The count did still increase, so it's not a GigE-
 > > only problem.
 > > 
 > > The I disabled RXCSUM,TXCSUM on the interface.  Again, the
 > > counter still increased.  So hardware checksumming isn't
 > > the cause of the problem either.
 > > 
 > > Anything else I could try?
 > 
 > re(4) is not smart enough to analyze packet payload. The hardware
 > also doesn't have a feature like TCP header split so I think re(4)
 > wouldn't have influence with TCP traffics by itself.

I see.  So it does not seem to be a bug in re(4).

My first suspect were the IPFW rules.  But they're quite
simple (only 20 rules) and I'm sure they're correct.
Apart from that, if it was a faulty rule that blocks
SYN+ACK packets or similar, then no TCP connections would
work at all.  And even in that case, the default timeout
for SYN_RCVD is very short (45 seconds I think), but not
several days.

So my current suspect is a bug in the syncache code.
That bug is probably triggered by something exceptional,
because I don't see the problem on any other machine,
not even on the one which is almost identical in hardware
and OS.

I would like to ask everybody to have a look at the
output from "sysctl net.inet.tcp.syncache.count".
Does anybody else have a non-zero value that slowly
increases?  If so, it would be interesting to find out
if there are any similarities with my machine.

 > Your dmesg indicates that you're using slightly old rgephy(4) on 7.0.
 > I touched rgephy(4) to support a newer PHY and fixed several bugs. If
 > speed/duplex mismatch was the cause of the issue you can see lots
 > of input errors from the output of "netstat -ndi" output. If so, try
 > latest rgephy(4).

I don't think that's the cause.  I tried with and without
auto-select, forcing the interface to 100 and GigE, and
all of that did not affect the behaviour at all.  The
error counters are all zero:

Name  Mtu Network  Address    Ipkts Ierrs    Opkts Oerrs  Coll Drop
re0  1500 <Link#1> [...]   28363007     0 25430349     0     0    0

 > > > net.inet.tcp.syncache.count: 702
 > > 
 > > It's now at 731.

And now at 832.  So it grows by more than 100 entries per
day.

Best regards
   Oliver

-- 
Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M.
Handelsregister: Registergericht Muenchen, HRA 74606,  Geschäftsfuehrung:
secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün-
chen, HRB 125758,  Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart

FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd

 > Can the denizens of this group enlighten me about what the
 > advantages of Python are, versus Perl ?
"python" is more likely to pass unharmed through your spelling
checker than "perl".
        -- An unknown poster and Fredrik Lundh
Received on Wed Nov 21 2007 - 08:56:28 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:22 UTC