Re: FreeBSD 8.0 - network stack crashes?

From: Eirik Øverby <ltning_at_anduin.net>
Date: Mon, 30 Nov 2009 08:20:57 +0100
On 30. nov. 2009, at 01.52, Pyun YongHyeon wrote:

> On Mon, Nov 30, 2009 at 12:21:16AM +0100, Eirik ??verby wrote:
>> On 29. nov. 2009, at 15.29, Robert Watson wrote:
>> 
>>> On Sun, 29 Nov 2009, Eirik Øverby wrote:
>>> 
>>>> I just did that (-rxcsum -txcsum -tso), but the numbers still keep rising. I'll wait and see if it goes down again, then reboot with those values to see how it behaves. But right away it doesn't look too good ..
>>> 
>>> It would be interesting to know if any of the counters in the output of netstat -s grow linearly with the allocation count in netstat -m.  Often times leaks are associated with edge cases in the stack (typically because if they are in common cases the bug is detected really quickly!) -- usually error handling, where in some error case the unwinding fails to free an mbuf that it should free.  These are notoriously hard to track down, unfortunately, but the stats output (especially where delta alloc is linear to delta stat) may inform the situation some more.
>> 
>> From what I can tell, all that goes up with mbuf usage is traffic/packet counts. I can't say I see anything fishy in there.
>> 
> 
> If system exhausted all available mbufs it still should not crash
> the box. Use -d option of netstat(1) to see whether packet drop
> counter still goes up when you know system can't receive any
> frames. AFAIK em(4) was carefully written to recover from Rx
> resource shortage such that it just drops incoming frames when it
> can't get new mbuf. This may result in dropping incoming connection
> request but it means it still tries to recover from the resource
> exhaustion.
> It's not clear where mbuf leak comes from, though.

The box does not crash; connecting to the console (via IP-KVM) shows the box is just fine, except that no networking works. I can up the kern.ipc.nmbclusters value from the commandline, and after a few seconds things start moving again.

The em(4) debug output shows that it fails to allocate mbuf clusters.


>> From the last few samples in
>> http://anduin.net/~ltning/netstat.log
> 
> 404

Uh? Unpossible :)
The file is there, and I can view it here ...


>> you can see the host stops receiving any packets, but does a few retransmits before the session where this script ran timed out.
>> 
> 
> By chance do you use pf/ipfw/ipf?

No... Unfortunately ;)

/Eirik
Received on Mon Nov 30 2009 - 06:21:03 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:58 UTC