sbflush() panics, new socket-related DDB commands (was: cvs commit: src/sys/conf files src/sys/kern uipc_debug.c (fwd))

From: Robert Watson <rwatson_at_FreeBSD.org> Date: Thu, 15 Feb 2007 10:12:59 +0000 (GMT) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:05 UTC

Dear All,

Of late we've seen an increase in the number of reports of panics in 
sbflush().  These occur due to an invariant check when closing sockets, which 
in effect confirms that the amount of actual data in a socket buffer matches 
the socket buffer's cached length from that data.  Typically, these panics are 
a symptom of mbuf chain/mbuf packet queue corruption, and in the past have 
been symptomatic primarily of device driver bugs (i.e., the device driver 
touches mbuf fields after passing the mbuf into the stack, etc).  However, 
debugging these problems is very difficult, as sbflush() is called only when 
the socket is closed, and the corruption frequently occurs much earlier.

If you are experiencing these panics, please compile your kernel with "options 
SOCKBUF_DEBUG", which adds more frequent invariant checks of socket buffer 
consistency.  These checks may have a substantial performance impact, but 
unless we can catch the problem(s) closer to when it(they) occur(s), the 
chances of tracking the source are low.  We could be looking at one or more 
bugs in some combination of the socket, protocol, and network interface code 
leading to a recent increase.  Perhaps socket buffer auto-sizing has increased 
the chances of it occuring, or perhaps we just have increase parallelism in 
test hardware.

I've added some new commands to DDB to make it easier to understand the state 
of the system following panic.  While the same information can be extracted 
with kgdb, core dumps are not reliably available in all environments, and 
certain other information (such as lock information) is most easily obtained 
using DDB.  The commands are:

show socket <addr>
show sockbuf <addr>
show protosw <addr>
show domain <addr>

Only the first two of these are likely to be used directly, and they invoke 
protosw and domain printing as required.  It would be helpful if people 
debugging network panics involving sockets could use the above (and especially 
show socket) on socket arguments in stack traces.  For example, in an 
sbflush() panic, the first argument to soclose() in the stack trace is the 
socket pointer, so is ideal to pass to "show socket" :-).

FYI, in general, the first argument to sofoo() functions is the socket 
pointer.  Likewise, the first argument to sbfoo() functions is the socket 
buffer pointer.  This is not the case of soo_foo() functions, which take a 
file descriptor, or filt_sofoo() functions, which accept a knote.  In general, 
if you print the socket contents with "show socket", then it isn't necessary 
to print socket buffers separately; "show socket" prints not just the contents 
of the socket buffers, but also the pointers to them so that those pointers 
can be compared with other arguments in the stack.

Robert N M Watson
Computer Laboratory
University of Cambridge

---------- Forwarded message ----------
Date: Thu, 15 Feb 2007 01:28:23 +0000 (UTC)
From: Robert Watson <rwatson_at_FreeBSD.org>
To: src-committers_at_FreeBSD.org, cvs-src_at_FreeBSD.org, cvs-all_at_FreeBSD.org
Subject: cvs commit: src/sys/conf files src/sys/kern uipc_debug.c

rwatson     2007-02-15 01:28:22 UTC

   FreeBSD src repository

   Modified files:
     sys/conf             files
   Added files:
     sys/kern             uipc_debug.c
   Log:
   Teach DDB how to print sockets, socket buffers, protosw's, and domain
   structures given pointers to them.

   Revision  Changes    Path
   1.1177    +1 -0      src/sys/conf/files
   1.1       +530 -0    src/sys/kern/uipc_debug.c (new)