Dear All, Of late we've seen an increase in the number of reports of panics in sbflush(). These occur due to an invariant check when closing sockets, which in effect confirms that the amount of actual data in a socket buffer matches the socket buffer's cached length from that data. Typically, these panics are a symptom of mbuf chain/mbuf packet queue corruption, and in the past have been symptomatic primarily of device driver bugs (i.e., the device driver touches mbuf fields after passing the mbuf into the stack, etc). However, debugging these problems is very difficult, as sbflush() is called only when the socket is closed, and the corruption frequently occurs much earlier. If you are experiencing these panics, please compile your kernel with "options SOCKBUF_DEBUG", which adds more frequent invariant checks of socket buffer consistency. These checks may have a substantial performance impact, but unless we can catch the problem(s) closer to when it(they) occur(s), the chances of tracking the source are low. We could be looking at one or more bugs in some combination of the socket, protocol, and network interface code leading to a recent increase. Perhaps socket buffer auto-sizing has increased the chances of it occuring, or perhaps we just have increase parallelism in test hardware. I've added some new commands to DDB to make it easier to understand the state of the system following panic. While the same information can be extracted with kgdb, core dumps are not reliably available in all environments, and certain other information (such as lock information) is most easily obtained using DDB. The commands are: show socket <addr> show sockbuf <addr> show protosw <addr> show domain <addr> Only the first two of these are likely to be used directly, and they invoke protosw and domain printing as required. It would be helpful if people debugging network panics involving sockets could use the above (and especially show socket) on socket arguments in stack traces. For example, in an sbflush() panic, the first argument to soclose() in the stack trace is the socket pointer, so is ideal to pass to "show socket" :-). FYI, in general, the first argument to sofoo() functions is the socket pointer. Likewise, the first argument to sbfoo() functions is the socket buffer pointer. This is not the case of soo_foo() functions, which take a file descriptor, or filt_sofoo() functions, which accept a knote. In general, if you print the socket contents with "show socket", then it isn't necessary to print socket buffers separately; "show socket" prints not just the contents of the socket buffers, but also the pointers to them so that those pointers can be compared with other arguments in the stack. Robert N M Watson Computer Laboratory University of Cambridge ---------- Forwarded message ---------- Date: Thu, 15 Feb 2007 01:28:23 +0000 (UTC) From: Robert Watson <rwatson_at_FreeBSD.org> To: src-committers_at_FreeBSD.org, cvs-src_at_FreeBSD.org, cvs-all_at_FreeBSD.org Subject: cvs commit: src/sys/conf files src/sys/kern uipc_debug.c rwatson 2007-02-15 01:28:22 UTC FreeBSD src repository Modified files: sys/conf files Added files: sys/kern uipc_debug.c Log: Teach DDB how to print sockets, socket buffers, protosw's, and domain structures given pointers to them. Revision Changes Path 1.1177 +1 -0 src/sys/conf/files 1.1 +530 -0 src/sys/kern/uipc_debug.c (new)Received on Thu Feb 15 2007 - 09:13:00 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:05 UTC