Re: [PATCH] Netdump for review and testing -- preliminary version

From: Robert Watson <rwatson_at_FreeBSD.org>
Date: Sat, 9 Oct 2010 02:15:39 +0100 (BST)
On Fri, 8 Oct 2010, Attilio Rao wrote:

>> GENERAL FRAMEWORK ARCHITECTURE
>>
>> Netdump is composed, right now, by an userland "server" and a kernel 
>> "client". The former is run on the target machine (where the dump will 
>> phisically happen) and it is responsible for receiving  the packets 
>> containing coredumps frame and for correctly writing them on-disk. The 
>> latter is part of the kernel installed on the source machine (where the 
>> dump is initiated) and is responsible for building correctly UDP packets 
>> containing the coredump frames, pushing through the network interface and 
>> routing them appropriately.

Hi Attilio:

Network dumps would be a great addition to the FreeBSD debugging suite!  A few 
casual comments and questions, I need to spend some more time pondering the 
implications of the current netdump design later in the week.

(1) Did you consider using tftp as the network dump protocol, rather than a 
custom protocol?  It's also a simple UDP-based, ACKed file transfer protocol, 
with the advantage that it's widely supported by exiting tftp daemons, packet 
sniffers, security devices, etc.  This wouldn't require using a stock tftpd, 
although that would certainly be a benefit as well.  The post-processing of 
crashdumps that seems to happen in the netdump server could, presumably, 
happen "offline" as easily...?

(2) Have you thought about adding a checksum to the dump format, since packets 
are going over the wire on UDP, etc?  Even an MD5 sum wouldn't be too hard to 
arrange: all the code is in the kernel already, requires relatively little 
state storage, and is designed for streamed data.

(3) As the bounds checking/etc behavior in dumping grows more complex, it 
seems a shame to replicate it in architecture-specific code.  Could we pull 
the mediaoffset/mediasize checking into common code?  Also, a reserved 
size/offset of "0" meaning "no limit" sounds slightly worrying; it might be 
slightly more conservative to add a flags field to dumperinfo and have a flag 
indicating "size is no object" for netdump, rather than overloading the 
existing fields.

Some specific patch comments:

+ * XXX: This should be split into machdep and non-machdep parts

What MD parts are in the file?

The sysctl parts of the patch have a number of issues:

- Please use sysctl_handle_string rather than manual calls and string bounds
   checking with SYSCTL_IN/SYSCTL_OUT.  In general, type-specific sysctl
   handlers are named sysctl_handle_<type>, which I recommend here as well.

- Please use stack-local, fixed-length string buffers as the
   source/destination of copyin/copyout on ifnet names, rather than the fields
   in the structure itself.  We support interface renaming, so the field can
   change, and sysctl has the potential to block for extended periods
   mid-copyin/copyout.

- Because we support ifnet renaming, "none" is probably not a good reserved
   name.  Could you use the empty string or something similar, or just a flag
   to indicate that netdump is disabled?

- "ifp" rather than "ifn" and "nic" is the preferred variable name for ifnet
   pointers throughout the kernel.

- ifnets can, and are, removeable at runtime.  We have a refcount, but that's
   not really what you want.  I suggest simply looking up the ifnet by name
   when it's needed during a dump or for sysctl output, rather than maintaining
   a long-term pointer, which can become stale.  Especially with the
   auto-detect code.

- Since sysctl_nic is only used once, I'd prefer it were inlined into a
   use-specific handler function, avoiding the arg1/arg2 complications that
   make this code a bit harder to read.  sysctl_ip is useful because it's used
   more than once, though.  However, it also very much wants you to call
   sysctl_handle_string!

+sysctl_force_crash(SYSCTL_HANDLER_ARGS)

Does this belong in the netdump code?  We already have some of these options 
in debug.kdb.*, but perhaps others should be added there as well.

+		/*
+		 * get and fill a header mbuf, then chain data as an extended
+		 * mbuf.
+		 */
+		MGETHDR(m, M_DONTWAIT, MT_DATA);

The idea of calling into the mbuf allocator in this context is just freaky, 
and may have some truly awful side effects.  I suppose this is the cost of 
trying to combine code paths in the network device driver rather than have an 
independent path in the netdump case, but it's quite unfortunate and will 
significantly reduce the robustness of netdumps in the face of, for example, 
mbuf starvation.

+	if (ntohs(ah->ar_hrd) != ARPHRD_ETHER &&
+	    ntohs(ah->ar_hrd) != ARPHRD_IEEE802 &&
+	    ntohs(ah->ar_hrd) != ARPHRD_ARCNET &&
+	    ntohs(ah->ar_hrd) != ARPHRD_IEEE1394) {
+		NETDDEBUG("nd_handle_arp: unknown hardware address fmt "
+		    "0x%2D)\n", (unsigned char *)&ah->ar_hrd, "");
+		return;
+	}

Are you sure you don't want to just check for ETHER here?

+	/* XXX: Probably should check if we're the recipient MAC address */
+	/* Done ethernet processing. Strip off the ethernet header */

Yes, quite probably.  What if you panic in promiscuous mode?

+	/*
+	 * The first write (at offset 0) is the kernel dump header.  Flag it
+	 * for the server to treat specially.  XXX: This doesn't strip out the
+	 * footer KDH, although it shouldn't hurt anything.
+	 */

The footer allows us to confirm that the tail end of a dump matches the 
beginning; while probably not strictly necessary in this scenario, it's not a 
bad idea given the lack of a checksum.

+	/* Default the nic to the first available interface */
+	if ((ifn = TAILQ_FIRST(&ifnet)) != NULL) do {
+		if ((ifn->if_flags & IFF_UP) == 0)
+			continue;

More locking needed.

I'm not sure I like the idea of defaulting to net dumping out the first 
interface -- in the future, we'll want to compile netdump into the kernel in 
GENERIC so that it's always available.  But we won't want to default to doing 
the above for a number of reasons.  This means the defaults need to be 
sensible and quite conservative (i.e., disabled and non-autoconfigured).

+	/* Default the client to the first IP on nd_nic */
+	if ((ifa = TAILQ_FIRST(&nd_nic->if_addrhead)) != NULL) do {
+		if (ifa->ifa_addr->sa_family != AF_INET) {
+			continue;
+		}
+		nd_client = ((struct sockaddr_in *)ifa->ifa_addr)->sin_addr;
+	} while ((ifa = TAILQ_NEXT(ifa, ifa_link)) != NULL &&
+			nd_client.s_addr == INADDR_ANY);

More locking needed here too.  I'm guess this code is forward-ported from an 
older FreeBSD version without locking here, so you will want to check for 
other changes in the replicated code paths, such as input / arp / etc 
handling.

And is this really a good idea?  Addresses change, and I think you aren't 
registering a callback to pick up the change in the event you're using an 
auto-configured address.  In some ways, in the same way you might want to look 
up the ifnet to use at dump-time, you might simultaneously want to 
auto-configure an address then, if auto-configuration is selected (i.e., 
netdump is enabled and no IP is configured).

-	void	*if_pspare[7];
+	struct	netdump_methods *if_ndumpfuncs; /* netdump virtual methods */
+	void	*if_pspare[6];
  	int	if_ispare[4];

In HEAD, at least, you should add your field and not use the spare.  The spare 
should only be used on a merge to a KBI-stable branch (such as 8.x).

I need to ponder your changes to ifnet and to the drivers some more; I 
recognize the benefits of maximal code alignment, but I worry a lot about 
these code paths having side effects (not least, calls into memory allocators, 
which in turn can enter VM, etc).  I wonder if, given that, we need to teach 
the mbuf allocator to be much more conservative if netdumping is active...

There are style bugs throughout.

Robert
Received on Fri Oct 08 2010 - 23:15:40 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:08 UTC