Re: Much improved sosend_*() functions

From: Andrew Gallatin <gallatin_at_cs.duke.edu> Date: Fri, 29 Sep 2006 18:45:23 -0400 (EDT) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:00 UTC

Andre Oppermann writes:
 > Andrew Gallatin wrote:
 > > Andre,
 > > 
 > > I meant to ask: Did you try 16KB jumbos?  Did they perform
 > > any better than page-sized jumbos?
 > 
 > No, I didn't try 16K jumbos.  The problem with anything larger than
 > page size is that it may look contigous in kernel memory but isn't
 > in physical memory.  Thus you need the same number of descriptors
 > for the network card as with page sized (4K) clusters.

But it would allow you to do one copyin, rather than 4.   I
don't know how much this would help, but it might be worth
looking at.

 > > Also, if we're going to change how mbufs work, let's add something
 > > like Linux's skb_frag_t frags[MAX_SKB_FRAGS]; In FreeBSD parlence,
 > > this embeds something like an array of sf_bufs pointers in mbuf.  The
 > > big difference to a chain of M_EXT mbufs is that you need to allocate
 > > only one mbuf wrapper, rather than one for each item in the list.
 > > Also, the reference is kept in the page (or sf_buf) itself, and the
 > > data offset is kept in the skbbuf (or mbuf).
 > 
 > We are not going to change how mbufs work.
 > 
 > > This allows us to do cool things like allocate a single page, and use
 > > both halves of it for 2 separate 1500 byte frames.  This allows us to
 > > achieve *amazing* results in combination with LRO, because it allows
 > > us to do, on average, many fewer allocations per byte.  Especially in
 > > combination with Linux's "high order" page allocations.  Using order-2
 > > allocations and LRO, I've actually seen 10GbE line rate receives on a
 > > wimpy 2.0GHz Athlon64.  
 > 
 > I have just started tackling the receive path.  Lets see what comes out
 > of it first before we jump to conclusions.

It could be mbufs are cheaper to get than skbs and pages on linux,
but I doubt it.  FWIW, linux has an skb chaining mechanism
(frag_list).  My first LRO experiment was based on allocating "normal"
skbs and chaining them.  That maxed out at around 5.2Gb/s (on the same
hardware I see line rate on).

Drew