Andrew Gallatin wrote: > Andre Oppermann writes: > > Andrew Gallatin wrote: > > > Andre, > > > > > > I meant to ask: Did you try 16KB jumbos? Did they perform > > > any better than page-sized jumbos? > > > > No, I didn't try 16K jumbos. The problem with anything larger than > > page size is that it may look contigous in kernel memory but isn't > > in physical memory. Thus you need the same number of descriptors > > for the network card as with page sized (4K) clusters. > > But it would allow you to do one copyin, rather than 4. I > don't know how much this would help, but it might be worth > looking at. It helped the SCTP code quite a bit when I optimized it to use this... can't remember how much of a boost it got.. (I started using all the frames at one time).. and of course it only helps when the msg size being sent is > 9k... But it was a help... at least on the copy-in side for sending down.. > > > > Also, if we're going to change how mbufs work, let's add something > > > like Linux's skb_frag_t frags[MAX_SKB_FRAGS]; In FreeBSD parlence, > > > this embeds something like an array of sf_bufs pointers in mbuf. The > > > big difference to a chain of M_EXT mbufs is that you need to allocate > > > only one mbuf wrapper, rather than one for each item in the list. > > > Also, the reference is kept in the page (or sf_buf) itself, and the > > > data offset is kept in the skbbuf (or mbuf). > > > > We are not going to change how mbufs work. > > > > > This allows us to do cool things like allocate a single page, and use > > > both halves of it for 2 separate 1500 byte frames. This allows us to > > > achieve *amazing* results in combination with LRO, because it allows > > > us to do, on average, many fewer allocations per byte. Especially in > > > combination with Linux's "high order" page allocations. Using order-2 > > > allocations and LRO, I've actually seen 10GbE line rate receives on a > > > wimpy 2.0GHz Athlon64. > > > > I have just started tackling the receive path. Lets see what comes out > > of it first before we jump to conclusions. > > It could be mbufs are cheaper to get than skbs and pages on linux, > but I doubt it. FWIW, linux has an skb chaining mechanism > (frag_list). My first LRO experiment was based on allocating "normal" > skbs and chaining them. That maxed out at around 5.2Gb/s (on the same > hardware I see line rate on). This would be a drastic set of changes.. a bit more than the simple add a few more sizes and get rid of data inside the mbuf... which would shrink the mbuf size considerable... of course one would need always a data EXT... R > > Drew > _______________________________________________ > freebsd-current_at_freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org" > -- Randall Stewart NSSTG - Cisco Systems Inc. 803-345-0369 <or> 815-342-5222 (cell)Received on Fri Sep 29 2006 - 21:11:02 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:00 UTC