Re: RFC: FreeBSD I/OAT driver

From: Prafulla Deuskar <pdeuskar_at_freebsd.org> Date: Wed, 30 Aug 2006 20:41:28 +0000 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:59 UTC

Andrew Gallatin [gallatin_at_cs.duke.edu] wrote:
> 
> Jack Vogel writes:
>  > We are making our development driver for the I/OAT engine available for
>  > download,  experimentation, and comment available at:
>  > 
>  >         http://sourceforge.net/project/showfiles.php?group_id=42302&package_id=202220
>  > 
>  > This includes a core driver for the dma hardware and a set of stack changes
>  > to allow use of the engine on the receive side of the stack.
>  > 
>  > There are certainly rough edges and limitations in this code, but we have run
>  > it internally and seen some great results.
>  > 
>  > I would like to see this get into CURRENT, so anything Prafulla and I can do
>  > to help or answer questions, send us email.
> 
> Excellent!  Can you share some of these results?  I would love to try
> it, but I don't have FreeBSD on any machine with I/OAT hardware.
>

Dual core Woodcrest
Bensley Chipset
Netperf Receive Test - 64k IO size
6.1-RELEASE SMP Kernel
MTU 1500 bytes

Num Ports  Thr(Native) Thr(I/OAT) CPU (Native) CPU (I/OAT)
           (Mbps)      (Mbps)     (%)          (%)
   1       943         943        14           11
   2      1886        1886        46           22
   4      1945        2531        84           54

It scales fairly linearly as number of ports increase.
Haven't run more than 4 port test with FreeBSD though.

> I've taken a very quick look at it.  Maybe I'm just being dense,
> but I don't like the name "dma_" being in the global namespace.
> Maybe things (like dma_*_list should be called at least
> dmaengine_*_list, etc.

Yeah - probably ioatdma_ would be more clear.

> 
> There are some style(9) defects which I'm sure others who are more
> proficient at style(9) than I am will point out (// comments, function
> names not starting in column 0, etc).
> 
> How deep would you expect so->dma_wait_queue to get?  Would it make
> sense to keep a pointer to the last item so that insertion is O(1),
> rather than O(N)?

The size is dependant on Application IO size and MSS.
So you are right it might be useful to keep a pointer to last item.
At higher IO sizes it can get quite deep for 1500 byte MTU.

> 
> Would it be possible to have a sysctl tunable threshold, below which
> the system does a normal uiomove?  A normal copyout() will certainly
> be faster at some point..
>

Yes - it doesn't make sense to use the DMA engine for small packets.
Also you get more benefit if you overlap IO with computation.

> Thanks for the great work!

Thank you for your help earlier.
>