Re: memory barriers in bus_dmamap_sync() ?

From: John Baldwin <jhb_at_freebsd.org>
Date: Wed, 11 Jan 2012 10:05:28 -0500
On Tuesday, January 10, 2012 5:41:00 pm Luigi Rizzo wrote:
> On Tue, Jan 10, 2012 at 01:52:49PM -0800, Adrian Chadd wrote:
> > On 10 January 2012 13:37, Luigi Rizzo <rizzo_at_iet.unipi.it> wrote:
> > > I was glancing through manpages and implementations of bus_dma(9)
> > > and i am a bit unclear on what this API (in particular, bus_dmamap_sync() )
> > > does in terms of memory barriers.
> > >
> > > I see that the x86/amd64 and ia64 code only does the bounce buffers.

That is because x86 in general does not need memory barriers.  Other platforms
have them (alpha had them in bus_dmamap_sync()).

> > > The mips seems to do some coherency-related calls.
> > >
> > > How do we guarantee, say, that a recently built packet is
> > > to memory before issuing the tx command to the NIC ?
> > 
> > The drivers should be good examples of doing the right thing. You just
> > do pre-map and post-map calls as appropriate.
> > 
> > Some devices don't bother with this on register accesses and this is a
> > bug. (eg, ath/ath_hal.) Others (eg iwn) do explicit flushes where
> > needed.
> 
> so you are saying that drivers are correct unless they are buggy :)

For bus_dma, just use bus_dmamap_sync() and you will be fine.

> Anyways... i see that some drivers use wmb() and rmb() and redefine
> their own version, usually based on lfence/sfence even on i386
> 
> 	#define rmb()	__asm volatile("lfence" ::: "memory")
> 	#define wmb()	__asm volatile("sfence" ::: "memory")
> 
> whereas the standard definitions are slightly different, e.g.
> sys/i386/include/atomic.h:
> 
>     #define      rmb()   __asm __volatile("lock; addl $0,(%%esp)" : : : "memory")
>     #define      wmb()   __asm __volatile("lock; addl $0,(%%esp)" : : : "memory")
> 
> and our bus_space API in sys/x86/include/bus.h is a bit unclear to
> me (other than the fact that having 4 unused arguments don't really
> encourage its use...)

We could use lfence/sfence on amd64, but on i386 not all processors support
those.  The broken drivers doing it by hand don't work on early i386 CPUs.
Also, I personally don't like using membars like rmb() and wmb() by hand.
If you are operating on normal memory I think atomic_load_acq() and
atomic_store_rel() are better.

>     static __inline void
>     bus_space_barrier(bus_space_tag_t tag __unused, bus_space_handle_t bsh __unused,
> 		      bus_size_t offset __unused, bus_size_t len __unused, int flags)
>     {
>     #ifdef __GNUCLIKE_ASM
> 	    if (flags & BUS_SPACE_BARRIER_READ)
>     #ifdef __amd64__
> 		    __asm __volatile("lock; addl $0,0(%%rsp)" : : : "memory");
>     #else
> 		    __asm __volatile("lock; addl $0,0(%%esp)" : : : "memory");
>     #endif
> 	    else
> 		    __asm __volatile("" : : : "memory");
>     #endif
>     }

This is only for use with something accessed via bus_space(9).  Often these
are not needed however.  For example, on x86 all bus_space memory is mapped
uncached, so no actual barrier is needed except for a compiler barrier.

-- 
John Baldwin
Received on Wed Jan 11 2012 - 14:14:06 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:23 UTC