On Mon, 17 Mar 2008, Julian Elischer wrote: >> Per previous posts, interested parties can find the slides on the design >> from the BSDCan 2008 developer summit here: >> >> >> http://www.watson.org/~robert/freebsd/2007bsdcan/20070517-devsummit-zerocopybpf.pdf > > with the video of the talk at: > > http://www.freebsd.org/~julian/BSDCan-2007/rwatson_bpf.mov The primary design change since that time is that we've eliminated the ioctl-driven monitoring and ACKing of shared memory buffers from userspace. All shared memory consumers must use the shared memory ACK model, and our libpcap changes do that. This removes redundancy (and complexity) from the set of ioctls we've added. I've attached the (new) text from bpf.4 below, which I think captures the changes best. Robert N M Watson Computer Laboratory University of Cambridge BUFFER MODES bpf devices deliver packet data to the application via memory buffers provided by the application. The buffer mode is set using the BIOCSETBUFMODE ioctl, and read using the BIOCGETBUFMODE ioctl. Buffered read mode By default, bpf devices operate in the BPF_BUFMODE_BUFFER mode, in which packet data is copied explicitly from the kernel to user memory using the read(2) system call. The user process will declare a fixed buffer size that will be used both for sizing internal buffers and for all read(2) operations on the file. This size is queried using the BIOCGBLEN ioctl, and is set using the BIOCSBLEN ioctl. Note that an individual packet larger than the buffer size is necessarily truncated. Zeroâ€copy buffer mode bpf devices may also operate in the BPF_BUFMODE_ZEROCOPY mode, in which packet data is written directly into user memory buffers by the kernel, avoiding both system call and copying overhead. Buffers are of fixed (and equal) size, pageâ€aligned, and an even multiple of the page size. The maximum zeroâ€copy buffer size is returned by the BIOCGETZMAX ioctl. Note that an individual packet larger than the buffer size is necessarily truncated. The user process registers two memory buffers using the BIOCSETZBUF ioctl, which accepts a struct bpf_zbuf pointer as an argument: struct bpf_zbuf { void *bz_bufa; void *bz_bufb; size_t bz_buflen; }; bz_bufa is a pointer to the userspace address of the first buffer that will be filled, and bz_bufb is a pointer to the second buffer. bpf will then cycle between the two buffers starting with bz_bufa. Each buffer begins with a fixedâ€length header to hold synchronization and data length information for the buffer: struct bpf_zbuf_header { volatile u_int bzh_kernel_gen; /* Kernel generation number. */ volatile u_int bzh_kernel_len; /* Length of data in the buffer. */ volatile u_int bzh_user_gen; /* User generation number. */ /* ...padding for future use... */ }; The header structure of each buffer, including all padding, should be zeroed before it is passed to the ioctl. Remaining space in the buffer will be used by the kernel to store packet data, laid out in the same format as with buffered read mode. The kernel and the user process follow a simple acknowledgement protocol via the buffer header to synchronize access to the buffer: when the header generation numbers, bzh_kernel_gen and bzh_user_gen, hold the same value, the kernel owns the buffer, and when they differ, userspace owns the buffer. While the kernel owns the buffer, the contents are unstable and may change asynchronously; while the user process owns the buffer, its con†tents are stable and will not be changed until the buffer has been acknowledged. Initializing the buffer headers to all 0’s before registering the buffer has the effect of assigning initial ownership of both buffers to the ker†nel. The kernel signals that a buffer has been assigned to userspace by modifying bzh_kernel_gen, and userspace acknowledges the buffer and returns it to the kernel by setting the value of bzh_user_gen to the value of bzh_kernel_gen. In order to avoid caching and memory reâ€ordering effects, the user process must use atomic operations and memory barriers when checking for and acknowledging buffers: #include <machine/atomic.h> /* * Return ownership of a buffer to the kernel for reuse. */ static void buffer_acknowledge(struct bpf_zbuf_header *bzh) { atomic_store_rel_int(&bzhâ€>bzh_user_gen, bzhâ€>bzh_kernel_gen); } /* * Check whether a buffer has been assigned to userspace by the kernel. * Return true if userspace owns the buffer, and false otherwise. */ static int buffer_check(struct bpf_zbuf_header *bzh) { return (bzhâ€>bzh_user_gen != atomic_load_acq_int(&bzhâ€>bzh_kernel_gen)); } The user process may force the assignment of the next buffer, if any data is pending, to userspace using the BIOCROTZBUF ioctl. This allows the user process to retrieve data in a partially filled buffer before the buffer is full, such as following a timeout; the process must check for buffer ownership using the header generation numbers, as the buffer will not be assigned if no data was present. As in the buffered read mode, kqueue(2), poll(2), and select(2) may be used to sleep awaiting the availbility of a completed buffer. They will return a readable file descriptor when ownership of the next buffer is assigned to user space. In the current implementation, the kernel will assign ownership of at most one buffer at a time to the user process. The user processes must acknowledge the current buffer in order to be notified that the next buffer is ready for processing. Programs should not rely on this as an invariant, as it may change in future versions.Received on Mon Mar 17 2008 - 17:45:53 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:29 UTC