Re: crash when bpf is used heavily

From: Sergey Lyubka <devnull_at_uptsoft.com> Date: Sat, 29 May 2004 16:03:45 +0300 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:55 UTC

> It looks like the BPF code is written to handle the case where allocation
> fails, but that it passes flags to the memory allocator that prevent the
> memory allocator from returning a failure.  Specifically,
> src/sys/net/bpf.c:bpf_allocbufs() passes M_WAITOK into malloc().  Try
> changing that flag (in both instances) to M_NOWAIT.  This will still
> permit BPF to consume large quantities of memory if the maximum buffer
> size is set so large, but it will cause BPF itself not to cause a direct
> panic if address space is exhausted.  I'm a little surprised M_NOWAIT
> isn't already the setting there, actually. 

put M_NOWAIT in bpf_allocbufs().
Got page fault now.
panic: kmem_malloc(4098) too small 
Fatal trap 12: page fault while in kernel mode
cpuid = 0, apic id = 00
fault virtual address = 0x24
fault code = supervisor read, page not present
..... skipped
current process = 4 (g_down)

> The system will still be running in a low address space scenario which
> might cause other parts of the system to bump into the failure, however.
> Unfortunately, balancing multiple consumers of address space is a "hard
> problem".  With the mbuf allocator, we make use of a separate address
> space map of bounded size to prevent the total address space consumed by
> packet buffers exceeding a certain size.  It might be interesting to
> experiment with allocating BPF space from the same map, as it would change
> the trade-off from "panic if there's no room"  to "stall the network stack
> if there's no room".  The other common solution is to use smaller buffers,
> making the trade-off become "If the packets come too fast, we drop them".
> I realize that is the problem you're trying to solve... :-)  On systems
> I've worked with that need to do processing of many high speed packet
> streams, we've generally tried to combine all the processing into modules
> in a single process, as this has a number of benefits:
> 
> (1) It avoids storing the same packet many times in many buffers for
>     different consumers.
> 
> (2) It reduces the over-all memory overhead of buffering in the kernel.
> 
> (3) It reduces the number of memory copies by avoiding copying the same
>     packet many times (in particular, between user and kernel space)
> 
> (4) It avoids performing additional context switches during high speed
>     tracing, which can substantially impact available CPU resources for
>     packet copying and monitoring.

100% agree. I follow the same ideology, and have just few BPF apps on the box.

At the moment, I do a penetration tests trying to overload the box.
I have found that it does not survive if many BPF apps is running, and that
my concern, since operators may login to the box and simultaneously run
tcpdumps just to quickly check something, and it may cause the box to crash.

I have a question. (maybe more *-net question than *-current )
Would it be more efficient having user-mappable memory
chunks, that represent device driver's input and output queues. libpcap
then may map those chunks, keeping buffers in userspace ?
Does that make sense? How difficult would it be to write a module that
implements such a mapping, is it possible to hook it into existing drivers
without changing their code ?

thanks for an anwer,
sergey