Re: less aggressive contigmalloc ?

From: Alan Cox <alc_at_rice.edu> Date: Thu, 23 Aug 2012 12:08:40 -0500 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:30 UTC

On 08/23/2012 11:31, Luigi Rizzo wrote:
> On Thu, Aug 23, 2012 at 10:48:27AM -0500, Alan Cox wrote:
>> On Wed, Aug 22, 2012 at 7:01 AM, Luigi Rizzo<rizzo_at_iet.unipi.it>  wrote:
>>
>>> I am trying to make netmap adapt the amount of memory it allocates
>>> to what is available. At its core, it uses contigmalloc() with
>>> small chunks (even down to 1 page) to fetch memory.
>>>
>>> Problem is, i notice that before failing, contigmalloc()
>>> tries to swap out some processes (effectively killing them
>>> because i have no swap configured in my picobsd image).
>>> This happens with both M_WAITOK and M_NOWAIT, the difference
>>> is only in the number of retries it does -- see e.g.
>>>
>>>          sys/vm/vm_kern.c :: kmem_alloc_contig()
>>>
>>> where it retries once for M_NOWAIT and 3 times for M_WAITOK.
>>>
>>> I wonder if there is a way to make contigmalloc less aggressive
>>> and fail without killing those innocent processes ?
>>>
>>>
>> Have you actually observed processes being killed with M_NOWAIT?
>>
>> The difference between M_NOWAIT and M_WAITOK is more than just the number
>> of retries.  Each successive iteration is more aggressive in its attempt to
>> recover pages.  On the first iteration, no pages should be written to
>> swap.  Nothing should happen that could result in process termination.
> yes i do see that.
>
> Maybe less aggressive with M_NOWAIT but still kills processes.

Are you compiling world with MALLOC_PRODUCTION?  The latest version of 
jemalloc uses significantly more memory when debugging options are 
enabled.  This first came up in a thread titled "10-CURRENT and swap 
usage" back in June.

Even at its most aggressive, M_WAITOK, contigmalloc() does not directly 
kill processes.  If process death coincides with the use of 
contigmalloc(), then it is simply the result of earlier, successful 
contigmalloc() calls, or for that matter any other physical memory 
allocation calls, having depleted the pool of free pages to the point 
that the page daemon runs and invokes vm_pageout_oom().

> Here is what i get with a kernel on a qemu machine with not enough
> memory. This output comes when the kernel is looping around
> a contigmalloc, getting one page at a time (i have rate-limited
> prints). The function is netmap_finalize_obj_allocator(),
> it works for a while, then decides to kill some processes,
> keeps succeeding, kills some other processes, etc.
> Eventually memory is over and you see the 'Unable to create...'
> message at the end.
>
>      ...
>      269.005884 netmap_finalize_obj_allocator [593] cluster at 63182 ok
> 					
>      PicoBSD (default) (ttyv0)
> 					
>      login: pid 60 (getty), uid 0, was killed: out of swap space
>      pid 63 (init), uid 0, was killed: out of swap space
>      pid 62 (init), uid 0, was killed: out of swap space
>      pid 61 (init), uid 0, was killed: out of swap space
>      pid 64 (init), uid 0, was killed: out of swap space
>      pid 51 (getty), uid 0, was killed: out of swap space
>      pid 50 (getty), uid 0, was killed: out of swap space
>      pid 65 (init), uid 0, was killed: out of swap space
>      pid 49 (getty), uid 0, was killed: out of swap space
>      pid 48 (getty), uid 0, was killed: out of swap space
>      pid 47 (getty), uid 0, was killed: out of swap space
>      pid 57 (pkt-gen), uid 0, was killed: out of swap space
>      269.602751 netmap_finalize_obj_allocator [600] Unable to create cluster at 95452 for 'netmap_buf' allocator
>
> If that helps, on this machine i have no swap configured
> (it is a picobsd image run within qemu)
>
> cheers
> luigi
>