Re: OOMA kill with vm.pfault_oom_attempts="-1" on RPi3 at r357147 (a vm_pfault_oom_attempts < 0 handling bug as of head -r357026)

From: Mark Millard <marklmi_at_yahoo.com>
Date: Tue, 28 Jan 2020 13:31:45 -0800
[I recommend not sending kib our other exchanges: that he has been
notified is enough. I also sent the material to 2 folks that I forgot at
the time and replied to one more related message with the information.
For most of these folks our general exchange is likely viewed as noise
after then basic, important point for them. Note: kib was the reviewer,
not the creator or submitter of -r357026 .]

On 2020-Jan-28, at 12:11, bob prohaska <fbsd at www.zefox.net> wrote:

> On Tue, Jan 28, 2020 at 11:28:14AM -0800, Mark Millard wrote:
>> 
>> 
>> On 2020-Jan-28, at 11:02, bob prohaska <fbsd at www.zefox.net> wrote:
>> 
>>> On Tue, Jan 28, 2020 at 09:42:17AM -0800, Mark Millard wrote:
>>>> 
>>>> 
>>>> 
>>> The (partly)modified kernel compiled and booted without
>>> obvious trouble. It's trying to finish buildworld now.
>>> 
> Stopped already, with 
> Jan 28 11:41:59 www kernel: pid 29909 (cc), jid 0, uid 0, was killed: fault's page allocation failed
> 

Yea, what I did in vm_pageout_oom for its messages does
indictae when the vm_pageout_oom(VM_OOM_MEM_PF)
happens of itself. So the printf before that call is
not essential.

With the bug that we have identified, this is the
expected notification until things are fixed.

> 
>>>> If you are testing with vm.pfault_oom_attempts="-1" then
>>>> the vm_fault printf message should never happen anyway.
>>>> 
>>> Would it not be interesting if the message appeared in that
>>> case? 
>> 
>> Thanks for the question: looking at the new code found a bug
>> causing oom where it used to be avoided in head -r357025 and
>> before.
> 
> 
> Glad to be of service, even if inadvertently 8-)
> 
> 
>> After vm_waitpfault(dset, vm_pfault_oom_wait * hz)
>> the -r357026 code does a vm_pageout_oom(VM_OOM_MEM_PF) no
>> matter what, even when vm_pfault_oom_attempts < 0 ||
>> fs->oom < vm_pfault_oom_attempts :
>> 
>> New code in head -r357026
>> ( nothing to avoid the vm_pageout_oom(VM_OOM_MEM_PF)
>> for vm_pfault_oom_attempts < 0 ||
>> fs->oom < vm_pfault_oom_attempts ):
>> 
>> 	if (fs->m == NULL) {
>> 		unlock_and_deallocate(fs);
>> 		if (vm_pfault_oom_attempts < 0 ||
>> 		    fs->oom < vm_pfault_oom_attempts) {
>> 			fs->oom++;
>> 			vm_waitpfault(dset, vm_pfault_oom_wait * hz);
>> 		}
>> 		if (bootverbose)
>> 			printf(
>> "proc %d (%s) failed to alloc page on fault, starting OOM\n",
>> 			    curproc->p_pid, curproc->p_comm);
>> 		vm_pageout_oom(VM_OOM_MEM_PF);
>> 		return (KERN_RESOURCE_SHORTAGE);
>> 	}
>> 
>> Old code in head -r357025
>> ( has the goto RetryFault_oom after vm_waitpfault(. . .),
>> thereby avoiding the vm_pageout_oom(VM_OOM_MEM_PF) for
>> vm_pfault_oom_attempts < 0 || fs->oom < vm_pfault_oom_attempts ) :
>> 
>> 			if (fs.m == NULL) {
>> 				unlock_and_deallocate(&fs);
>> 				if (vm_pfault_oom_attempts < 0 ||
>> 				    oom < vm_pfault_oom_attempts) {
>> 					oom++;
>> 					vm_waitpfault(dset,
>> 					    vm_pfault_oom_wait * hz);
>> 					goto RetryFault_oom;
>> 				}
>> 				if (bootverbose)
>> 					printf(
>> 	"proc %d (%s) failed to alloc page on fault, starting OOM\n",
>> 					    curproc->p_pid, curproc->p_comm);
>> 				vm_pageout_oom(VM_OOM_MEM_PF);
>> 				goto RetryFault;
>> 			}
>> 
>> I expect this is the source of the behavioral
>> difference folks have been seeing for OOM kills.
>> 
>> 
>> As for "gather evidence" messages . . .
>> 
>>>> You may be able to just look and manually delete or
>>>> comment out the bootverbose line in the more modern
>>>> source that currently looks like:
>>>> 
>>>> 		if (bootverbose)
>>>> 			printf(
>>>> "proc %d (%s) failed to alloc page on fault, starting OOM\n",
>>>> 			    curproc->p_pid, curproc->p_comm);
>>>> 		vm_pageout_oom(VM_OOM_MEM_PF);
>>>> 		return (KERN_RESOURCE_SHORTAGE);
>>>> 
>>> 
>>> I can find those lines in /usr/src/sys/vm/vm_fault.c, but
>>> unclear on the motivation to comment the lines out. Perhaps 
>>> to eliminate the return(...) ?  Anyway, is it sufficient 
>>> to insert /* before and */ after? 
>> 
>> The only line to delete or comment out in that
>> code block is:
>> 
>> 		if (bootverbose)
>> 
>> Disabling that line makes the following printf
>> always happen, even when a verbose boot was not
>> done.
> Oops, it's commented out now and the kernel is rebuilding.

Not a big deal, given the "was killed: fault's page
allocation failed" message that is separately generated.

>> 
>> Based on the above reported code change, having
>> a message before vm_pageout_oom(VM_OOM_MEM_PF) is
>> important to getting a report of the kill being
>> via that code.
>> 

I did not think of what I'd done in vm_pageout_oom
when I wrote that.

My hope is that at least something like what I did
in vm_pageout_oom for message content will be adopted
so the notices are accurate to context and more
traceable.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Received on Tue Jan 28 2020 - 20:31:52 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:22 UTC