Re: panic with out of memory

From: Alan Cox <alc_at_rice.edu> Date: Wed, 20 Jun 2012 11:44:01 -0500 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:28 UTC

On 06/20/2012 08:25, Konstantin Belousov wrote:
> On Wed, Jun 20, 2012 at 08:19:39AM -0400, John Baldwin wrote:
>> On Tuesday, June 19, 2012 9:30:59 pm Steve Wills wrote:
>>> Hi,
>>>
>>> I just got a panic out of my r237195 system. The panic looks like:
>>>
>>> Sleeping thread (tid 173153, pid 42034) owns a non-sleepable lock
>>> KDB: stack backtrace of thread 173153:
>>> sched_switch() at sched_switch+0x28a
>>> mi_switch() at mi_switch+0xdf
>>> sleepq_timedwait() at sleepq_timedwait+0x3a
>>> _sleep() at _sleep+0x266
>>> swp_pager_meta_build() at swp_pager_meta_build+0x259
>>> swap_pager_copy() at swap_pager_copy+0x17b
>>> vm_object_collapse() at vm_object_collapse+0x123
>>> vm_object_deallocate() at vm_object_deallocate+0x457
>>> vm_map_process_deferred() at vm_map_process_deferred+0x72
>>> vm_pageout_oom() at vm_pageout_oom+0x180
>>> swp_pager_meta_build() at swp_pager_meta_build+0x248
>>> swap_pager_copy() at swap_pager_copy+0x17b
>>> vm_object_collapse() at vm_object_collapse+0x123
>>> vm_object_deallocate() at vm_object_deallocate+0x457
>>> vm_map_process_deferred() at vm_map_process_deferred+0x72
>>> vm_map_remove() at vm_map_remove+0x116
>>> exec_new_vmspace() at exec_new_vmspace+0x1bc
>>> exec_elf64_imgact() at exec_elf64_imgact+0x5f4
>>> kern_execve() at kern_execve+0x6f0
>>> sys_execve() at sys_execve+0x37
>>> amd64_syscall() at amd64_syscall+0x351
>>> Xfast_syscall() at Xfast_syscall+0xfb
>>> --- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x800d2eddc, rsp =
>>> 0x7fffffffd328, rbp = 0x7fffffffd470 ---
>>> panic: sleeping thread
>>> cpuid = 4
>>>
>>> The system was very busy and using lots of swap, but I didn't expect a
>>> panic. If any more detail is needed or I should just get more RAM, let
>>> me know. :)
>> Hmm, this is due to a bug I noticed recently as well.  I had been talking
>> with Alan and Konstantin about the proper fix.  Hmm, thinking abou this some
>> more, perhaps a simpler fix would be to have a 'I'm already in
>> vm_map_process_deferred()' flag.  Or even better, just move the entire list
>> off into a static variable so that we don't get caught in recursion.
>> Something like this:
>>
>> Index: vm_map.c
>> ===================================================================
>> --- vm_map.c	(revision 237227)
>> +++ vm_map.c	(working copy)
>> _at__at_ -475,12 +475,14 _at__at_ static void
>>   vm_map_process_deferred(void)
>>   {
>>   	struct thread *td;
>> -	vm_map_entry_t entry;
>> +	vm_map_entry_t entry, next;
>>   	vm_object_t object;
>>
>>   	td = curthread;
>> -	while ((entry = td->td_map_def_user) != NULL) {
>> -		td->td_map_def_user = entry->next;
>> +	entry = td->td_map_def_user;
>> +	td->td_map_def_user = NULL;
>> +	while (entry != NULL) {
>> +		next = entry->next;
>>   		if ((entry->eflags&  MAP_ENTRY_VN_WRITECNT) != 0) {
>>   			/*
>>   			 * Decrement the object's writemappings and
>> _at__at_ -494,6 +496,7 _at__at_ vm_map_process_deferred(void)
>>   			    entry->end);
>>   		}
>>   		vm_map_entry_deallocate(entry, FALSE);
>> +		entry = next;
>>   	}
>>   }
> Yes, looks like it should work.

I'll add, "Me too."  I'm much happier with this than the previous patch.

Alan