Re: Swapped out procs not brought in immediately after child exits

From: David Xu <davidxu_at_freebsd.org>
Date: Sun, 06 Mar 2005 09:30:10 +0800
Sam Lawrance wrote:

>>Submitter-Id:  current-users
>>Originator:    Sam Lawrance
>>Confidential:  no 
>>Synopsis:      Swapped out procs not brought in immediately after child exits
>>Severity:      non-critical
>>Priority:      medium
>>Category:      kern
>>Class:         sw-bug
>>Release:       FreeBSD 5.4-PRERELEASE i386
>>Environment:
>>    
>>
>System: FreeBSD dirk.no.domain 5.4-PRERELEASE FreeBSD 5.4-PRERELEASE #10: Sun Ma
>r 6 10:45:13 EST 2005 root_at_dirk.no.domain:/usr/testbuild/src5/sys/i386/compile/G
>ENERIC i386
>
>
>  
>
>>Description:
>>    
>>
>
>I run -stable on my lonely box, but AFAICS this affects current.
>
>This problem is similar in flavour to one that I reported a while ago,
>since fixed.
>
>Here's an example. Below we have a login, shell and su which have
>swapped out, and a shell which is active:
>
>root 4291  0.0  0.0  1664     0  v3  IWs  -         0:00.00 login [pam] (login)
>sam  4298  0.0  0.0  2260     0  v3  IW   -         0:00.00 -bash (bash)
>root 4299  0.0  0.0  1644     0  v3  IW   -         0:00.00 su
>root 4300  0.0  0.4  2952  1132  v3  S+    3:23PM   0:00.66 su (bash)
>
>When 4300 exits, it will sit in the zombie state for a long
>time, waiting for 4299 to be swapped in.  Same for 4299 and 4298.
>
>The kernel call stack for 4300 would be something like
>
>	exit1
>	  kern_exit
>	    wakeup (parent process as wait channel)
>	      sleepq_broadcast
>	        sleepq_resume_thread (on parent process)
>	          setrunnable
>
>In setrunnable, curthread->td_pflags is flagged with TDP_WAKEPROC0 to
>indicate the vm scheduler should be awoken to do its thing.
>
>David Xu's original change was to check for TDP_WAKEPROC0 in
>critical_exit() and wakeup(&proc0) from there. Things were arranged
>this way in order to prevent an LOR between sched_lock and sleepqueue
>locks.
>
>  
>
My first patch is to put the TDP_WAKEPROC0 on per-cpu, so when you switch
to another thread, there must have a critical_exit(), but scottl told me 
that
using per-cpu reduces performance in visible degree, I assume he is uing
badly designed P4 --- long pipeline core. At least on PIII, I does not 
see the
performance reduced if using per-cpu flag.

David Xu

>That scheme doesn't take into account that exit1() does a
>critical_enter() that has no corresponding critical_exit() in that
>thread (because the exiting thread grabs sched_lock which is held across
>cpu_throw).
>
>So the wakeup is not done, and we just have to wait for the vm's tsleep
>on proc0 to time out. The same thing might occur across other exit
>points, but I don't know what they are.
>
>  
>
>>How-To-Repeat:
>>    
>>
>
>Run a shell somewhere (first). Su or run another shell or similar (second).
>Wait until the first shell has swapped out (might require running some other
>memory hogs). Exit the second shell. Notice that the second shell takes a
>long time to exit.
>
>  
>
>>Fix:
>>    
>>
>
>A possible solution might be to wakeup(&proc0) after waking the parent
>and before grabbing sched_lock:
>
>Index: kern_exit.c
>===================================================================
>RCS file: /home/ncvs/FreeBSD/src/sys/kern/kern_exit.c,v
>retrieving revision 1.256
>diff -u -r1.256 kern_exit.c
>--- kern_exit.c	29 Jan 2005 14:03:41 -0000	1.256
>+++ kern_exit.c	6 Mar 2005 01:17:35 -0000
>_at__at_ -503,6 +503,7 _at__at_
> 	mtx_unlock_spin(&sched_lock);
> 	wakeup(p->p_pptr);
> 	PROC_UNLOCK(p->p_pptr);
>+	wakeup(&proc0);
> 	mtx_lock_spin(&sched_lock);
> 	critical_exit();
> 
>_______________________________________________
>freebsd-current_at_freebsd.org mailing list
>http://lists.freebsd.org/mailman/listinfo/freebsd-current
>To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
>
>
>  
>
Received on Sun Mar 06 2005 - 00:30:05 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:29 UTC