On Thu, Mar 09, 2017 at 04:46:46PM +0200, Konstantin Belousov wrote: > Yes, there is a race, apparently, with the child zombie still not finishing > sending the SIGCHLD to the parent and parent exiting. The following should > fix the issue, but I do not think that reproducing the problem is easy. > diff --git a/sys/kern/kern_exit.c b/sys/kern/kern_exit.c > index c524fe5df37..ba5ff84e9de 100644 > --- a/sys/kern/kern_exit.c > +++ b/sys/kern/kern_exit.c > _at__at_ -189,6 +189,7 _at__at_ exit1(struct thread *td, int rval, int signo) > { > struct proc *p, *nq, *q, *t; > struct thread *tdt; > + ksiginfo_t ksi; > > mtx_assert(&Giant, MA_NOTOWNED); > KASSERT(rval == 0 || signo == 0, ("exit1 rv %d sig %d", rval, signo)); > _at__at_ -456,7 +457,12 _at__at_ exit1(struct thread *td, int rval, int signo) > proc_reparent(q, q->p_reaper); > if (q->p_state == PRS_ZOMBIE) { > PROC_LOCK(q->p_reaper); > - pksignal(q->p_reaper, SIGCHLD, q->p_ksi); > + if (q->p_ksi != NULL) { > + ksiginfo_init(&ksi); > + ksiginfo_copy(q->p_ksi, &ksi); > + } > + pksignal(q->p_reaper, SIGCHLD, q->p_ksi != > + NULL ? &ksi : NULL); > PROC_UNLOCK(q->p_reaper); > } > } else { This patch introduces a subtle correctness bug. A real SIGCHLD ksiginfo should always be the zombie's p_ksi; otherwise, the siginfo may be lost if there are too many signals pending for the target process or in the system. If the siginfo is lost and the reaper normally passes si_pid to waitpid() or similar (instead of passing WAIT_ANY or P_ALL), a zombie will remain until the reaper terminates. Conceptually the siginfo is sent to one process at a time only, so the bug is an artifact of the implementation. Perhaps the piece of code added in r309886 can be moved or the ksiginfo can be removed from the parent's queue. If such a fix is not possible, it may be better to send a bare SIGCHLD (si_code is SI_KERNEL or 0, depending on how many signals are pending) in this situation and document that reapers must use WAIT_ANY or P_ALL. (However, compared to the pre-r309886 situation they can still use SIGCHLD to get notified when to call waitpid() or similar.) -- Jilles TjoelkerReceived on Thu Mar 09 2017 - 22:11:55 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:10 UTC