On Fri, Mar 17, 2017 at 11:26:43AM -0700, Bryan Drewery wrote: > On 3/10/2017 2:44 AM, Konstantin Belousov wrote: > > On Fri, Mar 10, 2017 at 12:11:51AM +0100, Jilles Tjoelker wrote: > >> This patch introduces a subtle correctness bug. A real SIGCHLD ksiginfo > >> should always be the zombie's p_ksi; otherwise, the siginfo may be lost > >> if there are too many signals pending for the target process or in the > >> system. If the siginfo is lost and the reaper normally passes si_pid to > >> waitpid() or similar (instead of passing WAIT_ANY or P_ALL), a zombie > >> will remain until the reaper terminates. > >> > >> Conceptually the siginfo is sent to one process at a time only, so the > >> bug is an artifact of the implementation. Perhaps the piece of code > >> added in r309886 can be moved or the ksiginfo can be removed from the > >> parent's queue. > >> > >> If such a fix is not possible, it may be better to send a bare SIGCHLD > >> (si_code is SI_KERNEL or 0, depending on how many signals are pending) > >> in this situation and document that reapers must use WAIT_ANY or P_ALL. > >> (However, compared to the pre-r309886 situation they can still use > >> SIGCHLD to get notified when to call waitpid() or similar.) > >> > > > > IMO it is acceptable for reaper to know and handle the case of lost > > siginfo. But also it seems to be not too hard to guarantee the queueing > > of the SIGCHLD for zombie re-parerning. > > > > diff --git a/sys/kern/kern_exit.c b/sys/kern/kern_exit.c > > index c524fe5df37..1a1dcc3a4c7 100644 > > --- a/sys/kern/kern_exit.c > > +++ b/sys/kern/kern_exit.c > > _at__at_ -189,6 +189,7 _at__at_ exit1(struct thread *td, int rval, int signo) > > { > > struct proc *p, *nq, *q, *t; > > struct thread *tdt; > > + ksiginfo_t *ksi, *ksi1; > > > > mtx_assert(&Giant, MA_NOTOWNED); > > KASSERT(rval == 0 || signo == 0, ("exit1 rv %d sig %d", rval, signo)); > > _at__at_ -449,14 +450,23 _at__at_ exit1(struct thread *td, int rval, int signo) > > wakeup(q->p_reaper); > > for (; q != NULL; q = nq) { > > nq = LIST_NEXT(q, p_sibling); > > + ksi = ksiginfo_alloc(TRUE); > > PROC_LOCK(q); > > q->p_sigparent = SIGCHLD; > > > > if (!(q->p_flag & P_TRACED)) { > > proc_reparent(q, q->p_reaper); > > if (q->p_state == PRS_ZOMBIE) { > > + if (q->p_ksi == NULL) { > > + ksi1 = NULL; > > + } else { > > + ksiginfo_copy(q->p_ksi, ksi); > > + ksi->ksi_flags |= KSI_INS; > > + ksi1 = ksi; > > + ksi = NULL; > > + } > > PROC_LOCK(q->p_reaper); > > - pksignal(q->p_reaper, SIGCHLD, q->p_ksi); > > + pksignal(q->p_reaper, SIGCHLD, ksi1); > > PROC_UNLOCK(q->p_reaper); > > } > > } else { > > _at__at_ -489,6 +499,8 _at__at_ exit1(struct thread *td, int rval, int signo) > > kern_psignal(q, SIGKILL); > > } > > PROC_UNLOCK(q); > > + if (ksi != NULL) > > + ksiginfo_free(ksi); > > } > > > > /* > > > Ping? Is this still progressing to be committed? r315159 ?Received on Fri Mar 17 2017 - 18:05:25 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:10 UTC