Re: Deadlock involving truss -f, pdfork() and wait4()

From: Mariusz Zaborski <oshogbo_at_freebsd.org>
Date: Fri, 13 Sep 2019 16:05:21 +0200
Hello Ryan,

Can you verify is this patch fix your issue:
https://reviews.freebsd.org/D20362

Thanks,
Mariusz

On Thu, 12 Sep 2019 at 21:37, Ryan Stone <rysto32_at_gmail.com> wrote:
>
> I've hit an issue with a simple use of pdfork().  I have a process
> that calls pdfork() and the parent immediately does a wait4() on the
> child pid.  This works fine under normal conditions, but if the parent
> is run under truss -f, the three processes deadlock.  If I switch out
> pdfork() for fork(), the deadlock does not occur.
>
> This C file demonstrates the issue:
>
> https://people.freebsd.org/~rstone/pdfork.c
>
> If I run "truss -f ./pdfork", which uses fork(), it completes within a
> second.  If I run "truss -f ./pdfork -p", which uses pdfork(), the
> processes deadlock.  If I run "./pdfork -p" without truss, it
> completes normally.
>
> procstat reports the following kernel stacks:
>
> 27572 102043 truss               -                   mi_switch+0xe2
> sleepq_catch_signals+0x425 sleepq_wait_sig+0xf _sleep+0x1bf
> kern_wait6+0x695 sys_wait6+0x9f amd64_syscall+0x36e
> fast_syscall_common+0x101
> 27573 102469 pdfork              -                   mi_switch+0xe2
> sleepq_catch_signals+0x425 sleepq_wait_sig+0xf _sleep+0x1bf
> kern_wait6+0x695 sys_wait4+0x78 amd64_syscall+0x36e
> fast_syscall_common+0x101
> 27574 102053 pdfork              -                   mi_switch+0xe2
> thread_suspend_switch+0xd4 ptracestop+0x13b fork_return+0x14e
> fork_exit+0x83 fork_trampoline+0xe
>
> As near as I can tell, truss is blocked waiting for ptrace events, the
> parent process is blocked in wait4, and the child process is perhaps
> waiting for its parent to exit the kernel so it can send the ptrace
> event?
>
> I really don't see anything obvious in the pdfork() code path that
> would cause this to happen when fork() doesn't have the problem.  It
> may be that pdfork() just changes the timing enough to expose a latent
> bug.
>
> I'm seeing this on a recentish current (r351363).
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
Received on Fri Sep 13 2019 - 12:05:37 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:21 UTC