Re: ptrace attach in multi-threaded processes

From: Mark Johnston <markj_at_FreeBSD.org>
Date: Fri, 15 Jul 2016 11:01:59 -0700
On Fri, Jul 15, 2016 at 10:27:20AM +0300, Konstantin Belousov wrote:
> On Thu, Jul 14, 2016 at 11:16:05AM -0700, Mark Johnston wrote:
> > Please see the program here:
> > https://people.freebsd.org/~markj/ptrace_stop.c
> > 
> > It cheats a bit: it uses SIGSTOP to stop the child before sending a
> > SIGHUP to it. However, this is just for convenience; note that PT_ATTACH
> > will result in a call to thread_unsuspend() on the child, so PT_ATTACH's
> > SIGSTOP will be delivered to a running process. When ptrace attaches,
> > the child stops and WSTOPSIG(status) == SIGHUP. When ptrace detaches,
> > the child is left stopped.
> No, it is not for convenience, it relies on another bug to get the effect,
> see below.

I see. I should have noted that the result can be reproduced without the
first SIGSTOP, just not reliably. That is, I still occasionally get the
following output when the kill(SIGSTOP) and subsequent waitpid() call
are removed:

stopping signal is 1
waiting on child...
child is stopped after detach (sig 17)

> 
> As I understand you intent, you prefer to get SIGSTOP from the first
> waitpid(2) call after successful PT_ATTACH, am I right ?

Hm, I don't care very much about that. I was just addressing your claim
that the "debugger interface guarantees that SIGSTOP is noted." 

> At least for
> single-threaded case, this can be achieved with a flag indicating that
> we a doing first cursig(9) action after the attach, and preferring
> SIGSTOP over any other queued signal.  The new flag P2_PTRACE_FSTP
> does just that.  For mt case, I believe that some enchancements to
> my proc_next_xthread() would fix that.

This seems like a sound approach to me. It provides the guarantee I
referenced above, and ensures that the SIGSTOP from PT_ATTACH is
delivered before PT_DETACH.

> 
> But when debugging the code, I found that it still does not work reliably
> for your test.  The reason is that issignal() consumes a queued stop signal
> after the thread_suspend_switch().  It allows the attach to occur, but then
> sigqueue_delete() calls ('take the signal!') eat the signal for attach. It
> seems that we should consume stops before going to stop state.  An open
> question is how much this hurts when another (non-debugging) SIGSTOP is
> queued while in stopped state.
> 
> Please try this.

Thanks, this seems to give the desired behaviour in the single-threaded
case. I'll write a test case for the multi-threaded case next.

Am I correct in thinking that r302179 could be reverted if your change
is committed?
Received on Fri Jul 15 2016 - 15:58:42 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:06 UTC