Re: ptrace attach in multi-threaded processes

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Wed, 13 Jul 2016 06:30:36 +0300
On Tue, Jul 12, 2016 at 11:24:14AM -0700, Mark Johnston wrote:
> On Tue, Jul 12, 2016 at 08:51:50PM +0300, Konstantin Belousov wrote:
> > On Tue, Jul 12, 2016 at 10:05:02AM -0700, Mark Johnston wrote:
> > > On Tue, Jul 12, 2016 at 08:57:53AM +0300, Konstantin Belousov wrote:
> > > I suppose it is not strictly incorrect. I find it surprising that a
> > > PT_ATTACH followed by a PT_DETACH may leave the process in a different
> > > state than it was in before the attach. This means that it is not
> > > possible to gcore a process without potentially leaving it stopped, for
> > > instance. This result may occur in a single-threaded process
> > > as well, since a signal may already be queued when the PT_ATTACH handler
> > > sends SIGSTOP.
> > I still miss somethine.  Isn't this an expected outcome from sending a
> > signal with STOP action ?
> 
> It is. But I also expect a PT_DETACH operation to resume a stopped
> process, assuming that a second SIGSTOP was not posted while the
> process was suspended.
But as far as the situation was discussed, it seems that real SIGSTOP raced
with PT_ATTACH. And the offered interpretation that SIGSTOP was delivered
'a bit later' than PT_ATTACH would fit into the description.

> 
> > 
> > > Indeed, I somehow missed that. I had assumed that the leaked TDB_XSIG
> > > represented a bug in ptracestop().
> > It could, I did not made any statements that deny the bug:
> 
> To be clear, the root of my issue comes from the following: the SIGSTOP
> from PT_ATTACH may be handled concurrently with a second signal
> delivered to a second thread in the same process. Then, the resulting
> behaviour depends on the order in which the recipient threads suspend in
> ptracestop(). If the thread that received SIGSTOP suspends last, its
> td_xsig will be overwritten with the userland-provided value in the
> PT_DETACH handler. If it suspends first, its td_xsig will be preserved,
> and upon PT_DETACH the process will be suspended again in issignal().
> 
> I'm not sure if this is considered a bug. ptracestop() is handling all
> signals (including the SIGSTOP generated by the PT_ATTACH handler) in a
> consistent way, but this results in inconsistent behaviour from the
> perspective of a ptrace(2) consumer.

Still I do not understand what is inconsistent.

Let look at it from the other side (before, we discussed the implementation
in kernel).  Is this happens in gcore(1) ?   If yes, gcore interaction
with ptrace(2) looks like this:
	ptrace(PT_ATTACH, g_pid);
	waitpid(g_pid, &g_status, 0);
	...
	if (sig == SIGSTOP)
		sig = 0;
	ptrace(PT_DETACH, g_pid, 1, sig);
It sounds as if it is desirable for you to modify gcore(1) to consume
all signals, or at least, all STOP signals before PT_DETACH.  I do not
understand why do you want it, but that would probably give you the
behaviour you want:
	ptrace(PT_ATTACH, g_pid);
	waitpid(g_pid, &g_status, 0);
	...
	/* still consume implicit SIGSTOP from attach */
	if (sig == SIGSTOP)
		sig = 0;
	do {
		error = waitpid(g_pid, &g_status, WNOHANG | WSTOPPED);
	} while (error == 0);	
	ptrace(PT_DETACH, g_pid, 1, sig);
Received on Wed Jul 13 2016 - 01:30:49 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:06 UTC