Re: thread suspension when dumping core

From: Jilles Tjoelker <jilles_at_stack.nl> Date: Tue, 7 Jun 2016 16:24:53 +0200 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:05 UTC

On Tue, Jun 07, 2016 at 07:29:56AM +0300, Konstantin Belousov wrote:
> On Mon, Jun 06, 2016 at 09:17:41PM -0700, Mark Johnston wrote:
> > Sure, see below. For reference:

> > td_flags = 0xa84c = INMEM | SINTR | CANSWAP | ASTPENDING | SBDRY | NEEDSUSPCHK
> > td_pflags = 0
> > td_inhibitors = 0x2 = SLEEPING
> > td_locks = 0

> > stack:
> > mi_switch+0x21e sleepq_catch_signals+0x377 sleepq_wait_sig+0xb _sleep+0x29d ...

> > p_flag = 0x10080080 = INMEM | STOPPED_SINGLE | HADTHREADS
> > p_flag2 = 0

> > The thread is sleeping interruptibly. The NEEDSUSPCHK flag is set, yet the
> > SLEEPABORT flag is not, so the thread can not have been sleeping when
> > thread_single() was called - else sleepq_abort() would have been
> > invoked and set SLEEPABORT. We are at the second sleepq_switch() call in
> > sleepq_catch_signals(), and no signal was pending, so we called
> > thread_suspend_check(), which returned 0 because of SBDRY. So we went to
> > sleep.

> > I note that this couldn't have happened prior to r283320. That change
> > was apparently motivated by a similar hang, but in that case the thread
> > was suspended (with a vnode lock held) rather than asleep. It looks like
> > our internal fix also added a change to set TDF_SBDRY around
> > filesystem-specific syscalls, which often sleep interruptibly while
> > holding vnode locks. But I don't think that's the problem here, as you
> > noted with lf_advlock().

> > With r283320 reverted, P_STOPPED_SIG would not have been set, so
> > thread_suspend_check() would have suspended us, allowing the core dump
> > to proceed. I had thought that using SINGLE_BOUNDRY beforing coredumping
> > would fix both hangs, but I guess that wouldn't help SINGLE_ALLPROC, so
> > this is probably the wrong place to be solving the problem.

> This looks as if we should not ignore suspension requests in
> thread_suspend_check() completely in TDF_SBDRY case, but return either
> EINTR or ERESTART (most likely ERESTART). Note that the goal of
> TDF_SBDRY is to avoid suspending in the protected region, not to make an
> impression that the suspension does not occur at all.

This looks like it would revert r246417 and re-introduce the bug fixed
by it (unexpected [EINTR] and short reads/writes after stop signals).

After r246417, TDF_SBDRY is intended for sleeps that occur while holding
resources such as vnode locks and are normally short but should be
interruptible by fatal signals because they may occasionally be
indefinitely long (such as a non-responsive NFS server).

It looks like yet another kind of sleep may be required, since advisory
locks still hold some filesystem resources across the sleep (though not
vnode locks).

We then have four kinds:

* uninterruptible by signals, ignores stops (default)
* interruptible by signals, ignores stops (current TDF_SBDRY with
  PCATCH)
* interruptible by signals, freezes in place on stops (avoids
  unexpected short I/O) (current PCATCH, otherwise)
* interruptible by signals, fails with [ERESTART] on stops (avoids
  holding resources across a stop) (new)

The new kind of sleep would fail with [ERESTART] only for stops, since
[EINTR] should only be returned if a signal handler was called. There
cannot be a signal handler since a SIGTSTP/SIGTTIN/SIGTTOU signal with a
handler does not stop the process.

-- 
Jilles Tjoelker