Re: thread suspension when dumping core

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Wed, 8 Jun 2016 16:56:35 +0300
On Wed, Jun 08, 2016 at 06:35:08AM -0700, Mark Johnston wrote:
> On Wed, Jun 08, 2016 at 07:30:55AM +0300, Konstantin Belousov wrote:
> > On Tue, Jun 07, 2016 at 11:19:19PM +0200, Jilles Tjoelker wrote:
> > > I also wonder whether we may be overengineering things here. Perhaps
> > > the advlock sleep can simply turn off TDF_SBDRY.
> > Well, this was the very first patch suggested.  I would be fine with that,
> > but again, out-of-tree code seems to be not quite fine with that local
> > solution.
> 
> In our particular case, we could possibly use a similar approach. In
> general, it seems incorrect to clear TDF_SBDRY if the thread calling
> sx_sleep() has any locks held. It is easy to verify that all callers of
> lf_advlock() are safe in this respect, but this kind of auditing is
> generally hard. In fact, I believe the sx_sleep that led to the problem
> described in D2612 is the same as the one in my case. That is, the
> sleeping thread may or may not hold a vnode lock depending on context.

I do not think that in-tree code sleeps with a vnode lock held in
the lf_advlock().  Otherwise, system would hang in lock cascade by
an attempt to obtain an advisory lock.  I think we can even assert
this with witness.

There is another sleep, which Jilles mentioned, in lf_purgelocks(),
called from vgone(). This sleep indeed occurs under the vnode lock, and
as such must be non-suspendable. The sleep waits until other threads
leave the lf_advlock() for the reclaimed vnode, and they should leave in
deterministic time due to issued wakeups.  So this sleep is exempt from
the considerations, and TDF_SBDRY there is correct.

I am fine with either the braces around sx_sleep() in lf_advlock() to
clear TDF_SBDRY (sigdeferstsop()), or with the latest patch I sent,
which adds temporal override for TDF_SBDRY with TDF_SRESTART. My
understanding is that you prefer the later. If I do not mis-represent
your position, I understand why you do prefer that.
Received on Wed Jun 08 2016 - 11:56:46 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:05 UTC