Re: call suspend_cpus() under smp_ipi_mtx

From: John Baldwin <jhb_at_freebsd.org>
Date: Mon, 1 Apr 2013 10:52:18 -0400
On Saturday, March 23, 2013 5:48:50 am Andriy Gapon wrote:
> 
> Looks like this issue needs more thinking and discussing.
> 
> The basic idea is that suspend_cpus() must be called with smp_ipi_mtx held (on
> SMP systems).
> This is for exactly the same reasons as to why we first take smp_ipi_mtx before
> calling stop_cpus() in the shutdown path.  Essentially one CPU could be holding
> smp_ipi_mtx (and thus with interrupts disabled[*]) and waiting for an
> acknowledgement from other CPUs (e.g. in smp_rendezvous or in a TLB shootdown),
> while another CPU could be with interrupts disabled (explicitly - like in the
> shutdown or ACPI suspend paths) and trying to deliver an IPI to other CPUs.
> 
> In my opinion, we must consistently use the same lock, smp_ipi_mtx, for all
> regular (non-NMI) synchronous IPI-based communication between CPUs.  Otherwise a
> deadlock is quite possible.
> 
> Some obstacles for just going ahead and making the suggested change:
> 
> - acpi_sleep_machdep() calls intr_suspend() with interrupts disabled; currently
> witness(9) is not aware of that, but if smp_ipi_mtx spin-lock is used, then we
> would have to make intr_table_lock and msi_lock the spin-locks as well;
> - AcpiLeaveSleepStatePrep() (from ACPICA) is called with interrupts disabled and
> currently it performs an action that requires memory allocation; again, with
> interrupts disabled via intr_disable() this fact is not visible to witness, etc,
> but with smp_ipi_mtx it needs to be somehow handled.
> 
> I talked to ACPICA guys about the last issue and they told me that what is
> currently done in AcpiLeaveSleepStatePrep does not need to be with interrupts
> disabled and can be moved to AcpiLeaveSleepState.  This is after the _BFS and
> _GTS support was removed.
> 
> What do you think?
> Thank you.

Hmm, I think intr_table_lock used to be a spin lock at some point.  I don't remember
why we changed it to a regular mutex.  It may be that there was a lock order reason
for that. :(

-- 
John Baldwin
Received on Mon Apr 01 2013 - 13:36:00 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:36 UTC