Re: Boot still broken from r349133-r349160 - Was re:(Problem with USB after r349133)

From: Hans Petter Selasky <hps_at_selasky.org>
Date: Tue, 13 Aug 2019 15:31:09 +0200
Hi,

After tearing ACPI apart, there appears to be an issue like following:

1) AcpiUtAcquireMutex() doesn't support recursion, but also fails to 
report an error when such a condition is occurring. Here is the 
backtrace of the illegal mutex recursion.

 > AcpiUtAcquireMutex() at AcpiUtAcquireMutex+0x1fc/frame 0xffffffff834815d0
 > AcpiWalkNamespace() at AcpiWalkNamespace+0x8a/frame 0xffffffff83481640
 > AcpiNsInitializeObjects() at AcpiNsInitializeObjects+0x9b/frame 
0xffffffff834816c0
 > AcpiExLoadTableOp() at AcpiExLoadTableOp+0x21c/frame 0xffffffff83481730
 > AcpiExOpcode_6A_0T_1R() at AcpiExOpcode_6A_0T_1R+0x22e/frame 
0xffffffff83481790
 > AcpiDsExecEndOp() at AcpiDsExecEndOp+0x1dc/frame 0xffffffff83481830
 > AcpiPsParseLoop() at AcpiPsParseLoop+0x75a/frame 0xffffffff83481880
 > AcpiPsParseAml() at AcpiPsParseAml+0xfd/frame 0xffffffff834818d0
 > AcpiPsExecuteMethod() at AcpiPsExecuteMethod+0x27d/frame 
0xffffffff83481940
 > AcpiNsEvaluate() at AcpiNsEvaluate+0x336/frame 0xffffffff834819b0
 > AcpiEvaluateObject() at AcpiEvaluateObject+0x223/frame 0xffffffff83481a10
 > AcpiEvaluateObjectTyped() at AcpiEvaluateObjectTyped+0xe0/frame 
0xffffffff83481aa0
 > acpi_EvaluateOSC() at acpi_EvaluateOSC+0xef/frame 0xffffffff83481b90
 > acpi_cpu_attach() at acpi_cpu_attach+0x432/frame 0xffffffff83481cb0
 > DEVICE_ATTACH() at DEVICE_ATTACH+0x87/frame 0xffffffff83481cf0
 > device_attach() at device_attach+0xb9/frame 0xffffffff83481d80
 > device_probe_and_attach() at device_probe_and_attach+0x106/frame 
0xffffffff83481dc0
 > bus_generic_attach() at bus_generic_attach+0x2c/frame 0xffffffff83481df0
 > acpi_probe_children() at acpi_probe_children+0x77/frame 
0xffffffff83481e30
 > acpi_attach() at acpi_attach+0xbfe/frame 0xffffffff83482050
 > DEVICE_ATTACH() at DEVICE_ATTACH+0x87/frame 0xffffffff83482090
 > device_attach() at device_attach+0xb9/frame 0xffffffff83482120
 > device_probe_and_attach() at device_probe_and_attach+0x106/frame 
0xffffffff83482160
 > bus_generic_attach() at bus_generic_attach+0x2c/frame 0xffffffff83482190
 > nexus_acpi_attach() at nexus_acpi_attach+0x59/frame 0xffffffff834821b0
 > DEVICE_ATTACH() at DEVICE_ATTACH+0x87/frame 0xffffffff834821f0
 > device_attach() at device_attach+0xb9/frame 0xffffffff83482280
 > device_probe_and_attach() at device_probe_and_attach+0x106/frame 
0xffffffff834822c0
 > bus_generic_new_pass() at bus_generic_new_pass+0xb5/frame 
0xffffffff83482300
 > BUS_NEW_PASS() at BUS_NEW_PASS+0x87/frame 0xffffffff83482340
 > bus_set_pass() at bus_set_pass+0x8f/frame 0xffffffff83482360
 > root_bus_configure() at root_bus_configure+0xe/frame 0xffffffff83482370
 > configure() at configure+0x11/frame 0xffffffff83482390
 > mi_startup() at mi_startup+0x2dc/frame 0xffffffff834823f0
 > btext() at btext+0x2c
 > ACPI Error: AE_ALREADY_ACQUIRED, During WalkNamespace 
(20190703/nsinit-232)


The illegal mutex recursion ends up leaking a lock, which later on 
causes a boot deadlock due to accesses to ACPI hanging forever.


2) This patch works around the issue.

 > diff --git a/sys/contrib/dev/acpica/components/utilities/utmutex.c 
b/sys/contrib/dev/acpica/components/utilities/utmutex.c
 > index 4853bf5c3a6..33a67a731c6 100644
 > --- a/sys/contrib/dev/acpica/components/utilities/utmutex.c
 > +++ b/sys/contrib/dev/acpica/components/utilities/utmutex.c
 > _at__at_ -378,6 +378,16 _at__at_ AcpiUtAcquireMutex (
 >
 >      ThisThreadId = AcpiOsGetThreadId ();
 >
 > +    if (AcpiGbl_MutexInfo[MutexId].ThreadId == ThisThreadId)
 > +    {
 > +       ACPI_ERROR ((AE_INFO,
 > +           "Mutex [%s] already acquired by this thread [%u]",
 > +           AcpiUtGetMutexName (MutexId),
 > +           (UINT32) ThisThreadId));
 > +
 > +       return (AE_ALREADY_ACQUIRED);
 > +    }
 > +
 >  #ifdef ACPI_MUTEX_DEBUG
 >      {
 >          UINT32

--HPS

On 2019-08-01 15:58, Scott Long wrote:
> I’m 99% sure that the boot breakage is due to this commit:
> 
> Author: jkim
> Date: Tue Jul  9 18:02:36 2019
> New Revision: 349863
> URL: https://svnweb.freebsd.org/changeset/base/349863
> 
> Log:
>   MFV:	r349861
> 
>   Import ACPICA 20190703.
> 
> I have two systems now that are affected, and both of them
> are “fixed” by reverting this.  I don’t know the root cause yet,
> see my email to the svn-src-all mailing list.
> 
Received on Tue Aug 13 2019 - 11:31:53 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:21 UTC