Re: acpi timer reads all ones [Was: efirtc + atrtc at the same time]

From: Justin Hibbits <chmeeedalf_at_gmail.com>
Date: Wed, 27 May 2020 08:41:07 -0500
On Wed, 27 May 2020 06:27:16 -0700
John Baldwin <jhb_at_FreeBSD.org> wrote:

> On 5/27/20 2:39 AM, Andriy Gapon wrote:
> > On 27/05/2020 11:13, Andriy Gapon wrote:  
> >> I added more diagnostics and it seems to support the idea that the
> >> problem is related to I/O cycles and bridges.
> >>
> >> ACPI timer suddenly starts returning 0xffffffff and that lasts for
> >> tens of microseconds before the timer goes back to returning
> >> normal values with an expected increase.
> >> AMD provides a proprietary way to access ACPI registers via MMIO
> >> (0xfed808xx). That mechanism is unaffected, ACPI timer register
> >> always returns good values.
> >>
> >> The problem seems to happen when restoring configuration of a
> >> particular PCI bridge.  What's interesting is that the bridge
> >> decodes one memory range and one I/O range.
> >>
> >> Looking at pci_cfg_restore() I wonder if it is wise to restore
> >> PCIR_COMMAND so early.  Could it be that after the resume the
> >> bridge is configured with a wrong I/O range (e.g., too wide) and
> >> by writing PCIR_COMMAND we enable that decoding. So, the bridge
> >> steals I/O cycles destined for ACPI support hardware.  If there is
> >> nothing behind the bridge to handle those ports, then we get those
> >> bad readings. Once the bridge configuration is fully restored, the
> >> I/O handling goes back to normal.  
> > 
> > From what I see, this looks like a BIOS bug.
> > Upon resume, it swaps window configurations of pcib1 and pcib2
> > (until FreeBSD restores them).  pcib1 originally does not have an
> > I/O window.  So, BIOS programs both base and limit of pcib2 I/O
> > window to zero.   When FreeBSD writes its command register to
> > enable I/O decoding it starts claiming 0x0 - 0xFFF I/O port range.
> > That covers the ACPI ports at 0x8xx.
> > 
> > Some printf-s.
> > From (verbose) boot time:
> > pcib1:   domain            0
> > pcib1:   secondary bus     1
> > pcib1:   subordinate bus   1
> > pcib1:   memory decode     0xfea00000-0xfeafffff
> > pcib2:   domain            0
> > pcib2:   secondary bus     2
> > pcib2:   subordinate bus   2
> > pcib2:   I/O decode        0xf000-0xffff
> > pcib2:   memory decode     0xfe900000-0xfe9fffff
> > 
> > My printf-s from resume time:
> > pcib1: old I/O base (low): 0xf1
> > pcib1: old I/O base (high): 0x0
> > pcib1: old I/O limit (low): 0x1
> > pcib1: old I/O limit (high): 0x0
> > pcib2: old I/O base (low): 0x1
> > pcib2: old I/O base (high): 0x0
> > pcib2: old I/O limit (low): 0x1
> > pcib2: old I/O limit (high): 0x0  
> 
> The "solution" I think is to have resume be multi-pass and to resume
> all the bridges first before trying to resume leaf devices (including
> timers), but that's a fair bit of work.  It might be that we just
> need to resume timer interrupts later after the new-bus resume (I
> think we currently do it before?), though the reason for that was to
> allow resume methods in devices to sleep (I'm not sure if any do).
> 

That sounds like a good fit for https://reviews.freebsd.org/D203 .
Someone (TM) just needs to take it over the finish line... 6 years
later.

- Justin
Received on Wed May 27 2020 - 11:41:12 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:24 UTC