acpi timer reads all ones [Was: efirtc + atrtc at the same time]

From: Andriy Gapon <avg_at_FreeBSD.org> Date: Tue, 26 May 2020 18:22:13 +0300 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:24 UTC

On 25/05/2020 11:37, Andriy Gapon wrote:
> Also, there is another issue related to atrtc.
> When I have both drivers attached, and also when I have only atrtc attached
> (efi.rt.disabled=1), system clock jumps 10 minutes forward after each suspend /
> resume cycle (S0 -> S3 -> S0).  That does not happen for reboot and shutdown
> cycles.  I haven't investigated this deeper, but it is a curious problem.

Actually, I was wrong.  The problem can also occur with efirtc alone.
Also, sometimes there is a different problem where there are no callouts for a
period of time on the order of minutes.  I tracked it to cc_lastscan being set
to a value greater than the current uptime.  So, any scheduled callout gets
scheduled at cc_lastscan and it is a while before the uptime catches up.

It seemed that both issues were connected and were a result of the uptime
jumping forward by some minutes and then jumping back to a sane value.
If something important happened during the weird period, like getting time of
day from hardware or invoking a callout, it lead to the observed effects.

So, that gave me some ideas where to add debugging checks.
What I determined is that ACPI timer (ACPI-fast) could produce a reading of all
1-s like happens when there is no hardware response.

I caught one such instance and got a stack trace for it (but no crash dump
because devices had not resumed yet):
tc_windup() at tc_windup+0x318/frame 0xfffffe00a7a19300
tc_ticktock() at tc_ticktock+0x4b/frame 0xfffffe00a7a19320
hardclock() at hardclock+0x107/frame 0xfffffe00a7a19360
handleevents() at handleevents+0xb3/frame 0xfffffe00a7a193a0
timercb() at timercb+0x196/frame 0xfffffe00a7a193f0
lapic_handle_timer() at lapic_handle_timer+0x98/frame 0xfffffe00a7a19420
Xtimerint() at Xtimerint+0xb1/frame 0xfffffe00a7a19420
--- interrupt, rip = 0xffffffff80b34500, rsp = 0xfffffe00a7a194f8, rbp =
0xfffffe00a7a19540 ---
acpi_pcib_write_config() at acpi_pcib_write_config/frame 0xfffffe00a7a19540
pci_cfg_restore() at pci_cfg_restore+0x2cc/frame 0xfffffe00a7a195a0
pci_resume_child() at pci_resume_child+0xee/frame 0xfffffe00a7a195e0
pci_resume() at pci_resume+0x49/frame 0xfffffe00a7a19630
bus_generic_resume_child() at bus_generic_resume_child+0x43/frame 0xfffffe00a7a19650
bus_generic_resume() at bus_generic_resume+0x29/frame 0xfffffe00a7a19680
bus_generic_resume_child() at bus_generic_resume_child+0x43/frame 0xfffffe00a7a196a0
bus_generic_resume() at bus_generic_resume+0x29/frame 0xfffffe00a7a196d0
bus_generic_resume_child() at bus_generic_resume_child+0x43/frame 0xfffffe00a7a196f0
bus_generic_resume() at bus_generic_resume+0x29/frame 0xfffffe00a7a19720
bus_generic_resume_child() at bus_generic_resume_child+0x43/frame 0xfffffe00a7a19740
root_resume() at root_resume+0x29/frame 0xfffffe00a7a19770
acpi_EnterSleepState() at acpi_EnterSleepState+0x73b/frame 0xfffffe00a7a197f0
acpi_AckSleepState() at acpi_AckSleepState+0x144/frame 0xfffffe00a7a19820
devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfffffe00a7a19870
vn_ioctl() at vn_ioctl+0x132/frame 0xfffffe00a7a19980
devfs_ioctl_f() at devfs_ioctl_f+0x1e/frame 0xfffffe00a7a199a0
kern_ioctl() at kern_ioctl+0x27b/frame 0xfffffe00a7a19a00
sys_ioctl() at sys_ioctl+0x123/frame 0xfffffe00a7a19ad0
amd64_syscall() at amd64_syscall+0x140/frame 0xfffffe00a7a19bf0
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe00a7a19bf0

I am not sure if this is just a coincidence but it appears as if a write to some
PCI configuration register could temporarily interfere with access to the PM
timer I/O port.
Is that plausible?

I'll try to dig up more data.

-- 
Andriy Gapon