Re: amd64 head -r329465 (non-debug build, but with symbols): "panic: spin lock held too long" during make check-old, reported during a sys_vfork

From: Mateusz Guzik <mjguzik_at_gmail.com>
Date: Tue, 20 Feb 2018 18:06:47 +0100
I missed a consumer, try this:

diff --git a/sys/kern/sys_procdesc.c b/sys/kern/sys_procdesc.c
index 5e8928cb1534..174fffc5c666 100644
--- a/sys/kern/sys_procdesc.c
+++ b/sys/kern/sys_procdesc.c
_at__at_ -398,7 +398,6 _at__at_ procdesc_close(struct file *fp, struct thread *td)
                         * process's reference to the process descriptor
when it
                         * calls back into procdesc_reap().
                         */
-                       PROC_SLOCK(p);
                        proc_reap(curthread, p, NULL, 0);
                } else {
                        /*


On Tue, Feb 20, 2018 at 5:50 PM, Juan Ramón Molina Menor <listjm_at_club.fr>
wrote:

> I committed the fix in
>> https://svnweb.freebsd.org/base?view=revision&revision=329542
>>
>> i.e. should be stable from this point on.
>>
>
> Hi!
>
> It is maybe unrelated, but recent commits have broken my system with a
> similar error. I did not have panics with a system built around December,
> but since updating first to r329555 then today to r329641 I’m getting a
> reproducible panic when logging out from a Lumina desktop session:
>
> Unread portion of the kernel message buffer:
> spin lock 0xfffff8000d440020 (process slock) held by 0xfffff8000daed560
> (tid 100111) too long
> panic: spin lock held too long
> cpuid = 1
> time = 1519143505
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfffffe00005c15e0
> vpanic() at vpanic+0x18d/frame 0xfffffe00005c1640
> panic() at panic+0x43/frame 0xfffffe00005c16a0
> _mtx_lock_indefinite_check() at _mtx_lock_indefinite_check+0x71/frame
> 0xfffffe00005c16b0
> mtx_spin_wait_unlocked() at mtx_spin_wait_unlocked+0x59/frame
> 0xfffffe00005c16e0
> proc_reap() at proc_reap+0x24/frame 0xfffffe00005c1720
> procdesc_close() at procdesc_close+0x125/frame 0xfffffe00005c1760
> closef() at closef+0x251/frame 0xfffffe00005c17f0
> fdescfree_fds() at fdescfree_fds+0x90/frame 0xfffffe00005c1840
> fdescfree() at fdescfree+0x4df/frame 0xfffffe00005c1900
> exit1() at exit1+0x508/frame 0xfffffe00005c1970
> sys_sys_exit() at sys_sys_exit+0xd/frame 0xfffffe00005c1980
> amd64_syscall() at amd64_syscall+0xa48/frame 0xfffffe00005c1ab0
> fast_syscall_common() at fast_syscall_common+0x101/frame 0x7fffffffea90
> Uptime: 17m45s
> Dumping 327 out of 3990 MB:..5%..15%..25%..35%..44%..5
> 4%..64%..74%..84%..93%
>
> Reading symbols from /boot/kernel/linux.ko...done.
> Loaded symbols for /boot/kernel/linux.ko
> Reading symbols from /boot/kernel/linux_common.ko...done.
> Loaded symbols for /boot/kernel/linux_common.ko
> Reading symbols from /boot/kernel/acpi_ibm.ko...done.
> Loaded symbols for /boot/kernel/acpi_ibm.ko
> Reading symbols from /boot/kernel/iwm7260fw.ko...done.
> Loaded symbols for /boot/kernel/iwm7260fw.ko
> Reading symbols from /boot/kernel/coretemp.ko...done.
> Loaded symbols for /boot/kernel/coretemp.ko
> Reading symbols from /boot/kernel/if_iwm.ko...done.
> Loaded symbols for /boot/kernel/if_iwm.ko
> Reading symbols from /boot/kernel/acpi_video.ko...done.
> Loaded symbols for /boot/kernel/acpi_video.ko
> Reading symbols from /boot/kernel/nullfs.ko...done.
> Loaded symbols for /boot/kernel/nullfs.ko
> Reading symbols from /boot/kernel/fdescfs.ko...done.
> Loaded symbols for /boot/kernel/fdescfs.ko
> Reading symbols from /boot/kernel/i915kms.ko...done.
> Loaded symbols for /boot/kernel/i915kms.ko
> Reading symbols from /boot/kernel/drm2.ko...done.
> Loaded symbols for /boot/kernel/drm2.ko
> Reading symbols from /boot/kernel/iicbus.ko...done.
> Loaded symbols for /boot/kernel/iicbus.ko
> Reading symbols from /boot/kernel/iic.ko...done.
> Loaded symbols for /boot/kernel/iic.ko
> Reading symbols from /boot/kernel/iicbb.ko...done.
> Loaded symbols for /boot/kernel/iicbb.ko
> #0  cpustop_handler () at /usr/src/sys/x86/x86/mp_x86.c:1324
> 1324            CPU_SET_ATOMIC(cpu, &stopped_cpus);
> (kgdb) bt
> #0  cpustop_handler () at /usr/src/sys/x86/x86/mp_x86.c:1324
> #1  0xffffffff80e29fb4 in ipi_nmi_handler () at
> /usr/src/sys/x86/x86/mp_x86.c:1280
> #2  0xffffffff80d09a79 in trap (frame=0xffffffff8158bef0)
>     at /usr/src/sys/amd64/amd64/trap.c:188
> #3  0xffffffff80cec054 in nmi_calltrap () at /usr/src/sys/amd64/amd64/excep
> tion.S:633
> #4  0xffffffff80e1aaef in acpi_cpu_idle_mwait (mwait_hint=0) at
> cpufunc.h:611
> Previous frame inner to this frame (corrupt stack?)
> Current language:  auto; currently minimal
>
> kgdb is over my head, but I can provide more details under some guidance.
>
> Hope it helps,
> Juan
>
>


-- 
Mateusz Guzik <mjguzik gmail.com>
Received on Tue Feb 20 2018 - 16:06:48 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:15 UTC