2009/7/29 John Baldwin <jhb_at_freebsd.org>: > On Tuesday 28 July 2009 10:43:36 pm Attilio Rao wrote: >> 2009/5/23 Stefan Bethke <stb_at_lassitu.de>: >> > I wrote: >> > >> >> Syncing disks, vnodes remaining...0 done >> >> All buffers synced. >> >> GEOM_MIRROR: Device diesel_root: provider mirror/diesel_root destroyed. >> >> Uptime: 6m32s >> >> GEOM_MIRROR: Device diesel_root destroyed. >> >> Rebooting... >> >> cpu_reset: Stopping other CPUs >> >> spin lock 0xffffffff8078c900 (sched lock 1) held by 0xffffff00014d4ab0 >> >> (tid 100002) too long >> >> panic: spin lock held too long >> >> cpuid = 0 >> >> KDB: enter: panic >> >> [thread pid 77 tid 100090 ] >> >> Stopped at kdb_enter+0x3d: movq $0,0x48bbd0(%rip) >> >> db> bt >> >> Tracing pid 77 tid 100090 td 0xffffff000457bab0 >> >> kdb_enter() at kdb_enter+0x3d >> >> panic() at panic+0x17b >> >> _mtx_lock_spin_failed() at _mtx_lock_spin_failed+0x39 >> >> _mtx_lock_spin() at _mtx_lock_spin+0x9e >> >> _mtx_lock_spin_flags() at _mtx_lock_spin_flags+0x72 >> >> sched_balance_group() at sched_balance_group+0xc5 >> >> sched_balance_group() at sched_balance_group+0x1f8 >> >> sched_balance() at sched_balance+0xa2 >> >> sched_clock() at sched_clock+0xf6 >> >> statclock() at statclock+0xbd >> >> lapic_handle_timer() at lapic_handle_timer+0x197 >> >> Xtimerint() at Xtimerint+0x8c >> >> --- interrupt, rip = 0xffffffff80541cc4, rsp = 0xffffff80771dba90, rbp = >> >> 0xffffff80771dbab0 --- >> >> DELAY() at DELAY+0x64 >> >> cpu_reset() at cpu_reset+0xdd >> >> boot() at boot+0x2e6 >> >> reboot() at reboot+0x42 >> >> syscall() at syscall+0x1a5 >> >> Xfast_syscall() at Xfast_syscall+0xd0 >> >> --- syscall (55, FreeBSD ELF64, reboot), rip = 0x800788eec, rsp = >> >> 0x7fffffffeca8, rbp = 0 --- >> > >> > >> > I've only seen this once. If I should encounter it again, is there >> > something you'd like me to look at? >> >> [ Sorry, trying to add anyone who alredy reported such a problem even >> if I know many of you experienced it on -STABLE] >> >> Could you try this patch against -CURRENT: >> http://www.freebsd.org/~attilio/stop_nmi.diff >> >> This patch basically does 2 things: >> 1) Removing the STOP_NMI option, and adding the infrastructure for >> using NMI on KDB invocation and normal stop IPIs on standard cpu >> shutdown. >> In order to accomplish that and forsee a better design than what >> STOP_NMI does now, 2 new functions are introduced: * >> ipi_hstop_selected() which does, if the architecture offers such an >> option, the possibility to send a "forced" IPI through a privileged >> channel (NMI on amd64 and ia32) in order to stop CPUs passed in the >> mask. Note that for the other architectures that are not amd64 and >> ia32 ipi_hstop_selected() is defaulted to ipi_selected(..., STOP_IPI), >> but if maintainers want to override that they can simply implement >> something harder > > Why not just add a new IPI_STOP_HARD that maps to IPI_STOP on most archs and > does the NMI logic on x86. This avoids adding a new API > (ipi_hstop_selected()) instead just adding a new logical IPI. When choosing among the two, as long as we had API like ipi_all_but_self() I thought we gave preference to more explicit API toward logical ones. Anyways I can reimplement in that way if any, it is something I like more as well. Just want to know if that fixes the problem for the users right now. Thanks, Attilio -- Peace can only be achieved by understanding - A. EinsteinReceived on Wed Jul 29 2009 - 12:13:24 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:52 UTC