I think I follow the proposal. Sure, I'll apply your patch and run with it on my SMP box. It may take a while to reach a conclusion on its merits due to the racy nature of the crash. On Thursday 03 November 2005 11:27 am, John Baldwin wrote: > On Sunday 09 October 2005 05:49 pm, Lonnie VanZandt wrote: > > Attached is the patch for the revised subr_kdb.c from FreeBSD 5.4 STABLE. > > (the rcsid is __FBSDID("$FreeBSD: src/sys/kern/subr_kdb.c,v 1.5.2.2.2.1 > > 2005/05/01 05:38:14 dwhite Exp $"); ) > > I've looked at this, but I think t could maybe be done slightly > differently. Here's a suggested patch that would close the race you are > seeing I think while allowing semantics such that if two CPUs try to enter > KDB at the same time, they would serialize and the second CPU would enter > kdb after the first had exited. Could you at least test it to see if it > addresses your race condition? > > --- //depot/projects/smpng/sys/kern/subr_kdb.c 2005/10/27 19:51:50 > +++ //depot/user/jhb/ktrace/kern/subr_kdb.c 2005/11/03 18:24:38 > _at__at_ -39,6 +39,7 _at__at_ > #include <sys/smp.h> > #include <sys/sysctl.h> > > +#include <machine/cpu.h> > #include <machine/kdb.h> > #include <machine/pcb.h> > > _at__at_ -462,12 +463,21 _at__at_ > return (0); > > /* We reenter the debugger through kdb_reenter(). */ > - if (kdb_active) > + if (kdb_active == PCPU_GET(cpuid) + 1) > return (0); > > critical_enter(); > > - kdb_active++; > + /* > + * If more than one CPU tries to enter KDB at the same time > + * then force them to serialize and go one at a time. > + */ > + while (!atomic_cmpset_int(&kdb_active, 0, PCPU_GET(cpuid) + 1)) { > + critical_exit(); > + while (kdb_active) > + cpu_spinwait(); > + critical_enter(); > + } > > #ifdef SMP > if ((did_stop_cpus = kdb_stop_cpus) != 0) > _at__at_ -484,13 +494,17 _at__at_ > > handled = kdb_dbbe->dbbe_trap(type, code); > > + /* > + * We have to exit KDB before resuming the other CPUs so that they > + * may run in a debugger-less context. > + */ > + kdb_active = 0; > + > #ifdef SMP > if (did_stop_cpus) > restart_cpus(stopped_cpus); > #endif > > - kdb_active--; > - > critical_exit(); > > return (handled);Received on Thu Nov 03 2005 - 18:31:58 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:47 UTC