On Saturday 21 January 2006 01:05, Thierry Herbelot wrote: > Le Wednesday 4 January 2006 14:38, John Baldwin a écrit : > > On Wednesday 04 January 2006 02:06 am, Thierry Herbelot wrote: > > [SNIP previous similar panic] > > > Next time you get this, can you use 'show threads' to figure out the tid > > for the thread whose pointer is in the printf (0xc16de480 in this case) > > and then do a trace of that thread? > > Hello, > > Here is a more detailed crash session : > > is this (zomb) problematic ? (in ps) : > 8 c182e228 0 1 0 0002204 zomb[INACTIVE] g_mirror gm0s1 > > I keep the machine in DDB, if there are more detailed commands to > investigate the panic (the machine is an SMP BP6, runs a GENERIC current > kernel, and stores its local files in two g_mirror partitions). > > The problematic spinlock is held by 0xc16de340 which is cpustop_handler. > > TfH > > PS : printout of the crash : > > # reboot > Waiting (max 60 seconds) for system process `vnlru' to stop...done > Waiting (max 60 seconds) for system process `bufdaemon' to stop...done > Waiting (max 60 seconds) for system process `syncer' to stop... > Syncing disks, vnodes remaining...3 2 2 2 0 0 done > All buffers synced. > Uptime: 39m52s > GEOM_MIRROR: Device files1: provider mirror/files1 destroyed. > GEOM_MIRROR: Device files1 destroyed. > GEOM_MIRROR: Device gm0s1: provider mirror/gm0s1 destroyed. > GEOM_MIRROR: Device gm0s1 destroyed. > Rebooting... > cpu_reset: Stopping other CPUs > spin lock sched lock held by 0xc16de340 for > 5 seconds > panic: spin lock held too long Ok, it's not a fatal panic in that your disks should already be clean at this point, etc. You can try this hack to see if it fixes it: Index: vm_machdep.c =================================================================== RCS file: /usr/cvs/src/sys/i386/i386/vm_machdep.c,v retrieving revision 1.267 diff -u -r1.267 vm_machdep.c --- vm_machdep.c 14 Nov 2005 00:43:44 -0000 1.267 +++ vm_machdep.c 23 Jan 2006 20:49:21 -0000 _at__at_ -533,6 +533,7 _at__at_ ; /* Wait for other cpu to see that we've started */ stop_cpus((1<<cpu_reset_proxyid)); printf("cpu_reset_proxy: Stopped CPU %d\n", cpu_reset_proxyid); + disable_intr(); DELAY(1000000); cpu_reset_real(); } _at__at_ -581,6 +582,7 _at__at_ /* NOTREACHED */ } + disable_intr(); DELAY(1000000); } #endif The better fix is that we really should take CPUs offline more gracefully during a shutdown (at least during an orderly shutdown). -- John Baldwin <jhb_at_FreeBSD.org> <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve" = http://www.FreeBSD.orgReceived on Mon Jan 23 2006 - 20:00:03 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:51 UTC