Hi. I'm seeing several snapshot-related crashes in -current, cvsup'd 08/12/2005 at 15:15 GMT+0200. I suspect a ule scheduler/snapshot interaction. /var/crash/info.1 reveals Dump header from device /dev/ad0s1b Architecture: i386 Architecture Version: 33554432 Dump Length: 528023552B (503 MB) Blocksize: 512 Dumptime: Mon Aug 15 12:32:00 2005 Hostname: citadel.os.org.za Magic: FreeBSD Kernel Dump Version String: FreeBSD 7.0-CURRENT #0: Fri Aug 12 22:44:36 SAST 2005 khetan_at_citadel.os.org.za:/usr/src/sys/i386/compile/CITADEL5 Panic String: snapacct_ufs2: bad block Dump Parity: 1551260746 Bounds: 1 Dump Status: good Kgdb reveals [citadel] /var/crash# kgdb -c vmcore.1 /usr/src/sys/i386/compile/CITADEL5/kernel.debug [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd". Unread portion of the kernel message buffer: ÀÍÁ_at_ ÁÄ Á¢ÁÀÍÁ ÁÁ ¢ÁÀÍÁÀ ÁDÁ0¢ÁÀÍÁÁ Á_at_¢Á #0 doadump () at pcpu.h:165 165 pcpu.h: No such file or directory. in pcpu.h (kgdb) backtrace #1 0xc050212c in boot (howto=260) at ../../../kern/kern_shutdown.c:397 #2 0xc0502481 in panic (fmt=0xc06bb00a "snapacct_ufs2: bad block") at ../../../kern/kern_shutdown.c:553 #3 0xc05f9d95 in snapacct_ufs2 (vp=0xc2720880, oldblkp=0xc2673dd0, lastblkp=0xc2676000, fs=0xc1a75800, lblkno=12, expungetype=2) at ../../../ufs/ffs/ffs_snapshot.c:1338 #4 0xc05f9b3b in indiracct_ufs2 (snapvp=0xc2720880, cancelvp=0xc1ca9990, level=0, blkno=Unhandled dwarf expression opcode 0x93 ) at ../../../ufs/ffs/ffs_snapshot.c:1253 #5 0xc05f9905 in expunge_ufs2 (snapvp=0xc2720880, cancelip=0xc1c58bdc, fs=0xc1a75800, acctfunc=0xc05f9c7c <snapacct_ufs2>, expungetype=2) at ../../../ufs/ffs/ffs_snapshot.c:1185 #6 0xc05f7eaa in ffs_snapshot (mp=0xc1c05c00, snapfile=0xc1c58ce4 "`\214ÅÁ") at ../../../ufs/ffs/ffs_snapshot.c:605 #7 0xc0605de1 in ffs_mount (mp=0xc1c05c00, td=0xc24bb000) at ../../../ufs/ffs/ffs_vfsops.c:302 #8 0xc05556fc in vfs_domount (td=0xc24bb000, fstype=0xc1cb01f0 "ufs", fspath=0xc1cb0a00 "/", fsflags=16842752, fsdata=0xc2f23710) at ../../../kern/vfs_mount.c:739 #9 0xc0554ee9 in vfs_donmount (td=0xc24bb000, fsflags=16842752, fsoptions=0xd7041c04) at ../../../kern/vfs_mount.c:503 #10 0xc0557444 in kernel_mount (ma=0xc2311330, flags=16842752) at pcpu.h:162 #11 0xc0606041 in ffs_cmount (ma=0xc2311330, data=0x0, flags=16842752, ---Type <return> to continue, or q <return> to quit--- td=0xc24bb000) at ../../../ufs/ffs/ffs_vfsops.c:384 #12 0xc05550c6 in mount (td=0xc24bb000, uap=0xd7041d04) at ../../../kern/vfs_mount.c:566 #13 0xc066f0db in syscall (frame= {tf_fs = 59, tf_es = 59, tf_ds = 59, tf_edi = 134523985, tf_esi = -1077941244, tf_ebp = -1077943848, tf_isp = -687596188, tf_ebx = -1077943792, tf_edx = -1, tf_ecx = -1077940433, tf_eax = 21, tf_trapno = 12, tf_err = 2, tf_eip = 671848243, tf_cs = 51, tf_eflags = 582, tf_esp = -1077944004, tf_ss = 59}) at ../../../i386/i386/trap.c:986 #14 0xc065bb0f in Xint0x80_syscall () at ../../../i386/i386/exception.s:200 #15 0x0000003b in ?? () #16 0x0000003b in ?? () #17 0x0000003b in ?? () #18 0x0804ac51 in ?? () #19 0xbfbfec04 in ?? () #20 0xbfbfe1d8 in ?? () #21 0xd7041d64 in ?? () #22 0xbfbfe210 in ?? () #23 0xffffffff in ?? () #24 0xbfbfef2f in ?? () #25 0x00000015 in ?? () #26 0x0000000c in ?? () #27 0x00000002 in ?? () ---Type <return> to continue, or q <return> to quit--- #26 0x0000000c in ?? () #27 0x00000002 in ?? () ---Type <return> to continue, or q <return> to quit--- #28 0x280b9733 in ?? () #29 0x00000033 in ?? () #30 0x00000246 in ?? () #31 0xbfbfe13c in ?? () #32 0x0000003b in ?? () #33 0x00000000 in ?? () #34 0x00000000 in ?? () #35 0x00000000 in ?? () #36 0x00000000 in ?? () #37 0x12471000 in ?? () #38 0xc24bb154 in ?? () #39 0xc19b27d0 in ?? () #40 0xd7041504 in ?? () #41 0xd70414e8 in ?? () #42 0xc24bb000 in ?? () #43 0xc0514827 in sched_switch (td=0xbfbfe210, newtd=0xbfbfec04, flags=Cannot access memory at address 0xbfbfe1e8 ) at ../../../kern/sched_ule.c:1387 Previous frame inner to this frame (corrupt stack?) This points to a ULE scheduler issue, right ? My dmesg shows Copyright (c) 1992-2005 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 7.0-CURRENT #0: Fri Aug 12 22:44:36 SAST 2005 khetan_at_citadel.os.org.za:/usr/src/sys/i386/compile/CITADEL5 WARNING: debug.mpsafenet forced to 0 as ipsec requires Giant WARNING: MPSAFE network stack disabled, expect reduced performance. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Celeron(R) CPU 2.00GHz (1999.95-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0xf29 Stepping = 9 Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,M CA,C MOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> Features2=0x4400<CNTX-ID,<b14>> real memory = 528416768 (503 MB) avail memory = 507617280 (484 MB) ACPI APIC Table: <P4M266 AWRDACPI> ioapic0 <Version 0.3> irqs 0-23 on motherboard npx0: [FAST] npx0: <math processor> on motherboard npx0: INT 16 interface acpi0: <P4M266 AWRDACPI> on motherboard acpi0: Power Button (fixed) pci_link0: <ACPI PCI Link LNKA> on acpi0 pci_link1: <ACPI PCI Link LNKB> on acpi0 pci_link2: <ACPI PCI Link LNKC> irq 11 on acpi0 pci_link3: <ACPI PCI Link LNKD> on acpi0 pci_link4: <ACPI PCI Link ALKA> irq 0 on acpi0 pci_link5: <ACPI PCI Link ALKB> irq 0 on acpi0 pci_link6: <ACPI PCI Link ALKC> irq 0 on acpi0 pci_link7: <ACPI PCI Link ALKD> irq 0 on acpi0 Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0 cpu0: <ACPI CPU> on acpi0 acpi_button0: <Power Button> on acpi0 pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0 pci0: <ACPI PCI bus> on pcib0 agp0: <VIA 8703 (P4M266x/P4N266) host to PCI bridge> mem 0xeb000000-0xeb7fffff a t device 0.0 on pci0 pcib1: <PCI-PCI bridge> at device 1.0 on pci0 pci1: <PCI bus> on pcib1 pci1: <display, VGA> at device 0.0 (no driver attached) fxp0: <Intel 82550 Pro/100 Ethernet> port 0xd000-0xd03f mem 0xeb820000-0xeb820ff f,0xeb800000-0xeb81ffff irq 18 at device 8.0 on pci0 miibus0: <MII bus> on fxp0 inphy0: <i82555 10/100 media interface> on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto fxp0: Ethernet address: 00:02:b3:ed:ec:a2 fxp0: [GIANT-LOCKED] isab0: <PCI-ISA bridge> at device 17.0 on pci0 isa0: <ISA bus> on isab0 atapci0: <VIA 8235 UDMA133 controller> port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376, 0xe000-0xe00f at device 17.1 on pci0 ata0: <ATA channel 0> on atapci0 ata1: <ATA channel 1> on atapci0 acpi_tz0: <Thermal Zone> on acpi0 fdc0: <floppy drive controller> port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FAST] sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 sio0: type 16550A ppc0: <Standard parallel printer port> port 0x378-0x37f irq 7 on acpi0 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode ppbus0: <Parallel port bus> on ppc0 plip0: <PLIP network interface> on ppbus0 lpt0: <Printer> on ppbus0 lpt0: Interrupt-driven port ppi0: <Parallel I/O> on ppbus0 atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0 atkbd0: <AT Keyboard> irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] pmtimer0 on isa0 orm0: <ISA Option ROM> at iomem 0xcc000-0xcd7ff pnpid ORM0000 on isa0 sc0: <System console> at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> sio1: configured irq 3 not in bitmap of probed irqs 0 sio1: port may not be enabled vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 Timecounter "TSC" frequency 1999954984 Hz quality 800 Timecounters tick every 1.000 msec IPsec: Initialized Security Association Processing. ipfw2 (+ipv6) initialized, divert loadable, rule-based forwarding disabled, defa ult to deny, logging unlimited ad0: 39266MB <HDS722540VLAT20 V31OA6EA> at ata0-master UDMA100 ad2: 39266MB <HDS722540VLAT20 V31OA6EA> at ata1-master UDMA100 Trying to mount root from ufs:/dev/ad0s1a fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 fxp0: Microcode loaded, int_delay: 1000 usec bundle_max: 6 Accounting enabled I'd appreciate any pointers! Thanks. PS Problem is the machine is hosted in a remote data centre, requiring manual intervention to re-fsck it every time this crash occurs. For now, I'd disabled snapshots and forced fsck_y_enable="YES" background_fsck="NO" in /etc/rc.conf in the vain hope that if the machine barfs, it'll pick itself up again. That is logical, yes ? Khetan Gajjar -- Services | +27 11 575 3832 Internet Solutions | http://www.is.co.za/Received on Mon Aug 15 2005 - 13:09:48 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:41 UTC