Panic String: ffs_alloccg: map corrupted [/dev/gpt/tmp]

From: O. Hartmann <ohartman_at_zedat.fu-berlin.de>
Date: Wed, 11 Jun 2014 19:55:04 +0200
Running FreeBSD

Version String: FreeBSD 11.0-CURRENT #3 r267294: Mon Jun  9 22:07:15 CEST 2014 amd64

crashes wihout panic message and /var/crash/info.0 contains this message:

Dump header from device /dev/gpt/swap
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 968962048B (924 MB)
  Blocksize: 512
  Dumptime: Wed Jun 11 19:19:19 2014
  Hostname: thor.sb211.zbv
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 11.0-CURRENT #3 r267294: Mon Jun  9 22:07:15 CEST 2014
    root_at_thor.sb211.zbv:/usr/obj/usr/src/sys/THOR
  Panic String: ffs_alloccg: map corrupted
  Dump Parity: 3034136388
  Bounds: 0
  Dump Status: good

I'm very confused about the panic string, since it seems to tell me something is bad with
FFS/UFS.

More disturbing is the fact that the boot process into multi user stops at a compalin
about unclean /dev/gpt/tmp filesystem (mount to /tmp): The OS stops at the PAsswd: prompt
for single user-mode maintainance.

I can not understand why the system is stopping complaining about a broken /tmp
filesystem. I consider especially /tmp infill corrupt after a fault and I'd like to ask
whether there is a way to overrun this corruption and force a repair and mount, even if
the data contained in /tmp is after forced cleaning corrupt.

When using tmpfs backed /tmp there shouldn't be any stopp/fault of that kind so it would
be canonical to have it also for a hard-drive backed /tmp, or am I wrong?

It is not the first time that I receive this kind of crash under heavy load (box is a
8GB system with this CPU specs:  

FreeBSD clang version 3.4.1 (tags/RELEASE_34/dot1-final
208032) 20140512 CPU: Intel(R) Core(TM)2 Duo CPU     E8400  _at_ 3.00GHz (2999.72-MHz
K8-class CPU) Origin="GenuineIntel"  Id=0x10676  Family=0x6  Model=0x17  Stepping=6
  Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
  Features2=0x8e3fd<SSE3,DTES64,MON,DS_CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1>
  AMD Features=0x20100800<SYSCALL,NX,LM>
  AMD Features2=0x1<LAHF>
  TSC: P-state invariant, performance statistics
real memory  = 8589934592 (8192 MB)
avail memory = 8278880256 (7895 MB)
Event timer "LAPIC" quality 400
ACPI APIC Table: <A_M_I_ OEMAPIC >
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
FreeBSD/SMP: 1 package(s) x 2 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
[...]

The not-so-funny-part is that I have those crashes under heavy load very frequent on ALL
C2D systems (one E8400 as shown, another has a Q4400 CPU, but also 8 GB RAM, same
motherboard). In all cases of a sudden crash, /tmp gets corrupted and the system refuses
to boot into multiuser mode complaining about the broken /tmp filesystem which can not be
repaired automatically.

Apart from this specific question about an unclean /tmp, this kind of crash under heavy
load on a specific hardware architecture with most recent CURRENT is puzzling (and
occured within the past 8 weeks several times with the same stupid blocking at the
broken /tmp partition). I also checked the hardware with tools like memtest86 ensure
having no fault memory, but I can not exclude some kind of overheating the CPU since I
realized with CLANG and -O3 (which is supposed to optimise for vector units if available,
if I'm right) this increases the average CPU temperature by ~ 3 - 5 degree Celsius. This
is more obvious on a Dell Latitude E6510 with a first-generation Sandy Bridge mobile CPU
and FreeBSD 9.2/9.3: compiling the OS with gcc 4.2 (base compiler in that system), the
temperature is 2 - 4 degrees lower than using CLANG 3.4.1 with -O3 enabled (reading the
ACPI reported temperature via "systctl -a|grep tempe"). This is funny, isn't it?

Regards and thanks in advance, 
Oliver 

Received on Wed Jun 11 2014 - 15:55:18 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:49 UTC