Re: reproduced: ffs_blkfree: freeing free block

From: Don Lewis <truckman_at_FreeBSD.org>
Date: Sun, 16 Jan 2005 01:27:36 -0800 (PST)
On 15 Jan, Don Lewis wrote:
> On 15 Jan, Poul-Henning Kamp wrote:
>> 
>> Quite by accident my test-machine here can now reliably reproduce
>> the dreaded "panic: ffs_blkfree: freeing free block" in a few minutes
>> of time.
>> 
>> It is very interesting that the location of the actual error is a
>> very narrow stripe of the filesystem:
>> 
>> 	dev = ad8, block = 13456368, fs = /hex
>> 	dev = ad8, block = 13455888, fs = /hex
>> 	dev = ad8, block = 13454688, fs = /hex
>> 	dev = ad8, block = 13455040, fs = /hex
>> 	dev = ad8, block = 13455200, fs = /hex
>> 	dev = ad8, block = 13455880, fs = /hex
>> 
>> The application I'm running at the time adds/modifies records in a
>> db(3) hash file and nothing much besides.
>> 
>> Before I start implementing complete I/O traces and spend days
>> groveling over UFS/FFS on-disk bits, are there anybody who has
>> suggestions for things I should try to enable/disable to narrow
>> this down ?
> 
> I'm reasonably certain that the on-disk bits are ok.  I'm seeing it in
> the case where a freshly written file is being re-written.  If I unmount
> the file system after the file is initially created, the file system
> fsck's clean, and if I then remount the file system, I am unable to
> reproduce the problem.  Fsdb lists the block causing the panic as one
> that was initially allocated to the file.  It looks like the in-core
> block bitmap is getting corrupted.
> 
> In my openoffice build example, the following final step is sufficient
> to trigger the panic:
>     jot -b x 174113 > \
>       /mnt/usr/ports/editors/openoffice-1.1/work/config_office/configure
> Merely truncating the file with "cat /dev/null" does not appear to be
> sufficient to trigger the panic.

If I run the openoffice-1.1 make targets from extract through
patch-autotools, then truncate and rewrite teh configure script, I can
reproduce the panic:

fsync /mnt/usr/ports/editors/openoffice-1.1/work/config_office/configure
ls -li /mnt/usr/ports/editors/openoffice-1.1/work/config_office/configure
1274809 -rwxr-xr-x  1 root  wheel  348277 Aug 27 03:03 /mnt/usr/ports/editors/openoffice-1.1/work/config_office/configure
fsdb -r /dev/da0s2a
** /dev/da0s2a (NO WRITE)
Examining file system `/dev/da0s2a'
Last Mounted on /mnt
current inode: directory
I=2 MODE=40755 SIZE=512
        MTIME=Feb 14 11:55:43 2004 [0 nsec]
        CTIME=Feb 14 11:55:43 2004 [0 nsec]
        ATIME=Jan 14 18:24:05 2005 [0 nsec]
OWNER=root GRP=wheel LINKCNT=7 FLAGS=0 BLKCNT=4 GEN=53557245
fsdb (inum: 2)> inode 1274809
current inode: regular file
I=1274809 MODE=100755 SIZE=348277
        MTIME=Aug 27 03:03:28 2004 [0 nsec]
        CTIME=Jan 16 00:47:00 2005 [0 nsec]
        ATIME=Dec 31 16:00:00 1969 [0 nsec]
OWNER=root GRP=wheel LINKCNT=1 FLAGS=0 BLKCNT=2e0 GEN=791e742c
fsdb (inum: 1274809)> blocks
Blocks for inode 1274809:
Direct blocks:
5089312, 5089320, 5089328, 5089336, 5089344, 5089352, 5089360, 5089368, 5089424, 5089432, 5089440, 5089448
Indirect blocks:
5089456, 5089464, 5089472, 5089480, 5089488, 5089496, 5089504, 5089512, 
5089520, 5089528, 
fsdb (inum: 1274809)> q
cat /dev/null > /mnt/usr/ports/editors/openoffice-1.1/work/config_office/configure
sleep 120
jot -b x 174113 > /mnt/usr/ports/editors/openoffice-1.1/work/config_office/configure
sleep 120

The system panics during this final sleep:

dev = da0s2a, block = 5089472, fs = /mnt
panic: ffs_blkfree: freeing free block
cpuid = 0
KDB: enter: panic

The block number in question is one that was initially allocated to the
configure file.  Because of the sleep command, I would think that the
free block bitmap should have been updated to mark this block free
sometime between when the configure file was truncated by "cat
/dev/null" and when the configure file was re-written by jot, so it
looks like block 5089472 is being freed twice.  Without the final jot
command to rewrite the file, I can wait an arbitrarily long time and the
system will not panic.


GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd".
doadump () at pcpu.h:159
(kgdb) where
#0  doadump () at pcpu.h:159
#1  0xc04661e6 in db_fncall (dummy1=-1065580187, dummy2=0, dummy3=-455755064, 
    dummy4=0xe4d5ba60 "\224؛صن") at /usr/src/sys/ddb/db_command.c:531
#2  0xc046657c in db_command_loop () at /usr/src/sys/ddb/db_command.c:349
#3  0xc0467fa8 in db_trap (type=3, code=0) at /usr/src/sys/ddb/db_main.c:221
#4  0xc063baee in kdb_trap (type=3, code=0, tf=0xe4d5bb88)
    at /usr/src/sys/kern/subr_kdb.c:421
#5  0xc07dba6c in trap (frame=
      {tf_fs = 24, tf_es = 16, tf_ds = 16, tf_edi = 0, tf_esi = 1, tf_ebp = -455754808, tf_isp = -455754828, tf_ebx = -455754764, tf_edx = -1065055680, tf_ecx = -1064242016, tf_eax = -1065046973, tf_trapno = 3, tf_err = 0, tf_eip = -1067206872, tf_cs = 8, tf_eflags = 646, tf_esp = -455754776, tf_ss = -1067304789})
    at /usr/src/sys/i386/i386/trap.c:573
#6  0xc07ca3da in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0x00000018 in ?? ()
#8  0x00000010 in ?? ()
#9  0x00000010 in ?? ()
#10 0x00000000 in ?? ()
#11 0x00000001 in ?? ()
#12 0xe4d5bbc8 in ?? ()
#13 0xe4d5bbb4 in ?? ()
#14 0xe4d5bbf4 in ?? ()
#15 0xc0848a40 in ?? ()
#16 0xc090f4a0 in shutdown_howto ()
#17 0xc084ac43 in ?? ()
#18 0x00000003 in ?? ()
#19 0x00000000 in ?? ()
#20 0xc063b728 in kdb_enter (msg=0x0) at cpufunc.h:56
#21 0xc06238ab in panic (fmt=0x0) at /usr/src/sys/kern/kern_shutdown.c:538
#22 0xc0753f58 in ffs_blkfree (fs=0xc2861800, devvp=0xc298e114, bno=5089472, 
    size=16384, inum=1274809) at /usr/src/sys/ufs/ffs/ffs_alloc.c:1791
#23 0xc0762dae in handle_workitem_freefrag (freefrag=0xc63ae540)
    at /usr/src/sys/ufs/ffs/ffs_softdep.c:1588
#24 0xc0765a38 in process_worklist_item (matchmnt=0x0, flags=0)
    at /usr/src/sys/ufs/ffs/ffs_softdep.c:768
#25 0xc076a3aa in softdep_process_worklist (matchmnt=0x0)
    at /usr/src/sys/ufs/ffs/ffs_softdep.c:623
#26 0xc0677f12 in sched_sync () at /usr/src/sys/kern/vfs_subr.c:1595
#27 0xc0610ba2 in fork_exit (callout=0xc0677b60 <sched_sync>, arg=0x0, 
    frame=0xe4d5bd48) at /usr/src/sys/kern/kern_fork.c:787
#28 0xc07ca43c in fork_trampoline () at /usr/src/sys/i386/i386/exception.s:208


Note that the inum parameter to ffs_blkfree() matches the inode number
of the configure file.  I would not expect softupdates to be freeing
blocks for this inode at the time when the machine paniced.


At the time of the panic, there is one more item on the worklist,
another struct freefrag that points to block #5089464.

(kgdb) frame 25
#25 0xc076a3aa in softdep_process_worklist (matchmnt=0x0)
    at /usr/src/sys/ufs/ffs/ffs_softdep.c:623
623                     if ((cnt = process_worklist_item(matchmnt, 0)) == -1)
(kgdb) print num_on_worklist
$2 = 2
(kgdb) frame 24 
#24 0xc0765a38 in process_worklist_item (matchmnt=0x0, flags=0)
    at /usr/src/sys/ufs/ffs/ffs_softdep.c:768
768                     handle_workitem_freefrag(WK_FREEFRAG(wk));
print *(struct freefrag *)wk->wk_list->le_next
$8 = {ff_list = {wk_list = {le_next = 0xc57cd9e0, le_prev = 0xc096a018}, 
    wk_type = 7, wk_state = 32768}, ff_mnt = 0xc26a9800, ff_blkno = 5089464, 
  ff_fragsize = 16384, ff_inum = 1274809}
Received on Sun Jan 16 2005 - 08:27:44 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:26 UTC