Re: panic: Negative bio_offset (-15050100712783872) on bio 0xc7725d50

From: Bruce Evans <bde_at_zeta.org.au> Date: Tue, 16 Sep 2003 10:42:38 +1000 (EST) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:22 UTC

On Mon, 15 Sep 2003, Kris Kennaway wrote:

> bad block 8239054478774324592, ino 3229486
> bad block 7021770428354685254, ino 3229486
> panic: Negative bio_offset (-15050100712783872) on bio 0xc7725d50
> Debugger("panic")
> Stopped at      Debugger+0x54:  xchgl   %ebx,in_Debugger.0
> db> trace
> Debugger(c043aa25,c04ac1c0,c0435bc2,cd1d1980,100) at Debugger+0x54
> panic(c0435bc2,5d2e4000,ffca8803,c7725d50,c0440989) at panic+0xd5
> g_dev_strategy(c7725d50,0,c29b42e4,1,c0439d9c) at g_dev_strategy+0xa7
> spec_xstrategy(c29c9920,c7725d50,0,c29b42e4,0) at spec_xstrategy+0x3ef
> spec_specstrategy(cd1d1a60,cd1d1a84,c02b2982,cd1d1a60,0) at spec_specstrategy+0x72
> spec_vnoperate(cd1d1a60,0,ffffe544,4000,0) at spec_vnoperate+0x18
> breadn(c29c9920,1ae9720,ffffe544,4000,0) at breadn+0x122
> bread(c29c9920,1ae9720,ffffe544,4000,0) at bread+0x4c
                 ^^^^^^^^^^^^^^^^
> ffs_blkfree(c29d4800,c29c9920,6c6c69,6f747561,4000) at ffs_blkfree+0x286
> indir_trunc(c4abe200,313a0e0,0,0,c) at indir_trunc+0x334
> handle_workitem_freeblocks(c4abe200,0,2,c04af748,c26acc00) at handle_workitem_freeblocks+0x21e
> process_worklist_item(0,0,3f65f661,0,c32961e0) at process_worklist_item+0x1fd
> softdep_process_worklist(0,0,c044228c,6f4,0) at softdep_process_worklist+0xe0
> sched_sync(0,cd1d1d48,c04383a9,312,20) at sched_sync+0x304
> fork_exit(c02c4e30,0,cd1d1d48) at fork_exit+0xcf
> fork_trampoline() at fork_trampoline+0x1a
> --- trap 0x1, eip = 0, esp = 0xcd1d1d7c, ebp = 0 ---
> db>
>
> Is this disk corruption, or a bug?

This is either disk corruption or an ffs bug.  ffs passes the garbage
block number 0xffffe5441ae9720 to bread.  GEOM then handles this austerely
by panicing.  Garbage block numbers, including negative ones, can possibly
be created by applications seeking to preposterous offsets, so they should
not be handled with panics.

The following script (with edits to turn off avoiding the bugs) demonstrated
related bugs the last time I tried it (about 6 months ago).

%%%
#!/bin/sh

SOMEFILE=/c/tmp/zz

# Bugs:
# (1) md silently truncates sizes (in DEV_BSIZE'ed units) mod 2^32.
# (2) at least pre-GEOM versions of md get confused by this and cause a
# null pointer panic in devstat.
#
# Use the maximum size that works (2^32 - 1).  Unfortunately, this prevents
# testing of file systems with size 2 TB or larger.
dd if=/dev/zero of=$SOMEFILE oseek=0xFFFFFFFE count=1

mdconfig -a -t vnode -f ${SOMEFILE} -u 0

# The large values here are more to make newfs not take too long than to
# get a large maxfilesize.
newfs -O 1 -b 65536 -f 65536 -i 6533600 /dev/md0

# Note that this reports a very large maxfilesise (2282 TB).  This is the
# size associated with the triple indirect block limit, not the correct
# one.  I think the actual limit should be 64 TB (less epsilon).
dumpfs /dev/md0 | grep maxfilesize

mount /dev/md0 /mnt

# Bugs:
# (1) fsbtodb(nb) overflows when nb has type ufs1_daddr_t and the result
# should be larger than (2^31 - 1).
# (2) dscheck() used to detect garbage block numbers caused by (1) (if the
# garbage happened to be negative or too large).  Then it reported the error
# and invalidated the buffer.  GEOM doesn't detect any error.  It apparently
# passes on the garbage, so the error is eventually detected by ffs (since
# md0 is on an ffs vnode) (if the garbage is preposterous).  ffs_balloc_ufs1()
# eventually sees the error as an EFBIG returned be bwrite() and gives up.
# But the buffer says in the buffer cache to cause problems later.
# (3) The result of bwrite() is sometimes ignored.
#
# Chop a couple of F's off the seek so that we don't get an EFBIG error.
# Unfortunately, this breaks testing for files of size near 2282 TB.
dd if=/dev/zero of=/mnt/zz oseek=0xFFFFFE count=1

ls -l /mnt/zz

# Bugs:
# (1) umount(2) returns the undocumented errno EFBIG for the unwriteable
# buffer.
# (2) umount -f and unmount at reboot fail too (the latter leaving all file
# systems dirty).
#
# Removing the file clears the problem.
rm /mnt/zz
umount /mnt

# Since we couldn't demonstrate files larger than 2 TB on md0, demonstrate
# one near ${SOMEFILE}.
dumpfs /c | egrep '(^bsize|^fsize|maxfilesize)'
dd if=/dev/zero of="$SOMEFILE"-bigger oseek=0x3FFFFFFFE count=1
ls -l "$SOMEFILE"-bigger
rm "$SOMEFILE"-bigger

mdconfig -d -u 0
rm $SOMEFILE
%%%

Bruce