SUJ file system corruption.

From: Tim Kientzle <tim_at_kientzle.com>
Date: Sun, 13 May 2012 15:35:58 -0700
FYI:  Saw a crash due to filesystem corruption when running SUJ.

This is on a ARM AM335x system (BeagleBone) that is
still pretty experimental, so I certainly cannot rule out other
problems, but in case it means something to
someone, here's the scenario:

Reset the board to reboot (which is routine for these
small embedded boards) and when it came back up
it went through SUJ recovery, and then a little later
the kernel panicked with this stack trace:

rm: /var/run/dmesg.boot: Bad file descriptor
panic: ffs_write: type 0xc1e86660 0 (0,1024)
KDB: enter: panic
[ thread pid 492 tid 100044 ]
Stopped at      $d:     ldrb    r15, [r15, r15, ror r15]!
db> bt
Tracing pid 492 tid 100044 td 0xc1dbc5c0
kdb_enter() at kdb_enter+0xc
scp=0xc0321c08 rlv=0xc02f0024 (panic+0xe8)
        rsp=0xcb3e3ba8 rfp=0xcb3e3bbc
        r4=0x00000100
panic() at panic+0x10
scp=0xc02eff4c rlv=0xc043b2f4 (ffs_write+0x114)
        rsp=0xcb3e3bd0 rfp=0xcb3e3c48
ffs_write() at ffs_write+0xc
scp=0xc043b1ec rlv=0xc049d55c (VOP_WRITE_APV+0x128)
        rsp=0xcb3e3c4c rfp=0xcb3e3cf0
        r10=0x00020001 r9=0x00000000
        r8=0x00000000 r7=0x00000000 r6=0x00000000 r5=0xcb3e3cfc
        r4=0xc055a78c
VOP_WRITE_APV() at VOP_WRITE_APV+0xc
scp=0xc049d440 rlv=0xc0390ca4 (vn_write+0x28c)
        rsp=0xcb3e3cf4 rfp=0xcb3e3d3c
        r7=0xcb3e3db4 r6=0xc1dc09a0
        r5=0xc1e86660 r4=0x00000000
vn_write() at vn_write+0xc
scp=0xc0390a24 rlv=0xc0339c88 (dofilewrite+0x98)
        rsp=0xcb3e3d40 rfp=0xcb3e3d70
        r10=0x00000000 r9=0x00000400
        r8=0xc1dc09a0 r7=0xc1dbc5c0 r6=0x00000001 r5=0xcb3e3db4
        r4=0xffffffff
dofilewrite() at dofilewrite+0xc
scp=0xc0339bfc rlv=0xc033b508 (kern_writev+0x60)
        rsp=0xcb3e3d74 rfp=0xcb3e3da8
        r10=0x00000000 r9=0xbfffecec
        r8=0xc1dbc5c0 r7=0xcb3e3db4 r6=0x00000001 r5=0x00000000
        r4=0x00000000
kern_writev() at kern_writev+0xc
scp=0xc033b4b4 rlv=0xc033b620 (sys_write+0x58)
        rsp=0xcb3e3dac rfp=0xcb3e3de0
        r8=0x00000000 r7=0xc1d9a000
        r6=0xc1dbc5c0 r5=0xcb3e3eac r4=0x2047c400
sys_write() at sys_write+0xc
scp=0xc033b5d4 rlv=0xc048934c (swi_handler+0x2d0)
        rsp=0xcb3e3de4 rfp=0xcb3e3ea8
swi_handler() at swi_handler+0xc
scp=0xc0489088 rlv=0xc047c440 (swi_entry+0x28)
        rsp=0xcb3e3eac rfp=0xbfffea5c
        r10=0x2017be50 r8=0x2041c000
        r7=0x0000002d r6=0x00000400 r5=0x2017cc18 r4=0x2047c400


Rebooted and ran fsck -y without using the journal and noticed:

** Phase 2 - Check Pathnames
UNALLOCATED  I=244  OWNER=root MODE=0
SIZE=0 MTIME=Jan  1 00:00 1970 
NAME=/var/run/dmesg.boot

UNEXPECTED SOFT UPDATE INCONSISTENCY


If I can find a way to reproduce this, I'll let you know.

Cheers,

Tim
Received on Sun May 13 2012 - 20:35:59 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:26 UTC