fsck bug in replaying partial frag truncate journal on UFS SU+J?

From: <takehara.mikihito_at_jp.panasonic.com>
Date: Tue, 13 May 2014 21:42:19 +0900
Hello,

I think fsck_ffs has a bug in replaying partial frag truncate journal on UFS SU+J.
Bellow I tested about the issue.

I tested blocksize==4KB, fragsize=512byte UFS SU+J. But I think these parameters are not related to 
this issue.
1) Preparing 4096byte(1block) test file.
2) Use "truncate" to shorten filesize to 3584byte(7frags).
3) Shutdown without unmount after journal is written and before inode size ufs2_dinode is written.
4) Run fsck_ffs with journal.
5) Mount again and remove test file. Then I face panic.
   "panic: ffs_blkfree_cg: freeing free block"

This seems to be caused by fsck_ffs to replay JOP_FREEBLK whose "blkno" is not block-aligned.
The above case 2), JOP_FREEBLK journal is like this:
   FREEBLK ino=5, blkno=1727, lbn=0, frags=1, oldfrags=0  <---(a)
fsck_ffs handles JOP_NEWBLK almost same as JOP_FREEBLK. But my understanding is that in case of 
JOP_NEWBLK kernel always create block-aligned blkno. So this issue is only in case of JOP_FREEBLK.

To analyze this issue, I also tested that using the above test file 3584byte(7frags) and write 
512byte with append to largen to 4096byte(1block). In this case JOP_NEWBLK journal is like this:
   JOP_NEWBLK ino=5, blkno=1720, lbn=0, frags=8, oldfrags=7  <---(b)

I think the bug is in fsck_ffs suj.c's arround blk_check(). My understanding is that blk_check() 
cannot handle non-block-aligned blkno so there is a little trick in blk_build() bellow.
----------------------------------------------------------------------------------------
        /*                                                                              
         * Rewrite the record using oldfrags to indicate the offset into                
         * the block.  Leave jb_frags as the actual allocated count.                    
         */
        blkrec->jb_blkno -= frag;
        blkrec->jb_oldfrags = frag;                                                     
----------------------------------------------------------------------------------------
By this trick, (a) is modified like:
   JOP_FREEBLK ino=5, blkno=1720, lbn=0, frags=1, oldfrags=7  <---(a')
and (b) is modified like:
   JOP_NEWBLK ino=5, blkno=1720, lbn=0, frags=8, oldfrags=0  <---(b')
But blk_check() cannot handle the case "oldfrags!=0". If "oldfrags!=0", blk_check()'s "isat" comes 
to be 0 even though the blk is present. So (a) should be modified same as (b')
The following is my patch to fix like this. According to my test, this patch works fine.
=======================================================================
diff --git a/sbin/fsck_ffs/suj.c b/sbin/fsck_ffs/suj.c
index e21ffc6..2512522 100644
--- a/sbin/fsck_ffs/suj.c
+++ b/sbin/fsck_ffs/suj.c
_at__at_ -2503,11 +2503,11 _at__at_ blk_build(struct jblkrec *blkrec)
        frag = fragnum(fs, blkrec->jb_blkno);
        sblk = blk_lookup(blk, 1);
        /*
-        * Rewrite the record using oldfrags to indicate the offset into
-        * the block.  Leave jb_frags as the actual allocated count.
+        * Rewrite the record to indicate the offset into the block.
         */
        blkrec->jb_blkno -= frag;
-       blkrec->jb_oldfrags = frag;
+       blkrec->jb_frags += frag;
+       blkrec->jb_oldfrags = 0;
        if (blkrec->jb_oldfrags + blkrec->jb_frags > fs->fs_frag)
                err_suj("Invalid fragment count %d oldfrags %d\n",
                    blkrec->jb_frags, frag);
=======================================================================

I think we can fix this by changing kernel code not to create non-block-aligned JOP_FREEBLK.
But I also think it is better to change fsck by considering compatibility.
Received on Tue May 13 2014 - 10:42:49 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:49 UTC