Re: ffs_truncate3 panics

From: Konstantin Belousov <kostikbel_at_gmail.com> Date: Sat, 11 Aug 2018 15:37:55 +0300 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:17 UTC

On Sat, Aug 11, 2018 at 12:05:25PM +0000, Rick Macklem wrote:
> Konstantin Belousov wrote:
> >On Thu, Aug 09, 2018 at 08:38:50PM +0000, Rick Macklem wrote:
> >> >BTW, does NFS server use extended attributes ?  What for ?  Can you, please,
> >> >point out the code which does this ?
> >> For the pNFS service, there are two system namespace extended attributes for
> >> each file stored on the service.
> >> pnfsd.dsfile - Stores where the data for the file is. Can be displayed by the
> >>      pnfsdsfile(8) command.
> >>
> >> pnfsd.dsattr - Cached attributes that change when a file is written (size, mtime,
> >> change) so that the MDS doesn't have to do a Getattr on the data server for every client Getattr.
> >>
> >
> >My reading of the nfsd code + ffs extattr handling reminds me that you
> >already reported this issue some time ago.  I suspected ufs_balloc() at
> >that time.
> Yes. I had almost forgotten about them, because I have been testing with a
> couple of machines (not big, but amd64 with a few Gbytes of RAM) and they
> never hit the panic(). Recently, I've been using the 256Mbyte i386 and started
> seeing them again.
> 
> >Now I think that the situation with the stray buffers hanging on the
> >queue is legitimate, ffs_extread() might create such buffer and release
> >it to a clean queue, then removal of the file would see inode with no
> >allocated ext blocks but with the buffer.
> >
> >I think the easiest way to handle it is to always flush buffers and pages
> >in the ext attr range, regardless of the number of allocated ext blocks.
> >Patch below was not tested.
> [patch deleted for brevity]
> Well, the above sounds reasonable, but the patch didn't help.
> Here's a small portion of the log a test run last night.
> - First, a couple of things about the printf()s. When they start with "CL=<N>",
>   the printf() is at the start of ffs_truncate(). "<N>" is a static counter of calls to
>   ffs_truncate(), so "same value" indicates same call.
> 
> 
> CL=31816 flags=0xc00 vtyp=1 bodirty=0 boclean=1 diextsiz=320
> buf at 0x429f260
> b_flags = 0x20001020<vmio,reuse,cache>, b_xflags=0x2<clean>, b_vflags=0x0
> b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0
> b_bufobj = (0xfa3f734), b_data = 0x4c90000, b_blkno = -1, b_lblkno = -1, b_dep = 0
> b_kvabase = 0x4c90000, b_kvasize = 32768
> 
> CL=34593 flags=0xc00 vtyp=1 bodirty=0 boclean=1 diextsiz=320
> buf at 0x429deb0
> b_flags = 0x20001020<vmio,reuse,cache>, b_xflags=0x2<clean>, b_vflags=0x0
> b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0
> b_bufobj = (0xfd3da94), b_data = 0x5700000, b_blkno = -1, b_lblkno = -1, b_dep = 0
> b_kvabase = 0x5700000, b_kvasize = 32768
> 
> FFST3=34593 vtyp=1 bodirty=0 boclean=1
> buf at 0x429deb0
> b_flags = 0x20001020<vmio,reuse,cache>, b_xflags=0x2<clean>, b_vflags=0x0
> b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0
> b_bufobj = (0xfd3da94), b_data = 0x5700000, b_blkno = -1, b_lblkno = -1, b_dep = 0
> b_kvabase = 0x5700000, b_kvasize = 32768
Problem with this buffer is that BX_ALTDATA bit is not set.
This is the reason why vinvalbuf(V_ALT) skips it.

> 
> So, the first one is what typically happens and there would be no panic().
>  The second/third would be a panic(), since the one that starts with "FFST3"
> is a printf() that replaces the panic() call.
> - Looking at the second/third, the number at the beginning is the same, so it is
>   the same call, but for some reason, between the start of the function and
>   where the ffs_truncate3 panic() test is, di_extsize has been set to 0, but the
>   buffer is still there (or has been re-created there by another thread?).
> 
> Looking at the code, I can't see how this could happen, since there is a vinvalbuf()
> call after the only place in the code that sets di_extsize == 0, from what I can see?
> I am going to add printf()s after the vinvalbuf() calls, to make sure they are
> happening and getting rid of the buffer.
> 
> If another thread could somehow (re)create the buffer concurrently with the
> ffs_truncate() call, that would explain it, I think?
The vnode is exclusively locked. Other thread must not be able to
instantiate a buffer under us.

> 
> Just a wild guess, but I suspect softdep_slowdown() is flipping, due to the small
> size of the machine and this makes the behaviour of ffs_truncate() confusing.

This is the patch that I posted long time ago.  It is obviously related
to missed BX_ALTDATA.  Can you add this patch to your kernel ?

diff --git a/sys/ufs/ffs/ffs_balloc.c b/sys/ufs/ffs/ffs_balloc.c
index 552c295753d..6d89a229ea7 100644
--- a/sys/ufs/ffs/ffs_balloc.c
+++ b/sys/ufs/ffs/ffs_balloc.c
_at__at_ -682,8 +682,16 _at__at_ ffs_balloc_ufs2(struct vnode *vp, off_t startoffset, int size,
 				    ffs_blkpref_ufs2(ip, lbn, (int)lbn,
 				    &dp->di_extb[0]), osize, nsize, flags,
 				    cred, &bp);
-				if (error)
+				if (error != 0) {
+					/* getblk does truncation, if needed */
+					bp = getblk(vp, -1 - lbn, osize, 0, 0,
+					    GB_NOCREAT);
+					if (bp != NULL) {
+						bp->b_xflags |= BX_ALTDATA;
+						brelse(bp);
+					}
 					return (error);
+				}
 				bp->b_xflags |= BX_ALTDATA;
 				if (DOINGSOFTDEP(vp))
 					softdep_setup_allocext(ip, lbn,
_at__at_ -699,8 +707,17 _at__at_ ffs_balloc_ufs2(struct vnode *vp, off_t startoffset, int size,
 			error = ffs_alloc(ip, lbn,
 			   ffs_blkpref_ufs2(ip, lbn, (int)lbn, &dp->di_extb[0]),
 			   nsize, flags, cred, &newb);
-			if (error)
+			if (error != 0) {
+				bp = getblk(vp, -1 - lbn, nsize, 0, 0,
+				    GB_NOCREAT);
+				if (bp != NULL) {
+					bp->b_xflags |= BX_ALTDATA;
+					bp->b_flags |= B_RELBUF | B_INVAL;
+					bp->b_flags &= ~B_ASYNC;
+					brelse(bp);
+				}
 				return (error);
+			}
 			bp = getblk(vp, -1 - lbn, nsize, 0, 0, gbflags);
 			bp->b_blkno = fsbtodb(fs, newb);
 			bp->b_xflags |= BX_ALTDATA;