Re: Softupdate/kernel panic ffs_fsync

From: Sven Willenberger <sven_at_dmv.com> Date: Tue, 15 Jun 2004 09:16:02 -0400 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:57 UTC

On Mon, 2004-06-14 at 13:29 -0400, Sven Willenberger wrote:
> Once upon a time I wrote:
> 
> > I have seen a few (unresolved) questions similar to this searching
> > (google|archives). On a 5.2.1-Release-P2 system (actually a couple with
> > essentially identical configs) I get the following Stack Backtrace
> > messages:
> > 
> > backtrace(c070cbf8,2,e5b3af60,0,22) at backtrace +0x17
> > getdirtybuf(f7f99bbc,0,1,e5b3a,f60,1) at getdirtybuf +0x30
> > flush_deplist(c724e64c,1,f7f99be4,f7f99be8,0) at flush_deplist +0x43
> > flush_inode_deps(c6c35000,5c108,f7f99c10,c0510fe3,f7f99c40) at
> > flush_inode_deps + 0xa3
> > softdep_sync_metadata(f7f99ca8,0,c06da90f,124,0) at
> > softdep_sync_metadata +0x87
> > ffs_fsync(f7f99ca8,0,c06d0c8b,beb,0) at ffs_fsync +0x3b9
> > fsync(c7c224780,f7f99d14,c06e15c0,3ee,1) at fsync +0x151
> > syscall(80e002f,bfbf002f,bfbf0028,0,80f57e0) at syscall +0x2a0
> > Xint0x80_syscall() at Xint0x80_syscall() +0x1d
> > --- syscall (95), eip=0x282a89af, esp=0xbfbfa10c, ebp=0xbfbfba68 ---
> > 
> > 
> > The systems in question are mail servers that act as gateways (no local
> > delivery) running mimedefang (2.39 - 2.42) with spamassassin. The work
> > directory is not swap/memory mounted but rather on
> > /var/spool/MIMEDefang. The frequency of these messages increases when
> > bayes filtering is added (as the bayes tokens db file also resides on
> > the same filesystem/directory).
> > 
> > I have read that it may be that getdirtybuf() was passed a corrupt
> > buffer header; has anything further ever been made of this and if not,
> > where/how do I start to help contributing to finding a solution?
> 
> I have yet to see a resolution to this issue. I am now running all the
> boxen using 5.2.1-Release-P8 with perl 5.8.4 and all ports upgraded.
> 
> I have created 256MB Ramdisks on each machine that MIMEDefang now uses
> for it's temp files and bayesian database but, if anything, the
> frequency of backtraces has actually increased, rather than decreased.
> 
> What do I need to do to further delineate this issue? For me this is a
> showstopper as it will occasionally cause a panic/reboot. I have these
> machines clustered so as not to interrupt services but it is slowly
> becoming frustrating as the machines are bailing under heavy traffic.
> Is there any output I can provide or diagnostics I can run to help find
> a solution?
> 
> Sven
> 

Would this have anything to do with background fscking? or is the bgfsck
only run once at bootup[+delay] if the system determines if it is
needed? I am trying to find some commmon factor here and the only thing
I can find is that during heavy incoming mail load (when many perl
proceses courtesy of MIMEDefang are running) the kernel creates the
backtrace. This is still odd because all the temp files are on a RAMdisk
(malloc-based) - is it possible that softupdates is trying to fsync
either swap and/or other memory devices? The following is a typical
layout of the boxes in question:

/dev/da0s1a on / (ufs, local)
devfs on /dev (devfs, local)
/dev/da0s1e on /tmp (ufs, local, soft-updates)
/dev/da0s1f on /usr (ufs, local, soft-updates)
/dev/da0s1d on /var (ufs, local, soft-updates)
/dev/md10 on /var/spool/MIMEDefang (ufs, local)

where the ramdisk is configured with mdconfig -a -t malloc -s 256m -u 10

Sven