Re: HAVE TRACE & DDB Re: FreeBSD 5.2-RC1 released

From: Jeff Roberson <jroberson_at_chesapeake.net>
Date: Fri, 12 Dec 2003 07:03:54 -0500 (EST)
I have some more information based on debugging done by dwhite.  See
below.

On Fri, 12 Dec 2003, Doug White wrote:

> Ok, after playing with make release, I can get this to drop to ddb.
>
> Here are the details:
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0x0
> fault code              = supervisor read, page not present
> instruction pointer     = 0x8:0xc0632e63
> stack pointer           = 0x10:0xd8b89afc
> frame pointer           = 0x10:0xd8b89b18
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 53 (syncer)
> kernel: type 12 trap, code=0
> Stopped at      _mtx_lock_flags+0x43:   cmpl    $0xc08b815c,0(%ebx)
> db> tr
> _mtx_lock_flags(0,0,c0865696,900,0) at _mtx_lock_flags+0x43
> vfs_setdirty(cee49ff0,0,c0865696,3fc,0) at vfs_setdirty+0x79
> bdwrite(cee49ff0,cee49ff0,0,4000,0) at bdwrite+0x358
> ffs_update(c4e16b2c,0,c0874d12,143,d8b89c28) at ffs_update+0x333
> ffs_fsync(d8b89c58,10012,c4ac0c80,495,0) at ffs_fsync+0x42f
> ffs_sync(c4cfb800,3,c20f7200,c4ac0c80,c4cfb800) at ffs_sync+0x1d4
> sync_fsync(d8b89cd4,30002,c4ac0c80,6cd,0) at sync_fsync+0x16a
> sched_sync(0,d8b89d48,c085d045,311,ffffffff) at sched_sync+0x286
> fork_exit(c06979b0,0,d8b89d48) at fork_exit+0xb4
> fork_trampoline() at fork_trampoline+0x8
> --- trap 0x1, eip = 0, esp = 0xd8b89d7c, ebp = 0 ---
>
> I'm getting a fsync message before things fall over, not sure if I can
> recover it.  More later.
>
> On Fri, 12 Dec 2003, Martin Blapp wrote:
>
> > >fsync: giving up on dirty: 0xc4e18e38: tag devfs, type VCHR, usecount 50,
> > >writecount 0, refcount 14,
> > > flags (VI_XLOCK|VV_OBJBUF), lock type devfs: EXCL (count 1) by thread
> > >0xc20ff500
> > >        dev ar0s1a

This fsync seems to be common, from dwhite's box:

fsync: giving up on dirty: 0xc4e18000: tag devfs, type VCHR, usecount 44,
writecount 0, refcount 14, flags (VI_XLOCK|VV_OBJBUF), lock type devfs: EXCL
(count 1) by thread 0xc20ff500

Locked vnodes
0xc4e16b2c: tag ufs, type VDIR, usecount 10, writecount 0, refcount 2,
flags (VV_ROOT|VV_OBJBUF), lock type ufs: EXCL (count 1) by thread 0xc4ac0c80
ino 2, on dev ar0s1a (4, 31)
0xc4e16a28: tag syncer, type VNON, usecount 1, writecount 0, refcount 1,
lock type syncer: EXCL (count 1) by thread 0xc4ac0c80


Looks like we're syncing the root inode for this filesystem.  Here's the
buffer:
b_flags = 0x200202a0<vmio,clusterok,done,delwri,cache>
b_error = 0, b_bufsize = 16384, b_bcount = 16384, b_resid = 0
b_dev = (4,31), b_data = 0xcf4de000, b_blkno = 224
b_npages = 4, pages(OBJ, IDX, PA):
(0, 0x1c, 0x181ef000),
(0, 0x1d, 0x17e50000),
(0, 0x1e, 0x180d1000),
(0, 0x1f, 0x180b2000)

And the page addresses:
0xcee4a120:     c1b04af0        c1af4638        c1affa80       c1aff1c8

first page
0xc1b04af0:     c1b041f0        c09522a0        c1af4638        c4e17770
0xc1b04b00:     0               c1ae4330        0               1c
0xc1b04b10:     0               181ef000        0               0
0xc1b04b20:     c1b04b1c        0               1000f           0
0xc1b04b30:     ff              0

The page looks valid.  Flags are 0, valid is 0xff, one of the splay
pointers is NULL, and the object is null.  The pages are all within the
vm_page_array.

The things that stand out to me, are the fsync failure, which
interestingly points to a VCHR vnode locked with XLOCK.  This is during
sysinstall, so I'm not sure why we'd be xlocking anything, but I don't
know much about the mechanics of sysinstall.  Also, the NULL object is
quite confusing since we have valid looking object offsets.

Furthermore, people say this goes away if you don't enable softupdate on
the install.  It has also been reproduced with sysinstall over a
previously created filesystem.

Anyone care to suggest further avenues for debugging?  I'm sleeping soon..

Cheers,
Jeff

> >
> > I get exactly the same type of panic with a Adapted AAC 2200S Raid
> > System. Just after the install is progressing further, the system
> > instantly panics.
>
>
>
> >
> > The bugs is the same with CURRENT from a month ago, as it exists in
> > 5.2RC1
> >
> > Martin
> >
> > Martin Blapp, <mb_at_imp.ch> <mbr_at_FreeBSD.org>
> > ------------------------------------------------------------------
> > ImproWare AG, UNIXSP & ISP, Zurlindenstrasse 29, 4133 Pratteln, CH
> > Phone: +41 61 826 93 00 Fax: +41 61 826 93 01
> > PGP: <finger -l mbr_at_freebsd.org>
> > PGP Fingerprint: B434 53FC C87C FE7B 0A18 B84C 8686 EF22 D300 551E
> > ------------------------------------------------------------------
> >
>
> --
> Doug White                    |  FreeBSD: The Power to Serve
> dwhite_at_gumbysoft.com          |  www.FreeBSD.org
> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
>
Received on Fri Dec 12 2003 - 03:04:07 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:33 UTC