Re: boot-time crash in today's -current, and other threading problems

From: Attilio Rao <attilio_at_freebsd.org>
Date: Wed, 21 Nov 2007 20:39:01 +0100
2007/11/21, Doug Barton <dougb_at_freebsd.org>:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: RIPEMD160
>
> I'm spamming everyone who's had fingers in the threading code lately
> since I can't seem to find a specific commit that looks guilty.
>
> On 19 Nov. I updated my -current system and noticed a regression where
> alpine (a new version of the pine mail client that uses threads) would
> crash while opening my mail folders with a sig 6. I figured I'd wait a
> day or two since it was obvious that there was some work going on with
> threads, and other things were working.
>
> Today I upgraded again, and the new kernel crashes on startup.
> Traceback is below. Suggestions welcome.
>
> Doug
>
> ...
> bge0: <Broadcom NetXtreme Gigabit Ethernet Controller, ASIC rev.
> 0x6002> mem 0xecef0000-0xecefffff irq 18 at device 0.0 on pci9
> miibus0: <MII bus> on bge0
> brgphy0: <BCM5752 10/100/1000baseTX PHY> PHY 1 on miibus0
> brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
> 1000baseT-FDX, auto
> bge0: Ethernet address: 00:15:c5:55:f0:5b
> bge0: [ITHREAD]
>
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address   = 0xdeadc0ee
> fault code              = supervisor read, page not present
> instruction pointer     = 0x20:0xc0556053
> stack pointer           = 0x28:0xe5b00c54
> frame pointer           = 0x28:0xe5b00c64
> code segment            = base 0x0, limit 0xfffff, type 0x1b
>                         = DPL 0, pres 1, def32 1, gran 1
> processor eflags        = interrupt enabled, resume, IOPL = 0
> current process         = 11 (swi4: clock sio)
> Physical memory: 2029 MB
> Dumping 70 MB: 55 39 23 7
>
> #0  doadump () at pcpu.h:195
> 195     pcpu.h: No such file or directory.
>         in pcpu.h
>
> (kgdb) where
> #0  doadump () at pcpu.h:195
> #1  0xc0455619 in db_fncall (dummy1=-441447812, dummy2=0, dummy3=582,
>     dummy4=0xe5b009e8 "ÀxEÀ") at /usr/local/src/sys/ddb/db_command.c:486
> #2  0xc0455b85 in db_command_loop () at
> /usr/local/src/sys/ddb/db_command.c:401
> #3  0xc04572f5 in db_trap (type=12, code=0)
>     at /usr/local/src/sys/ddb/db_main.c:222
> #4  0xc058b136 in kdb_trap (type=12, code=0, tf=0xe5b00c14)
>     at /usr/local/src/sys/kern/subr_kdb.c:502
> #5  0xc071f4af in trap_fatal (frame=0xe5b00c14, eva=3735929070)
>     at /usr/local/src/sys/i386/i386/trap.c:863
> #6  0xc071f6d0 in trap_pfault (frame=0xe5b00c14, usermode=0,
> eva=3735929070)
>     at /usr/local/src/sys/i386/i386/trap.c:785
> #7  0xc071ff92 in trap (frame=0xe5b00c14)
>     at /usr/local/src/sys/i386/i386/trap.c:463
> #8  0xc0706dab in calltrap () at
> /usr/local/src/sys/i386/i386/exception.s:139
> #9  0xc0556053 in _mtx_assert (m=0xdeadc0de, what=20,
>     file=0xc0763525 "/usr/local/src/sys/kern/kern_mutex.c", line=167)
>     at /usr/local/src/sys/kern/kern_mutex.c:632
> #10 0xc055676a in unlock_mtx (lock=0xdeadc0de)
>     at /usr/local/src/sys/kern/kern_mutex.c:167
> #11 0xc0574c3f in softclock (dummy=0x0)
>     at /usr/local/src/sys/kern/kern_timeout.c:297
> #12 0xc0546405 in ithread_loop (arg=0xc54daa00)
>     at /usr/local/src/sys/kern/kern_intr.c:1034
> #13 0xc0543988 in fork_exit (callout=0xc0546250 <ithread_loop>,
>     arg=0xc54daa00, frame=0xe5b00d38)
>     at /usr/local/src/sys/kern/kern_fork.c:788
> #14 0xc0706e20 in fork_trampoline ()
>     at /usr/local/src/sys/i386/i386/exception.s:205

My last callout commit did this.
Obviously in my tests it didn't get out.
After a chat with jhb situation is clearer, I will post a patch you
can try soon.

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
Received on Wed Nov 21 2007 - 18:39:04 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:22 UTC