Re: bind fails with sig11 on start / pthread failure on ARM?

From: Bernd Walter <ticso_at_cicely7.cicely.de>
Date: Fri, 19 Feb 2010 00:45:37 +0100
On Thu, Feb 18, 2010 at 03:10:10PM +0200, Kostik Belousov wrote:
> On Thu, Feb 18, 2010 at 01:49:07PM +0100, Bernd Walter wrote:
> > On Tue, Feb 16, 2010 at 07:39:51PM +0100, Bernd Walter wrote:
> > > On Mon, Feb 15, 2010 at 10:39:07PM +0100, Bernd Walter wrote:
> > > [55]Please.tell.me.who.am.I# gdb /usr/sbin/named named.core 
> > > GNU gdb 6.1.1 [FreeBSD]
> > > Copyright 2004 Free Software Foundation, Inc.
> > > GDB is free software, covered by the GNU General Public License, and you are
> > > welcome to change it and/or distribute copies of it under certain conditions.
> > > Type "show copying" to see the conditions.
> > > There is absolutely no warranty for GDB.  Type "show warranty" for details.
> > > This GDB was configured as "arm-marcel-freebsd"...(no debugging symbols found)...
> > > Core was generated by `named'.
> > > Program terminated with signal 5, Trace/breakpoint trap.
> > > Reading symbols from /lib/libcrypto.so.6...(no debugging symbols found)...done.
> > > Loaded symbols for /lib/libcrypto.so.6
> > > Reading symbols from /lib/libthr.so.3...(no debugging symbols found)...done.
> > > Loaded symbols for /lib/libthr.so.3
> > > Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done.
> > > Loaded symbols for /lib/libc.so.7
> > > Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols found)...done.
> > > Loaded symbols for /libexec/ld-elf.so.1
> > > #0  0x203571b0 in _thread_bp_create () from /lib/libthr.so.3
> > > [New Thread 20804280 (LWP 100062)]
> > > [New Thread 20804140 (LWP 100052)]
> > > (gdb) bt
> > > #0  0x203571b0 in _thread_bp_create () from /lib/libthr.so.3
> > > #1  0x203572b8 in _thread_bp_death () from /lib/libthr.so.3
> > > #2  0x20349da4 in pthread_create () from /lib/libthr.so.3
> > > #3  0x00164cb8 in ?? ()
> > > (gdb) 
> > > 
> > > Do we have a general threading problem on ARM?
> > 
> > I see two different type a crashes.
> > Both have in common that one or more threads are in _umtx_op.
> > Unfortunately I don't know enough details about those things to isolate
> > any more.
> > 
> > the one from above:
> > #0  0x203571b0 in _thread_bp_create () from /lib/libthr.so.3
> > [New Thread 20804280 (LWP 100062)]
> > [New Thread 20804140 (LWP 100052)]
> > (gdb) bt
> > #0  0x203571b0 in _thread_bp_create () from /lib/libthr.so.3
> > #1  0x203572b8 in _thread_bp_death () from /lib/libthr.so.3
> > #2  0x20349da4 in pthread_create () from /lib/libthr.so.3
> > #3  0x00164cb8 in ?? ()
> > (gdb) thread 1
> > [Switching to thread 1 (Thread 20804280 (LWP 100062))]#0  0x203ab6f0 in _umtx_op () from /lib/libc.so.7
> > (gdb) bt
> > #0  0x203ab6f0 in _umtx_op () from /lib/libc.so.7
> > #1  0x2035769c in pthread_cleanup_push () from /lib/libthr.so.3
> > #2  0x20357cc0 in pthread_cleanup_push () from /lib/libthr.so.3
> > #3  0x20349540 in pthread_getprio () from /lib/libthr.so.3
> > #4  0x203499a0 in pthread_create () from /lib/libthr.so.3
> > #5  0x00164cb8 in ?? ()
> > 
> > And another, which is what I get most of the time:
> > (gdb) thread 1
> > [Switching to thread 1 (Thread 20804500 (LWP 100100))]#0  0x20435f28 in kevent () from /lib/libc.so.7
> > (gdb) bt
> > #0  0x20435f28 in kevent () from /lib/libc.so.7
> > #1  0x0014f2dc in ?? ()
> > (gdb) thread 2
> > [Switching to thread 2 (Thread 208043c0 (LWP 100099))]#0  0x203ab6f4 in _umtx_op () from /lib/libc.so.7
> > (gdb) bt
> > #0  0x203ab6f4 in _umtx_op () from /lib/libc.so.7
> > #1  0x2035769c in pthread_cleanup_push () from /lib/libthr.so.3
> > #2  0x20357a78 in pthread_cleanup_push () from /lib/libthr.so.3
> > #3  0x20355580 in pthread_cond_signal () from /lib/libthr.so.3
> > #4  0x00000000 in ?? ()
> > (gdb) thread 3
> > [Switching to thread 3 (Thread 20804280 (LWP 100098))]#0  0x203ab6f4 in _umtx_op () from /lib/libc.so.7
> > (gdb) bt
> > #0  0x203ab6f4 in _umtx_op () from /lib/libc.so.7
> > #1  0x2035769c in pthread_cleanup_push () from /lib/libthr.so.3
> > #2  0x20357a78 in pthread_cleanup_push () from /lib/libthr.so.3
> > #3  0x20355580 in pthread_cond_signal () from /lib/libthr.so.3
> > #4  0x2092d008 in ?? ()
> > (gdb) thread 4
> > [Switching to thread 4 (Thread 20804140 (LWP 100043))]#0  0x0015755c in ?? ()
> > (gdb) bt
> > #0  0x0015755c in ?? ()
> 
> Compile and install ld-elf.so, libc and libthr with debugging symbols:
> (cd libexec/rtld-elf && make all install DEBUG_FLAGS=-g)
> (cd lib/libc && make all install DEBUG_FLAGS=-g)
> (cd lib/libthr && make all install DEBUG_FLAGS=-g)
> 
> Then repeat the crash and try to see where in code does it happen.

It is still compiling, but since kdump was fixed recently (thanks guys!)
I already have some other data.
But to be honest I don't see anything usefull in it.
The last entry before the segfault was a successfull syslog submission.
Hope the compiler finishes soon to get more detailed backtraces.

[...]
 59537 named    CALL  kevent(0x8,0xbfffe91c,0x1,0,0,0)
 59537 named    GIO   fd 8 wrote 20 bytes
       0x0000 0500 0000 ffff 0100 0000 0000 0000 0000 0000 0000                                                    |....................|

 59537 named    GIO   fd 8 read 0 bytes
       ""
 59537 named    RET   kevent 0
 59537 named    CALL  mmap(0xbfafc000,0x101000,PROT_READ|PROT_WRITE,MAP_STACK,0xffffffff,0,0)
 59537 named    RET   mmap -1079001088/0xbfafc000
 59537 named    CALL  mprotect(0xbfafc000,0x1000,PROT_NONE)
 59537 named    RET   mprotect 0
 59537 named    CALL  thr_new(0xbfffe8b4,0x34)
 59537 named    RET   thr_new 0
 59537 named    RET   fork 0
 59537 named    CALL  kevent(0x8,0,0,0x209c6100,0x40,0)
 59537 named    CALL  clock_gettime(0xd,0xbfffce58)
 59537 named    RET   clock_gettime 0
 59537 named    CALL  getpid
 59537 named    RET   getpid 59537/0xe891
 59537 named    CALL  sendto(0x3,0xbfffd304,0x3a,0,0,0)
 59537 named    GIO   fd 3 wrote 58 bytes
       "<30>Feb 18 23:38:10 named[59537]: using up to 4096 sockets"
 59537 named    RET   sendto 58/0x3a
 59537 named    PSIG  SIGSEGV SIG_DFL
 59537 named    RET   _umtx_op -1 errno 4 Interrupted system call
 59537 named    GIO   fd 8 wrote 0 bytes
       ""
 59537 named    RET   kevent -1 errno 4 Interrupted system call
 59537 named    RET   _umtx_op -1 errno 4 Interrupted system call
 59537 named    NAMI  "named.core"
 59536 initial thread GIO   fd 5 read 0 bytes
       ""
 59536 initial thread RET   read 0
 59536 initial thread CALL  exit(0x1)


-- 
B.Walter <bernd_at_bwct.de> http://www.bwct.de
Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm.
Received on Thu Feb 18 2010 - 22:45:49 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:01 UTC