Re: bind fails with sig11 on start / pthread failure on ARM?

From: M. Warner Losh <imp_at_bsdimp.com>
Date: Thu, 18 Feb 2010 20:22:08 -0700 (MST)
In message: <20100219031200.GY43625_at_cicely7.cicely.de>
            Bernd Walter <ticso_at_cicely7.cicely.de> writes:
: On Thu, Feb 18, 2010 at 03:10:10PM +0200, Kostik Belousov wrote:
: > On Thu, Feb 18, 2010 at 01:49:07PM +0100, Bernd Walter wrote:
: > > On Tue, Feb 16, 2010 at 07:39:51PM +0100, Bernd Walter wrote:
: > > > On Mon, Feb 15, 2010 at 10:39:07PM +0100, Bernd Walter wrote:
: > > > [55]Please.tell.me.who.am.I# gdb /usr/sbin/named named.core 
: > > > GNU gdb 6.1.1 [FreeBSD]
: > > > Copyright 2004 Free Software Foundation, Inc.
: > > > GDB is free software, covered by the GNU General Public License, and you are
: > > > welcome to change it and/or distribute copies of it under certain conditions.
: > > > Type "show copying" to see the conditions.
: > > > There is absolutely no warranty for GDB.  Type "show warranty" for details.
: > > > This GDB was configured as "arm-marcel-freebsd"...(no debugging symbols found)...
: > > > Core was generated by `named'.
: > > > Program terminated with signal 5, Trace/breakpoint trap.
: > > > Reading symbols from /lib/libcrypto.so.6...(no debugging symbols found)...done.
: > > > Loaded symbols for /lib/libcrypto.so.6
: > > > Reading symbols from /lib/libthr.so.3...(no debugging symbols found)...done.
: > > > Loaded symbols for /lib/libthr.so.3
: > > > Reading symbols from /lib/libc.so.7...(no debugging symbols found)...done.
: > > > Loaded symbols for /lib/libc.so.7
: > > > Reading symbols from /libexec/ld-elf.so.1...(no debugging symbols found)...done.
: > > > Loaded symbols for /libexec/ld-elf.so.1
: > > > #0  0x203571b0 in _thread_bp_create () from /lib/libthr.so.3
: > > > [New Thread 20804280 (LWP 100062)]
: > > > [New Thread 20804140 (LWP 100052)]
: > > > (gdb) bt
: > > > #0  0x203571b0 in _thread_bp_create () from /lib/libthr.so.3
: > > > #1  0x203572b8 in _thread_bp_death () from /lib/libthr.so.3
: > > > #2  0x20349da4 in pthread_create () from /lib/libthr.so.3
: > > > #3  0x00164cb8 in ?? ()
: > > > (gdb) 
: > > > 
: > > > Do we have a general threading problem on ARM?
: > > 
: > > I see two different type a crashes.
: > > Both have in common that one or more threads are in _umtx_op.
: > > Unfortunately I don't know enough details about those things to isolate
: > > any more.
: > > 
: > > the one from above:
: > > #0  0x203571b0 in _thread_bp_create () from /lib/libthr.so.3
: > > [New Thread 20804280 (LWP 100062)]
: > > [New Thread 20804140 (LWP 100052)]
: > > (gdb) bt
: > > #0  0x203571b0 in _thread_bp_create () from /lib/libthr.so.3
: > > #1  0x203572b8 in _thread_bp_death () from /lib/libthr.so.3
: > > #2  0x20349da4 in pthread_create () from /lib/libthr.so.3
: > > #3  0x00164cb8 in ?? ()
: > > (gdb) thread 1
: > > [Switching to thread 1 (Thread 20804280 (LWP 100062))]#0  0x203ab6f0 in _umtx_op () from /lib/libc.so.7
: > > (gdb) bt
: > > #0  0x203ab6f0 in _umtx_op () from /lib/libc.so.7
: > > #1  0x2035769c in pthread_cleanup_push () from /lib/libthr.so.3
: > > #2  0x20357cc0 in pthread_cleanup_push () from /lib/libthr.so.3
: > > #3  0x20349540 in pthread_getprio () from /lib/libthr.so.3
: > > #4  0x203499a0 in pthread_create () from /lib/libthr.so.3
: > > #5  0x00164cb8 in ?? ()
: > > 
: > > And another, which is what I get most of the time:
: > > (gdb) thread 1
: > > [Switching to thread 1 (Thread 20804500 (LWP 100100))]#0  0x20435f28 in kevent () from /lib/libc.so.7
: > > (gdb) bt
: > > #0  0x20435f28 in kevent () from /lib/libc.so.7
: > > #1  0x0014f2dc in ?? ()
: > > (gdb) thread 2
: > > [Switching to thread 2 (Thread 208043c0 (LWP 100099))]#0  0x203ab6f4 in _umtx_op () from /lib/libc.so.7
: > > (gdb) bt
: > > #0  0x203ab6f4 in _umtx_op () from /lib/libc.so.7
: > > #1  0x2035769c in pthread_cleanup_push () from /lib/libthr.so.3
: > > #2  0x20357a78 in pthread_cleanup_push () from /lib/libthr.so.3
: > > #3  0x20355580 in pthread_cond_signal () from /lib/libthr.so.3
: > > #4  0x00000000 in ?? ()
: > > (gdb) thread 3
: > > [Switching to thread 3 (Thread 20804280 (LWP 100098))]#0  0x203ab6f4 in _umtx_op () from /lib/libc.so.7
: > > (gdb) bt
: > > #0  0x203ab6f4 in _umtx_op () from /lib/libc.so.7
: > > #1  0x2035769c in pthread_cleanup_push () from /lib/libthr.so.3
: > > #2  0x20357a78 in pthread_cleanup_push () from /lib/libthr.so.3
: > > #3  0x20355580 in pthread_cond_signal () from /lib/libthr.so.3
: > > #4  0x2092d008 in ?? ()
: > > (gdb) thread 4
: > > [Switching to thread 4 (Thread 20804140 (LWP 100043))]#0  0x0015755c in ?? ()
: > > (gdb) bt
: > > #0  0x0015755c in ?? ()
: > 
: > Compile and install ld-elf.so, libc and libthr with debugging symbols:
: > (cd libexec/rtld-elf && make all install DEBUG_FLAGS=-g)
: > (cd lib/libc && make all install DEBUG_FLAGS=-g)
: > (cd lib/libthr && make all install DEBUG_FLAGS=-g)
: > 
: > Then repeat the crash and try to see where in code does it happen.
: 
: Currently I can only get this type.
: I've started the unstripped named to get all the function names.
: 
: (gdb) thread 1
: [Switching to thread 1 (Thread 20804500 (LWP 100100))]#0  0x20435308 in kevent () at kevent.S:3
: 3       RSYSCALL(kevent)
: Current language:  auto; currently asm
: (gdb) bt
: #0  0x20435308 in kevent () at kevent.S:3
: #1  0x0014f2dc in watcher ()
: #2  0x203495b0 in thread_start (curthread=0x20804500) at /data/builder/arm-current/head/lib/libthr/thread/thr_create.c:288
: #3  0x20349a20 in _pthread_create (thread=0x2046caa8, attr=0xbfffe7f8, start_routine=0x14f2ac <watcher>, arg=Variable "arg" is not available.
: )
:     at /data/builder/arm-current/head/lib/libthr/thread/thr_create.c:174
: #4  0x00000000 in ?? ()
: Cannot access memory at address 0x40
: (gdb) thread 2
: [Switching to thread 2 (Thread 208043c0 (LWP 100099))]#0  0x203ab558 in _umtx_op () at _umtx_op.S:3
: 3       RSYSCALL(_umtx_op)
: (gdb) bt
: #0  0x203ab558 in _umtx_op () at _umtx_op.S:3
: #1  0x2035753c in _umtx_op_err (obj=Variable "obj" is not available.
: ) at /data/builder/arm-current/head/lib/libthr/thread/thr_umtx.c:36
: #2  0x20357910 in _thr_ucond_wait (cv=Variable "cv" is not available.
: ) at /data/builder/arm-current/head/lib/libthr/thread/thr_umtx.c:182
: #3  0x20355464 in cond_wait_common (cond=Variable "cond" is not available.
: ) at /data/builder/arm-current/head/lib/libthr/thread/thr_cond.c:204
: #4  0x20355600 in __pthread_cond_wait (cond=Variable "cond" is not available.
: ) at /data/builder/arm-current/head/lib/libthr/thread/thr_cond.c:228
: #5  0x00153080 in run ()
: #6  0x203495b0 in thread_start (curthread=0x208043c0) at /data/builder/arm-current/head/lib/libthr/thread/thr_create.c:288
: #7  0x20349a20 in _pthread_create (thread=0x0, attr=0xbfffe8f8, start_routine=0x152d70 <run>, arg=Variable "arg" is not available.
: )
:     at /data/builder/arm-current/head/lib/libthr/thread/thr_create.c:174
: #8  0x2092b088 in ?? ()
: (gdb) thread 3
: [Switching to thread 3 (Thread 20804280 (LWP 100098))]#0  0x203ab558 in _umtx_op () at _umtx_op.S:3
: 3       RSYSCALL(_umtx_op)
: (gdb) bt
: #0  0x203ab558 in _umtx_op () at _umtx_op.S:3
: #1  0x2035753c in _umtx_op_err (obj=Variable "obj" is not available.
: ) at /data/builder/arm-current/head/lib/libthr/thread/thr_umtx.c:36
: #2  0x20357910 in _thr_ucond_wait (cv=Variable "cv" is not available.
: ) at /data/builder/arm-current/head/lib/libthr/thread/thr_umtx.c:182
: #3  0x20355464 in cond_wait_common (cond=Variable "cond" is not available.
: ) at /data/builder/arm-current/head/lib/libthr/thread/thr_cond.c:204
: #4  0x20355600 in __pthread_cond_wait (cond=Variable "cond" is not available.
: ) at /data/builder/arm-current/head/lib/libthr/thread/thr_cond.c:228
: #5  0x0016379c in run ()
: #6  0x203495b0 in thread_start (curthread=0x20804280) at /data/builder/arm-current/head/lib/libthr/thread/thr_create.c:288
: #7  0x20349a20 in _pthread_create (thread=0x0, attr=0xbfffe8f8, start_routine=0x163724 <run>, arg=Variable "arg" is not available.
: )
:     at /data/builder/arm-current/head/lib/libthr/thread/thr_create.c:174
: #8  0x2092b088 in ?? ()
: (gdb) thread 4
: [Switching to thread 4 (Thread 20804140 (LWP 100053))]#0  0x0015755c in isc_atomic_cmpxchg ()
: (gdb) bt
: #0  0x0015755c in isc_atomic_cmpxchg ()
: #1  0x00157dac in isc_rwlock_lock ()
: #2  0x000f9790 in dns_db_register ()
: #3  0x0004d590 in dns_sdb_register ()
: #4  0x0000c974 in ns_builtin_init ()
: #5  0x0001aa90 in $a ()
: #6  0x0001aa90 in $a ()
: 
: isc_atomic_cmpxchg really sounds quite interesting though.
: It is not only the crashing function it is also a type of function which
: sounds error prune.

ARM atomics require the help of the kernel to get right...

Warner
Received on Fri Feb 19 2010 - 02:26:23 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:01 UTC