Re: named crashes on assertion in rbtdb.c on sparc64/SMP

From: Marius Strobl <marius_at_alchemy.franken.de>
Date: Fri, 15 Jul 2011 01:21:26 +0200
On Thu, Jul 14, 2011 at 09:53:42AM +0400, KOT MATPOCKuH wrote:
> 2011/7/11 KOT MATPOCKuH <matpockuh_at_gmail.com>:
> >> Oops, sorry, I forgot to revert the previous patch when test-compiling.
> >> Please re-fetch sparc64_isc_atomic.h.diff2 and try again.
> > I started named from ports (dns/bind96) at Sat Jul ?9 10:08:41 MSD,
> > and it worked properly till Sun Jul 10 22:25:41 MSD.
> > At 22:25:41 I restarted bind from base system with your
> > sparc64_isc_atomic.h.diff2.
> > From this moment till today, 15:57:05 he crashed 3 times:
> > Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on signal 6
> > Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on signal 6
> > Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on signal 6
> >
> > To make to ensure proper operation of bind from ports, I ran it again
> > at 15:57:05, and, I think, we need to wait several days.
> And from that time till now bind from ports never died and works properly...
> 

Okay.
Doug, could you please disable the use of atomic operations for sparc64
in the in-tree BIND via the following patch in order to match what the
vendor source does?
http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
I've no idea why they don't work properly (apart from the fact that there
additionally should be memory barriers at least when used for reference
counting just like the alpha version of the ISC atomic operations uses),
I just can say they match what we use in the kernel without problems
pretty closely and that they work as described in the respective comments
when testing them stand-alone. So my best guess is that the BIND source
additionaly depends on some x86-specific behavior of the atomic operations
there or in general, but from a glance the source it's not obvious for me
what that could be. Given that the vendor source doesn't even use atomic
operations on Solaris/SPARC I suspect this is a non-trivial problem.
It probably would be a good idea to also disable the use of atomic
operations for arm again just like the vendor source does as they don't
work there either but nobody seems to care (see PR 154306).

Marius
Received on Thu Jul 14 2011 - 21:21:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:15 UTC