Re: [CURRENT]: weird memory/linker problem?

From: O. Hartmann <ohartman_at_zedat.fu-berlin.de>
Date: Tue, 1 Jul 2014 17:33:35 +0200
Am Tue, 01 Jul 2014 17:23:14 +0200
Willem Jan Withagen <wjw_at_digiware.nl> schrieb:

> On 2014-07-01 16:48, Rang, Anton wrote:
> > DOT => DOD
> >
> > 444F54 => 444F44
> >
> > That's a single-bit flip.  Bad memory, perhaps?
> 
> Very likely, especially if the system does not have ECC....
> It just happens on rare occasions that a alpha particle, power cycle, or 
> any things else disruptive damages a memory cell. And it could be that 
> it requires a special pattern of accesses to actually exhibit the error.
> 
> In the past (199x's) 'make buildworld' used to be a rather good memory 
> tester. But nowadays look at
> 	http://www.memtest.org/
> 
> This tool has found all of the bad memory in all the systems I used and 
> or build for others...
> Note that it might take a few runs and some more heat to actually 
> trigger the faulty cell, but memtest86 will usually find it.
> 
> Note that on big systems with lots of memory it can take a loooooong 
> time to run just one full testset to completion.
> 
> --WjW

I already testet via memtest86+ (had to download the linux image, the port on FreeBSD is
broken on CURRENT). It didn't find anything strange so far.

I will do another test.

I realised, that on that that specific box, the chipset temperature is 81 Grad Celius.
The chipset is a Eaglelake P45 - in which the memory controller resides on that old
platform. dmidecode gives:

        Manufacturer: ASUSTeK Computer INC.
        Product Name: P5Q-WS
        Version: Rev 1.xx

> 
> 
> >
> > Anton
> >
> > -----Original Message-----
> > From: owner-freebsd-current_at_freebsd.org [mailto:owner-freebsd-current_at_freebsd.org] On
> > Behalf Of O. Hartmann Sent: Tuesday, July 01, 2014 8:08 AM
> > To: Dimitry Andric
> > Cc: Adrian Chadd; FreeBSD CURRENT
> > Subject: Re: [CURRENT]: weird memory/linker problem?
> >
> > Am Mon, 23 Jun 2014 17:22:25 +0200
> > Dimitry Andric <dim_at_FreeBSD.org> schrieb:
> >
> >> On 23 Jun 2014, at 16:31, O. Hartmann <ohartman_at_zedat.fu-berlin.de> wrote:
> >>> Am Sun, 22 Jun 2014 10:10:04 -0700
> >>> Adrian Chadd <adrian_at_freebsd.org> schrieb:
> >>>> When they segfault, where do they segfault?
> >> ...
> >>> GIMP, LaTeX work, nothing special, but a bit memory consuming
> >>> regrading GIMP) I tried updating the ports tree and surprisingly the
> >>> tree is left over in a unclean condition while /usr/bin/svn segfault
> >>> (on console: pid 18013 (svn), uid 0: exited on signal 11 (core dumped)).
> >>>
> >>> Using /usr/local/bin/svn, which is from the devel/subversion port,
> >>> performs well, while FreeBSD 11's svn contribution dies as described. It did not
> >>> hours ago!
> >>
> >> I think what Adrian meant was: can you run svn (or another crashing
> >> program) in gdb, and post a backtrace?  Or maybe run ktrace, and see
> >> where it dies?
> >>
> >> Alternatively, put a core dump and the executable (with debug info) in
> >> a tarball, and upload it somewhere, so somebody else can analyze it.
> >>
> >> -Dimitry
> >>
> >
> > It's me again, with the same weird story.
> >
> > After a couple of days silence, the mysterious entity in my computer is back. This
> > time it is again a weird compiler message of failure (trying to buildworld):
> >
> > [...]
> > c++  -O2 -pipe -O3 -O3
> > c++ -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include
> > -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/tools/clang/include
> > -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support -I.
> > -I/usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/../../lib/clang/include
> > -DLLVM_ON_UNIX -DLLVM_ON_FREEBSD -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS
> > -fno-strict-aliasing -DLLVM_DEFAULT_TARGET_TRIPLE=\"x86_64-unknown-freebsd11.0\"
> > -DLLVM_HOST_TRIPLE=\"x86_64-unknown-freebsd11.0\" -DDEFAULT_SYSROOT=\"\"
> > -Qunused-arguments -I/usr/obj/usr/src/tmp/legacy/usr/include -std=c++11
> > -fno-exceptions -fno-rtti -Wno-c++11-extensions
> > -c /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/Host.cpp -o
> > Host.o --- GraphWriter.o --- In file included
> > from /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/lib/Support/GraphWriter.cpp:14: /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include/llvm/Support/GraphWriter.h:269:10:
> > error: use of undeclared identifier 'DOD'; did you mean 'DOT'? O <<
> > DOD::EscapeString(Label); ^~~
> > DOT /usr/src/lib/clang/libllvmsupport/../../../contrib/llvm/include/llvm/Support/GraphWriter.h:35:11:
> > note: 'DOT' declared here namespace DOT {  // Private functions... ^ 1 error
> > generated. *** [GraphWriter.o] Error code 1
> >
> >
> > Well, in the past I saw many of those messages, especially not found labels of
> > routines in shared objects/libraries or even those "funny" misspelled messages shown
> > above.
> >
> > I can not reproduce them after a reboot, but as long as the system is running with
> > this error occured, it is sticky. So in order to compile the OS successfully, I
> > reboot.
> >
> > Does anyone have an idea what this could be? Since it affects at the moment only one
> > machine (the other CoreDuo has been retired in the meanwhile), it feels a bit like a
> > miscompilation on a certain type of CPU.
> >
> > Thanks for your patience,
> >
> > Oliver
> > _______________________________________________
> > freebsd-current_at_freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> >



Received on Tue Jul 01 2014 - 13:33:45 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:50 UTC