[CURRENT]: weird memory/linker problem?

From: O. Hartmann <ohartman_at_zedat.fu-berlin.de>
Date: Sun, 22 Jun 2014 16:56:39 +0200
Hello.

I face a strange problem on a set of CURRENT driven boxes. The systems in question are
all the same version of CURRENT (more or less, a week or so discrepancy).

The boxes affected have 8 GB of RAM and are old-style Core2Duo systems.

The phenomenon:

Starting up the box shows the operating system working. But sometimes it is impossible to
start certain applications, like Firefox - they segfault. More disturbing is the fail of
the linker when building world. Sometimes I get strange messages like

relocation truncated to fit: R_X86_64_PC32 against symbol `__error' defined in .text

when compiling/linking. The funny thing is: rebooting the box and doing exactly the same
very often leaves the system then operable - starting applications works, compiling works!

First I thought this could be a indication of a dying system and so I checked the memory
for two days non-stop without any indication of anything wrong. The boxes do not have ECC
RAM - it's Intel.

I see this problem on two C2D based boxes relatively often (one E8400 two core, another
Q6600 quadcore, both systems have 8 GB RAM). This phenomenon also occured two or three
months ago on another machine with 32 GB RAM and a Core-i7 3930K, but it went away (it was
the very same error as shown above).

Another system, a i3-3220 with 16 GB RAM never showed the problem although that system
build world also on a regular basis very frequent as the C2D systems do.

Well, I feel a bit confused. On the first view, the problem looks weird and it indicates
a kind of memory problem - but testing the memory didn't show anything wrong. 

Today "windowmaker" stopped starting due to a malformed command in one of windowmaker's
library. I did reboot the box and everything was all right. Then, also today, I tried
compiling world and I got a weird error message about a misspelled "Int__xxx", I can not
remember exactly the text, I rebooted and everything was all right again.

Those errors are frequent on 8GB, C2D based systems and at the moment not present any
more on more modern systems with more memory as described above. This could be a
coincidence, but it is strange anyway.

I do not exclude dying hardware, but I'd like to ask whether there is something strange
going on with FreeBSD's memory management at the moment and whether those problems could
also be triggered by some nasty bug? I never see a crash (which would also indicated
faulty hardware), I mostly realise those strange behaviour either after a fresh boot or
after I ran some memory disk i/o intensive jobs, like updating the ports tree.

By the way, FreeBSD CURRENT suffer from a tremendous performance cut these days when
compiling world and updating the ports tree and running portmaster. On one box, on which
ports reside on a UFS partion, it takes more than 8 minutes to pass the portmaster -da,
which is quick when not compiling world. On another system on which /usr/ports is
residing on ZFS (the box has 16GB RAM!), it takes sometimes 30(!) minutes to perform a
"svn update" while compiling world (that is the i3-3220 with 16 GB RAM system), it takes
6 - 15 minutes when the box is relaxed and updating the ports tree the first time (every
subsequent update is much faster).

Well, I know these reports of mine are a bit weird since I have no exact log of the
problems, but I think if there is an issue not with the hardware, I report those in.

Regards,

oh

Received on Sun Jun 22 2014 - 13:00:41 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:50 UTC