Re: C++ in jemalloc

From: Mark Millard <markmi_at_dsl-only.net>
Date: Fri, 6 Oct 2017 09:28:37 -0700
On 2017-Oct-6, at 7:15 AM, Justin Hibbits <jrh29 at alumni.cwru.edu> wrote:

> Hi Mark,
> 
> On Thu, Oct 5, 2017 at 11:58 PM, Mark Millard <markmi_at_dsl-only.net> wrote:
>> Warner Losh imp at bsdimp.com wrote on
>> Thu Oct 5 21:01:26 UTC 2017 :
>> 
>>> Starting in FreeBSD 11, arm and powerpc are supported by clang,
>>> but not super well. For FreeBSD 12, we're getting close for everything
>>> except sparc64 (whose fate has not yet been finally decided).
>> 
>> My understanding of the powerpc and powerpc64 status
>> follows. This is based on my use of head via clang
>> as much as I can for them.
>> 
>> For TARGET_ARCH=powerpc64 and TARGET_ARCH=powerpc :
>> 
>> lld is far from working last I knew. (I've focused
>> more on the compilers for testing, using other
>> linkers and such.) [lldb may be in a similar state
>> for powerpc64. It does not build for powerpc.]
>> 
>> clang 5 (and 4) generated code crashes on any thrown
>> C++ exception. For example:
>> 
>> #include <exception>
>> 
>> int main(void)
>> {
>>    try { throw std::exception(); }
>>    catch (std::exception& e) {}
>>    return 0;
>> }
>> 
>> crashes.
>> 
>> Luckily most kernel and world code that I actively use
>> does not throw C++ exceptions in my use.
> 
> Do you see this problem using libstdc++, et al, from base gcc 4.2.1?
> Or using libc++?

gcc 4.2.1 buildkernel buildworld work fine for anything that I've
tried. They are libstdc++ based.

I've not tried clang with libstdc++, just libc++. (Note: clang 3.8,
3.9, 4.0, and 5.0 all have had the problem. My llvm bug submittals
tend to be from the earlier time frame. Many of my submittals for
other types of issues have been addressed. )

But my llvm bugzilla submittals for C++ exceptions indicate
errors/incompletenesses in the DW_CFA_<?> generation, such as
for scratch register handling. (Warning: I've not been through
the details in some time so this is from a vague memory.) 26844
and 26856 are the relevant ones if I remember right. 31590 might
be relevant depending on what linunwind is to be used.

Be warned that I do not believe Roman Divacky agrees with my
interpretation and I'd never studied the exception handling
techniques prior to these investigations. Still I think that
I was correct about there being problems in the DW_CFA_<?>
sequences generated.

For a separate issue llvm 31716 is relevant for .plt and the
function descriptor layout. I use Roman Divacky's patch listing in
Comment 1. Included below as well.

The llvm patches that I have are both from Roman as I remember:

Index: /usr/src/contrib/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
===================================================================
--- /usr/src/contrib/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp       (revision 324071)
+++ /usr/src/contrib/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp       (working copy)
_at__at_ -1178,7 +1178,7 _at__at_
       // For SVR4, don't emit a move for the CR spill slot if we haven't
       // spilled CRs.
       if (isSVR4ABI && (PPC::CR2 <= Reg && Reg <= PPC::CR4)
-          && !MustSaveCR)
+          && (!MustSaveCR && isPPC64))
         continue;
 
       // For 64-bit SVR4 when we have spilled CRs, the spill location
Index: /usr/src/contrib/llvm/tools/lld/ELF/Arch/PPC64.cpp
===================================================================
--- /usr/src/contrib/llvm/tools/lld/ELF/Arch/PPC64.cpp  (revision 324071)
+++ /usr/src/contrib/llvm/tools/lld/ELF/Arch/PPC64.cpp  (working copy)
_at__at_ -60,7 +60,8 _at__at_
 static uint16_t applyPPCHighesta(uint64_t V) { return (V + 0x8000) >> 48; }
 
 PPC64::PPC64() {
-  PltRel = GotRel = R_PPC64_GLOB_DAT;
+  GotRel = R_PPC64_GLOB_DAT;
+  PltRel = R_PPC64_JMP_SLOT;
   RelativeRel = R_PPC64_RELATIVE;
   GotEntrySize = 8;
   GotPltEntrySize = 8;



> I don't have the time right now to look into it, but if no one else is
> able to in the next couple months I'll try to make the time when
> higher priorities are done.

Are you familiar with what the DQ_CFA_<?> sequences should
be like given the powerpc scratch register usage and the
like?

>> But devel/kyua is majorly broken by the C++ exception
>> issue: It makes extensive use of C++ exceptions. In my
>> view that disqualifies clang as being "close": I view
>> my activity as a hack until devel/kyua is generally
>> operable and so available for use in testing.
>> 
>> clang 5 currently can not build the TARGET_ARCH=powerpc
>> kernel. (I was able to back in clang 4 days --but the
>> resultant build failed to execute init fully after
>> finishing the prior boot activity.) For the 32-bit
>> context I use gcc 4.2.1 for building the kernel and
>> clang 5 for building the world, system binutils
>> in use in both cases.
> 
> What problem(s) do you see with this?  If they're just compile time
> failures they can be fixed pretty readily.

I submitted FreeBSD bugzilla 221107 for the:

R_PPC_PLTREL24 reloc against local symbol

failures. This was using system binutils.

In a parallel builds it is a race between agp.* vs.
aha.* reporting this and stopping the build.


>> I do build the kernel and world for
>> TARGET_ARCH=powerpc64 via system clang 5. But I
>> use ports binutils as well in order for this to
>> finish and work overall.
>> 
>> 
>> As for more modern devel/powerpc64-xtoolchain-gcc
>> (so devel/powerpc64-gcc) versions being used to
>> build the world and kernel for powerpc64 I've never
>> been able to get lib32 on powerpc64 to work via
>> such a build: it builds but fails to execute from
>> dereferencing via an R30 that has an inappropriate
>> value for the attempt ( lib32/crtbeginS.o code in
>> _init in /usr/lib32/libc.so.* ). (The clang-based
>> builds do not have this problem.) It has been a
>> while since I've done devel/powerpc64-gcc
>> experiments.
>> 
>> I'm not aware of a devel/powerpc-xtoolchain-gcc
>> as an alternative for 32-bit powerpc targeting.
> 
> There's documentation floating around (on the wiki maybe?) for doing
> this.  I won't check now, but it's not difficult (not trivial, but not
> difficult).  With the proposal to eliminate gcc 4.2.1 from our tree by
> the end of the year, we need to get everything in place to make a
> seamless transition, whether it be to external toolchain or to finish
> up clang for powerpc.  I really hope we can finish up clang.  Please
> continue to file bugs with as much detail as necessary to track down
> and fix the problems--both in FreeBSD and upstream LLVM.

I've never run into instructions for targeting 32-bit
powerpc FreeBSD via some gcc vintage/variant. As I
remember: When I tried I failed to figure out how to
et devel/powerpc64-gcc and related things to produce
what was needed. But I've not retried in a long time.



My intended primary environment for FreeBSD build
activity is to be unavailable for a month or so. So
I'm currently limited to slower alternatives. Another
amd64 that I otherwise use for cross builds looks like
it will have to go out for repair.

I did finally get both 32-bit and 64-bit powerpc to
jump from well before INO64 to fairly modern recently:
head -r324071 . Even the old iMac G3 boots the
32-bit variant.


You might want to stop reading here.

Details of /usr/src my activity is based on:
(some of the below list is not for powerpc
families)

# svnlite status /usr/src | sort
?       /usr/src/sys/amd64/conf/GENERIC-DBG
?       /usr/src/sys/amd64/conf/GENERIC-NODBG
?       /usr/src/sys/arm/conf/GENERIC-DBG
?       /usr/src/sys/arm/conf/GENERIC-NODBG
?       /usr/src/sys/arm64/conf/GENERIC-DBG
?       /usr/src/sys/arm64/conf/GENERIC-NODBG
?       /usr/src/sys/powerpc/conf/GENERIC64vtsc-DBG
?       /usr/src/sys/powerpc/conf/GENERIC64vtsc-NODBG
?       /usr/src/sys/powerpc/conf/GENERICvtsc-DBG
?       /usr/src/sys/powerpc/conf/GENERICvtsc-NODBG
M       /usr/src/contrib/llvm/lib/Target/PowerPC/PPCFrameLowering.cpp
M       /usr/src/contrib/llvm/tools/lld/ELF/Arch/PPC64.cpp
M       /usr/src/crypto/openssl/crypto/armcap.c
M       /usr/src/lib/Makefile
M       /usr/src/lib/libkvm/kvm_powerpc.c
M       /usr/src/lib/libkvm/kvm_private.c
M       /usr/src/sys/arm64/arm64/identcpu.c
M       /usr/src/sys/arm64/arm64/mp_machdep.c
M       /usr/src/sys/boot/ofw/Makefile.inc
M       /usr/src/sys/boot/powerpc/Makefile.inc
M       /usr/src/sys/boot/powerpc/boot1.chrp/Makefile
M       /usr/src/sys/boot/powerpc/kboot/Makefile
M       /usr/src/sys/boot/uboot/Makefile.inc
M       /usr/src/sys/conf/kmod.mk
M       /usr/src/sys/conf/ldscript.powerpc
M       /usr/src/sys/ddb/db_main.c
M       /usr/src/sys/ddb/db_script.c
M       /usr/src/sys/kern/subr_pcpu.c
M       /usr/src/sys/powerpc/aim/mmu_oea64.c
M       /usr/src/sys/powerpc/ofw/ofw_machdep.c
M       /usr/src/sys/powerpc/powerpc/interrupt.c
M       /usr/src/sys/powerpc/powerpc/mp_machdep.c
M       /usr/src/sys/powerpc/powerpc/trap.c

I do have some patches trying to catch a problem
that I saw earlier for 32-bit powerpc FreeBSD on
old G5 PowerMacs but I've not seen the issue in a
long while. I also did something crude to libkvm to
get basic raw memory dumps working for that context
so I could examine stacks and other things. So some
of the powerpc C code files above just have some
sort of additional cross check code that is not
generally relevant.

First I list things more directly tied to building
various ways not associated with those cross checks:

Index: /usr/src/lib/Makefile
===================================================================
--- /usr/src/lib/Makefile       (revision 324071)
+++ /usr/src/lib/Makefile       (working copy)
_at__at_ -158,7 +158,7 _at__at_
 .if ${MK_LIBCPLUSPLUS} != "no"
 _libcxxrt=     libcxxrt
 _libcplusplus= libc++
-.if ${MACHINE_CPUARCH} != "arm" && ${MACHINE_CPUARCH} != "mips"
+.if ${MACHINE_CPUARCH} != "arm" && ${MACHINE_CPUARCH} != "mips" && ${MACHINE_CPUARCH} != "powerpc"
 _libcplusplus+=        libc++experimental
 .endif
 .endif

Index: /usr/src/sys/boot/ofw/Makefile.inc
===================================================================
--- /usr/src/sys/boot/ofw/Makefile.inc  (revision 324071)
+++ /usr/src/sys/boot/ofw/Makefile.inc  (working copy)
_at__at_ -2,7 +2,7 _at__at_
 
 .if ${MACHINE_ARCH} == "powerpc64"
 CFLAGS+=       -m32 -mcpu=powerpc
-LDFLAGS+=      -m elf32ppc_fbsd
+LDFLAGS+=      -Wl,-m -Wl,elf32ppc_fbsd
 .endif
 
 .include "../Makefile.inc"

Index: /usr/src/sys/boot/powerpc/Makefile.inc
===================================================================
--- /usr/src/sys/boot/powerpc/Makefile.inc      (revision 324071)
+++ /usr/src/sys/boot/powerpc/Makefile.inc      (working copy)
_at__at_ -2,6 +2,7 _at__at_
 
 .if ${MACHINE_ARCH} == "powerpc64"
 CFLAGS+=       -m32 -mcpu=powerpc
+LDFLAGS+=      -Wl,-m -Wl,elf32ppc_fbsd
 .endif
 
 .include "../Makefile.inc"

Index: /usr/src/sys/boot/powerpc/boot1.chrp/Makefile
===================================================================
--- /usr/src/sys/boot/powerpc/boot1.chrp/Makefile       (revision 324071)
+++ /usr/src/sys/boot/powerpc/boot1.chrp/Makefile       (working copy)
_at__at_ -8,7 +8,7 _at__at_
 INSTALLFLAGS=   -b
 
 FILES=         boot1.hfs
-SRCS=          boot1.c ashldi3.c syncicache.c
+SRCS=          boot1.c qdivrem.c udivdi3.c ashldi3.c syncicache.c
 
 MAN=
 

Index: /usr/src/sys/boot/powerpc/kboot/Makefile
===================================================================
--- /usr/src/sys/boot/powerpc/kboot/Makefile    (revision 324071)
+++ /usr/src/sys/boot/powerpc/kboot/Makefile    (working copy)
_at__at_ -69,8 +69,6 _at__at_
 LIBFICL=       ${.OBJDIR}/../../ficl/libficl.a
 .endif
 
-CFLAGS+=       -mcpu=powerpc64
-
 # Always add MI sources
 .PATH:         ${.CURDIR}/../../common ${.CURDIR}/../../../libkern
 .include       "${.CURDIR}/../../common/Makefile.inc"
_at__at_ -86,9 +84,6 _at__at_
 
 LDFLAGS=       -nostdlib -static -T ${.CURDIR}/ldscript.powerpc
 
-# 64-bit bridge extensions
-CFLAGS+= -Wa,-mppc64bridge
-
 # Pull in common loader code
 #.PATH:                ${.CURDIR}/../../ofw/common
 #.include      "${.CURDIR}/../../ofw/common/Makefile.inc"

Index: /usr/src/sys/boot/uboot/Makefile.inc
===================================================================
--- /usr/src/sys/boot/uboot/Makefile.inc        (revision 324071)
+++ /usr/src/sys/boot/uboot/Makefile.inc        (working copy)
_at__at_ -2,7 +2,7 _at__at_
 
 .if ${MACHINE_ARCH} == "powerpc64"
 CFLAGS+=       -m32 -mcpu=powerpc
-LDFLAGS+=      -m elf32ppc_fbsd
+LDFLAGS+=      -Wl,-m -Wl,elf32ppc_fbsd
 .endif
 
 .include "../Makefile.inc"

Index: /usr/src/sys/conf/kmod.mk
===================================================================
--- /usr/src/sys/conf/kmod.mk   (revision 324071)
+++ /usr/src/sys/conf/kmod.mk   (working copy)
_at__at_ -151,8 +151,12 _at__at_
 .endif
 
 .if ${MACHINE_CPUARCH} == powerpc
+.if ${COMPILER_TYPE} == "gcc"
 CFLAGS+=       -mlongcall -fno-omit-frame-pointer
+.else
+CFLAGS+=       -fno-omit-frame-pointer
 .endif
+.endif
 
 .if ${MACHINE_CPUARCH} == mips
 CFLAGS+=       -G0 -fno-pic -mno-abicalls -mlong-calls


Index: /usr/src/sys/powerpc/ofw/ofw_machdep.c
===================================================================
--- /usr/src/sys/powerpc/ofw/ofw_machdep.c      (revision 324071)
+++ /usr/src/sys/powerpc/ofw/ofw_machdep.c      (working copy)
_at__at_ -111,26 +111,27 _at__at_
         * Assume that interrupt are disabled at this point, or
         * SPRG1-3 could be trashed
         */
-#ifdef __powerpc64__
-       __asm __volatile("mtsprg1 %0\n\t"
-                        "mtsprg2 %1\n\t"
-                        "mtsprg3 %2\n\t"
-                        :
-                        : "r"(ofmsr[2]),
-                        "r"(ofmsr[3]),
-                        "r"(ofmsr[4]));
-#else
-       __asm __volatile("mfsprg0 %0\n\t"
-                        "mtsprg0 %1\n\t"
-                        "mtsprg1 %2\n\t"
-                        "mtsprg2 %3\n\t"
-                        "mtsprg3 %4\n\t"
-                        : "=&r"(ofw_sprg0_save)
-                        : "r"(ofmsr[1]),
-                        "r"(ofmsr[2]),
-                        "r"(ofmsr[3]),
-                        "r"(ofmsr[4]));
+#ifndef __powerpc64__
+       if (!(cpu_features & PPC_FEATURE_64))
+               __asm __volatile("mfsprg0 %0\n\t"
+                                "mtsprg0 %1\n\t"
+                                "mtsprg1 %2\n\t"
+                                "mtsprg2 %3\n\t"
+                                "mtsprg3 %4\n\t"
+                                : "=&r"(ofw_sprg0_save)
+                                : "r"(ofmsr[1]),
+                                "r"(ofmsr[2]),
+                                "r"(ofmsr[3]),
+                                "r"(ofmsr[4]));
+       else
 #endif
+               __asm __volatile("mtsprg1 %0\n\t"
+                                "mtsprg2 %1\n\t"
+                                "mtsprg3 %2\n\t"
+                                :
+                                : "r"(ofmsr[2]),
+                                "r"(ofmsr[3]),
+                                "r"(ofmsr[4]));
 }
 
 static __inline void
_at__at_ -147,7 +148,8 _at__at_
         * PCPU data cannot be used until this routine is called !
         */
 #ifndef __powerpc64__
-       __asm __volatile("mtsprg0 %0" :: "r"(ofw_sprg0_save));
+       if (!(cpu_features & PPC_FEATURE_64))
+               __asm __volatile("mtsprg0 %0" :: "r"(ofw_sprg0_save));
 #endif
 }
 #endif


The cross check code is code like the following
but should not be important outside my context:

+if ((((uintptr_t) frame) & 0x3) != 0x0) { panic("trap: frame misaligned"); } // HACK
+if ((void*) frame < (void*) 0x1000)     { panic("trap: frame too small"); }  // HACK

and:

+if ((((uintptr_t) framep) & 0x3) != 0x0) { panic("powerpc_interrupt: framep misaligned"); } // HACK
+if ((void*) framep < (void*) 0x1000)     { panic("powerpc_interrupt: framep too small"); }  // HACK

and:

avoiding VM_PROT_EXECUTE on most kernel pages with
no code (when PPC_FEATURE_64 is present):

        struct pvo_entry *pvo, *oldpvo;
 
        pvo = alloc_pvo_entry(0);
+#if defined(AIM) && !defined(__powerpc64__)
+       if (cpu_features & PPC_FEATURE_64)
+       {
+               if ( va < ((vm_offset_t)(etext+(PAGE_SIZE-1)) & ~PAGE_MASK) )
+                       pvo->pvo_pte.prot = VM_PROT_READ | VM_PROT_WRITE | VM_PROT_EXECUTE;
+
+               else if (  ((vm_offset_t)_GOT_START_ & ~PAGE_MASK) <= va
+                       && va < ((vm_offset_t)(_GOT_END_+(PAGE_SIZE-1)) & ~PAGE_MASK)
+                       )
+                       pvo->pvo_pte.prot = VM_PROT_READ | VM_PROT_WRITE | VM_PROT_EXECUTE;
+
+               else if ( va < (__endkernel & ~PAGE_MASK) )
+                       pvo->pvo_pte.prot = VM_PROT_READ | VM_PROT_WRITE;
+
+               else // Otherwise do as before the HACK:
+                       pvo->pvo_pte.prot = VM_PROT_READ | VM_PROT_WRITE | VM_PROT_EXECUTE;
+       }
+       else
+#endif
        pvo->pvo_pte.prot = VM_PROT_READ | VM_PROT_WRITE | VM_PROT_EXECUTE;
        pvo->pvo_pte.pa = (pa & ~ADDR_POFF) | moea64_calc_wimg(pa, ma);
        pvo->pvo_vaddr |= PVO_WIRED;

combined with:

Index: /usr/src/sys/conf/ldscript.powerpc
===================================================================
--- /usr/src/sys/conf/ldscript.powerpc  (revision 324071)
+++ /usr/src/sys/conf/ldscript.powerpc  (working copy)
_at__at_ -24,6 +24,9 _at__at_
   _etext = .;
   PROVIDE (etext = .);
 
+  /* Force after this to start on a separate page from what is *before* _etext/etext */
+  . = ((. + 0x1000 - 1) & ~(0x1000 - 1));
+
   .interp     : { *(.interp)   }
   .hash          : { *(.hash)          }
   .dynsym        : { *(.dynsym)                }


As for the libkvm hacks to get raw memory dumps:

Index: /usr/src/lib/libkvm/kvm_powerpc.c
===================================================================
--- /usr/src/lib/libkvm/kvm_powerpc.c   (revision 324071)
+++ /usr/src/lib/libkvm/kvm_powerpc.c   (working copy)
_at__at_ -209,6 +209,53 _at__at_
        if (be32toh(vm->ph->p_paddr) == 0xffffffff)
                return ((int)powerpc_va2off(kd, va, ofs));
 
+       // HACK in something for what I observe in
+       // a debug.minidump=0 vmcore.* for 32-bit powerpc
+       //
+       if (  be32toh(vm->ph->p_vaddr)  == 0xffffffff
+          && be32toh(vm->ph->p_paddr)  == 0
+          && be16toh(vm->eh->e_phnum)  == 1
+          ) {
+               // Presumes p_memsz is either unsigned
+               // 32-bit or is 64-bit, same for va .
+
+               if (be32toh(vm->ph->p_memsz) <= va)
+                       return 0; // Like powerpc_va2off
+
+               // If ofs was (signed) 32-bit there
+               // would be a problem for sufficiently
+               // large postive memsz's and va's
+               // near the end --because of p_offset
+               // and dmphdrsz causing overflow/wrapping
+               // for some large va values.
+               // Presumes 64-bit ofs for such cases.
+               // Also presumes dmphdrsz+p_offset
+               // is non-negative so that small
+               // non-negative va values have no
+               // problems with ofs going negative.
+
+               *ofs =    vm->dmphdrsz
+                       + be32toh(vm->ph->p_offset)
+                       + va;
+
+               // The normal return value overflows/wraps
+               // for p_memsz == 0x80000000u when va == 0 .
+               // Avoid this by depending on calling code's
+               // loop for sufficiently large cases.
+               // This code presumes p_memsz/2 <= MAX_INT .
+               // 32-bit powerpc FreeBSD does not allow
+               // using more than 2 GiBytes of RAM but
+               // does allow using 2 GiBytes on 64-bit
+               // hardware.
+               //
+               if (  (int)be32toh(vm->ph->p_memsz) < 0
+                  && va < be32toh(vm->ph->p_memsz)/2
+                  )
+                       return be32toh(vm->ph->p_memsz)/2;
+
+               return be32toh(vm->ph->p_memsz) - va;
+       }
+
        _kvm_err(kd, kd->program, "Raw corefile not supported");
        return (0);
 }
Index: /usr/src/lib/libkvm/kvm_private.c
===================================================================
--- /usr/src/lib/libkvm/kvm_private.c   (revision 324071)
+++ /usr/src/lib/libkvm/kvm_private.c   (working copy)
_at__at_ -128,7 +128,9 _at__at_
 {
 
        return (kd->nlehdr.e_ident[EI_CLASS] == class &&
-           kd->nlehdr.e_type == ET_EXEC &&
+           (  kd->nlehdr.e_type == ET_EXEC ||
+              kd->nlehdr.e_type == ET_DYN
+           ) &&
            kd->nlehdr.e_machine == machine);
 }
 



===
Mark Millard
markmi at dsl-only.net
Received on Fri Oct 06 2017 - 14:28:42 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:13 UTC