Re: buildworld on CPU-A, installworld on CPU-B ends up with SIGILL

From: Ruslan Garipov <ruslanngaripov_at_gmail.com>
Date: Mon, 25 Nov 2019 23:26:46 +0500
On 11/25/2019 10:30 PM, Miroslav Lachman wrote:
> Ruslan Garipov wrote on 2019/11/25 15:06:
>> Hello.
>>
>> I want to build kernel and world (of FreeBSD 13.0-CURRENT) on a fast
>> virtual machine for other ones (all the virtual machines are hosted on
>> VMware EXSi hypervisors, which have different physical CPUs).
>>
>> After `make -j16 buildworld` has finished successfully on the build
>> machine, I get there, for example,
>> /usr/obj/usr/src/amd64.amd64/tmp/legacy/bin/install program having the
>> shlxq instruction (one from the BMI2 instruction set extensions). This
>> eventually causes make installkernel and make installworld to fail with
>> SIGILL on a virtual machine which must consume built world and kernel,
>> and which is hosted on another ESXi instance, with older physical CPU
>> (read: with CPU not implementing shlxq).
>>
>> The build machine (FreeBSD 13.0-CURRENT r354802) builds (x)install using
>> the following commands (a part of buildworld):
>>
>> $ cc -O2 -pipe -O2 -march=x86-64 -pipe -I/usr/src/contrib/mtree
>> -I/usr/src/lib/libnetbsd -DHAVE_STRUCT_STAT_ST_FLAGS=1 -MD
>> -MF.depend.xinstall.o -MTxinstall.o -std=gnu99 -Wno-format-zero-length
>> -Qunused-arguments -I/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/include
>> -c /usr/src/usr.bin/xinstall/xinstall.c -o xinstall.o
>> $ cc -O2 -pipe -O2 -march=x86-64 -pipe -I/usr/src/contrib/mtree
>> -I/usr/src/lib/libnetbsd -DHAVE_STRUCT_STAT_ST_FLAGS=1 -MD
>> -MF.depend.getid.o -MTgetid.o -std=gnu99 -Wno-format-zero-length
>> -Qunused-arguments -I/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/include
>> -c /usr/src/contrib/mtree/getid.c -o getid.o
>> $ cc -O2 -pipe -O2 -march=x86-64 -pipe -I/usr/src/contrib/mtree
>> -I/usr/src/lib/libnetbsd -DHAVE_STRUCT_STAT_ST_FLAGS=1 -std=gnu99
>> -Wno-format-zero-length -Qunused-arguments
>> -I/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/include -static
>> -L/usr/obj/usr/src/amd64.amd64/tmp/legacy/usr/lib -o xinstall xinstall.o
>> getid.o -L/usr/obj/usr/src/amd64.amd64/tmp/obj-tools/lib/libmd -lmd -legacy
>>
>> This produces xintstall with `shlxq`s:
>>
>> $ llvm-objdump --disassemble xinstall | grep -c shlxq
>> 135
>>
>> I believe statically linked /usr/lib/libmd.a is a stuff which brings
>> `shlxq`s into the xinstall.  I didn't examine it further, sorry...
>>
>> My question is: is it possible to buildworld without issuing
>> instructions which are native for the host CPU?  I have neither
>> /etc/make.conf, nor /etc/src.conf on the build machine.  Is it possible
>> at all for FreeBSD CURRENT to be built outside a host and installed on
>> the host later?
>>
>> Just as a reference:
>>
>> My build machine has Intel(R) Xeon(R) Gold 6150 CPU that supports BMI2:
>>
>> # cpucontrol -i 7 /dev/cpuctl0
>> cpuid level 0x7: 0x00000000 0xd19f6ffb 0x00000018 0xbc000000
>>
>> (Bit 08 in EBX is set)
>>
>> , and a consuming machine has Intel(R) Xeon(R) CPU E5-4617 CPU that
>> doesn't support BMI2:
>>
>> # cpucontrol -i 7 /dev/cpuctl0
>> cpuid level 0x7: 0x00000000 0x00000002 0x00000000 0xbc000000
>>
>> (Bit 08 in EBX is unset).
>>
>> Both machines install kernel and world without any problem, if they were
>> built locally.
> 
> I didn't tried this with current but I am using it with stable (11.3 at 
> this time). Building on Xeon E3-1240v3 and installing on many different 
> machines. Some of them are 10+ years old AMD Opteron, some Xeon E5649, 
> some 10 years old Intel Pentium.
> So at least it worked in the past (11.3 amd64). Did you use this 
> workflow in the past / did it work?
No, unfortunately I didn't.  Always built world/kernel on target host.

> I remember some issue in the past which was (accidentaly?) fixed by 
> running "make buildworld && make builkernel && make installkernel && 
> make installworld" on the build host (to some different DESTDIR) and 
> then "make installkernel && make installworld" on the target host (build 
> machine is shared via NFS)
Therefore, this trick somehow "fixes" /usr/obj shared on the build
machine?  I'll try this later.  Thanks!

> 
> Miroslav Lachman
> 
Received on Mon Nov 25 2019 - 17:26:53 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:22 UTC