Re: A head buildworld race visible in the ci.freebsd.org build history

From: Bryan Drewery <bdrewery_at_FreeBSD.org>
Date: Mon, 18 Jun 2018 15:33:56 -0700
On 6/18/2018 3:31 PM, Li-Wen Hsu wrote:
> On Mon, Jun 18, 2018 at 6:27 PM Bryan Drewery <bdrewery_at_freebsd.org> wrote:
>>
>> On 6/18/2018 1:45 PM, Konstantin Belousov wrote:
>>> On Mon, Jun 18, 2018 at 12:42:46PM -0700, Bryan Drewery wrote:
>>>> On 6/15/2018 10:55 PM, Mark Millard wrote:
>>>>> In watching ci.freebsd.org builds I've seen a notable
>>>>> number of one time failures, such as (example from
>>>>> powerpc64):
>>>>>
>>>>> --- all_subdir_lib/libufs ---
>>>>> ranlib -D libufs.a
>>>>> ranlib: fatal: Failed to open 'libufs.a'
>>>>> *** [libufs.a] Error code 70
>>>>>
>>>>> where the next build works despite the change being
>>>>> irrelevant to whatever ranlib complained about.
>>>>>
>>>>> Other builds failed similarly:
>>>>>
>>>>> --- all_subdir_lib/libbsm ---
>>>>> ranlib -D libbsm_p.a
>>>>> ranlib: fatal: Failed to open 'libbsm_p.a'
>>>>> *** [libbsm_p.a] Error code 70
>>>>>
>>>>> and:
>>>>>
>>>>> --- kerberos5/lib__L ---
>>>>> ranlib -D libgssapi_spnego_p.a
>>>>> --- libgssapi_spnego.a ---
>>>>> ranlib -D libgssapi_spnego.a
>>>>> --- libgssapi_spnego_p.a ---
>>>>> ranlib: fatal: Failed to open 'libgssapi_spnego_p.a'
>>>>> *** [libgssapi_spnego_p.a] Error code 70
>>>>>
>>>>> and so on.
>>>>>
>>>>>
>>>>> It is not limited to powerpc64. For example, for aarch64
>>>>> there are:
>>>>>
>>>>> --- libpam_exec.a ---
>>>>> building static pam_exec library
>>>>> ar -crD libpam_exec.a `NM='nm' NMFLAGS=''  lorder pam_exec.o  | tsort -q`
>>>>> ranlib -D libpam_exec.a
>>>>> ranlib: fatal: Failed to open 'libpam_exec.a'
>>>>> *** [libpam_exec.a] Error code 70
>>>>>
>>>>> and:
>>>>>
>>>>> --- all_subdir_lib/libusb ---
>>>>> ranlib -D libusb.a
>>>>> ranlib: fatal: Failed to open 'libusb.a'
>>>>> *** [libusb.a] Error code 70
>>>>>
>>>>> and:
>>>>>
>>>>> --- all_subdir_lib/libbsnmp ---
>>>>> ranlib: fatal: Failed to open 'libbsnmp.a'
>>>>> --- all_subdir_lib/ncurses ---
>>>>> --- all_subdir_lib/ncurses/panelw ---
>>>>> --- panel.pico ---
>>>>> --- all_subdir_lib/libbsnmp ---
>>>>> *** [libbsnmp.a] Error code 70
>>>>>
>>>>>
>>>>> Even amd64 gets such:
>>>>>
>>>>> --- libpcap.a ---
>>>>> ranlib -D libpcap.a
>>>>> ranlib: fatal: Failed to open 'libpcap.a'
>>>>> *** [libpcap.a] Error code 70
>>>>>
>>>>> and:
>>>>>
>>>>>
>>>>> --- libkafs5.a ---
>>>>> ranlib: fatal: Failed to open 'libkafs5.a'
>>>>> --- libkafs5_p.a ---
>>>>> ranlib: fatal: Failed to open 'libkafs5_p.a'
>>>>> --- cddl/lib__L ---
>>>>> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lbaselib.c:60:26: note: include the header <ctype.h> or explicitly provide a declaration for 'toupper'
>>>>> --- kerberos5/lib__L ---
>>>>> *** [libkafs5_p.a] Error code 70
>>>>>
>>>>> make[5]: stopped in /usr/src/kerberos5/lib/libkafs5
>>>>> --- libkafs5.a ---
>>>>> *** [libkafs5.a] Error code 70
>>>>>
>>>>> and:
>>>>>
>>>>>
>>>>> --- lib__L ---
>>>>> ranlib -D libclang_rt.asan_cxx-i386.a
>>>>> ranlib: fatal: Failed to open 'libclang_rt.asan_cxx-i386.a'
>>>>> *** [libclang_rt.asan_cxx-i386.a] Error code 70
>>>>>
>>>>>
>>>>> (Notice the variability in what .a the ranlib's fail for.)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> I looked at this a few days ago and don't believe it's actually a build
>>>> race. I think there is something wrong with the ar/ranlib on that system
>>>> or something else. I've found no evidence of concurrent building of the
>>>> .a files in question.
>>>
>>> FWIW, I got the similar failure when I did last checks for the OFED
>>> commit.  For me, it was libgcc.a.
>>>
>>
>> If it was -lgcc_s then it's a known rare build race due to
>> tools/install.sh not handling -S.
> 
> It seems a more general problem, this one:
> 
> https://ci.freebsd.org/job/FreeBSD-head-aarch64-build/8190/console
> 
> calls for libcuse_p.a, while this one:
> 
> https://ci.freebsd.org/job/FreeBSD-head-mips-build/2919/console
> 
> calls for libfifolog.a
> 

Well why is ar -> ranlib so special? Nothing else is failing.
What filesystem are these using for objdirs?
What revision is the host kernel?

-- 
Regards,
Bryan Drewery


Received on Mon Jun 18 2018 - 20:34:01 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:16 UTC