Re: Apparent race in buildworld (head/amd64, r322214 -> r322304)

From: Bryan Drewery <bdrewery_at_FreeBSD.org>
Date: Wed, 9 Aug 2017 12:22:20 -0700
> 	/usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s


On 8/9/2017 10:57 AM, David Wolfskill wrote:
> On Wed, Aug 09, 2017 at 10:49:04AM -0700, Bryan Drewery wrote:
>> ...
>>> on one machine, but the other never had an issue.  On the "failing" one,
>>> a re-start of the buildworld completed (apparently) successfully.
>>
>> Yeah, I've gotten reports of this one for years.  I fixed a few problems
>> with it in the past but something else must have creeped in.
> 
> Or I just got "lucky." :-)
> 
>> I don't believe it is related to META_MODE though.
> 
> Fair enough; I pointed it out just in case it might be relevant.  (I try
> to avoid hiding possibly-relevant information when I'm trying to work
> with someone to solve a problem.  I know that's weird, but... :-} )
> 
>> The last time I fixed this (AFAIK) it was related to an early error
>> being ignored.  I'll review your log to see if I can find anything like
>> that.
> 
> Cool.  FWIW, the scheduler will see 8 cores on each machine, so the
> "make buildworld" will have been "make -j16 buildworld" (on each).
> 
>> ....
> 

This should fix it:
https://people.freebsd.org/~bdrewery/patches/gcc_s-install-race.diff

The problem has consistently been, from your reports, that gcc_s is
being installed to WORLDTMP *while* something is trying to link to it.

> --- gnu/lib/libgcc__L ---
> Building /common/S4/obj/usr/src/world32/usr/src/gnu/lib/libgcc/_libinstall
> --- kerberos5/lib/libhx509__L ---
> Building /common/S4/obj/usr/src/world32/usr/src/kerberos5/lib/libhx509/keyset.So
> --- secure/lib/libssl__L ---
> /usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s
> 
> 
> Building /common/S3/obj/usr/src/world32/usr/src/gnu/lib/libgcc/_libinstall
> --- lib/ncurses/ncursesw__L ---
> Building /common/S3/obj/usr/src/world32/usr/src/lib/ncurses/ncursesw/nc_panel.po
> --- lib/ncurses/ncurses__L ---
> Building /common/S3/obj/usr/src/world32/usr/src/lib/ncurses/ncurses/comp_parse.po
> --- lib/ncurses/ncursesw__L ---
> Building /common/S3/obj/usr/src/world32/usr/src/lib/ncurses/ncursesw/resizeterm.po
> --- lib/libc++__L ---
> /usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s
> 
> --- lib/libgcc_s__L ---^M                                                     
> Building /common/S4/obj/usr/src/world32/usr/src/lib/libgcc_s/_libinstall^M    
> --- kerberos5/lib/libwind__L ---^M                                            
> --- obj ---^M                                                                 
> --- secure/lib/libcrypto__L ---^M                                             
> --- all_subdir_secure/lib/libcrypto/engines/libatalla ---^M                   
> /usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s^M                        
> cc: error: linker command failed with exit code 1 (use -v to see invocation)^M
> --- all_subdir_secure/lib/libcrypto/engines/libsureware ---^M                 
> /usr/obj/usr/src/tmp/usr/bin/ld: cannot find -lgcc_s^M                        



By default 'install' unlinks the file and then copies over the new file.
 Using PRECIOUSLIB we get the -S flag to install which is atomic in its
installation.

Note the patch is not what I will commit. At Isilon we changed our
install to always use -S for library installation, but not to force schg
on.  I am considering making that change the default, to use -S for all
libraries.


-- 
Regards,
Bryan Drewery


Received on Wed Aug 09 2017 - 17:22:46 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:12 UTC