My build work and goals

From: Bryan Drewery <bdrewery_at_FreeBSD.org>
Date: Thu, 10 Dec 2015 11:56:25 -0800
This mail is to outline my recent work, goals and motivations.  This is
long.  This is not really a architectural review or discussion mail.  I
am leaving those and details for their own threads where needed, so
please do not start discussions about any of the details here.  We will
have them later.  No one has really objected to my work but many have
asked what my goals are and if the churn is worth it.

A lot of my work lately has been to remove hacks in the build and use
normal mechanisms or framework to achieve the same thing.  Some of this
is to remove hacks from the META build.  Some of this is just natural
evolution of the framework since we haven't had a real maintainer for so
long.  That may be putting it wrong but I think it is fair to say we're
all a little scared of touching share/mk and try to do so as little as
possible.  Consider bsd.files.mk/bsd.incs.mk/bsd.confs.mk(new) are all
95% the same.  I have not yet combined them but plan to.  Why is SCRIPTS
only in bsd.prog.mk?  There's many problems in these files that need
fixing.  The vast majority of my work and "churn" lately has been
improving the META mode build which is not the default build.  So it may
appear that I am churning the build a lot when I am really not.  More on
the META build later.

First an introductory and background.

Who am I?  I came to the FreeBSD community as a Ports committer who was
primarily focused on building up the Ports framework and its build
tools.  I tookover upstream maintenance of Portupgrade and then got
involved with Poudriere in its early stages and helped bring it to what
it is today.  I think it is fair to say I have been the maintainer of
Poudriere for some time now.  I am one of the maintainers of the
pkg.FreeBSD.org package builds and oversee its automation.  I've written
a few FreeBSD Journal articles about that.  99% of this was not
sponsored though.

For my day job at Isilon I recently moved my efforts to the base build.
 In many people's eyes the build system, 'buildworld', mostly "just
works".  The problems come in when "works" to you is time and productivity.

At Isilon we have a range of development efforts that span from
developers only caring about the kernel to ones who care about 1
userland tool. All of us expect that we should be able to just build our
1 thing rather than everything. Some of these 1 userland tool cases
though have hundreds of dependencies.  Most developers instinctively try
building manually rather than Jenkins as it feels like it should be
quicker.  This leads to grief.  The problem comes down to productivity.
 I've been given a great opportunity to address these problems and am
running with it.

Isilon has a quite convoluted build.  Our product has its own ELF
brand/ABI/KBI and cannot run on native FreeBSD systems. The build is
done from a FreeBSD system for reasons, so is entirely a cross-build by
expectations.  We have a buildworld, ports phase, and then we have a
buildworld-type thing of stuff that depends on ports!  Both the ports
and ports-dependent pieces are built in a jail using a special hack of a
kernel module that provides KBI compatibility to the native FreeBSD
system so we can run target binaries.  QEMU does not apply here as it's
not an architecture problem, it's a syscall/KBI problem.  Solving ports
cross-building will remove the need for this.

Stepping back to pure FreeBSD now.

Some various problems I've observed in the build:
- Building clang twice.  We build clang from the source tree so we can
build everything with it even if /usr/bin/cc qualifies as "new enough"
and "capable of cross-building the target".  We later build clang for
the target as well.  Try doing this for 'make universe' for N
architectures and you build clang N*2 times rather than N times.
- Not being highly parallel.
- Requiring building everything to build anything (without being an
expert on manually building dependencies).  AKA, no reliable
sub-directory builds.
- Incremental builds don't work reliably.  We have stealth dependencies
that are not tracked such as csu, libgcc, compiler_rt, compiler and
Makefile (CFLAGS/LDFLAGS,build command).
- There is gross under-linkage in libraries and over-linkage in binaries.
- No foreign OS cross-build support.  You must build from FreeBSD 8.0+.
 This is a problem when people prefer to run OSX or Linux for their
primary system.
- No cross-build support in Ports.
- share/mk and Makefile.inc1 has a ton of bitrot.

Some various problems I've observed with maintaining the build:
- Adding new libraries into the build usually results in doing it wrong
in Makefile.inc1's handling of 'make libraries'. I think it is fair to
say that most people don't understand how any of this works.  Just
yesterday I discovered more of how it works that surprised me.
- Adding new libraries into the build is usually done wrong in terms of
the new framework.
- There are little build framework sanity checks.  The ports build has
grown a large array of sanity checks over the past few years that is a
decent model for the base system.  Such as telling developers that they
have used the framework incorrectly or forgotten something.
- Adding new DESTDIR directories can lead to installing directories as
files (think installing a header as /usr/include/foo rather than
/usr/include/foo/file.h).  This happens when missing adding to the MTREE
files.
- No one really was trying to improve it head-on and focused on
FreeBSD's general audience needs.

The maintenance problems come down to expertise.  Most developers are
not experts in the build and don't have time or interest to become so.
It is not intuitive that adding a new include file and directory should
require modifying an MTREE file somewhere else in the tree, or that __L
and SUBDIR_depend entries are needed.  Removing these landmines and
adding sanity checks improves maintenance for everyone.

Now for my goals and work.

- My general goal is to be able to go into any src or ports directory,
type 'make', and have everything just work (and have src/ports work with
each other).  Have all dependencies build, have them be incremental, and
have parallelism work.  Being able to do this on a foreign OS, such as
OSX or Linux, is a stretch goal that I want to see but will require a
lot of work and support from the Community to achieve.  This goal may in
the end replace 'make' with a different spelling but not one of having
to set anything up like Poudriere requires.
- Let the build tell you when you added something wrong so you can fix
it before it harms someone else's time.

Some improvements I have made recently:
- WITH_FAST_DEPEND: Replacing the antiquated 'make depend'/'mkdep' with
compiler dependency flags to generate the .depend files as a side-effect
of compiling.  This saves tremendous time in buildworld and buildkernel.
 Before this we were preprocessing files *twice*.  Now we only do so
while compiling.  https://svnweb.freebsd.org/changeset/base/290433 has
more details.  This effort has mostly removed the need to even have a
'make depend' phase and I may still remove it with this option.  I plan
to enable this option by default quite soon.
- WITH_CCACHE_BUILD: Utilizing ccache without -DNO_CLEAN to achieve an
incremental build where builds are quite frequent.
https://svnweb.freebsd.org/changeset/base/290526 has more details.
- Having 'make install' error if trying to install a file into a
non-existent directory.  This has bitten us a lot.
- Add more checks for adding LIBADD properly.
- Allowing bsd.subdir.mk to run directories in parallel more often, such
as always in 'make obj', since there is no harm in doing so.  This can
seem small at first but it adds up when we do many tree-walks.

Some improvements I have planned soon:
- Removing all of the __L and SUBDIR_depend in the tree.  This depends
on LIBADD, which is why I've been polishing it lately.
- Tracking csu, libgcc and libcompiler_rt in DPADD.
- Removing _DP_ duplication (since LIBADDs already define it) in
src.libnames.mk.

Midterm goals:
- Building clang less.  Being smart about when it needs to build a
cross-compiler.  Only building the cross-compiler with targets that are
being built.
- MFCing external compiler support to stable/10 as promised at the
Vendor summit.  I just haven't had time yet, but it is not that much.
- Replacing 'make native-xtools' with something that works correctly.
It is currently broken as it does not build required libraries for the
host.  This is only important for the current Poudriere+QEMU effort.
- Reliable incremental builds in buildworld.  This most likely will
utilize the .MAKE.MODE=meta support from bmake along with filemon.  See
'man make'.
- Sub-directory builds.
- Under-linkage of libraries and over-linkage of binaries.  Generally a
library should link in all of its needed dependencies, rather than
forcing a consumer to link in something it doesn't care about
(under-linking).  There are exceptions to this of course.  Consumers
(both library and binary) also should not link in libraries they and
their dependents do not need (over-linking).  I went through a massive
effort at Isilon to fix this for our internal libraries/binaries and
plan to do the same for FreeBSD.  We have test in our build that check
for under/over-linking that utilize the tools/build/check-links.sh
script.  Special cases usually involve loadable modules that depend on
their parent to provide symbols, cyclic dependencies and optional
symbols that usually should be using weak symbols instead.  The benefits
here are both build parallelism, decreasing startup time for
binaries/rtld, reducing memory mapping overhead for libraries that won't
be used, and providing hints on what libraries are needed.
- Possibly defaulting SUBDIR_PARALLEL to on and changing how we handle
SUBDIR_DEPEND to be automated.

Longterm goals:
- Ports cross-building.  It is a massive effort that several people have
explored before and found to be "too hard" (because 24,000 ports IS too
hard for major Ports changes).  I admit that my only interest is in a
handful of ports and I will not be trying to solve ports I have no
interest in.  I will not try to get all of ports cross-building, but I
will hopefully help bring us a framework to that may lead to not needing
QEMU.
- Foreign OS builds.  I think NetBSD got this right.  I'm speaking from
a very high-level and quick overview, but it seems that the gist here is
that they build all of their libraries (libc being most important) into
a libcompat that is linked into all "host"/"build" tools.  We have
libegacy (-legacy) that serves the same goal but we have by-design kept
this extremely minimal.  This is why I say the Community needs to
support it as we've so far taken the minimum-required route for building
on foreign (older) releases.


Some details on solving primary goal of sub-directory and incremental
builds:

This problem is already solved by the recently imported
META_MODE/DIRDEPS_BUILD mode from Juniper/sjg_at_.

Simon presented this at BSDCan 2014:
http://www.bsdcan.org/2014/schedule/events/460.en.html

The DIRDEPS_BUILD is a collection of several features.  It was recently
renamed from META_MODE so META_MODE could be used on its own.  Enabling
it enables all of these features:
1. META_MODE (.MAKE.MODE=meta) which does for any shell command the same
as what the compiler -MM flags for generating dependencies
(.depend/FAST_DEPEND) do for compiling.  It uses filemon(4) to track all
files read/written by the shell command and then considers those files
as dependencies for the target later.  It tracks these in a .meta file
that can be thought of as a .depend file; it mostly only benefits
incremental builds.  It also tracks the build command which fixes
changing of CFLAGS/etc not rebuilding.  This feature solves incremental
builds.  This feature also allows not doing the same thing twice, such
as installing into a STAGEDIR.  If the file already exists and
dependencies didn't cause a rebuild then there is no need to instal it
again.  This sounds magical but is just using a cookie on the install
target and the cookie depends on all build dependencies via its .meta file.
2. AUTO_OBJ prevents needing to tree-walk with 'make obj' as it just
creates the objdir as soon as it touches a directory.
3. DIRDEPS_BUILD uses the Makefile.depend files to get a list of
DIRDEPS. This is just a list of directories that need to be built before
this directory. These Makefile.depend files are included from the first
make process (non-recursive process-wise) that generate a dependency
graph.  From this graph sub-makes are called to build each dependency.
This also supports building a 'host' version of a directory first so it
can be used as a build tool.  This feature solves sub-directory builds
at the cost of having generated files checked in.
4. STAGING installs files after they are built into a STAGEDIR (like
WORLDTMP).  This allows linking to libraries already built.  This is not
that different from what buildworld does not for 'make libraries' but is
subtlety different in that it does not tree-walk 'all' before 'install',
it just installs right after 'all' in each directory.  This subtle
difference leads to the surprising behavior I referred to in __L/'make
libraries' earlier.  The STAGING feature also creates a FILE.dirdep in
the STAGEDIR for every file to specify which SRCDIR installed it which
UPDATE_DEPENDFILE uses.
5. UPDATE_DEPENDFILE utilizes the information from META_MODE/filemon(4)
to see which files from the OBJDIR or STAGEDIR were used in the build.
Any SRCDIR from there, considering the FILE.dirdep for used files in
STAGEDIR, is added to the DIRDEPS list.  A basic .depend-like file is
also generated for generating of objects, called "local dependencies".
This last piece is due to not running 'make depend' and through my
findings is largely unneeded now.  All of this information, the DIRDEPS
and local dependencies, are written out into the Makefile.depend for
that build directory.  It is expected to check in this file as otherwise
you cannot build in this directory later since it is unknown what it
needs to build.

Some problems with the combined DIRDEPS_BUILD build:

It is very hard to maintain, prone to massive churn in directories you
did and did not touch, requires checking in fickle generated files, has
a chicken-and-egg problem for adding new things, does not respect
WITH/WITHOUT options or ARCH-dependent SUBDIR, has subtle problems with
a static dependencies list around options, and has no 'installworld'
support.

The lack of 'installworld' is a deal breaker, albeit probably trivial to
fix.  The reason for this is that the system is used to generate
packages (not pkgng style) at Juniper.  We have our own packaging effort
occurring in projects/release-pkg though and won't likely use the same
methods that the DIRDEPS_BUILD does (of which none of the support is
really in the src tree).

This build does not use recursive SUBDIR walking (by design).  It uses a
static list of directories to build in the
targets/pseudo/userland/*/Makefile.depend files.  So we have to maintain
new/removed applications in two places.  The effort by Simon and I here
was lacking in 100% OPTION and ARCH support as well.  Warner and I have
discussed moving OPTION/ARCH checks from SUBDIR Makefile into the actual
build files.  That would help maintenance of targets/ some but still
leave the connect/disconnect maintenance issue.

Some directories still not built are rescue/rescue and sys/modules.

The checked-in Makefile.depend files have a flaw (feature for static
builds perhaps) that the invoking the linker will cause non-direct
dependencies to show up in Makefile.depend.  This leads to situations
such as https://svnweb.freebsd.org/changeset/base/291558.  There is also
a situation where a local build may set MK_SSP=no and yet need to build
a dependency that wants MK_SSP=yes (libssp dependency) but because of
how DIRDEPS are processed it means that libssp is never built and the
build fails.  They also require hacks to support ARCH/OPTION-dependent
dependencies by modifying local.dirdeps.mk and local.gendirdeps.mk,
which can easily lead to over-depending on something.  The checked-in
auto generated Makefile.depend can cause them to change unexpectedly and
cause a developer to waste their time looking into why, or not
committing it and then wasting someone else's time when it won't build.
 Most developers naturally do not want to check in generated files and
so I generally see this as unworkable.  Lastly, generating
Makefile.depend when you have none, or different options/dependencies
than upstream, is an exhaustive chicken-and-egg problem.  These problems
make checked-in Makefile.depend infeasible.  "But wait, can you just not
check them in?"  Not really.

The targets/ and checked-in Makefile.depend are largely to decrease
startup overhead costs by not needing recursive tree walks.  I have
recently added support to local.dirdeps.mk to make an educated guess on
what DIRDEPS should be even if there is no Makefile.depend.  This is
based on what is being built (C and C++ objects have common
dependencies) and what DPADD/LIBADD there are.  In most cases this is
enough.  It has greatly helped when bootstrapping Makefile.depend.
Going further though I have written a script that just recursively walks
the tree to generate the DIRDEPS graph from the first make process
rather than including the Makefile.depend files.  This removes the need
for having any Makefile.depend and allows targets/ to be dynamic and
utilize the existing SUBDIR around the tree.  Obviously the trade-off
here is startup time at the benefit of maintainability.  This script I
have written for this is very similar to what Poudriere is doing though
so I have a lot of experience and code for optimizing it.  I plan to use
this as a build option at Isilon rather than having checked-in
Makefile.depend files.  I will add support for this in FreeBSD as well.
 None of this or DIRDEPS_BUILD would be on-by-default though.

As for buildworld, I have been working on improving it as well in case
my plans with the META build did not pan out.  Having split my time
between them I see now that much of buildworld can directly benefit from
the range of features that DIRDEPS_BUILD uses.  I plan to bring in
.MAKE.MODE=meta to buildworld to fix incremental and redundant WORLDTMP
staging.  I plan to rework 'make libraries' significantly utilizing my
DIRDEPS script to generate the __L in an automatic and hidden way.  It
may be that my first iterations of this is to have us generate a file
and check it in until I can improve the speeds.  This is still better
than the current situation of unmaintainable __L in Makefile.inc1.


P.S. Working on this stuff can be exhausting.  Mistakes are easy.
Especially when existing behavior is obscure, undocumented and possibly
accidental and things depend on that behavior.  I get burned out on it
often and shift my efforts around to keep going.  Some may notice I am
skipping review on a lot of it.  The reasoning is that it is either too
obvious, too prone to bikeshedding over insignificant details, or so
obscure and unobvious that a review would not serve much benefit as no
one really knows the code in question or just be too unsure to put their
name on it.  All of which I see as a time waste for others.
Objectionable or easily-wrong changes I am getting reviews for or going
slow with (such as FAST_DEPEND).  I am MFCing relevant and safe changes
mostly only to reduce merge conflicts for others.  I have no real
interest in building stable/* and don't want to cause issues for others
there.  Thank you for your patience and hopefully when I'm done we will
have something better that we can all agree with and work with.

-- 
Regards,
Bryan Drewery
Received on Thu Dec 10 2015 - 18:56:37 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:01 UTC