Re: rtld + static linking

From: Terry Lambert <tlambert2_at_mindspring.com>
Date: Wed, 26 Nov 2003 04:16:06 -0800
Marcel Moolenaar wrote:
> On Tue, Nov 25, 2003 at 05:44:18PM -0800, Terry Lambert wrote:
> > "E.B. Dreger" wrote:
> > > Dynamic linking works by the kernel running the dynamic linker,
> > > which loads shared objects and fixes the symbol tables, yes?
> >
> > No.
> >
> > Dynamic linking works because the crt0 mmap's the /usr/libexec/ld.so
> > file as executable, and then points known stub offsets into it.
> 
> No.
> 
> Dynamic linking works because the kernel loads and runs the dynamic
> linker when it sees that the executable defines an interpeter.

Since I have patches to make dlopen work with static binaries, and
it doesn't work this way, I must conclude you have not really looked
deeply into solving the problem.

While the ELF specification, and the SVID III, specifically set
aside "enough space" that you can leave the first page unmapped
and have room for the kernel to load in the ld.so (and thereby
save yourself recreating that part of the address space on exec
of non-setuid/setgid binaries, where it's not a security issue),
FreeBSD doesn't do this.  I can only conclude that either no one
has gotten around to it since I pointed it out 6 years ago, or
someone is intent on FreBSD's "exec" beinbg slower than it needs
to be.

In any case, I point you to /usr/src/lib/csu/i386/crt0.c, which
contains these lines of code:

	[...]
        /* Map in ld.so */
        crt.crt_ba = (int)_mmap(0, hdr.a_text,
                        PROT_READ|PROT_EXEC,
                        MAP_FILE|MAP_PRIVATE,
                        crt.crt_ldfd, N_TXTOFF(hdr));
	[...]


> A complete executable (i.e. staticly linked) does not export any
> symbols, or at least not in the same way a shared executable and
> shared libraries do.

This is why I keep saying that compiler work would be needed to
export symbols from the original binary and any statically linked
libraries.


> If I try to dynamicly link libbar into a
> complete executable foo and libbar depends on libc, then there's
> no guarantee that all the required bits from libc are in foo,

Correct, for the current (flawed) implementation of the linker.


> nor is there any guarantee that the bits are actually visible or even
> accessable (no linkage table).

In fact, there is a guaranteed that they are not there, for the
current (flawed) implementation of the linker.


> Dynamicly loading libc for use by libbar can work, but it's not
> guaranteed. One failure mode is suddenly having two instances of
> a common variable instead of one. Another is the clobbering of
> data caused reinitializations.

The first problem is a non-issue, at least for FreeBSD supplied
libraries.  For other libraries, yes, it could be an issue.

The second is always a non-issue, since the loading of shared
objects occurs in a hierarchy, and you are perfectly allowed to
have multiple instances of different versions of shared objects
existing simultaneously.  In fact, if you read the dlsym(3) manual
page, you will see that there exists a special handle identifier,
RTLD_NEXT, specifically to enable a program to traverse this
hierarchy in order to obtain the correct symbol.  If you are still
interested in the details of obtaining the correct subhierarchy
from an arbitrary graph of such hierarchies, the way in which this
is normally accomplished is to look for a symbol at your own
hierarchy level, for which the address is known (since both the
program iterating and the symbol you are looking for come from the
same .so file and can be compared), and thereafter descend into
your own branch to find the symbol in queation.


> So, the problem of dynamic linking a shared library into a static
> process is non-trivial and probably cannot be solved genericly.

And yet both Solaris and SVR4 manage to accomplish this.  Of course,
they are not using the crt0.o code supplied by the GCC people, who
have so far failed to ship a set of tools (gcc, binutils) that can
do the job.

The problem can be resolved generically.


> Under restricted and controlled conditions you can make it work.
> I would call it the ability to have plugins, not the ability to
> load dynamic libraries.

Then you are probaly using the implementation that was posted to
-current about half a year ago, which failed to take into account
the symbol sets, and made no changes to the constructor argument
list.

If you look in crt0.c again, you will see a number of lines that
look like this:

#ifdef DYNAMIC
        /* ld(1) convention: if DYNAMIC = 0 then statically linked */
        /* sometimes GCC is too smart/stupid for its own good */
        x = (caddr_t)&_DYNAMIC;
        if (x)
                __do_dynamic_link(argv);
#endif /* DYNAMIC */

This is the conditional compilation unit for the dynamic crt0.o, as
the same source file is use for both dynamic and static linking.

The value of 'x' and 'argv' in this case are, for the first, the
location of the 'struct _dynamic' that's is used to gate the call
to __do_dynamic_link(), and the environment I spoke of from my previous
post.

The __do_dynamic_link() function then fills out a struct crt_ldso,
the address of which is passed to the ld.so entry point.

The only data passed into the __do_dynamic_link() is the argv
pointer from the start() function.

Now you should look at c++rt0.c (this is in the same directory).

Specifically, look at the __ctors() and the __init function that
calls it.

Finally, look at i386-elf/crt1.c and common/crtbegin.c.

At this point, it should be pretty obvious that it would be trivial
to add code to propagate the argv through the code to the constructors,
and use a constructor in a static library to accomplish the mmap() of
the ld.so program into a static binary, and the *only* issue is that
the constructors (1) must take a void * parameter so you can pass it
into a constructor based implementation of __do_dynamic_link(), and
(2) the compiler changes necessary so that the decorated symbols are
correct on C++ constructors called for statically declared instances
of C++ classes in C++ object files linked into an arbitrary program.

As I said before, this is all pretty easy to implement.

If you wanted to get lazy, and not make the compiler changes for
the constructor void * argument (which could be generally useful in
a lot of ways), you could always add a global weak symbol that was
stomped by a strong symbol from the static libdl, and make the
constructor list iterator check for NULL before calling it, and use
the linker set length field, rather than NULL, to know how many
members there were in the constructor linker set.

Ether way, you still need to deal with the linker changes necessary
to export the symbol set for all statically linked objects, and to
force the inclusion of all archive members when statically linking,
if one of the linked libraries is libdl, if you wanted a full
implementation.

BTW: IEEE 1003.1-2003 requires a full implementation of dlopen, and
does not permit an exception for statically linked binaries:

http://www.opengroup.org/onlinepubs/007904975/functions/dlopen.html

I'll also point out that the ELF specification does not define static
linking *at all*.

In any case, even with the weak symbol kludge to forward the argv to
the constructor, if it exists, and not doing the linker fix, it's a
couple hours work to make "plugins" work for static binaries, with
standard dlopen() semantics for everything by the "NULL" share object
(self reference) or its subhierarchy of statically linked libraries.

Obviously, "doing it right" for the symbols and archive inclusions
modifications to the linker (a new linker option would probably be
easiest to implement, e.g. "-dlopen") would take a little bit longer.

Considering the major suckage in the linker (e.g. the JNI problems
we had at Whistle several years back have never been addressed; to
do so would require treating linking of a binary as an "RTLD_NOW"
at link time, to ensure all symbols could be satisfied, even if it
was used as "RTLD_LAZY" in practice, to force a link error when it
would be impossible to load a JNI module), I don't think that not
fixing that suckage in the this case makes it "plugins" any more
than the current situation makes any nested opens "plugins", so
calling them that as a justification to not do the work is a cop-out.

I'm surprised that people have found it worth arguing about so long,
considering how little effort is required to implement it.

As to inevitable "where are the patches?", please check the -current
list archives, you will find at least one set there.

-- Terry
Received on Wed Nov 26 2003 - 03:16:16 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:31 UTC