Re: rtld + static linking

From: Terry Lambert <tlambert2_at_mindspring.com> Date: Tue, 25 Nov 2003 17:44:18 -0800 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:31 UTC

"E.B. Dreger" wrote:
> After watching the recent shared/dynamic threads, and reading the
> archives from five or six years ago, I have a question...
> 
> Dynamic linking works by the kernel running the dynamic linker,
> which loads shared objects and fixes the symbol tables, yes?

No.

Dynamic linking works because the crt0 mmap's the /usr/libexec/ld.so
file as executable, and then points known stub offsets into it.  It
then passes this as part of the environment, at a negative offset,
into the _main, which fills out a little glue table.  After all this
is set up, then _main calls the location .entry in the executable,
which is usally main, but can be set to something else at link time.
This all works because the crt0.o has some self-knowledge for a number
of symbol offsets, and because _main is called with the environment
descriptor.  Basically, the environment descriptor is lost in the
static linking case.

> Is there some reason that a statically-linked program couldn't
> include some "ld-elf.a" type of intelligence?  Would that be
> necessary and sufficient to allow statically-linked programs to
> load shared objects?

Yes, and yes.

The main reason that there is no dlopen is that the environment
descriptor is lost.  This is pretty trivial to remedy, but it
means passing the environment descriptor to something that can
use it to set up the startup.

This is complicated by the fact that only a single .init entry
point is usable, so if you were to override it, you would lose
library initialization for C libraries, and you would fail to
call constructor code for statically declare class instances in
C++ code, and lose out on other linker set stuff, such as per
thread exception stacks, etc..

The trivial fix for this is to add a void * parameter to the
constructor iterator in the crt0, and then pass the environment
there, so that you could implement a libdlopen that took that
and used it to obtain the self-knowledge of the crt0 that was
needed to (1) adjust the stub pointers to an mmap'ed ld.so, and
(2) provide the access to the symbol table needed so that when
you loaded modules, they would link properly vs. the symbols in
the executable itself.  You would either need to change the list
to specifically reference the libc symbols, OR  you would need
to reload libc.so (the problem with doing the latter is that you
might get a different libc out of it, and the executable symbols
that may replace the libc symbols wouldn't do so for modules, so
that's a non-starter).

It's actually a pretty trivial crt0 and ld change to deal with
this, the sticking point is that you have to also change the
libgcc and other GNU code to pass a NULL value to the void *
constructor parameter in the default case.  This would make
the code minorly incompatible with Linux, etc., unless you could
get the GCC people to pick up your change.

-- Terry