Re: Intercepting calls in PIC mode

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Fri, 4 Jul 2014 17:38:50 +0300
On Fri, Jul 04, 2014 at 04:12:51PM +0400, Ivan A. Kosarev wrote:
> Hello,
> 
> Consider the following:
> 
> ---
> #include <stdio.h>
> #include <string.h>
> 
> extern "C" void* memset(void *block, int c, size_t size)
>      __attribute__((weak, alias("__int_memset"), visibility("default")));
> 
> extern "C" __attribute__((visibility("default")))
> void* __int_memset(void *block, int c, size_t size) {
>      puts("Hello");
>      return NULL;
> }
> 
> int main()
> {
>      void *(*F)(void *b, int c, size_t len) = memset;
>      char a[5];
>      memset(a, 0, sizeof(a));
>      F(a, 0, sizeof(a));
>      return 0;
> }
> ---
> 
> It intercepts the memset() calls without issue on both x86-64 FreeBSD 
> 9.2 and Linux. However, with the -fPIC option specified in the cc's 
> command line, only the first (direct) call work on FreeBSD, but not the 
> second (indirect) one. Note is that on Linux both the calls are 
> intercepted--no matter whether the -fPIC option is specified or not.

Your example is rather convoluted, I will try to explain below why.

First, I am sure that C99 does not allow to override the semantic of the
standard-defined functions.  That said, a call to memset(3) can be
inlined by a compiler, so there could be nothing to intercept.

Second, FreeBSD implementation of the weak ELF symbols is non-compliant.
The dynamic linker prioritizes non-weak symbols over the weak. This at
least explains why your code snippet does not segfaults: the memset(3)
from libc is not interposed by your memset() implementation, so libc can
at least initialize itself.  If you remove weak attribute from the memset(),
debug version of libc fails with assertions in jemalloc, while normal
build just segfaults.

That said, there are also differences in the static linker behaviour.
Clang generates the following code to obtain the address of the memset(3)
function:
		movq	memset_at_GOTPCREL(%rip), %rsi

The in-tree ld from binutils 2.17.redhat generates the
R_X86_64_GLOB_DAT relocation to fill the GOT entry for the memset
symbol.  Processor of the GLOB_DAT in the rtld-elf always starts
the lookup of the requested symbol in the object next from main.
For your code, this means libc is searched for memset to fill the
slot, and you get a libc symbol.

The ld from the stock build of binutils 2.24, on the other hand, does not
generate a relocation at all, it resolves memset internally from the same
object file and fills the offset directly into instruction.  I.e., when
the program is linked with new ld, it works as you intend.  This is
probably the reason why it worked for you on Linux.

I am not sure what conclusion could be made from the story I just told you.
Might be, 'do not try to interpose std C functions' and 'put interposers
into the LD_PRELOADed objects' ?

Received on Fri Jul 04 2014 - 12:39:00 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:50 UTC