Re: Re: Improving the kernel/i386 timecounter performance (GSoC proposal)

From: Robert Watson <rwatson_at_FreeBSD.org>
Date: Fri, 27 Mar 2009 22:59:39 +0000 (GMT)
On Fri, 27 Mar 2009, Sergey Babkin wrote:

>   Would not a normal mmap be duplicated on fork? I'd do it as a small
>   pseudo-= driver
>   that allows to mmap this page. Then libc would open this pseudo-d=
>   evice and mmap it,
>   either in the on-load handler or on the first call of=
>   gettimeofday().  I think, that should
>   be it, no special magic nece= ssary.
>   The per-process is more difficult and would require the magic= :-) Or
>   maybe
>   no magic a s such: just mmap the file from the /proc files= ystem.
>   Then on fork
>   in the child unmap this page, open the new file, and= map it. vfork
>   will still be tricky :-)
>   It also means wasting an extra p= age per process.

Part of the point of mapping in the page at execve()-time, or fork()-time for 
per-process pages (which I'm not entirely convinced we need yet) is to avoid 
the cost of an extra device open, mmap, etc, for every execve(), which can be 
quite expensive.  I stuck a prototype page mapped from a special device 
exporting time information here a year or two ago:

   http://www.watson.org/~robert/freebsd/20080203-evilmem.diff
   http://www.watson.org/~robert/freebsd/evilmem_test.c

This doesn't do TSC-based adjustment, just drops a timestamp in from the 
callout wheel, but was intended to allow Kris to do a bit of comparative 
benchmarking and decide if it might be a viable approach to invest further 
work in.  Obviously, the above code should never, ever, get near a production 
kernel, since it was a 2-hour hack for experimental purposes.

I think the right way forward is to prototype: map the page in at 
execve()-time in the kernel and pass the address to rtld via elf auxiliary 
arguments, and have rtld link it (via some or another means), exposing symbols 
or code or whatever, to libc.  If someone wants to make it a dynamic shared 
object in ELF-speak, then I'm all for that as it would minimize the work rtld 
had to do.

I guess interesting questions are whether (a) it would be desirable to have 
per-page, per-cpu, or per-thread mappings.  If there are non-synchronized 
TSCs, then there might be some interesting advantages to a per-CPU page.

Robert N M Watson
Computer Laboratory
University of Cambridge

>   -SB
>   Mar 27, 2009 12:51:56 PM, [1]scottl_at_samsc= o.org wrote:
>
>     I've been talking about this for years. All I need is help with =
>     the VM
>     magic to create the page on fork. I also want two pages, one gl=
>     obal
>     for gettimeofday (and any other global data we can think of) and
>     on= e
>     per-process for static data like getpid/getgid.
>     Scott
>     Sergey Babkin wrote:
>     > (Sorry for the top quoting). Probably the= best implementation of
>     > gettimeofd=3Dy() is to have
>     > a= page in the kernel mapped read-only to all the user
>     pr=3Dcesses. Put
>     &g= t; the kernel's idea of time
>     > into this page. Then getting the= =3Dime becomes a simple read
>     (OK, two
>     > reads, to make sure that<= br>> no update =3Das happened in
>     between).
>
>
> References
>
>   1. file://localhost/tmp/3D"mai=
> _______________________________________________
> freebsd-hackers_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "freebsd-hackers-unsubscribe_at_freebsd.org"
>
Received on Fri Mar 27 2009 - 21:59:40 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:45 UTC