Re: libthr and main thread stack size

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Tue, 23 Sep 2014 12:24:49 +0300
On Mon, Sep 22, 2014 at 09:19:30AM -0400, Daniel Eischen wrote:
> On Sun, 21 Sep 2014, Julian Elischer wrote:
> 
> > On 9/20/14, 3:27 AM, John Baldwin wrote:
> >> On Tuesday, September 16, 2014 11:13:24 AM Konstantin Belousov wrote:
> >>> On Mon, Sep 15, 2014 at 03:47:41PM -0600, Justin T. Gibbs wrote:
> >>>> On Aug 8, 2014, at 5:22 AM, Konstantin Belousov <kostikbel_at_gmail.com>
> >>>> wrote:
> >>>> 
> >>>> ?
> >>>> 
> >>>>> Below is the patch which adds environment variable
> >>>>> LIBPTHREAD_BIGSTACK_MAIN. Setting it to any value results in the
> >>>>> main thread stack left as is, and other threads allocate stack
> >>>>> below the area of RLIMIT_STACK. Try it. I do not want to set this
> >>>>> behaviour as default.
> >>>> Is there a reason this should not be the default? Looking at the
> >>>> getrlimit() page on the OpenGroup?s site they say:
> >>>> 
> >>>> RLIMIT_STACK This is the maximum size of the initial thread's stack,
> >>>> in bytes. The implementation does not automatically grow the stack
> >>>> beyond this limit. If this limit is exceeded, SIGSEGV shall be
> >>>> generated for the thread. If the thread is blocking SIGSEGV, or the
> >>>> process is ignoring or catching SIGSEGV and has not made arrangements
> >>>> to use an alternate stack, the disposition of SIGSEGV shall be set to
> >>>> SIG_DFL before it is generated.
> >>>> 
> >>>> Does posix say something different?
> >>>> 
> >>>> I ran into this issue when debugging a segfault on Postgres when
> >>>> running an (arguably quite bogus) query that should have fit within
> >>>> both the configured stack rlimit and Postgres? configured stack limit.
> >>>> The Postgres backend is really just single threaded, but happens
> >>>> to pull in libpthread due to the threading support in some of the
> >>>> libraries it uses. The segfault definitely violates POLA.
> >>>> 
> >>>> ? Justin
> >>> I am conservative to not disturb the address space layout in single go.
> >>> If enough people test this setting, I can consider flipping the default
> >>> to the reverse.
> >>> 
> >>> I am still curious why the things were done in this way, but nobody
> >>> replied.
> >> I suspect it was done out of reasons of being overly conservative in
> >> interpreting RLIMIT_STACK.  I think it is quite surprising behavior though 
> >> and
> >> would rather we make your option the default and implement what the Open 
> >> Group
> >> says above.
> >> 
> > that is my memory..
> > The transition from a non threaded app to a threaded app with one thread is 
> > sort of an undefined area.
> > Feel free to change it if Dan agrees..
> 
> I'm all for adopting what POSIX specifies as the default.  I
> would shy away from adding another knob (LIBPTHREAD_BIGSTACK_MAIN)
> if possible.

In the patch, default behaviour is to provide RLIMIT_STACK sized stack
for the main thread.  The knobs are there to restore the old AS layout
if my fears of the binary compatibility become real one day, and to
keep the interface compat with the stable/10, which already got a knob
merged.

That said, below the patch with libthr.7 man page merged to libthr.3,
and with the editing applied.

diff --git a/lib/libthr/libthr.3 b/lib/libthr/libthr.3
index bfbebec..aa4572c 100644
--- a/lib/libthr/libthr.3
+++ b/lib/libthr/libthr.3
_at__at_ -1,6 +1,11 _at__at_
 .\" Copyright (c) 2005 Robert N. M. Watson
+.\" Copyright (c) 2014 The FreeBSD Foundation, Inc.
 .\" All rights reserved.
 .\"
+.\" Part of this documentation was written by
+.\" Konstantin Belousov <kib_at_FreeBSD.org> under sponsorship
+.\" from the FreeBSD Foundation.
+.\"
 .\" Redistribution and use in source and binary forms, with or without
 .\" modification, are permitted provided that the following conditions
 .\" are met:
_at__at_ -24,7 +29,7 _at__at_
 .\"
 .\" $FreeBSD$
 .\"
-.Dd October 19, 2007
+.Dd September 20, 2014
 .Dt LIBTHR 3
 .Os
 .Sh NAME
_at__at_ -45,8 +50,216 _at__at_ has been optimized for use by applications expecting system scope thread
 semantics, and can provide significant performance improvements
 compared to
 .Lb libkse .
+.Pp
+The library is tightly integrated with the run-time link editor
+.Xr ld-elf.so.1 1
+and
+.Lb libc ;
+all three components must be built from the same source tree.
+Mixing
+.Li libc
+and
+.Nm
+libraries from different versions of
+.Fx
+is not supported.
+The run-time linker
+.Xr ld-elf.so.1 1
+has some code to ensure backward-compatibility with older versions of
+.Nm .
+.Pp
+The man page documents the quirks and tunables of the
+.Nm .
+When linking with
+.Li -lpthread ,
+the run-time dependency
+.Li libthr.so.3
+is recorded in the produced object.
+.Sh MUTEX ACQUISITION
+A locked mutex (see
+.Xr pthread_mutex_lock 3 )
+is represented by a volatile variable of type
+.Dv lwpid_t ,
+which records the global system identifier of the thread
+owning the lock.
+.Nm
+performs a contested mutex acquisition in three stages, each of which
+is more resource-consuming than the previous.
+.Pp
+First, a spin loop
+is performed, where the library attempts to acquire the lock by
+.Xr atomic 9
+operations.
+The loop count is controlled by the
+.Ev LIBPTHREAD_SPINLOOPS
+environment variable, with a default value of 2000.
+.Pp
+If the spin loop
+was unable to acquire the mutex, a yeild loop
+is executed, performing the same
+.Xr atomic 9
+acquisition attempts as the spin loop,
+but each attempt is followed by a yield of the CPU time
+of the thread using the
+.Xr sched_yield 2
+syscall.
+By default, the yield loop
+is not executed.
+This is controlled by the
+.Ev LIBPTHREAD_YIELDLOOPS
+environment variable.
+.Pp
+If both the spin and yield loops
+failed to acquire the lock, the thread is taken off the CPU and
+put to sleep in the kernel with the
+.Xr umtx 2
+syscall.
+The kernel wakes up a thread and hands the ownership of the lock to
+the woken thread when the lock becomes available.
+.Sh THREAD STACKS
+Each thread is provided with a private user-mode stack area
+used by the C runtime.
+The size of the main (initial) thread stack is set by the kernel, and is
+controlled by the
+.Dv RLIMIT_STACK
+process resource limit (see
+.Xr getrlimit 2 ) .
+.Pp
+By default, the main thread's stack size is equal to the value of
+.Dv RLIMIT_STACK
+for the process.
+If the
+.Ev LIBPTHREAD_SPLITSTACK_MAIN
+environment variable is present in the process environment
+(its value does not matter),
+the main thread's stack is reduced to 4MB on 64bit architectures, and to
+2MB on 32bit architectures, when the threading library is initialized.
+The rest of the address space area which has been reserved by the
+kernel for the initial process stack is used for non-initial thread stacks
+in this case.
+The presence of the
+.Ev LIBPTHREAD_BIGSTACK_MAIN
+environment variable overrides
+.Ev LIBPTHREAD_SPLITSTACK_MAIN ;
+it is kept for backward-compatibility.
+.Pp
+The size of stacks for threads created by the process at run-time
+with the
+.Xr pthread_create 3
+call is controlled by thread attributes: see
+.Xr pthread_attr 3 ,
+in particular, the
+.Xr pthread_attr_setstacksize 3 ,
+.Xr pthread_attr_setguardsize 3
+and
+.Xr pthread_attr_setstackaddr 3
+functions.
+If no attributes for the thread stack size are specified, the default
+non-initial thread stack size is 2MB for 64bit architectures, and 1MB
+for 32bit architectures.
+.Sh RUN-TIME SETTINGS
+The following environment variables are recognized by
+.Nm
+and adjust the operation of the library at run-time:
+.Bl -tag -width LIBPTHREAD_SPLITSTACK_MAIN
+.It Ev LIBPTHREAD_BIGSTACK_MAIN
+Disables the reduction of the initial thread stack enabled by
+.Ev LIBPTHREAD_SPLITSTACK_MAIN .
+.It Ev LIBPTHREAD_SPLITSTACK_MAIN
+Causes a reduction of the initial thread stack, as described in the
+section
+.Sx THREAD STACKS .
+This was the default behaviour of
+.Nm
+before
+.Fx 11.0 .
+.It Ev LIBPTHREAD_SPINLOOPS
+The integer value of the variable overrides the default count of
+iterations in the
+.Li spin loop
+of the mutex acquisition.
+The default count is 2000, set by the
+.Dv MUTEX_ADAPTIVE_SPINS
+constant in the
+.Nm
+sources.
+.It Ev LIBPTHREAD_YIELDLOOPS
+A non-zero integer value enables the yield loop
+in the process of the mutex acquisition.
+The value is the count of loop operations.
+.It Ev LIBPTHREAD_QUEUE_FIFO
+The integer value of the variable specifies how often blocked
+threads are inserted at the head of the sleep queue, instead of its tail.
+Bigger values reduce the frequency of the FIFO discipline.
+The value must be between 0 and 255.
+.El
+.Sh INTERACTION WITH RUN-TIME LINKER
+The
+.Nm
+library must appear before
+.Li libc
+in the global order of depended objects.
+.Pp
+Loading
+.Nm
+with the
+.Xr dlopen 3
+call in the process after the program binary is activated
+is not supported, and causes miscellaneous and hard-to-diagnose misbehaviour.
+This is due to
+.Nm
+interposing several important
+.Li libc
+symbols to provide thread-safe services.
+In particular,
+.Dv errno
+and the locking stubs from
+.Li libc
+are affected.
+This requirement is currently not enforced.
+.Pp
+If the program loads any modules at run-time, and those modules may require
+threading services, the main program binary must be linked with
+.Li libpthread ,
+even if it does not require any services from the library.
+.Pp
+.Nm
+cannot be unloaded; the
+.Xr dlclose 3
+function does not perform any action when called with a handle for
+.Nm .
+One of the reasons is that the interposing of
+.Li libc
+functions cannot be undone.
+.Sh SIGNALS
+The implementation also interposes the user-installed
+.Xr signal 3
+handlers.
+This interposing is done to postpone signal delivery to threads which
+entered (libthr-internal) critical sections, where the calling
+of the user-provided signal handler is unsafe.
+An example of such a situation is owning the internal library lock.
+When a signal is delivered while the signal handler cannot be safely
+called, the call is postponed and performed until after the exit from
+the critical section.
+This should be taken into account when interpreting
+.Xr ktrace 1
+logs.
 .Sh SEE ALSO
-.Xr pthread 3
+.Xr ktrace 1 ,
+.Xr ld-elf.so.1 1 ,
+.Xr getrlimit 2 ,
+.Xr umtx 2 ,
+.Xr dlclose 3 ,
+.Xr dlopen 3 ,
+.Xr errno 3 ,
+.Xr getenv 3 ,
+.Xr libc 3 ,
+.Xr pthread_attr 3 ,
+.Xr pthread_attr_setstacksize 3 ,
+.Xr pthread_create 3 ,
+.Xr signal 3 ,
+.Xr atomic 9 .
 .Sh AUTHORS
 .An -nosplit
 The
diff --git a/lib/libthr/thread/thr_init.c b/lib/libthr/thread/thr_init.c
index 9bf0e29..72a067a 100644
--- a/lib/libthr/thread/thr_init.c
+++ b/lib/libthr/thread/thr_init.c
_at__at_ -445,7 +445,7 _at__at_ init_private(void)
 	struct rlimit rlim;
 	size_t len;
 	int mib[2];
-	char *env;
+	char *env, *env_bigstack, *env_splitstack;
 
 	_thr_umutex_init(&_mutex_static_lock);
 	_thr_umutex_init(&_cond_static_lock);
_at__at_ -473,8 +473,9 _at__at_ init_private(void)
 		len = sizeof (_usrstack);
 		if (sysctl(mib, 2, &_usrstack, &len, NULL, 0) == -1)
 			PANIC("Cannot get kern.usrstack from sysctl");
-		env = getenv("LIBPTHREAD_BIGSTACK_MAIN");
-		if (env != NULL) {
+		env_bigstack = getenv("LIBPTHREAD_BIGSTACK_MAIN");
+		env_splitstack = getenv("LIBPTHREAD_SPLITSTACK_MAIN");
+		if (bigstack != NULL || env_splitstack == NULL) {
 			if (getrlimit(RLIMIT_STACK, &rlim) == -1)
 				PANIC("Cannot get stack rlimit");
 			_thr_stack_initial = rlim.rlim_cur;
Received on Tue Sep 23 2014 - 07:24:57 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:52 UTC