Re: libthr and main thread stack size

From: Konstantin Belousov <kostikbel_at_gmail.com>
Date: Tue, 23 Sep 2014 13:48:14 +0300
On Tue, Sep 23, 2014 at 02:02:01PM +0400, Sergey Kandaurov wrote:
> > +               env_bigstack = getenv("LIBPTHREAD_BIGSTACK_MAIN");
> > +               env_splitstack = getenv("LIBPTHREAD_SPLITSTACK_MAIN");
> > +               if (bigstack != NULL || env_splitstack == NULL) {
> looks like a typo: s/bigstack/env_bigstack/

Indeed, thank you for noting.  This patch actually booted, and I verified
all 5 conditions.

diff --git a/lib/libthr/libthr.3 b/lib/libthr/libthr.3
index bfbebec..a5b75d4 100644
--- a/lib/libthr/libthr.3
+++ b/lib/libthr/libthr.3
_at__at_ -1,6 +1,11 _at__at_
 .\" Copyright (c) 2005 Robert N. M. Watson
+.\" Copyright (c) 2014 The FreeBSD Foundation, Inc.
 .\" All rights reserved.
 .\"
+.\" Part of this documentation was written by
+.\" Konstantin Belousov <kib_at_FreeBSD.org> under sponsorship
+.\" from the FreeBSD Foundation.
+.\"
 .\" Redistribution and use in source and binary forms, with or without
 .\" modification, are permitted provided that the following conditions
 .\" are met:
_at__at_ -24,7 +29,7 _at__at_
 .\"
 .\" $FreeBSD$
 .\"
-.Dd October 19, 2007
+.Dd September 20, 2014
 .Dt LIBTHR 3
 .Os
 .Sh NAME
_at__at_ -45,8 +50,216 _at__at_ has been optimized for use by applications expecting system scope thread
 semantics, and can provide significant performance improvements
 compared to
 .Lb libkse .
+.Pp
+The library is tightly integrated with the run-time link editor
+.Xr ld-elf.so.1 1
+and
+.Lb libc ;
+all three components must be built from the same source tree.
+Mixing
+.Li libc
+and
+.Nm
+libraries from different versions of
+.Fx
+is not supported.
+The run-time linker
+.Xr ld-elf.so.1 1
+has some code to ensure backward-compatibility with older versions of
+.Nm .
+.Pp
+The man page documents the quirks and tunables of the
+.Nm .
+When linking with
+.Li -lpthread ,
+the run-time dependency
+.Li libthr.so.3
+is recorded in the produced object.
+.Sh MUTEX ACQUISITION
+A locked mutex (see
+.Xr pthread_mutex_lock 3 )
+is represented by a volatile variable of type
+.Dv lwpid_t ,
+which records the global system identifier of the thread
+owning the lock.
+.Nm
+performs a contested mutex acquisition in three stages, each of which
+is more resource-consuming than the previous.
+.Pp
+First, a spin loop
+is performed, where the library attempts to acquire the lock by
+.Xr atomic 9
+operations.
+The loop count is controlled by the
+.Ev LIBPTHREAD_SPINLOOPS
+environment variable, with a default value of 2000.
+.Pp
+If the spin loop
+was unable to acquire the mutex, a yield loop
+is executed, performing the same
+.Xr atomic 9
+acquisition attempts as the spin loop,
+but each attempt is followed by a yield of the CPU time
+of the thread using the
+.Xr sched_yield 2
+syscall.
+By default, the yield loop
+is not executed.
+This is controlled by the
+.Ev LIBPTHREAD_YIELDLOOPS
+environment variable.
+.Pp
+If both the spin and yield loops
+failed to acquire the lock, the thread is taken off the CPU and
+put to sleep in the kernel with the
+.Xr umtx 2
+syscall.
+The kernel wakes up a thread and hands the ownership of the lock to
+the woken thread when the lock becomes available.
+.Sh THREAD STACKS
+Each thread is provided with a private user-mode stack area
+used by the C runtime.
+The size of the main (initial) thread stack is set by the kernel, and is
+controlled by the
+.Dv RLIMIT_STACK
+process resource limit (see
+.Xr getrlimit 2 ) .
+.Pp
+By default, the main thread's stack size is equal to the value of
+.Dv RLIMIT_STACK
+for the process.
+If the
+.Ev LIBPTHREAD_SPLITSTACK_MAIN
+environment variable is present in the process environment
+(its value does not matter),
+the main thread's stack is reduced to 4MB on 64bit architectures, and to
+2MB on 32bit architectures, when the threading library is initialized.
+The rest of the address space area which has been reserved by the
+kernel for the initial process stack is used for non-initial thread stacks
+in this case.
+The presence of the
+.Ev LIBPTHREAD_BIGSTACK_MAIN
+environment variable overrides
+.Ev LIBPTHREAD_SPLITSTACK_MAIN ;
+it is kept for backward-compatibility.
+.Pp
+The size of stacks for threads created by the process at run-time
+with the
+.Xr pthread_create 3
+call is controlled by thread attributes: see
+.Xr pthread_attr 3 ,
+in particular, the
+.Xr pthread_attr_setstacksize 3 ,
+.Xr pthread_attr_setguardsize 3
+and
+.Xr pthread_attr_setstackaddr 3
+functions.
+If no attributes for the thread stack size are specified, the default
+non-initial thread stack size is 2MB for 64bit architectures, and 1MB
+for 32bit architectures.
+.Sh RUN-TIME SETTINGS
+The following environment variables are recognized by
+.Nm
+and adjust the operation of the library at run-time:
+.Bl -tag -width LIBPTHREAD_SPLITSTACK_MAIN
+.It Ev LIBPTHREAD_BIGSTACK_MAIN
+Disables the reduction of the initial thread stack enabled by
+.Ev LIBPTHREAD_SPLITSTACK_MAIN .
+.It Ev LIBPTHREAD_SPLITSTACK_MAIN
+Causes a reduction of the initial thread stack, as described in the
+section
+.Sx THREAD STACKS .
+This was the default behaviour of
+.Nm
+before
+.Fx 11.0 .
+.It Ev LIBPTHREAD_SPINLOOPS
+The integer value of the variable overrides the default count of
+iterations in the
+.Li spin loop
+of the mutex acquisition.
+The default count is 2000, set by the
+.Dv MUTEX_ADAPTIVE_SPINS
+constant in the
+.Nm
+sources.
+.It Ev LIBPTHREAD_YIELDLOOPS
+A non-zero integer value enables the yield loop
+in the process of the mutex acquisition.
+The value is the count of loop operations.
+.It Ev LIBPTHREAD_QUEUE_FIFO
+The integer value of the variable specifies how often blocked
+threads are inserted at the head of the sleep queue, instead of its tail.
+Bigger values reduce the frequency of the FIFO discipline.
+The value must be between 0 and 255.
+.El
+.Sh INTERACTION WITH RUN-TIME LINKER
+The
+.Nm
+library must appear before
+.Li libc
+in the global order of depended objects.
+.Pp
+Loading
+.Nm
+with the
+.Xr dlopen 3
+call in the process after the program binary is activated
+is not supported, and causes miscellaneous and hard-to-diagnose misbehaviour.
+This is due to
+.Nm
+interposing several important
+.Li libc
+symbols to provide thread-safe services.
+In particular,
+.Dv errno
+and the locking stubs from
+.Li libc
+are affected.
+This requirement is currently not enforced.
+.Pp
+If the program loads any modules at run-time, and those modules may require
+threading services, the main program binary must be linked with
+.Li libpthread ,
+even if it does not require any services from the library.
+.Pp
+.Nm
+cannot be unloaded; the
+.Xr dlclose 3
+function does not perform any action when called with a handle for
+.Nm .
+One of the reasons is that the interposing of
+.Li libc
+functions cannot be undone.
+.Sh SIGNALS
+The implementation also interposes the user-installed
+.Xr signal 3
+handlers.
+This interposing is done to postpone signal delivery to threads which
+entered (libthr-internal) critical sections, where the calling
+of the user-provided signal handler is unsafe.
+An example of such a situation is owning the internal library lock.
+When a signal is delivered while the signal handler cannot be safely
+called, the call is postponed and performed until after the exit from
+the critical section.
+This should be taken into account when interpreting
+.Xr ktrace 1
+logs.
 .Sh SEE ALSO
-.Xr pthread 3
+.Xr ktrace 1 ,
+.Xr ld-elf.so.1 1 ,
+.Xr getrlimit 2 ,
+.Xr umtx 2 ,
+.Xr dlclose 3 ,
+.Xr dlopen 3 ,
+.Xr errno 3 ,
+.Xr getenv 3 ,
+.Xr libc 3 ,
+.Xr pthread_attr 3 ,
+.Xr pthread_attr_setstacksize 3 ,
+.Xr pthread_create 3 ,
+.Xr signal 3 ,
+.Xr atomic 9
 .Sh AUTHORS
 .An -nosplit
 The
diff --git a/lib/libthr/thread/thr_init.c b/lib/libthr/thread/thr_init.c
index 9bf0e29..6d6a532 100644
--- a/lib/libthr/thread/thr_init.c
+++ b/lib/libthr/thread/thr_init.c
_at__at_ -445,7 +445,7 _at__at_ init_private(void)
 	struct rlimit rlim;
 	size_t len;
 	int mib[2];
-	char *env;
+	char *env, *env_bigstack, *env_splitstack;
 
 	_thr_umutex_init(&_mutex_static_lock);
 	_thr_umutex_init(&_cond_static_lock);
_at__at_ -473,8 +473,9 _at__at_ init_private(void)
 		len = sizeof (_usrstack);
 		if (sysctl(mib, 2, &_usrstack, &len, NULL, 0) == -1)
 			PANIC("Cannot get kern.usrstack from sysctl");
-		env = getenv("LIBPTHREAD_BIGSTACK_MAIN");
-		if (env != NULL) {
+		env_bigstack = getenv("LIBPTHREAD_BIGSTACK_MAIN");
+		env_splitstack = getenv("LIBPTHREAD_SPLITSTACK_MAIN");
+		if (env_bigstack != NULL || env_splitstack == NULL) {
 			if (getrlimit(RLIMIT_STACK, &rlim) == -1)
 				PANIC("Cannot get stack rlimit");
 			_thr_stack_initial = rlim.rlim_cur;
Received on Tue Sep 23 2014 - 08:48:22 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:52 UTC