Probably a known issue, but I thought it worthwhile reporting it, if nothing else for archival purposes. I think our userland thread library (libc_r) has some bugs in handling descriptors. I can reproduce the behaviour on -current and 4.x, and I believe it applies to 5.x too. Following is a description of the problem and some code to replicate it The code includes a workaround but it is not particularly nice. Any better ideas ? I am not sure on what to do, but perhaps the only sensible thing to do is to add a note with this workaround (or better ones, if available) to our pthreads manpage --- PROBLEM DESCRIPTION --- Basically, our libc_r keeps two views of i/o descriptors, one (external) is for threads and reflects the modes requested by the threads (blocking or not, etc.); the "internal" view instead is how descriptors are actually set in the kernel -- and there they should always be set as O_NONBLOCK to avoid blocking on a syscall. The bug occurs when a process does a fork(), and then either a close() or an exec() -- a similar thing also occurs with popen(). The relevant source code is in /usr/src/lib/libc_r/uthread/uthread_execve.c /usr/src/lib/libc_r/uthread/uthread_close.c Right before the exec(), the internal descriptors are put into blocking mode if the external one are blocking, and they are only reset to O_NONBLOCK after termination of the child (upon SIGCHLD). The same occurs for close(). Note that close() has hacks to leave pipes alone, but the same code is not present in the execve() case where instead I believe it would be necessary. Another thing to note is that there is some kind of 'fate sharing' among the stdio descriptors (0, 1, 2) which is not totally clear to me, but seems to require setting O_NONBLOCK on all 3 to make sure that they are not changed to blocking mode. Because descriptors are shared between parent and child, for the lifetime of the child descriptors in the parent will be blocking and the scheduling of threads will be completely broken. The only fix i have found is to act as follows: pipe(fd); /* create a pipe with the child */ p = fork(); if (p == 0) { /* child */ /* call fcntl() _before_ close() to avoid resetting * O_NONBLOCK on the internal descriptors. After that, * close the descriptors not needed in the child. */ for (i=0; i < getdtablesize(); i++) { long fl = fcntl(i, F_GETFL); if (fl != -1 && i != fd[0]) { /* open and must be closed in the child */ fcntl(i, F_SETFL, O_NONBLOCK | fl); close(i); } } /* standard stuff (dup2, exec*()... */ dup2(fd[0], STDOUT_FILENO); /* as an example */ execl(....); } else { /* parent */ close(fd[0]); /* close child end. */ ... } but of course this is rather unintuitive. On the other hand, I have no idea of a better way to address the problem, and being fairly new to threads programming maybe others know better. I am attaching two minimal programs to demonstrate the bug. simple.c is a simple program (linked against the regular C library) cc -o simple simple.c that only plays with blocking mode on the descriptors. thre.c is meant to be linked with libc_r. cc -o thre thre.c -lc_r It does a fork and exec of the other program. If you call it without arguments, it does not implement the above workaround, and you see how the 'internal' descriptor change to blocking mode. If you call it with an argument, it implements the workaround. enjoy luigiReceived on Wed Jun 15 2005 - 07:54:48 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:36 UTC