Re: unbounded sleep on [fifoow] while open a named pipe: is it a feature?

From: Jilles Tjoelker <jilles_at_stack.nl>
Date: Fri, 28 Jan 2011 13:14:12 +0100
On Fri, Jan 28, 2011 at 12:43:38PM +0300, Sergey Kandaurov wrote:
> That's FreeBSD 8.1-RELEASE i386 (w/o debug).

> It's observed for bash processes which end up in unbounded sleep
> in [fifoow] wchan while executing the next script:

> %%%
> #!/usr/local/bin/bash

> LOG_FACILITY="local7.notice"
> LOG_TOPIC_OUT="$TAG-output[$$]"
> LOG_TOPIC_ERR="$TAG-error[$$]"

> cd $WORK_DIR

> exec  > >(logger -p "$LOG_FACILITY" -t "$LOG_TOPIC_OUT" )
> exec 2> >(logger -p "$LOG_FACILITY" -t "$LOG_TOPIC_ERR" )
> eval exec $1
> %%%

> It's used to call $1 and redirect its stdout and stderr streams
> to different files. Bash implements this functionality by creating
> on every execution a new named pipe stored at /var/tmp/ directory.
> This script is used to run periodically from cron(8).

> Note, that bash has a misfeature to not delete those named pipes.

I think that is more to do with the 'exec'. Bash will delete process
substitution fifos when the process using them is done, but 'exec' will
prevent this.

The original use case for process substitution was
  program <(command)
and it may not be known when 'program' opens the fifo so the fifo needs
to be kept around until 'program' is done.

> With a high creation frequency of named pipes (several per minute) it
> eventually ends up with a large number [~137000] of existing named
> pipes, when it was found that a number [~44] of bash processes sleep
> on [fifoow], which is guessed to be likely caused by the number of
> fifo pipes created.

> For example, this one sleeps for about 1.5 days:
>  1001 78809 78803   0  50  0  4564  1952 fifoow Is    ??    0:00.00
> /usr/local/bin/bash ./logger_wrapper.sh ./sync_overlords.pl

> # procstat -f 78809
> [...]
> 78809 bash                1 f - -w------   1       0 -   /var/tmp/sh-np-[random]
> [...]

If a process gets "stuck" in [fifoow], it means that the process that is
supposed to read from the fifo fails to do so.

This may be because multiple use of the same pathname -- bash's random
name generation algorithm is not very good.

How about something like this instead of the two exec commands with
process substitution (untested):

tempdir=$(mktemp -d /tmp/logtmp.XXXXXXXXXX) || exit
fifo1=$tempdir/stdout.fifo
fifo2=$tempdir/stderr.fifo
mkfifo "$fifo1" "$fifo2" || exit
logger -p "$LOG_FACILITY" -t "$LOG_TOPIC_OUT" <"$fifo1" &
logger -p "$LOG_FACILITY" -t "$LOG_TOPIC_ERR" <"$fifo2" &
exec >"$fifo1" 2>"$fifo2"
rm -rf "$tempdir"

Once the fifos have been opened for writing, the readers must have
already opened too, therefore it is safe to unlink them.

If you have a secure directory you may create the fifos there instead of
creating a temporary directory.

Alternatively, mount the full /dev/fd and compile bash to use it. This
avoids needing to unlink fifos.

-- 
Jilles Tjoelker
Received on Fri Jan 28 2011 - 11:14:14 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:11 UTC