Re: unbounded sleep on [fifoow] while open a named pipe: is it a feature?

From: Sergey Kandaurov <pluknet_at_gmail.com>
Date: Wed, 2 Feb 2011 18:06:55 +0300
On 28 January 2011 15:14, Jilles Tjoelker <jilles_at_stack.nl> wrote:
> On Fri, Jan 28, 2011 at 12:43:38PM +0300, Sergey Kandaurov wrote:
>> That's FreeBSD 8.1-RELEASE i386 (w/o debug).
>
>> It's observed for bash processes which end up in unbounded sleep
>> in [fifoow] wchan while executing the next script:
>
>> %%%
>> #!/usr/local/bin/bash
>
>> LOG_FACILITY="local7.notice"
>> LOG_TOPIC_OUT="$TAG-output[$$]"
>> LOG_TOPIC_ERR="$TAG-error[$$]"
>
>> cd $WORK_DIR
>
>> exec  > >(logger -p "$LOG_FACILITY" -t "$LOG_TOPIC_OUT" )
>> exec 2> >(logger -p "$LOG_FACILITY" -t "$LOG_TOPIC_ERR" )
>> eval exec $1
>> %%%
>
>> It's used to call $1 and redirect its stdout and stderr streams
>> to different files. Bash implements this functionality by creating
>> on every execution a new named pipe stored at /var/tmp/ directory.
>> This script is used to run periodically from cron(8).
>
>> Note, that bash has a misfeature to not delete those named pipes.
>
> I think that is more to do with the 'exec'. Bash will delete process
> substitution fifos when the process using them is done, but 'exec' will
> prevent this.

Yes, indeed.

>
> The original use case for process substitution was
>  program <(command)
> and it may not be known when 'program' opens the fifo so the fifo needs
> to be kept around until 'program' is done.
>
>> With a high creation frequency of named pipes (several per minute) it
>> eventually ends up with a large number [~137000] of existing named
>> pipes, when it was found that a number [~44] of bash processes sleep
>> on [fifoow], which is guessed to be likely caused by the number of
>> fifo pipes created.
>
>> For example, this one sleeps for about 1.5 days:
>>  1001 78809 78803   0  50  0  4564  1952 fifoow Is    ??    0:00.00
>> /usr/local/bin/bash ./logger_wrapper.sh ./sync_overlords.pl
>
>> # procstat -f 78809
>> [...]
>> 78809 bash                1 f - -w------   1       0 -   /var/tmp/sh-np-[random]
>> [...]
>
> If a process gets "stuck" in [fifoow], it means that the process that is
> supposed to read from the fifo fails to do so.
>
> This may be because multiple use of the same pathname -- bash's random
> name generation algorithm is not very good.
>
> How about something like this instead of the two exec commands with
> process substitution (untested):
>
> tempdir=$(mktemp -d /tmp/logtmp.XXXXXXXXXX) || exit
> fifo1=$tempdir/stdout.fifo
> fifo2=$tempdir/stderr.fifo
> mkfifo "$fifo1" "$fifo2" || exit
> logger -p "$LOG_FACILITY" -t "$LOG_TOPIC_OUT" <"$fifo1" &
> logger -p "$LOG_FACILITY" -t "$LOG_TOPIC_ERR" <"$fifo2" &
> exec >"$fifo1" 2>"$fifo2"
> rm -rf "$tempdir"
>
> Once the fifos have been opened for writing, the readers must have
> already opened too, therefore it is safe to unlink them.

We have switched today to this suitable solution (with minor change).
So far, so good. Thank you very much.

-- 
wbr,
pluknet
Received on Wed Feb 02 2011 - 14:06:59 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:11 UTC