Re: qmail uses 100% cpu after FreeBSD-5.0 to 5.1 upgrade

From: Don Lewis <truckman_at_FreeBSD.org> Date: Mon, 16 Jun 2003 10:29:57 -0700 (PDT) · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:37:12 UTC

On 16 Jun, Bruce Evans wrote:
> On Mon, 16 Jun 2003, Don Lewis wrote:
> 
>> On 16 Jun, Bruce Evans wrote:
>> > In my review of 1.87, I forgot to ask you how atomic the close is with part
>> > of it moved out to fifo_inactive().  I think it's important that all
>> > traces of the old open have gone away (as far as applications can tell)
>> > when the last close returns.
>>
>> I hadn't taken queued data into consideration.  Now that I've looked at
>> this more closely, there are other problems in both the old and new
>> code.  If a process calls fcntl(fd, F_SETOWN, ...) on one end of the
>> fifo, that should be undone when that end of the fifo is closed.  In the
>> old implementation, that only happens when both ends of the fifo are
>> closed and the sockets are deleted.
> 
> F_SETOWN (and associated signal delivery) is even more broken than that :-].
> This fcntl() should applied to the file (though not just the file descriptor),
> so its effect should be limited to fd's open in the file instance and go
> away when all thse are closed.  However, F_SETOWN (and associated signal
> delivery) actually applies to the socket for fifos.  It doesn't work quite
> right for ttys either.  F_SETOWN apparently isn't used in ways complicated
> enough to require it to work right.

There is a fundamental architectural problem -- devices and files don't
have a list of the descriptors that have them open.  That would require
putting descriptors on another list (and dealing with the necessary
locking), which would also bloat the size of the descriptor structure.
Storing the F_SETOWN info there would bloat all descriptors even more
rather than the relative handful of device structures that support this
feature.

>> >> Now there are two questions that I can't answer:
>> >>
>> >> 	Why is my analysis of select() and the SS_CANTRCVMORE flag
>> >>         incorrect in 5.1-current with version 1.87 or 1.88 of
>> >>         fifo_vnops.c.
>> >
>> > I think it is correct, assuming that something writes to the fifo.
>> > Writing might be part of synchronization but actually reading the
>> > data should not be necessary since the last close must discard the
>> > data (POSIX spec).
>>
>> It sure looks to me like SS_CANTRCVMORE is always set when the write end
>> of the fifo is closed, no matter whether the the sockets were freshly
>> allocated by a fifo_open() call on the read end of the fifo, or because
>> the the last writer closed the write end of the fifo.  It sure looks
>> like select() should immediately return if this flag is set, but it is
>> not returning ...
> 
> Alfred changed the semantics for 5.x.  I thought that you knew this.
> I finally gave up resisting this change after a lot of email :-).  In
> 5.x, SS_CANTRCVMORE often has no effect for fifos (it still works
> normally for sockets).  fifo_poll() normally calls soo_poll() with
> POLLIN converted to POLLINIGNEOF.  This causes soo_poll() (sopoll())
> to skip the usual SS_CANTRCVMORE check (which is inside soreadable())
> and check the watermark instead, so that select() on a fifo normally
> waits for data even when the fifo is open in nonblocking mode and
> SS_CANTRCVMORE is set.

Nope, I didn't know this, and I missed the POLLIN->POLLINIGNEOF
conversion when I was tracing the code.