Re: Deadlock, exclusive sx so_rcv_sx, amd64

From: Robert Watson <rwatson_at_FreeBSD.org>
Date: Fri, 26 Oct 2007 22:42:13 +0100 (BST)
On Fri, 26 Oct 2007, John Baldwin wrote:

> "sbwait" is waiting for data to come in on a socket and "pfault" is waiting 
> on disk I/O.  It is a bit odd that 1187 is holding a lock while sleeping 
> though that is permitted with an sx lock.  Still, if it's supposed to be 
> protect socket's receive buffer that is odd.  Maybe get a trace of the 
> process blocked in "sbwait" (tr <pid>) and bug rwatson_at_ about it.

This is normal -- there are two kinds of locks on each socket buffer: a mutex 
protecting the integrity of the data structure, and an sx lock serializing I/O 
on the socket buffer.  The latter is intended to prevent I/O interlacing, and 
replaced the older sblock/sbunlock implemented using tsleep(), flags, and the 
mutex as an interlock.  It is normal for the sx lock to be held over sleeps -- 
both sbwait, indicating that the I/O has not yet been completed but is waiting 
on the network or remote endpoint, and a page fault, indicating that a data 
copy to or from user space is in progress and has blocked waiting on paging. 
Other threads blocked on the sx lock sleep interruptibly, thanks for Attilio's 
addition of interruptible sx lock calls.

It's not impossible that there are deadlocks involved, but if so, they likely 
existed before the change to formal sx locks as the previous "by hand" lock 
construction had essentially identical (but slower) properties.  There is an 
interesting question about whether the strong semantics in the presence of 
interlaced I/O requests (i.e., simultaneous requests from multiple threads on 
a single socket) are required, in which case we might be able to weaken the 
locking here with some reworking of the socket buffer data structures and 
send/receive routines.  For the time being we should leave them as-is for 
stream sockets, and have optimized them out for UDP sockets by virtue of a 
simplified sosend_dgram(), which was part of our optimization work for BIND. 
FYI, BIND uses a single UDP socket for all transactions, and since each 
transaction is atomic (being a datagram), the overhead of socket buffer 
locking was significant, not to mention unrequired.  This was problem was 
originally pointed out by Jinmei Tatuya.

So, in summary: sleeping while holding the so_rcv/so_snd sx locks is normal, 
but deadlocks are not, so if the pointer comes back in the direction of the 
socket code after some more investigation, let me know.

Robert N M Watson
Computer Laboratory
University of Cambridge
Received on Fri Oct 26 2007 - 19:42:14 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:20 UTC