Re: Deadlock, exclusive sx so_rcv_sx, amd64

From: John Baldwin <jhb_at_freebsd.org>
Date: Fri, 26 Oct 2007 12:22:28 -0400
On Friday 26 October 2007 05:52:07 am Gleb Kozyrev wrote:
> On 25/10/2007, John Baldwin <jhb_at_freebsd.org> wrote:
> > > Running rtorrent and ftp brings my system to a deadlock
> > > in a few hours. Kernel still responds to pings and sends some
> > > TCP acks.
> ...
> > > Please suggest any other commands to run in DDB if needed.
> > > Cores are saved.
> >
> > show sleepchain <pid>  will show if it's a real deadlock or not.
> >
> 
> This time the freeze was a matter of minutes.
> 
> db> ps
>   pid  ppid  pgrp   uid   state   wmesg         wchan        cmd
>  1229   991   991     0  ?                                   smbd
>  1201  1195  1201  1001  SL+     pfault   0xffffffff80b1359c rtorrent
>  1199  1193  1199  1001  Ss+     ttyin    0xffffff0001211410 tcsh
>  1197  1193  1197  1001  Ss+     ttyin    0xffffff0001218810 tcsh
>  1195  1193  1195  1001  Ss+     pause    0xffffff000624a0c0 tcsh
>  1193  1192  1193  1001  SLs     pfault   0xffffffff80b1359c screen
>  1192  1190  1190  1001  S+      pause    0xffffff00013c10c0 screen
>  1190  1189  1190  1001  Ss+     pause    0xffffff00065b40c0 tcsh
>  1189  1187  1187  1001  S       select   0xffffffff80af79d0 sshd
>  1187  1097  1187     0  Ss      sbwait   0xffffff00065346cc sshd
> ...
> 
> db> show alllocks
> Process 1187 (sshd) thread 0xffffff00065ad350 (100166)
> exclusive sx so_rcv_sx r = 0 (0xffffff0006534670) locked _at_
> /usr/src/sys/kern/uipc_sockbuf.c:145
> 
> db> show sleepchain 1187
> thread 100166 (pid 1187, sshd) sleeping on 0xffffff00065346cc "sbwait"
> db> show sleepchain 1201
> thread 100164 (pid 1201, rtorrent) sleeping on 0xffffffff80b1359c "pfault"
> 
> Nothing interesting I guess...
> Maybe this is not a deadlock, what else can cause such a freeze?
> I won't reboot it for a while -- maybe someone can suggest anything else.

"sbwait" is waiting for data to come in on a socket and "pfault" is waiting on 
disk I/O.  It is a bit odd that 1187 is holding a lock while sleeping though 
that is permitted with an sx lock.  Still, if it's supposed to be protect 
socket's receive buffer that is odd.  Maybe get a trace of the process 
blocked in "sbwait" (tr <pid>) and bug rwatson_at_ about it.

-- 
John Baldwin
Received on Fri Oct 26 2007 - 16:21:22 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:20 UTC