Re: Deadlock, exclusive sx so_rcv_sx, amd64

From: Kostik Belousov <kostikbel_at_gmail.com>
Date: Fri, 26 Oct 2007 21:45:36 +0300
On Fri, Oct 26, 2007 at 12:22:28PM -0400, John Baldwin wrote:
> On Friday 26 October 2007 05:52:07 am Gleb Kozyrev wrote:
> > On 25/10/2007, John Baldwin <jhb_at_freebsd.org> wrote:
> > > > Running rtorrent and ftp brings my system to a deadlock
> > > > in a few hours. Kernel still responds to pings and sends some
> > > > TCP acks.
> > ...
> > > > Please suggest any other commands to run in DDB if needed.
> > > > Cores are saved.
> > >
> > > show sleepchain <pid>  will show if it's a real deadlock or not.
> > >
> > 
> > This time the freeze was a matter of minutes.
> > 
> > db> ps
> >   pid  ppid  pgrp   uid   state   wmesg         wchan        cmd
> >  1229   991   991     0  ?                                   smbd
> >  1201  1195  1201  1001  SL+     pfault   0xffffffff80b1359c rtorrent
> >  1199  1193  1199  1001  Ss+     ttyin    0xffffff0001211410 tcsh
> >  1197  1193  1197  1001  Ss+     ttyin    0xffffff0001218810 tcsh
> >  1195  1193  1195  1001  Ss+     pause    0xffffff000624a0c0 tcsh
> >  1193  1192  1193  1001  SLs     pfault   0xffffffff80b1359c screen
> >  1192  1190  1190  1001  S+      pause    0xffffff00013c10c0 screen
> >  1190  1189  1190  1001  Ss+     pause    0xffffff00065b40c0 tcsh
> >  1189  1187  1187  1001  S       select   0xffffffff80af79d0 sshd
> >  1187  1097  1187     0  Ss      sbwait   0xffffff00065346cc sshd
> > ...
> > 
> > db> show alllocks
> > Process 1187 (sshd) thread 0xffffff00065ad350 (100166)
> > exclusive sx so_rcv_sx r = 0 (0xffffff0006534670) locked _at_
> > /usr/src/sys/kern/uipc_sockbuf.c:145
> > 
> > db> show sleepchain 1187
> > thread 100166 (pid 1187, sshd) sleeping on 0xffffff00065346cc "sbwait"
> > db> show sleepchain 1201
> > thread 100164 (pid 1201, rtorrent) sleeping on 0xffffffff80b1359c "pfault"
> > 
> > Nothing interesting I guess...
> > Maybe this is not a deadlock, what else can cause such a freeze?
> > I won't reboot it for a while -- maybe someone can suggest anything else.
> 
> "sbwait" is waiting for data to come in on a socket and "pfault" is
> waiting on disk I/O. It is a bit odd that 1187 is holding a lock while
No, pfault means that the process handles page faults, and sleeps
waiting for some page to become available (either from cache or free list).

> sleeping though that is permitted with an sx lock. Still, if it's
> supposed to be protect socket's receive buffer that is odd. Maybe get
> a trace of the process blocked in "sbwait" (tr <pid>) and bug rwatson_at_
> about it.

Received on Fri Oct 26 2007 - 16:45:51 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:20 UTC