Re: truss(1) locked in kernel with 8.0-BETA2

From: Kostik Belousov <kostikbel_at_gmail.com>
Date: Sat, 22 Aug 2009 22:47:24 +0300
On Sat, Aug 22, 2009 at 08:58:12PM +0200, Jeremie Le Hen wrote:
> Hi,
> 
> I've upgraded my laptop to 8.0-BETA2 and ran portupgrade in script(1).
> But according to top, it seems script(1) is going crazy, even after I've
> hit ^C:
> 
> % CPU:  0.0% user,  7.9% nice, 92.1% system,  0.0% interrupt,  0.0% idle
> % Mem: 64M Active, 675M Inact, 141M Wired, 21M Cache, 111M Buf, 90M Free
> % Swap: 1024M Total, 1024M Free
> % 
> % PID USERNAME    THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
> % 47998 root          1 128   10  3344K   788K RUN      3:02 100.00% script
> 
> jarjarbinks:~:7# procstat -k 47998
>   PID    TID COMM             TDNAME           KSTACK                       
> 47998 100087 script           -                mi_switch critical_exit intr_event_handle intr_execute_handlers lapic_handle_intr Xapic_isr1 binuptime bintime microtime gettimeofday syscall Xint0x80_syscall 
> 
> truss(1) show the following sequence infinitely:
> % gettimeofday({1250965404.830938 },0x0)           = 0 (0x0)
> % select(5,{0 4},0x0,0x0,{30.000000 })             = 1 (0x1)
> % read(0,0xbfbfdf5c,1024)                          = 0 (0x0)
> % write(4,0xbfbfdf5c,0)                            = 0 (0x0)
> 
> And when I try to stop truss, it nevers end and seems to be blocked in kernel:
> 
> ps aux:
> root   49506  0.0  0.1  3284   912   2  I+    8:23PM   0:00.06 truss -p 47998
> root   47998  0.0  0.1  3344   788   1  TNX+  8:39AM   6:37.36 /usr/bin/script -qa /tmp/portupgrade20090822-4080-3knxs0-0 env UPGRA
> 
> procstat -k:
> 49506 100121 truss            -                mi_switch sleepq_switch sleepq_catch_signals sleepq_wait_sig _sleep kern_wait wait4 syscall Xint0x80_syscall
> 47998 100087 script           -                mi_switch thread_suspend_switch ptracestop syscall Xint0x80_syscall 
> Note that once I've truss(1)'ed script(1), it stops consuming all the
> CPU cycles.  It seems to be deadlocked, but DDB shows only one lock.  So
> this is not a deadlock stricly speaking, but obviously truss(1) is
> waiting for something that will never come but nonetheless I can kill it
> with a SIGKILL.
> 
> 
> The same thing can happen with dialog(1); truss shows:
> % read(0,0xbfbfcf10,1)                             = 0 (0x0)
> % read(0,0xbfbfcf10,1)                             = 0 (0x0)
> % read(0,0xbfbfcf10,1)                             = 0 (0x0)
> % read(0,0xbfbfcf10,1)                             = 0 (0x0)
> 
> And so on.  And after ^C the processes are staled in the same state in
> the kernel.
> 
> I've reproduced this twice in a row across a reboot.  I think I can
> reproduce it on demand.
Does kill -9 kills truss ? I expect that it does, and then the debuggee
is killable.

Received on Sat Aug 22 2009 - 17:47:39 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:54 UTC