Re: Process stuck in vmmaps on 8.0-BETA1

From: John Marshall <john.marshall_at_riverwillow.com.au>
Date: Thu, 9 Jul 2009 18:52:42 +1000
On Thu, 09 Jul 2009, 17:30 +1000, John Marshall wrote:
> On Thu, 09 Jul 2009, 10:42 +0400, pluknet wrote:
> > 2009/7/9 John Marshall <john.marshall_at_riverwillow.com.au>:
> > > After upgrading...
> > >  - boot new kernel to single-user
> > >  - make installworld
> > >  - make delete-old
> > >  - make delete-old-libs
> > >  - mergemaster
> > >  - reboot
> > >
> > > I re-built a few of my applications.  I noticed a problem with ntpd
> > > 4.2.4p7.  The build was fine, it started fine, but got stuck in vmmaps
> > > and I couldn't kill it.  Stopping the operating system appears to be the
> > > only remedy.  I have re-built a few times (starting with 'make
> > > distclean') just to make sure.
> > >
> > >  UID   PID  PPID CPU PRI NI   VSZ   RSS MWCHAN STAT  TT       TIME COMMAND
> > >    0   791     1   0  44  0  4944  4920 vmmaps Ds    ??    0:00.01 ntpd
> > >
> > 
> > Can you place here 'procstat -k 791', where 791 is pid of ntpd?
> > It'd be nice also if you go through all ddb steps described in
> > Debugging Deadlocks chapter of FreeBSD Developers' Handbook.
> 
> Here is some procstat output.  I'm just rebuilding the kernel with the
> debugging options enabled - not something I've ever done before.
> 
> rwsrv05# procstat 2788
>   PID  PPID  PGID   SID  TSID THR LOGIN    WCHAN     EMUL          COMM        
>  2788     1  2788  2788     0   1 john     vmmaps    FreeBSD ELF32 ntpd        
> rwsrv05# procstat -k 2788
>   PID    TID COMM             TDNAME           KSTACK                       
>  2788 100164 ntpd             -                mi_switch sleepq_switch sleepq_wait _sleep vm_map_unlock_and_wait vm_map_delete vm_map_fixed vm_mmap mmap syscall Xint0x80_syscall 
> rwsrv05# procstat -v 2788
>   PID      START        END PRT  RES PRES REF SHD FL TP PATH
>  2788  0x8048000  0x807e000 r-x   54   60   2   1 CN vn /usr/local/bin/ntpd
>  2788  0x807e000  0x8080000 rw-    2    0   1   0 C- vn /usr/local/bin/ntpd
>  2788  0x8080000  0x8100000 rw-  128    0   1   0 C- df 
>  2788 0x2807e000 0x280ab000 r-x   45    0 171  75 CN vn /libexec/ld-elf.so.1
>  2788 0x280ab000 0x280ad000 rw-    2    0   1   0 C- vn /libexec/ld-elf.so.1
>  2788 0x280ad000 0x280c0000 rw-   19    0   1   0 C- df 
>  2788 0x280c0000 0x280d7000 r-x   23    0   1   0 CN vn /lib/libm.so.5
>  2788 0x280d7000 0x280d8000 r-x    1    0   1   0 CN vn /lib/libm.so.5
>  2788 0x280d8000 0x280d9000 rw-    1    0   1   0 C- vn /lib/libm.so.5
>  2788 0x280d9000 0x28211000 r-x  312    0   1   0 CN vn /lib/libcrypto.so.5
>  2788 0x28211000 0x28212000 r-x    1    0   1   0 CN vn /lib/libcrypto.so.5
>  2788 0x28212000 0x2822a000 rw-   24    0   1   0 C- vn /lib/libcrypto.so.5
>  2788 0x2822a000 0x2822c000 rw-    2    0   1   0 C- df 
>  2788 0x2822c000 0x28232000 r-x    6    0   1   0 CN vn /lib/libkvm.so.4
>  2788 0x28232000 0x28233000 r-x    1    0   1   0 CN vn /lib/libkvm.so.4
>  2788 0x28233000 0x28234000 rw-    1    0   1   0 C- vn /lib/libkvm.so.4
>  2788 0x28234000 0x2824c000 r-x   24    0   1   0 CN vn /usr/lib/libelf.so.1
>  2788 0x2824c000 0x2824d000 r-x    1    0   1   0 CN vn /usr/lib/libelf.so.1
>  2788 0x2824d000 0x2824e000 rw-    1    0   1   0 C- vn /usr/lib/libelf.so.1
>  2788 0x2824e000 0x28251000 r-x    3    0  15  10 CN vn /usr/lib/librt.so.1
>  2788 0x28251000 0x28252000 r-x    1    0   1   0 CN vn /usr/lib/librt.so.1
>  2788 0x28252000 0x28253000 rw-    1    0   1   0 C- vn /usr/lib/librt.so.1
>  2788 0x28253000 0x28260000 r-x   13    0   1   0 CN vn /lib/libmd.so.4
>  2788 0x28260000 0x28261000 r-x    1    0   1   0 CN vn /lib/libmd.so.4
>  2788 0x28261000 0x28262000 rw-    1    0   1   0 C- vn /lib/libmd.so.4
>  2788 0x28262000 0x28351000 r-x  239    0   1   0 CN vn /lib/libc.so.7
>  2788 0x28351000 0x28352000 r-x    1    0   1   0 CN vn /lib/libc.so.7
>  2788 0x28352000 0x28358000 rw-    6    0   1   0 C- vn /lib/libc.so.7
>  2788 0x28358000 0x2836e000 rw-   22    0   1   0 C- df 
>  2788 0x2836e000 0x2837a000 ---    0    0   0   0 -- -- 
>  2788 0x28400000 0x28500000 rw-  256    0   1   0 C- df 
>  2788 0xbfbe0000 0xbfc00000 rwx   32    0   1   0 C- df 
> rwsrv05# 

OK, now that I've rebuilt the kernel with the debugging options not
commented out, I'm getting a number of 'lock order reversal' messages
printed on the console: is that normal?

From the Debugging Deadlocks chapter to which I was referred by pluknet
(above) it appears that I need to enter 'sysctl debug.kdb.enter=1' or
'sysctl debug.kdb.panic=1' after I get the process into the desired
'stuck' state.  If I enter either of those commands, the system reboots.
Now *I'm* stuck.

-- 
John Marshall

Received on Thu Jul 09 2009 - 06:52:48 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:51 UTC