Re: Process stuck in vmmaps on 8.0-BETA1

From: John Marshall <john.marshall_at_riverwillow.com.au>
Date: Fri, 10 Jul 2009 13:58:49 +1000
On Thu, 09 Jul 2009, 17:21 +0300, Kostik Belousov wrote:
> On Thu, Jul 09, 2009 at 06:52:42PM +1000, John Marshall wrote:
> > OK, now that I've rebuilt the kernel with the debugging options not
> > commented out, I'm getting a number of 'lock order reversal' messages
> > printed on the console: is that normal?
> > 
> > From the Debugging Deadlocks chapter to which I was referred by pluknet
> > (above) it appears that I need to enter 'sysctl debug.kdb.enter=1' or
> > 'sysctl debug.kdb.panic=1' after I get the process into the desired
> > 'stuck' state.  If I enter either of those commands, the system reboots.
> > Now *I'm* stuck.
> 
> Since you have mostly working system, and interesting information most
> easy accessible by kgdb, attach it to the live kernel:
> kgdb <path to kernel.debug> /dev/mem
> 
> From there, switch to the stuck process,
> 	process <pid>

I tried that...

  (kgdb) process 1373
  Undefined command: "process".  Try "help".

It took me several more hours to discover "proc" which I assume is what
you meant?

> do
> 	bt
> find the frame for vm_map_delete, and print the entry:
> 	p entry

I have no idea which number(s) to plug in here.  I hope I guessed the
right one.

> Also, I need to see the information you posted earlier, namely, procstat
> -k and -v output for the process.

rwsrv05# procstat 1373
  PID  PPID  PGID   SID  TSID THR LOGIN    WCHAN     EMUL          COMM        
 1373     1  1373  1373     0   1 john     vmmaps    FreeBSD ELF32 ntpd        
rwsrv05# procstat -k 1373
  PID    TID COMM             TDNAME           KSTACK                       
 1373 100168 ntpd             -                mi_switch sleepq_switch sleepq_wait _sleep vm_map_unlock_and_wait vm_map_delete vm_map_fixed vm_mmap mmap syscall Xint0x80_syscall 
rwsrv05# procstat -v 1373
  PID      START        END PRT  RES PRES REF SHD FL TP PATH
 1373  0x8048000  0x807e000 r-x   54   60   2   1 CN vn /usr/local/bin/ntpd
 1373  0x807e000  0x8080000 rw-    2    0   1   0 C- vn /usr/local/bin/ntpd
 1373  0x8080000  0x8100000 rw-  128    0   1   0 C- df 
 1373 0x2807e000 0x280ab000 r-x   45    0 143  62 CN vn /libexec/ld-elf.so.1
 1373 0x280ab000 0x280ad000 rw-    2    0   1   0 C- vn /libexec/ld-elf.so.1
 1373 0x280ad000 0x280c0000 rw-   19    0   1   0 C- df 
 1373 0x280c0000 0x280d7000 r-x   23    0   1   0 CN vn /lib/libm.so.5
 1373 0x280d7000 0x280d8000 r-x    1    0   1   0 CN vn /lib/libm.so.5
 1373 0x280d8000 0x280d9000 rw-    1    0   1   0 C- vn /lib/libm.so.5
 1373 0x280d9000 0x28211000 r-x  312    0   1   0 CN vn /lib/libcrypto.so.5
 1373 0x28211000 0x28212000 r-x    1    0   1   0 CN vn /lib/libcrypto.so.5
 1373 0x28212000 0x2822a000 rw-   24    0   1   0 C- vn /lib/libcrypto.so.5
 1373 0x2822a000 0x2822c000 rw-    2    0   1   0 C- df 
 1373 0x2822c000 0x28232000 r-x    6    0   1   0 CN vn /lib/libkvm.so.4
 1373 0x28232000 0x28233000 r-x    1    0   1   0 CN vn /lib/libkvm.so.4
 1373 0x28233000 0x28234000 rw-    1    0   1   0 C- vn /lib/libkvm.so.4
 1373 0x28234000 0x2824c000 r-x   24    0   1   0 CN vn /usr/lib/libelf.so.1
 1373 0x2824c000 0x2824d000 r-x    1    0   1   0 CN vn /usr/lib/libelf.so.1
 1373 0x2824d000 0x2824e000 rw-    1    0   1   0 C- vn /usr/lib/libelf.so.1
 1373 0x2824e000 0x28251000 r-x    3    0  15  10 CN vn /usr/lib/librt.so.1
 1373 0x28251000 0x28252000 r-x    1    0   1   0 CN vn /usr/lib/librt.so.1
 1373 0x28252000 0x28253000 rw-    1    0   1   0 C- vn /usr/lib/librt.so.1
 1373 0x28253000 0x28260000 r-x   13    0   1   0 CN vn /lib/libmd.so.4
 1373 0x28260000 0x28261000 r-x    1    0   1   0 CN vn /lib/libmd.so.4
 1373 0x28261000 0x28262000 rw-    1    0   1   0 C- vn /lib/libmd.so.4
 1373 0x28262000 0x28351000 r-x  239    0   1   0 CN vn /lib/libc.so.7
 1373 0x28351000 0x28352000 r-x    1    0   1   0 CN vn /lib/libc.so.7
 1373 0x28352000 0x28358000 rw-    6    0   1   0 C- vn /lib/libc.so.7
 1373 0x28358000 0x2836e000 rw-   22    0   1   0 C- df 
 1373 0x2836e000 0x2837a000 ---    0    0   0   0 -- -- 
 1373 0x28400000 0x28500000 rw-  256    0   1   0 C- df 
 1373 0xbfbe0000 0xbfc00000 rwx   32    0   1   0 C- df 

rwsrv05# kgdb /spare/obj8/usr/src/sys/RWSRV05/kernel.debug /dev/mem
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-marcel-freebsd"...
#0  sched_switch (td=0xc08ad090, newtd=0xc4d4d900, flags=260)
    at /usr/src/sys/kern/sched_ule.c:1864
1864			cpuid = PCPU_GET(cpuid);
Ready to go.  Enter 'tr' to connect to the remote target
with /dev/cuad0, 'tr /dev/cuad1' to connect to a different port
or 'trf portno' to connect to the remote target with the firewire
interface.  portno defaults to 5556.

Type 'getsyms' after connection to load kld symbols.

If you're debugging a local system, you can use 'kldsyms' instead
to load the kld symbols.  That's a less obnoxious interface.
(kgdb) proc 1373
[Switching to thread 154 (Thread 100168)]#0  sched_switch (td=0xc5776240, newtd=0xc4d4db40, flags=0x104)
    at /usr/src/sys/kern/sched_ule.c:1864
1864			cpuid = PCPU_GET(cpuid);
(kgdb) bt 1373
#0  sched_switch (td=0xc5776240, newtd=0xc4d4db40, flags=0x104) at /usr/src/sys/kern/sched_ule.c:1864
During symbol reading, Incomplete CFI data; unspecified registers at 0xc05fe876.
#1  0xc05e572f in mi_switch (flags=0x104, newtd=0x0) at /usr/src/sys/kern/kern_synch.c:444
#2  0xc06147fc in sleepq_switch (wchan=0xc5a5f338, pri=0x44) at /usr/src/sys/kern/subr_sleepqueue.c:505
#3  0xc0615495 in sleepq_wait (wchan=0xc5a5f338, pri=0x44) at /usr/src/sys/kern/subr_sleepqueue.c:584
#4  0xc05e5bd9 in _sleep (ident=0xc5a5f338, lock=0xc0a243a4, priority=0x244, wmesg=0xc08357af "vmmaps", timo=0x0)
    at /usr/src/sys/kern/kern_synch.c:232
#5  0xc075f8d7 in vm_map_unlock_and_wait (map=0xc5a5f2b8, timo=0x0) at /usr/src/sys/vm/vm_map.c:638
#6  0xc075f987 in vm_map_delete (map=0xc5a5f2b8, start=0x2836e000, end=0x28374000) at /usr/src/sys/vm/vm_map.c:2703
#7  0xc076136e in vm_map_fixed (map=0xc5a5f2b8, object=0xc52ffc38, offset=0x0, start=0x2836e000, length=0x6000, 
    prot=0x5, max=0x7, cow=0x112) at /usr/src/sys/vm/vm_map.c:1367
#8  0xc0763a48 in vm_mmap (map=0xc5a5f2b8, addr=0xe7840c70, size=0x6000, prot=Variable "prot" is not available.
) at /usr/src/sys/vm/vm_mmap.c:1439
#9  0xc07641ef in mmap (td=0xc5776240, uap=0xe7840cf8) at /usr/src/sys/vm/vm_mmap.c:390
#10 0xc07b955f in syscall (frame=0xe7840d38) at /usr/src/sys/i386/i386/trap.c:1073
#11 0xc079dff0 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:261
#12 0x00000033 in ?? ()
Previous frame inner to this frame (corrupt stack?)
(kgdb) p *0xc5a5f2b8 
$1 = 0xc58cb048

-- 
John Marshall

Received on Fri Jul 10 2009 - 01:58:55 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:51 UTC