Re: How to help debugging of lock-up

From: Doug White <dwhite_at_gumbysoft.com>
Date: Tue, 14 Jun 2005 19:48:27 -0700 (PDT)
On Tue, 14 Jun 2005, Jun Kuriyama wrote:

>
> I'm using the current (minus recent ssouhlal_at_'s commit).
>
> This kernel usually locked up when daily backup begins (by invoked by
> amanda server), but sometimes locked up in other random situations.
>
> When locked up, no ping response, but I can enter to debugger from
> serial console.
>
> I'm not sure which process I should suspect.  Is there something I can
> provide to help debugging about this?

The trace looks normal for something network- and disk-bound. Perhaps your
NIC's overloaded or hung?  Where is the amanda backup going -- back to the
same system?

>
> Currently, compiled with INVARIANTS, INVARIANT_SUPPORT, WITNESS,
> WITNESS_SKIPSPIN, DEBUG_VFS_LOCKS and debug.mpsafevfs="0".
>
>
> -----
> KDB: enter: Break sequence on console
> [thread pid 12 tid 100005 ]
> Stopped at      kdb_enter+0x2b: nop
> db> ps
>   pid   proc     uid  ppid  pgrp  flag   stat  wmesg    wchan  cmd
>  2491 c186d400   91  2489  2483 0004000 [SLPQ piperd 0xc16cba80][SLP] sed
>  2490 c1802e00   91  2489  2483 0004000 [SLPQ piperd 0xc183e000][SLP] restore
>  2489 c1648a00   91  2487  2483 0004000 [SLPQ wait 0xc1648a00][SLP] sh
>  2488 c1648800   91  2484  2483 0004000 [SLPQ biord 0xc3a27da8][SLP] dump
>  2487 c15e4a00   91  2484  2483 0000000 [SLPQ piperd 0xc16f1d80][SLP] sendbackup
>  2486 c1871c00   91  2484  2483 0004000 [SLPQ piperd 0xc16ca000][SLP] gzip
>  2484 c1b12000   91     1  2483 0004000 [SLPQ piperd 0xc16f1c00][SLP] sendbackup
>  2454 c1802a00   91  2451  2440 0000000 [SLPQ pause 0xc1802a34][SLP] dump
>  2453 c1b11c00   91  2451  2440 0000000 [SLPQ pipdwt 0xc16cbc00][SLP] dump
>  2452 c1871400   91  2451  2440 0000000 [SLPQ pause 0xc1871434][SLP] dump
>  2451 c1b11000   91  2445  2440 0000000 [SLPQ sbwait 0xc186b9b0][SLP] dump
>  2445 c1b12200   91  2441  2440 0004000 [SLPQ wait 0xc1b12200][SLP] dump
>  2444 c15e4200   91  2441  2440 0000000 [SLPQ pipewr 0xc16f1900][SLP] sendbackup
>  2443 c1645e00   91  2441  2440 0004000 [SLPQ sbwait 0xc1a7e638][SLP] gzip
>  2441 c1b11400   91     1  2440 0004000 [SLPQ piperd 0xc16cb600][SLP] sendbackup
>  1033 c186de00 1021  1032  1033 0004002 [SLPQ ttyin 0xc15b9410][SLP] zsh
>  1032 c186da00 1021  1030  1030 0000100 [SLPQ select 0xc075cda4][SLP] sshd
>  1030 c1800200    0   583  1030 0004100 [SLPQ sbwait 0xc186c480][SLP] sshd
>   717 c1648e00    0     1   717 0004002 [SLPQ ttyin 0xc1541010][SLP] getty
>   716 c1800a00    0     1   716 0004002 [SLPQ ttyin 0xc153cc10][SLP] getty
>   715 c1871a00    0     1   715 0004002 [SLPQ ttyin 0xc155a810][SLP] getty
>   714 c1800600    0     1   714 0004002 [SLPQ ttyin 0xc153b010][SLP] getty
>   700 c1800000    0     1   700 0000000 [SLPQ select 0xc075cda4][SLP] inetd
>   679 c1871600    0     1   679 0000000 [SLPQ select 0xc075cda4][SLP] moused
>   656 c1871e00    0     1   655 0000000 [SLPQ select 0xc075cda4][SLP] snmpd
>   637 c1800400    0     1   637 0000000 [SLPQ select 0xc075cda4][SLP] pptpd
>   607 c1645800    0     1   607 0000000 [SLPQ nanslp 0xc070faac][SLP] cron
>   595 c1645600   25     1   595 0000100 [SLPQ pause 0xc1645634][SLP] sendmail
>   589 c186d200    0     1   589 0000100 [SLPQ select 0xc075cda4][SLP] sendmail
>   583 c15e4600    0     1   583 0000100 [SLPQ select 0xc075cda4][SLP] sshd
>   565 c186d000    0     1   565 0000000 [SLPQ select 0xc075cda4][SLP] ntpd
>   511 c1648000    0     1   511 0000000 [SLPQ select 0xc075cda4][SLP] usbd
>   492 c15e4400    0   490   490 0000100 [SLPQ nfslockd 0xc07654c8][SLP] rpc.lockd
>   490 c15e4800    0     1   490 0000000 [SLPQ select 0xc075cda4][SLP] rpc.lockd
>   485 c15e5c00    0     1   485 0000000 [SLPQ select 0xc075cda4][SLP] rpc.statd
>   479 c1648600    0   478   478 0000000 [SLPQ - 0xc15cb800][SLP] nfsd
>   478 c1645a00    0     1   478 0000000 [SLPQ select 0xc075cda4][SLP] nfsd
>   476 c1645400    0     1   476 0000000 [SLPQ select 0xc075cda4][SLP] mountd
>   398 c1648400    0     1   398 0000000 [SLPQ select 0xc075cda4][SLP] ypbind
>   395 c1645200    0     1   395 0000000 [SLPQ select 0xc075cda4][SLP] rpcbind
>   362 c1648200    0     1   362 0000000 [SLPQ biord 0xc3a14508][SLP] syslogd
>   325 c1645000    0     1   325 0000000 [SLPQ select 0xc075cda4][SLP] devd
>   218 c15e4000    0     1   218 0000000 [SLPQ pause 0xc15e4034][SLP] adjkerntz
>    62 c15e4c00    0     0     0 0000204 [SLPQ - 0xc83cfd04][SLP] schedcpu
>    61 c15e4e00    0     0     0 0000204 [SLPQ - 0xc076526c][SLP] nfsiod 3
>    60 c15e5000    0     0     0 0000204 [SLPQ - 0xc0765268][SLP] nfsiod 2
>    59 c15e5200    0     0     0 0000204 [SLPQ - 0xc0765264][SLP] nfsiod 1
>    58 c15e5400    0     0     0 0000204 [SLPQ - 0xc0765260][SLP] nfsiod 0
>    57 c15e5600    0     0     0 0000204 [SLPQ syncer 0xc070f81c][SLP] syncer
>    56 c15e5800    0     0     0 0000204 [SLPQ vlruwt 0xc15e5800][SLP] vnlru
>    55 c133e400    0     0     0 0000204 [SLPQ psleep 0xc075d2ec][SLP] bufdaemon
>    54 c133e600    0     0     0 000020c [SLPQ pgzero 0xc076b704][SLP] pagezero
>    53 c133e800    0     0     0 0000204 [SLPQ psleep 0xc076b254][SLP] vmdaemon
>    52 c133ea00    0     0     0 0000204 [SLPQ psleep 0xc076b210][SLP] pagedaemon
>    51 c133ec00    0     0     0 0000204 [SLPQ m:w2 0xc15b5d00][SLP] g_mirror data
>    50 c133ee00    0     0     0 0000204 [IWAIT] swi0: sio
>    49 c13ad000    0     0     0 0000204 [SLPQ - 0xc14c5a3c][SLP] fdc0
>    48 c13ad200    0     0     0 0000204 [SLPQ tzpoll 0xc0882694][SLP] acpi_thermal
>    47 c13ad400    0     0     0 0000204 [SLPQ usbevt 0xc1525210][SLP] usb2
> db> trace 362
> Tracing pid 362 tid 100081 td 0xc15e7c00
> sched_switch(c15e7c00,0,1) at sched_switch+0x177
> mi_switch(1,0) at mi_switch+0x270
> sleepq_switch(c3a14508,cc6b99dc,c050da15,c3a14508,0) at sleepq_switch+0xe0
> sleepq_wait(c3a14508,0,0,c06aec19,e52) at sleepq_wait+0x30
> msleep(c3a14508,c075d3c0,4c,c06af341,0) at msleep+0x311
> bwait(c3a14508,4c,c06af341) at bwait+0x47
> bufwait(c3a14508,1,0,0,c16b2000) at bufwait+0x1a
> breadn(c1adedd0,0,0,800,0) at breadn+0x266
> bread(c1adedd0,0,0,800,0) at bread+0x20
> ffs_balloc_ufs2(c1adedd0,4f,0,57,c12f5a00) at ffs_balloc_ufs2+0xcbf
> ffs_write(cc6b9c40,c1a4a510,c1adedd0,cc6b9c8c,c0565fd6) at ffs_write+0x2b4
> VOP_WRITE_APV(c06f7600,cc6b9c40) at VOP_WRITE_APV+0x9b
> vn_write(c1a4a510,c1bac100,c12f5a00,0,c15e7c00) at vn_write+0x1ea
> kern_writev(c15e7c00,8,c1bac100,c1bac100,0) at kern_writev+0x8e
> writev(c15e7c00,cc6b9d04,3,39,292) at writev+0x30
> syscall(3b,3b,3b,8054cde,bfbfde70) at syscall+0x22f
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (121, FreeBSD ELF32, writev), eip = 0x280c9563, esp = 0xbfbfd8ac, ebp = 0xbfbfde98 ---
> db> trace 2451
> Tracing pid 2451 tid 100120 td 0xc1803600
> sched_switch(c1803600,0,1) at sched_switch+0x177
> mi_switch(1,0) at mi_switch+0x270
> sleepq_switch(c186b9b0,0,ccb6dbb4,c050da06,c186b9b0) at sleepq_switch+0xe0
> sleepq_wait_sig(c186b9b0,0,100,c06adec5,3f1) at sleepq_wait_sig+0xc
> msleep(c186b9b0,c186b97c,158,c06ae15c,0) at msleep+0x302
> sbwait(c186b964,c070f180,2,4,0) at sbwait+0x4b
> soreceive(c186b914,0,ccb6dc78,0,0) at soreceive+0x2da
> soo_read(c1a4a318,ccb6dc78,c1b01380,0,c1803600) at soo_read+0x41
> dofileread(c1803600,c1a4a318,12,bfbee028,4) at dofileread+0xad
> read(c1803600,ccb6dd04,3,449,292) at read+0x3b
> syscall(3b,3b,bfbe003b,12,bfbee028) at syscall+0x22f
> Xint0x80_syscall() at Xint0x80_syscall+0x1f
> --- syscall (3, FreeBSD ELF32, read), eip = 0x280c1e83, esp = 0xbfbedfdc, ebp = 0xbfbee008 ---
>
>
>

-- 
Doug White                    |  FreeBSD: The Power to Serve
dwhite_at_gumbysoft.com          |  www.FreeBSD.org
Received on Wed Jun 15 2005 - 00:48:27 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:36 UTC