Re: 'swap_pager: indefinite wait buffer' with swapfile

From: Kris Kennaway <kris_at_obsecurity.org> Date: Tue, 13 Sep 2005 01:43:18 -0400 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:38:43 UTC

On Sun, Sep 11, 2005 at 03:51:57AM -0400, Kris Kennaway wrote:
> I configured a vnode-backed md and enabled swapping on it.  A few
> hours later after moderate swap use the console showed:
> 
> swap_pager: indefinite wait buffer: bufobj: 0, blkno: 889347, size: 8192
> [...repeated...]
> 
> The backing store was a sparse file, but there was ample space:
> 
> # ls -l /data2/swapfile
> -rw-r--r--  1 root  wheel  17179869184 Sep 11 16:50 /data2/swapfile
> # df /data2
> Filesystem       1K-blocks     Used    Avail Capacity  Mounted on
> /dev/stripe/data  51666218 27042730 20490192    57%    /data2
> # swapinfo
> Device          1K-blocks     Used    Avail Capacity
> /dev/da0b         6297480   949304  6297480    15%
> /dev/md41        16777216   842544 16777216     5%
> Total            23074696  1791848 21282848     8%

I think these messages happen when a ton of stuff gets paged out at
once, since the vnode backing store is going to be a significant
bottleneck.  I got a few dozen of these in groups of 5 or 6 over the
past 24 hours, and the system seemed to be fine (i.e. no immediate
panics or filesystem corruption from lost transactions).

However, the system has now deadlocked.

    0 c04188e0    0     0     0 0000200 [SLPQ vmwait 0xc05695e0][SLP] swapper
db> wh 0
Tracing pid 0 tid 0 td 0xc0418c30
mi_switch() at mi_switch+0x2b0
sleepq_switch() at sleepq_switch+0xf4
sleepq_wait() at sleepq_wait+0x3c
msleep() at msleep+0x378
vm_wait() at vm_wait+0xa8
scheduler() at scheduler+0x58
mi_startup() at mi_startup+0x12c
btext() at btext+0x34

This is the md that I'm swapping onto:

 9151 fffff8010fbf5a80    0     0     0 0000204 [SLPQ vmwait 0xc05695e0][SLP] md41
db> wh 9151
Tracing pid 9151 tid 100359 td 0xfffff801047a30a0
mi_switch() at mi_switch+0x2b0
sleepq_switch() at sleepq_switch+0xf4
sleepq_wait() at sleepq_wait+0x3c
msleep() at msleep+0x378
vm_wait() at vm_wait+0xa8
allocbuf() at allocbuf+0x614
getblk() at getblk+0x598
breadn() at breadn+0x58
bread() at bread+0x20
ffs_balloc_ufs2() at ffs_balloc_ufs2+0xcf0
ffs_write() at ffs_write+0x2a4
VOP_WRITE_APV() at VOP_WRITE_APV+0x120
mdstart_vnode() at mdstart_vnode+0x16c
md_kthread() at md_kthread+0x1f8
fork_exit() at fork_exit+0x94
fork_trampoline() at fork_trampoline+0x8
db>

Most other processes on the system are sleeping in various states and/or
trying to swap, e.g.:

30588 fffff8010051ba80    0 30586 44971 0004000 [SLPQ vmwait 0xc05695e0][SLP] bsdtar
30587 fffff800b05169f0    0 30586 44971 0004000 [SLPQ vmwait 0xc05695e0][SLP] bsdtar
30586 fffff800d48d73e0    0 28045 44971 0004000 [SLPQ wait 0xfffff800d48d73e0][SLP][SWAP] sh
30585 fffff80017de73e0    0 30583 45059 0004000 [SLPQ vmwait 0xc05695e0][SLP] bsdtar
30584 fffff800425d6000    0 30583 45059 0004000 [SLPQ pipdwt 0xfffff8005088e780][SLP] bsdtar
30583 fffff800b0517730    0 28564 45059 0004000 [SLPQ wait 0xfffff800b0517730][SLP][SWAP] sh

Looks like swapfiles are broken.

Kris