panic: destroying non-empty racct: 2113536 allocated for resource 4

From: Andriy Gapon <avg_at_FreeBSD.org>
Date: Tue, 17 May 2016 09:22:57 +0300
To be fair I got this panic after some exotic sequence of events: running
poudriere, sending SIGSTOP to one of build processes, forgetting about it,
seeing poudriere timeout that job, sending SIGCONT...

This is amd64 head r297350.

Some details:
(kgdb) bt
#0  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:295
#1  0xffffffff8062d7ef in kern_reboot (howto=<optimized out>) at
/usr/src/sys/kern/kern_shutdown.c:363
#2  0xffffffff8062de38 in vpanic (fmt=<optimized out>, ap=0xfffffe0519b73920) at
/usr/src/sys/kern/kern_shutdown.c:639
#3  0xffffffff8062db43 in panic (fmt=<unavailable>) at
/usr/src/sys/kern/kern_shutdown.c:572
#4  0xffffffff8061ef1c in racct_destroy_locked (racctp=<optimized out>) at
/usr/src/sys/kern/kern_racct.c:478
#5  0xffffffff8061ee45 in racct_destroy (racct=0xfffff802f6301518) at
/usr/src/sys/kern/kern_racct.c:495
#6  0xffffffff805fdd3c in prison_racct_free_locked (prr=0xfffff802f6301400) at
/usr/src/sys/kern/kern_jail.c:4564
#7  0xffffffff805fdc8d in prison_racct_free (prr=0xfffff802f6301400) at
/usr/src/sys/kern/kern_jail.c:4583
#8  0xffffffff805fddee in prison_racct_detach (pr=0xfffff802b0730000) at
/usr/src/sys/kern/kern_jail.c:4658
#9  0xffffffff805fb2cb in prison_deref (pr=<optimized out>, flags=3) at
/usr/src/sys/kern/kern_jail.c:2663
#10 0xffffffff805fca25 in prison_remove_one (pr=<optimized out>) at
/usr/src/sys/kern/kern_jail.c:2358
#11 0xffffffff805fc8e4 in sys_jail_remove (td=<optimized out>, uap=<optimized
out>) at /usr/src/sys/kern/kern_jail.c:2313
#12 0xffffffff80820ddd in syscallenter (td=0xfffff801146019e0,
sa=0xfffffe0519b73b80) at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:135
#13 0xffffffff808209af in amd64_syscall (td=0xfffff801146019e0, traced=0) at
/usr/src/sys/amd64/amd64/trap.c:943

RACCT_RSS is 4.

(kgdb) p *prr
$5 = {
  prr_next = {
    le_next = 0xfffff80382fe4400,
    le_prev = 0xfffff8017ac90600
  },
  prr_name = "basejail-default-job-03", '\000' <repeats 232 times>,
  prr_refcount = 0,
  prr_racct = 0xfffff802e3f520b0
}
(kgdb) p *prr->prr_racct
$6 = {
  r_resources = {13884177072, 0, 0, 0, 2113536, 0 <repeats 14 times>,
13611325009, 0},
  r_rule_links = {
    lh_first = 0x0
  }
}

Could it be that somehow the CONT'd process failed to deduct its resources from
the jail's resources because the jail was already marked for destruction or
something like that?

-- 
Andriy Gapon
Received on Tue May 17 2016 - 04:24:02 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:04 UTC