Re: pkg with an ssh repo crashes CURRENT

From: Mark Felder <feld_at_FreeBSD.org>
Date: Thu, 20 Aug 2015 15:26:10 -0500
On Thu, Aug 20, 2015, at 06:50, Konstantin Belousov wrote:
> On Wed, Aug 19, 2015 at 04:52:56PM -0500, Mark Felder wrote:
> > panic: children list
> > cpuid = 0
> > KDB: stack backtrace:
> > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> > 0xfffffe01228ea840
> > vpanic() at vpanic+0x189/frame 0xfffffe01228ea8c0
> > kassert_panic() at kassert_panic+0x132/frame 0xfffffe01228ea930
> > kern_procctl_single() at kern_procctl_single+0x81c/frame
> > 0xfffffe01228eaa00
> > kern_procctl() at kern_procctl+0x223/frame 0xfffffe01228eaa50
> > sys_procctl() at sys_procctl+0xa5/frame 0xfffffe01228eaae0
> > amd64_syscall() at amd64_syscall+0x282/frame 0xfffffe01228eabf0
> > Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfffffe01228eabf0
> 
> The fired assert means that there was a reaper process with some children
> but without descendands to be reaped.  Hm, I can imagine this situation
> to happen if e.g. some not-reaper forks and then acquires reaper status.
> The patch below removes too aggressive asserts.
> 
> Still, it would be interesting to look into the process table.  Please
> repeat the procedure to panic, then in ddb do 'ps'.  After that do
> 'dump' and please keep kernel.debug and vmcore around.  First I want to
> look
> at the ps output.

I've recreated this in a bhyve VM with the latest CURRENT snapshot,
r286893. You can grab the whole /var/crash dump at
https://feld.me/freebsd/crash.tar.gz

I've pasted the ps output below, but it's also included in the info.0
file.

Stopped at      kdb_enter+0x3e: movq    $0,kdb_why
db> ps
  pid  ppid  pgrp   uid   state   wmesg         wchan        cmd
  667   666   665     0  S+      select   0xfffff80003c53840 ssh
  666   665   665     0  R+      CPU 0                       pkg
  665   629   665     0  S+      wait     0xfffff800039e0548 pkg
  629   628   629     0  S+      pause    0xfffff8001947eb38 csh
  628     1   628     0  Ss+     wait     0xfffff80003db8a90 login
  627     1   627     0  Ss+     ttyin    0xfffff80003c0f0a8 getty
  626     1   626     0  Ss+     ttyin    0xfffff80003c0f4a8 getty
  625     1   625     0  Ss+     ttyin    0xfffff8000387a0a8 getty
  624     1   624     0  Ss+     ttyin    0xfffff8000387a4a8 getty
  623     1   623     0  Ss+     ttyin    0xfffff8000387a8a8 getty
  622     1   622     0  Ss+     ttyin    0xfffff8000387aca8 getty
  621     1   621     0  Ss+     ttyin    0xfffff8000387b0a8 getty
  620     1   620     0  Ss+     ttyin    0xfffff8000387b4a8 getty
  577     1   577     0  Ss      nanslp   0xffffffff81ab2561 cron
  573     1   573    25  Ss      pause    0xfffff80003d040a8 sendmail
  570     1   570     0  Ss      select   0xfffff80003849c40 sendmail
  542     1   542     0  Ss      select   0xfffff80003c53ec0 sshd
  443     1   443     0  Ss      select   0xfffff80003849d40 casperd
  442     1   442     0  Ss      select   0xfffff80003c540c0 casperd
  342     1   342     0  Ss      select   0xfffff80003849dc0 syslogd
  271     1   271     0  Ss      select   0xfffff80003849ec0 devd
   16     0     0     0  DL      vlruwt   0xfffff800039e0a90 [vnlru]
   15     0     0     0  DL      syncer   0xffffffff81c41cf8 [syncer]
   14     0     0     0  DL      (threaded)                  [bufdaemon]
100042                   D       psleep   0xffffffff81c40f04 [bufdaemon]
100057                   D       sdflush  0xfffff80003d870e8 [/ worker]
    9     0     0     0  DL      pgzero   0xffffffff81c4aee4 [pagezero]
    8     0     0     0  DL      psleep   0xffffffff81c4a6b8 [vmdaemon]
    7     0     0     0  DL      (threaded)                 
    [pagedaemon]
100039                   D       psleep   0xffffffff81cf6684
[pagedaemon]
100045                   D       umarcl   0xffffffff81c4a040 [uma]
    6     0     0     0  DL      waiting_ 0xffffffff81ce8640
    [sctp_iterator]
    5     0     0     0  DL      (threaded)                  [cam]
100017                   D       -        0xffffffff818d6e00 [doneq0]
100038                   D       -        0xffffffff818d6c48 [scanner]
    4     0     0     0  DL      crypto_r 0xffffffff81c48b88 [crypto
    returns]
    3     0     0     0  DL      crypto_w 0xffffffff81c48a30 [crypto]
   13     0     0     0  DL      (threaded)                  [geom]
100010                   D       -        0xffffffff81cc0aa0 [g_event]
100011                   D       -        0xffffffff81cc0aa8 [g_up]
100012                   D       -        0xffffffff81cc0ab0 [g_down]
   12     0     0     0  WL      (threaded)                  [intr]
100006                   I                                   [swi4:
clock (0)]
100007                   I                                   [swi4:
clock (1)]
100008                   I                                   [swi3: vm]
100009                   I                                   [swi1:
netisr 0]
100018                   I                                   [swi6: task
queue]
100019                   I                                   [swi6:
Giant taskq]
100021                   I                                   [swi5: fast
taskq]
100026                   I                                   [irq264:
virtio_pci0]
100027                   I                                   [irq265:
virtio_pci0]
100028                   I                                   [irq266:
virtio_pci0]
100031                   I                                   [irq267:
virtio_pci1]
100032                   I                                   [irq268:
virtio_pci1]
100033                   I                                   [swi0: uart
uart]
100034                   I                                   [irq1:
atkbd0]
   11     0     0     0  RL      (threaded)                  [idle]
100004                   CanRun                              [idle:
cpu0]
100005                   Run     CPU 1                       [idle:
cpu1]
    2     0     0     0  DL      -        0xffffffff81a03ca0
    [rand_harvestq]
    1     0     1     0  SLs     wait     0xfffff8000362f548 [init]
   10     0     0     0  DL      audit_wo 0xffffffff81cedc10 [audit]
    0     0     0     0  DLs     (threaded)                  [kernel]
100000                   D       swapin   0xffffffff81cc0ad8 [swapper]
100013                   D       -        0xfffff80003611300 [firmware
taskq]
100016                   D       -        0xfffff80003610e00 [ffs_trim
taskq]
100020                   D       -        0xfffff80003610400 [thread
taskq]
100022                   D       -        0xfffff80003820100
[acpi_task_0]
100023                   D       -        0xfffff80003820100
[acpi_task_1]
100024                   D       -        0xfffff80003820100
[acpi_task_2]
100025                   D       -        0xfffff8000381fc00 [kqueue
taskq]
100029                   D       -        0xfffff8000381f200 [vtnet0 rxq
0]
100030                   D       -        0xfffff8000381f100 [vtnet0 txq
0]
100035                   D       -        0xffffffff81ab1330 [deadlkres]
100037                   D       -        0xfffff80003610c00 [CAM taskq]




> 
> diff --git a/sys/kern/kern_procctl.c b/sys/kern/kern_procctl.c
> index d65ba5a..8ef72901 100644
> --- a/sys/kern/kern_procctl.c
> +++ b/sys/kern/kern_procctl.c
> _at__at_ -187,8 +187,6 _at__at_ reap_status(struct thread *td, struct proc *p,
>  		}
>  	} else {
>  		rs->rs_pid = -1;
> -               KASSERT(LIST_EMPTY(&reap->p_reaplist), ("reap children
> list"));
> -               KASSERT(LIST_EMPTY(&reap->p_children), ("children
> list"));
>  	}
>  	return (0);
>  }

I'll try compiling a kernel with your patch and see what happens.
Received on Thu Aug 20 2015 - 18:26:11 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:59 UTC