Re: crash on process exit.. current at about r332467

From: Bryan Drewery <bdrewery_at_FreeBSD.org>
Date: Wed, 25 Apr 2018 13:35:20 -0700
On 4/25/18 12:41 PM, Bryan Drewery wrote:
> On 4/25/18 12:39 PM, Bryan Drewery wrote:
>> On 4/23/18 7:50 AM, Julian Elischer wrote:
>>> back trace at:  http://www.freebsd.org/~julian/bob-crash.png
>>>
>>> If anyone wants to take a look..
>>>
>>> In the exit syscall, while deallocating a vm object.
>>>
>>> I haven't see references to a similar crash in the last 10 days or so..
>>> But if it rings any bells...
>>>
>>
>> I just hit this on r332455 and have a dump.
>>
>>> panic: Bad link elm 0xfffff811cd920e60 prev->next != elm
>>> cpuid = 10
>>> time = 1524682939
>>> KDB: stack backtrace:
>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfffffe23f450c3b0
>>> vpanic() at vpanic+0x18d/frame 0xfffffe23f450c410
>>> panic() at panic+0x43/frame 0xfffffe23f450c470
>>> vm_object_destroy() at vm_object_destroy/frame 0xfffffe23f450c4d0
>>> vm_object_deallocate() at vm_object_deallocate+0x45c/frame 0xfffffe23f450c530
>>> vm_map_process_deferred() at vm_map_process_deferred+0x99/frame 0xfffffe23f450c560
>>> vm_map_remove() at vm_map_remove+0xc6/frame 0xfffffe23f450c590
>>> exec_new_vmspace() at exec_new_vmspace+0x185/frame 0xfffffe23f450c5f0
>>> exec_elf64_imgact() at exec_elf64_imgact+0x8fb/frame 0xfffffe23f450c6e0
>>> kern_execve() at kern_execve+0x82c/frame 0xfffffe23f450ca40
>>> sys_execve() at sys_execve+0x4c/frame 0xfffffe23f450cac0
>>> amd64_syscall() at amd64_syscall+0x786/frame 0xfffffe23f450cbf0
>>> fast_syscall_common() at fast_syscall_common+0x101/frame 0xfffffe23f450cbf0
>>> --- syscall (59, FreeBSD ELF64, sys_execve), rip = 0x800d7af7a, rsp = 0x7fffffffbd28, rbp = 0x7fffffffbe70 ---
>>
>>
>>
> 
> It's a different stack than Julian's but it seems like the same issue to me.
> 

Here's the real stack:

> #12 0xffffffff80b6d1c3 in panic (fmt=0xffffffff81dee958 <cnputs_mtx> "\322\237'\201\377\377\377\377") at /usr/src/sys/kern/kern_shutdown.c:764
> #13 0xffffffff80eac060 in vm_object_terminate (object=0xfffff80587c4de00) at /usr/src/sys/vm/vm_object.c:868
> #14 0xffffffff80eaaf2c in vm_object_deallocate (object=0xfffff80587c4de00) at /usr/src/sys/vm/vm_object.c:684
> #15 0xffffffff80ea0089 in vm_map_entry_deallocate (system_map=<error reading variable: Cannot access memory at address 0x0>, entry=<optimized out>) at /usr/src/sys/vm/vm_map.c:2997
> #16 vm_map_process_deferred () at /usr/src/sys/vm/vm_map.c:541
> #17 0xffffffff80ea5186 in _vm_map_unlock (map=<optimized out>, file=<optimized out>, line=3189) at /usr/src/sys/vm/vm_map.c:554
> #18 vm_map_remove (map=<optimized out>, start=4096, end=140737488355328) at /usr/src/sys/vm/vm_map.c:3189
> #19 0xffffffff80b24c35 in exec_new_vmspace (imgp=0xfffffe23f450c8b0, sv=<optimized out>) at /usr/src/sys/kern/kern_exec.c:1099
> #20 0xffffffff80afaf1b in exec_elf64_imgact (imgp=<optimized out>) at /usr/src/sys/kern/imgact_elf.c:922
> #21 0xffffffff80b2380c in do_execve (td=<optimized out>, args=<optimized out>, mac_p=<optimized out>) at /usr/src/sys/kern/kern_exec.c:606
> #22 kern_execve (td=<optimized out>, args=<optimized out>, mac_p=<optimized out>) at /usr/src/sys/kern/kern_exec.c:351
> #23 0xffffffff80b22c9c in sys_execve (td=0xfffff801493c6000, uap=0xfffff801493c63c0) at /usr/src/sys/kern/kern_exec.c:225

> (kgdb) frame 13
> #13 0xffffffff80eac060 in vm_object_terminate (object=0xfffff80587c4de00) at /usr/src/sys/vm/vm_object.c:868
> 868             KASSERT(object->cred == NULL || object->type == OBJT_DEFAULT ||
> (kgdb) p *object
> $2 = {lock = {lock_object = {lo_name = 0xffffffff81214cff "vm object", lo_flags = 627245056, lo_data = 0, lo_witness = 0xfffff8123fd6a700}, rw_lock = 18446735283140190208}, object_list = {
>     tqe_next = 0xfffff80587c67000, tqe_prev = 0xfffff80587c4dd20}, shadow_head = {lh_first = 0x0}, shadow_list = {le_next = 0xfffff80a9606c500, le_prev = 0xfffff8054f0c2c30}, memq = {
>     tqh_first = 0xfffff811d00c9980, tqh_last = 0xfffff811d850b8a0}, rtree = {rt_root = 18446735322516082688}, size = 2048, domain = {dr_policy = 0x0, dr_iterator = 0}, generation = 1, ref_count = 0,
>   shadow_count = 0, memattr = 6 '\006', type = 0 '\000', flags = 12296, pg_color = 1024, paging_in_progress = 0, resident_page_count = 5, backing_object = 0x0, backing_object_offset = 0,
>   pager_object_list = {tqe_next = 0x0, tqe_prev = 0x0}, rvq = {lh_first = 0xfffff811cbac2240}, handle = 0x0, un_pager = {vnp = {vnp_size = 0, writemappings = 0}, devp = {devp_pglist = {
>         tqh_first = 0x0, tqh_last = 0x0}, ops = 0x0, dev = 0x0}, sgp = {sgp_pglist = {tqh_first = 0x0, tqh_last = 0x0}}, swp = {swp_tmpfs = 0x0, swp_blks = {pt_root = 0}}}, cred = 0xfffff807ebd91500,
>   charge = 8388608, umtx_data = 0x0}

object->type = OBJT_DEFAULT.

Er I'm not sure what's going on here as line 868 is a totally different
assert than the queue(3) one in the msgbuf...

> 868         KASSERT(object->cred == NULL || object->type == OBJT_DEFAULT ||
> 869             object->type == OBJT_SWAP,
> 870             ("%s: non-swap obj %p has cred", __func__, object));



-- 
Regards,
Bryan Drewery


Received on Wed Apr 25 2018 - 18:35:27 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:15 UTC