It turns out that the bt's from the example panics are repeatable for the pc and lr sequence involved (but not the specific sp's and fp's involved). I report this in case it suggests anything. I'll note that the build had a production style kernel for a build of -r317015 . The first type of panic actually a back to back sequence of two bt's, this is the sleeping-thread type pf example. The second type is just one bt by itself. There is one variable lr in the bt for the sleeping-thread type of example (the first type of panic of the two shown later, the one with back-to-back bt's): 131,133c131,133 < handle_el0_sync() at 0x40040070 < pc = 0xffff0000006079e8 lr = 0x0000000040040070 < sp = 0xffff000065dfdba0 fp = 0x0000ffffffffeb00 --- > handle_el0_sync() at 0x40044490 > pc = 0xffff0000006079e8 lr = 0x0000000040044490 > sp = 0xffff000040229ba0 fp = 0x0000ffffffffe3d0 Otherwise the two bt's in the example match for the pc/lr sequence. I only have the two examples of this type to compare so far (one diff). I have 3 examples of the second type and they had no such variation. One thing in common to all 5 of these examples is the sequence: data_abort() at handle_el1h_sync+0x70 lr = 0xffff000000607870 handle_el1h_sync() at pmap_remove_pages+0x2a8 pc = 0xffff000000607870 lr = 0xffff0000006175d4 pmap_remove_pages() being involved in each example. I'm not saying that I can cause any panics at will, but when either of the two types happen the bt is (mostly) stable for the pc and lr sequence and that short sequence above is involved. I have seen one other type of panic but I did not manage to record a bt for it yet. It involved the instruction cache instead of arm64_dcache_wb_range . I quote the prior reported example bt's below. On 2017-May-2, at 5:24 AM, Mark Millard <markmi at dsl-only.net> wrote: > On 2017-May-2, at 3:37 AM, Mark Millard <markmi at dsl-only.net> wrote: > >> On 2017-May-2, at 2:53 AM, Mark Millard <markmi at dsl-only.net> wrote: >> >> . . . >> FYI: >> >> I do sometimes get things like: >> >> >> System shutdown time has arrived >> Apr 30 19:43:15 ODC2FBSD shutdown: power-down by root: >> Sleeping thread (tid 100093, pid 708) owns a non-sleepable lock >> KDB: stack backtrace of thread 100093: >> sched_switch() at mi_switch+0x100 >> pc = 0xffff000000347d44 lr = 0xffff000000327358 >> sp = 0xffff000040237e00 fp = 0xffff000040237e20 >> >> mi_switch() at sleepq_wait+0x3c >> pc = 0xffff000000327358 lr = 0xffff00000036c174 >> sp = 0xffff000040237e30 fp = 0xffff000040237e50 >> >> sleepq_wait() at _sleep+0x29c >> pc = 0xffff00000036c174 lr = 0xffff000000326c7c >> sp = 0xffff000040237e60 fp = 0xffff000040237ee0 >> >> _sleep() at vm_page_sleep_if_busy+0xb0 >> pc = 0xffff000000326c7c lr = 0xffff0000005cfcf4 >> sp = 0xffff000040237ef0 fp = 0xffff000040237f10 >> >> vm_page_sleep_if_busy() at vm_fault_hold+0xcc8 >> pc = 0xffff0000005cfcf4 lr = 0xffff0000005ba17c >> sp = 0xffff000040237f20 fp = 0xffff000040238070 >> >> vm_fault_hold() at vm_fault+0x70 >> pc = 0xffff0000005ba17c lr = 0xffff0000005b9464 >> sp = 0xffff000040238080 fp = 0xffff0000402380b0 >> >> vm_fault() at data_abort+0xe0 >> pc = 0xffff0000005b9464 lr = 0xffff00000061ad94 >> sp = 0xffff0000402380c0 fp = 0xffff000040238170 >> >> data_abort() at handle_el1h_sync+0x70 >> pc = 0xffff00000061ad94 lr = 0xffff000000607870 >> sp = 0xffff000040238180 fp = 0xffff000040238290 >> >> handle_el1h_sync() at pmap_enter+0x678 >> pc = 0xffff000000607870 lr = 0xffff000000615684 >> sp = 0xffff0000402382a0 fp = 0xffff0000402383b0 >> >> pmap_enter() at vm_fault_hold+0x17c0 >> pc = 0xffff000000615684 lr = 0xffff0000005bac74 >> sp = 0xffff0000402383c0 fp = 0xffff000040238510 >> >> vm_fault_hold() at vm_fault+0x70 >> pc = 0xffff0000005bac74 lr = 0xffff0000005b9464 >> sp = 0xffff000040238520 fp = 0xffff000040238550 >> >> vm_fault() at data_abort+0xe0 >> pc = 0xffff0000005b9464 lr = 0xffff00000061ad94 >> sp = 0xffff000040238560 fp = 0xffff000040238610 >> >> data_abort() at handle_el1h_sync+0x70 >> pc = 0xffff00000061ad94 lr = 0xffff000000607870 >> sp = 0xffff000040238620 fp = 0xffff000040238730 >> >> handle_el1h_sync() at pmap_remove_pages+0x2a8 >> pc = 0xffff000000607870 lr = 0xffff0000006175d4 >> sp = 0xffff000040238740 fp = 0xffff000040238870 >> >> pmap_remove_pages() at vmspace_exit+0xb0 >> pc = 0xffff0000006175d4 lr = 0xffff0000005c020c >> sp = 0xffff000040238880 fp = 0xffff0000402388b0 >> >> vmspace_exit() at exit1+0x604 >> pc = 0xffff0000005c020c lr = 0xffff0000002db5e0 >> sp = 0xffff0000402388c0 fp = 0xffff000040238920 >> >> exit1() at sys_sys_exit+0x10 >> pc = 0xffff0000002db5e0 lr = 0xffff0000002dafd8 >> sp = 0xffff000040238930 fp = 0xffff000040238930 >> >> sys_sys_exit() at do_el0_sync+0xa48 >> pc = 0xffff0000002dafd8 lr = 0xffff00000061b91c >> sp = 0xffff000040238940 fp = 0xffff000040238a70 >> >> do_el0_sync() at handle_el0_sync+0x6c >> pc = 0xffff00000061b91c lr = 0xffff0000006079e8 >> sp = 0xffff000040238a80 fp = 0xffff000040238b90 >> >> handle_el0_sync() at 0x38cc0 >> pc = 0xffff0000006079e8 lr = 0x0000000000038cc0 >> sp = 0xffff000040238ba0 fp = 0x0000ffffffffed00 >> >> panic: sleeping thread >> cpuid = 2 >> time = 1493581440 >> KDB: stack backtrace: >> db_trace_self() at db_trace_self_wrapper+0x28 >> pc = 0xffff000000605cc0 lr = 0xffff0000000869cc >> sp = 0xffff000065dfd320 fp = 0xffff000065dfd530 >> >> db_trace_self_wrapper() at vpanic+0x164 >> pc = 0xffff0000000869cc lr = 0xffff00000031d464 >> sp = 0xffff000065dfd540 fp = 0xffff000065dfd5b0 >> >> vpanic() at panic+0x4c >> pc = 0xffff00000031d464 lr = 0xffff00000031d2fc >> sp = 0xffff000065dfd5c0 fp = 0xffff000065dfd640 >> >> panic() at propagate_priority+0x2d0 >> pc = 0xffff00000031d2fc lr = 0xffff000000374558 >> sp = 0xffff000065dfd650 fp = 0xffff000065dfd690 >> >> propagate_priority() at turnstile_wait+0x340 >> pc = 0xffff000000374558 lr = 0xffff00000037503c >> sp = 0xffff000065dfd6a0 fp = 0xffff000065dfd6e0 >> >> turnstile_wait() at __rw_wlock_hard+0x208 >> pc = 0xffff00000037503c lr = 0xffff000000319138 >> sp = 0xffff000065dfd6f0 fp = 0xffff000065dfd770 >> >> __rw_wlock_hard() at pmap_enter+0xe98 >> pc = 0xffff000000319138 lr = 0xffff000000615ea4 >> sp = 0xffff000065dfd780 fp = 0xffff000065dfd810 >> >> pmap_enter() at vm_fault_hold+0x28c >> pc = 0xffff000000615ea4 lr = 0xffff0000005b9740 >> sp = 0xffff000065dfd820 fp = 0xffff000065dfd970 >> >> vm_fault_hold() at vm_fault+0x70 >> pc = 0xffff0000005b9740 lr = 0xffff0000005b9464 >> sp = 0xffff000065dfd980 fp = 0xffff000065dfd9b0 >> >> vm_fault() at data_abort+0xe0 >> pc = 0xffff0000005b9464 lr = 0xffff00000061ad94 >> sp = 0xffff000065dfd9c0 fp = 0xffff000065dfda70 >> >> data_abort() at handle_el0_sync+0x6c >> pc = 0xffff00000061ad94 lr = 0xffff0000006079e8 >> sp = 0xffff000065dfda80 fp = 0xffff000065dfdb90 >> >> handle_el0_sync() at 0x40040070 >> pc = 0xffff0000006079e8 lr = 0x0000000040040070 >> sp = 0xffff000065dfdba0 fp = 0x0000ffffffffeb00 >> >> KDB: enter: panic >> [ thread pid 709 tid 100086 ] >> Stopped at kdb_enter+0x44: undefined d4200000 >> db> > > Another example failure is: > > Fatal data abort: > x0: 400a9000 > x1: 1000 > x2: 0 > x3: 40 > x4: 3f > x5: fffffd00304e5000 > x6: 2b52 > x7: c > x8: b > x9: fffffd000076d5d0 > x10: 68 > x11: 40000000 > x12: 704c5000 > x13: 42b42003 > x14: 42b42003 > x15: 40000000 > x16: c > x17: ffffffffffffffff > x18: ffff000065dd5310 > x19: 800000000000000 > x20: 1 > x21: fffffd0002b43000 > x22: 12000004556478b > x23: f000000000000000 > x24: fffffd0002b41bc8 > x25: 40 > x26: fffffd0002b42548 > x27: 7b > x28: 3 > x29: ffff000065dd53c0 > sp: ffff000065dd5310 > lr: ffff0000006175d8 > elr: ffff00000060589c > spsr: 60000345 > far: 400a9000 > esr: 96000147 > [ thread pid 715 tid 100078 ] > Stopped at arm64_dcache_wb_range+0x18: undefined d50b7a20 > db> bt > Tracing pid 715 tid 100078 td 0xfffffd00007849c0 > db_trace_self() at db_stack_trace+0xf0 > pc = 0xffff000000605cc0 lr = 0xffff0000000840e0 > sp = 0xffff000065dd4cb0 fp = 0xffff000065dd4ce0 > > db_stack_trace() at db_command+0x23c > pc = 0xffff0000000840e0 lr = 0xffff000000083d58 > sp = 0xffff000065dd4cf0 fp = 0xffff000065dd4dd0 > > db_command() at db_command_loop+0x60 > pc = 0xffff000000083d58 lr = 0xffff000000083b00 > sp = 0xffff000065dd4de0 fp = 0xffff000065dd4e00 > > db_command_loop() at db_trap+0xf4 > pc = 0xffff000000083b00 lr = 0xffff000000086b34 > sp = 0xffff000065dd4e10 fp = 0xffff000065dd5030 > > db_trap() at kdb_trap+0x180 > pc = 0xffff000000086b34 lr = 0xffff00000035f650 > sp = 0xffff000065dd5040 fp = 0xffff000065dd50a0 > > kdb_trap() at data_abort+0x1a0 > pc = 0xffff00000035f650 lr = 0xffff00000061ae54 > sp = 0xffff000065dd50b0 fp = 0xffff000065dd5160 > > data_abort() at handle_el1h_sync+0x70 > pc = 0xffff00000061ae54 lr = 0xffff000000607870 > sp = 0xffff000065dd5170 fp = 0xffff000065dd5280 > > handle_el1h_sync() at pmap_remove_pages+0x2a8 > pc = 0xffff000000607870 lr = 0xffff0000006175d4 > sp = 0xffff000065dd5290 fp = 0xffff000065dd53c0 > > pmap_remove_pages() at exec_new_vmspace+0x1a4 > pc = 0xffff0000006175d4 lr = 0xffff0000002d9da0 > sp = 0xffff000065dd53d0 fp = 0xffff000065dd5430 > > exec_new_vmspace() at exec_elf64_imgact+0xa70 > pc = 0xffff0000002d9da0 lr = 0xffff0000002b7c14 > sp = 0xffff000065dd5440 fp = 0xffff000065dd5550 > > exec_elf64_imgact() at kern_execve+0x664 > pc = 0xffff0000002b7c14 lr = 0xffff0000002d8730 > sp = 0xffff000065dd5560 fp = 0xffff000065dd58b0 > > kern_execve() at sys_execve+0x54 > pc = 0xffff0000002d8730 lr = 0xffff0000002d7d90 > sp = 0xffff000065dd58c0 fp = 0xffff000065dd5930 > > sys_execve() at do_el0_sync+0xa48 > pc = 0xffff0000002d7d90 lr = 0xffff00000061b91c > sp = 0xffff000065dd5940 fp = 0xffff000065dd5a70 > > do_el0_sync() at handle_el0_sync+0x6c > pc = 0xffff00000061b91c lr = 0xffff0000006079e8 > sp = 0xffff000065dd5a80 fp = 0xffff000065dd5b90 > > handle_el0_sync() at 0x24a90 > pc = 0xffff0000006079e8 lr = 0x0000000000024a90 > sp = 0xffff000065dd5ba0 fp = 0x0000ffffffffe7d0 > > db> === Mark Millard markmi at dsl-only.netReceived on Tue May 02 2017 - 19:29:08 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:11 UTC