On 2020-Jun-11, at 19:25, Justin Hibbits <chmeeedalf at gmail.com> wrote: > On Thu, 11 Jun 2020 17:30:24 -0700 > Mark Millard <marklmi_at_yahoo.com> wrote: > >> On 2020-Jun-11, at 16:49, Mark Millard <marklmi at yahoo.com> wrote: >> >>> On 2020-Jun-11, at 14:42, Justin Hibbits <chmeeedalf at gmail.com> >>> wrote: >>> >>> On Thu, 11 Jun 2020 14:36:37 -0700 >>> Mark Millard <marklmi_at_yahoo.com> wrote: >>> >>>> On 2020-Jun-11, at 13:55, Justin Hibbits <chmeeedalf at gmail.com> >>>> wrote: >>>> >>>>> On Wed, 10 Jun 2020 18:56:57 -0700 >>>>> Mark Millard <marklmi_at_yahoo.com> wrote: >>> . . . >>>> >>>> >>>>> That said, the attached patch effectively copies >>>>> what's done in OEA6464 into OEA pmap. Can you test it? >>>> >>>> I'll try it once I get a chance, probably later >>>> today. >>>> . . . >>> >>> No luck at the change being a fix, I'm afraid. >>> >>> I verified that the build ended up with >>> >>> 00926cb0 <moea_protect+0x2ec> bl 008e8dc8 <PHYS_TO_VM_PAGE> >>> 00926cb4 <moea_protect+0x2f0> mr r27,r3 >>> 00926cb8 <moea_protect+0x2f4> addi r3,r3,36 >>> 00926cbc <moea_protect+0x2f8> hwsync >>> 00926cc0 <moea_protect+0x2fc> lwarx r25,0,r3 >>> 00926cc4 <moea_protect+0x300> li r4,0 >>> 00926cc8 <moea_protect+0x304> stwcx. r4,0,r3 >>> 00926ccc <moea_protect+0x308> bne- 00926cc0 <moea_protect+0x2fc> >>> 00926cd0 <moea_protect+0x30c> andi. r3,r25,128 >>> 00926cd4 <moea_protect+0x310> beq 00926ce0 <moea_protect+0x31c> >>> 00926cd8 <moea_protect+0x314> mr r3,r27 >>> 00926cdc <moea_protect+0x318> bl 008e9874 <vm_page_dirty_KBI> >>> >>> in the installed kernel. So I doubt a >>> mis-build would be involved. It is a >>> head -r360311 based context still. World is >>> without MALLOC_PRODUCTION so that jemalloc >>> code executes its asserts, catching more >>> and earlier than otherwise. >>> >>> First test . . . >>> >>> The only thing that the witness kernel reported was: >>> >>> Jun 11 15:58:16 FBSDG4S2 kernel: lock order reversal: >>> Jun 11 15:58:16 FBSDG4S2 kernel: 1st 0x216fb00 Mountpoints (UMA >>> zone) _at_ /usr/src/sys/vm/uma_core.c:4387 Jun 11 15:58:16 FBSDG4S2 >>> kernel: 2nd 0x1192d2c kernelpmap (kernelpmap) _at_ >>> /usr/src/sys/powerpc/aim/mmu_oea.c:1524 Jun 11 15:58:16 FBSDG4S2 >>> kernel: stack backtrace: Jun 11 15:58:16 FBSDG4S2 kernel: #0 >>> 0x5ec164 at witness_debugger+0x94 Jun 11 15:58:16 FBSDG4S2 kernel: >>> #1 0x5ebe3c at witness_checkorder+0xb50 Jun 11 15:58:16 FBSDG4S2 >>> kernel: #2 0x536d5c at __mtx_lock_flags+0xcc Jun 11 15:58:16 >>> FBSDG4S2 kernel: #3 0x92636c at moea_kextract+0x5c Jun 11 15:58:16 >>> FBSDG4S2 kernel: #4 0x965d30 at pmap_kextract+0x98 Jun 11 15:58:16 >>> FBSDG4S2 kernel: #5 0x8bfdbc at zone_release+0xf0 Jun 11 15:58:16 >>> FBSDG4S2 kernel: #6 0x8c7854 at bucket_drain+0x2f0 Jun 11 15:58:16 >>> FBSDG4S2 kernel: #7 0x8c728c at bucket_free+0x54 Jun 11 15:58:16 >>> FBSDG4S2 kernel: #8 0x8c74fc at bucket_cache_reclaim+0x1bc Jun 11 >>> 15:58:16 FBSDG4S2 kernel: #9 0x8c7004 at zone_reclaim+0x128 Jun 11 >>> 15:58:16 FBSDG4S2 kernel: #10 0x8c3a40 at uma_reclaim+0x170 Jun 11 >>> 15:58:16 FBSDG4S2 kernel: #11 0x8c3f70 at uma_reclaim_worker+0x68 >>> Jun 11 15:58:16 FBSDG4S2 kernel: #12 0x50fbac at fork_exit+0xb0 Jun >>> 11 15:58:16 FBSDG4S2 kernel: #13 0x9684ac at fork_trampoline+0xc >>> >>> The processes that were hit were listed as: >>> >>> Jun 11 15:59:11 FBSDG4S2 kernel: pid 971 (cron), jid 0, uid 0: >>> exited on signal 11 (core dumped) Jun 11 16:02:59 FBSDG4S2 kernel: >>> pid 1111 (stress), jid 0, uid 0: exited on signal 6 (core dumped) >>> Jun 11 16:03:27 FBSDG4S2 kernel: pid 871 (mountd), jid 0, uid 0: >>> exited on signal 6 (core dumped) Jun 11 16:03:40 FBSDG4S2 kernel: >>> pid 1065 (su), jid 0, uid 0: exited on signal 6 Jun 11 16:04:13 >>> FBSDG4S2 kernel: pid 1088 (su), jid 0, uid 0: exited on signal 6 >>> Jun 11 16:04:28 FBSDG4S2 kernel: pid 968 (sshd), jid 0, uid 0: >>> exited on signal 6 >>> >>> Jun 11 16:05:42 FBSDG4S2 kernel: pid 1028 (login), jid 0, uid 0: >>> exited on signal 6 >>> >>> Jun 11 16:05:46 FBSDG4S2 kernel: pid 873 (nfsd), jid 0, uid 0: >>> exited on signal 6 (core dumped) >>> >>> >>> Rebooting and rerunning and showing the stress output and such >>> (I did not capture copies during the first test, but the first >>> test had similar messages at the same sort of points): >>> >>> Second test . . . >>> >>> # stress -m 2 --vm-bytes 1700M >>> stress: info: [1166] dispatching hogs: 0 cpu, 0 io, 2 vm, 0 hdd >>> <jemalloc>: >>> /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: >>> Failed assertion: "slab == extent_slab_get(extent)" <jemalloc>: >>> /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: >>> Failed assertion: "slab == extent_slab_get(extent)" ^C >>> >>> # exit >>> <jemalloc>: >>> /usr/src/contrib/jemalloc/include/jemalloc/internal/sz.h:200: >>> Failed assertion: "ret == sz_index2size_compute(index)" Abort trap >>> >>> The other stuff was similar to to first test, not repeated here. >> >> The updated code looks odd to me for how "m" is >> handled (part of a egrep to ensure I show all the >> usage of m): >> >> moea_protect(mmu_t mmu, pmap_t pm, vm_offset_t sva, vm_offset_t eva, >> vm_page_t m; >> if (pm != kernel_pmap && m != NULL && >> (m->a.flags & PGA_EXECUTABLE) == 0 && >> if ((m->oflags & VPO_UNMANAGED) == 0) >> vm_page_aflag_set(m, >> PGA_EXECUTABLE); m = PHYS_TO_VM_PAGE(old_pte.pte_lo & PTE_RPGN); >> refchg = >> atomic_readandclear_32(&m->md.mdpg_attrs); vm_page_dirty(m); >> vm_page_aflag_set(m, >> PGA_REFERENCED); >> >> Or more completely, with notes mixed in: >> >> void >> moea_protect(mmu_t mmu, pmap_t pm, vm_offset_t sva, vm_offset_t eva, >> vm_prot_t prot) >> { >> . . . >> vm_page_t m; >> . . . >> for (pvo = RB_NFIND(pvo_tree, &pm->pmap_pvo, &key); >> pvo != NULL && PVO_VADDR(pvo) < eva; pvo = tpvo) { >> . . . >> if (pt != NULL) { >> . . . >> if (pm != kernel_pmap && m != NULL && >> >> NOTE: m seems to be uninitialized but tested for being NULL >> above. >> >> (m->a.flags & PGA_EXECUTABLE) == 0 && >> >> Note: This looks to potentially be using a random, non-NULL >> value for m during evaluation of m->a.flags . >> >> . . . >> >> if ((pvo->pvo_vaddr & PVO_MANAGED) && >> (pvo->pvo_pte.prot & VM_PROT_WRITE)) { >> m = PHYS_TO_VM_PAGE(old_pte.pte_lo & >> PTE_RPGN); >> >> Note: m finally is potentially initialized(/set). >> >> refchg = >> atomic_readandclear_32(&m->md.mdpg_attrs); if (refchg & PTE_CHG) >> vm_page_dirty(m); >> if (refchg & PTE_REF) >> vm_page_aflag_set(m, >> PGA_REFERENCED); . . . >> >> Note: So, if m is set above, then the next loop >> iteration(s) would use this then-old value >> instead of an initialized value. >> >> It looks to me like at least one assignment >> to m is missing. >> >> moea64_pvo_protect has pg that seems analogous to >> m and has: >> >> pg = PHYS_TO_VM_PAGE(pvo->pvo_pte.pa & LPTE_RPGN); >> . . . >> if (pm != kernel_pmap && pg != NULL && >> (pg->a.flags & PGA_EXECUTABLE) == 0 && >> (pvo->pvo_pte.pa & (LPTE_I | LPTE_G | LPTE_NOEXEC)) == 0) >> { if ((pg->oflags & VPO_UNMANAGED) == 0) >> vm_page_aflag_set(pg, PGA_EXECUTABLE); >> >> . . . >> if (pg != NULL && (pvo->pvo_vaddr & PVO_MANAGED) && >> (oldprot & VM_PROT_WRITE)) { >> refchg |= atomic_readandclear_32(&pg->md.mdpg_attrs); >> if (refchg & LPTE_CHG) >> vm_page_dirty(pg); >> if (refchg & LPTE_REF) >> vm_page_aflag_set(pg, PGA_REFERENCED); >> >> >> This might suggest some about what is missing. > > Can you try moving the assignment to 'm' to right below the > moea_pte_change() call? Panics during boot. svnlite diff shown later. That change got me a panic just after the lines about ada0 and cd0 details. (Unknown what internal stage.) Hand translated from a picture of the screen: panic: vm_page_free_prep: mapping flags set in page 0xd032a078 . . . panic vm_page_free_prep vm_page_free_toq vm_page_free vm_object_collapse vm_object_deallocate vm_map_process_deferred vm_map_remove exec_new_vmspace exec_elf32_imgact kern_execve sys_execve trap powerpc_interrupt user SC trap by 0x100d7af8 . . . # svnlite diff /usr/src/sys/powerpc/aim/mmu_oea.c Index: /usr/src/sys/powerpc/aim/mmu_oea.c =================================================================== --- /usr/src/sys/powerpc/aim/mmu_oea.c (revision 360322) +++ /usr/src/sys/powerpc/aim/mmu_oea.c (working copy) _at__at_ -1773,6 +1773,9 _at__at_ { struct pvo_entry *pvo, *tpvo, key; struct pte *pt; + struct pte old_pte; + vm_page_t m; + int32_t refchg; KASSERT(pm == &curproc->p_vmspace->vm_pmap || pm == kernel_pmap, ("moea_protect: non current pmap")); _at__at_ -1800,12 +1803,31 _at__at_ pvo->pvo_pte.pte.pte_lo &= ~PTE_PP; pvo->pvo_pte.pte.pte_lo |= PTE_BR; + old_pte = *pt; + /* * If the PVO is in the page table, update that pte as well. */ if (pt != NULL) { moea_pte_change(pt, &pvo->pvo_pte.pte, pvo->pvo_vaddr); + m = PHYS_TO_VM_PAGE(old_pte.pte_lo & PTE_RPGN); + if (pm != kernel_pmap && m != NULL && + (m->a.flags & PGA_EXECUTABLE) == 0 && + (pvo->pvo_pte.pa & (PTE_I | PTE_G)) == 0) { + if ((m->oflags & VPO_UNMANAGED) == 0) + vm_page_aflag_set(m, PGA_EXECUTABLE); + moea_syncicache(pvo->pvo_pte.pa & PTE_RPGN, + PAGE_SIZE); + } mtx_unlock(&moea_table_mutex); + if ((pvo->pvo_vaddr & PVO_MANAGED) && + (pvo->pvo_pte.prot & VM_PROT_WRITE)) { + refchg = atomic_readandclear_32(&m->md.mdpg_attrs); + if (refchg & PTE_CHG) + vm_page_dirty(m); + if (refchg & PTE_REF) + vm_page_aflag_set(m, PGA_REFERENCED); + } } } rw_wunlock(&pvh_global_lock); === Mark Millard marklmi at yahoo.com ( dsl-only.net went away in early 2018-Mar)Received on Fri Jun 12 2020 - 01:29:33 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:24 UTC