Hi Mark, On Wed, 13 May 2020 01:43:23 -0700 Mark Millard <marklmi_at_yahoo.com> wrote: > [I'm adding a reference to an old arm64/aarch64 bug that had > pages turning to zero, in case this 32-bit powerpc issue is > somewhat analogous.] > > On 2020-May-13, at 00:29, Mark Millard <marklmi at yahoo.com> wrote: > > > [stress alone is sufficient to have jemalloc asserts fail > > in stress, no need for a multi-socket G4 either. No need > > to involve nfsd, mountd, rpcbind or the like. This is not > > a claim that I know all the problems to be the same, just > > that a jemalloc reported failure in this simpler context > > happens and zeroed pages are involved.] > > > > Reminder: head -r360311 based context. > > > > > > First I show a single CPU/core PowerMac G4 context failing > > in stress. (I actually did this later, but it is the > > simpler context.) I simply moved the media from the > > 2-socket G4 to this slower, single-cpu/core one. > > > > cpu0: Motorola PowerPC 7400 revision 2.9, 466.42 MHz > > cpu0: Features 9c000000<PPC32,ALTIVEC,FPU,MMU> > > cpu0: HID0 8094c0a4<EMCP,DOZE,DPM,EIEC,ICE,DCE,SGE,BTIC,BHT> > > real memory = 1577857024 (1504 MB) > > avail memory = 1527508992 (1456 MB) > > > > # stress -m 1 --vm-bytes 1792M > > stress: info: [1024] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd > > <jemalloc>: > > /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: > > Failed assertion: "slab == extent_slab_get(extent)" stress: FAIL: > > [1024] (415) <-- worker 1025 got signal 6 stress: WARN: [1024] > > (417) now reaping child worker processes stress: FAIL: [1024] (451) > > failed run completed in 243s > > > > (Note: 1792 is the biggest it allowed with M.) > > > > The following still pages in and out and fails: > > > > # stress -m 1 --vm-bytes 1290M > > stress: info: [1163] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd > > <jemalloc>: > > /usr/src/contrib/jemalloc/include/jemalloc/internal/arena_inlines_b.h:258: > > Failed assertion: "slab == extent_slab_get(extent)" . . . > > > > By contrast, the following had no problem for as > > long as I let it run --and did not page in or out: > > > > # stress -m 1 --vm-bytes 1280M > > stress: info: [1181] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd > > ... > The following is was a fix for a "pages magically > turn into zeros" problem on amd64/aarch64. The > original 32-bit powerpc context did not seem a > match to me --but the stress test behavior that > I've just observed seems closer from an > external-test point of view: swapping is involved. > > My be this will suggest something to someone that > knows what they are doing. > > (Note: dsl-only.net closed down, so the E-mail > address reference is no longer valid.) > > Author: kib > Date: Mon Apr 10 15:32:26 2017 > New Revision: 316679 > URL: > https://svnweb.freebsd.org/changeset/base/316679 > > > Log: > Do not lose dirty bits for removing PROT_WRITE on arm64. > > Arm64 pmap interprets accessed writable ptes as modified, since > ARMv8.0 does not track Dirty Bit Modifier in hardware. If writable > bit is removed, page must be marked as dirty for MI VM. > > This change is most important for COW, where fork caused losing > content of the dirty pages which were not yet scanned by pagedaemon. > > Reviewed by: alc, andrew > Reported and tested by: Mark Millard <markmi at dsl-only.net> > PR: 217138, 217239 > Sponsored by: The FreeBSD Foundation > MFC after: 2 weeks > > Modified: > head/sys/arm64/arm64/pmap.c > > Modified: head/sys/arm64/arm64/pmap.c > ============================================================================== > --- head/sys/arm64/arm64/pmap.c Mon Apr 10 12:35:58 > 2017 (r316678) +++ head/sys/arm64/arm64/pmap.c Mon Apr > 10 15:32:26 2017 (r316679) _at__at_ -2481,6 +2481,11 _at__at_ > pmap_protect(pmap_t pmap, vm_offset_t sv sva += L3_SIZE) { > l3 = pmap_load(l3p); > if (pmap_l3_valid(l3)) { > + if ((l3 & ATTR_SW_MANAGED) && > + pmap_page_dirty(l3)) { > + > vm_page_dirty(PHYS_TO_VM_PAGE(l3 & > + ~ATTR_MASK)); > + } > pmap_set(l3p, ATTR_AP(ATTR_AP_RO)); > PTE_SYNC(l3p); > /* XXX: Use pmap_invalidate_range */ > > > === > Mark Millard > marklmi at yahoo.com > ( dsl-only.net went > away in early 2018-Mar) > Thanks for this reference. I took a quick look at the 3 pmap implementations we have (haven't check the new radix pmap yet), and it looks like only mmu_oea.c (32-bit AIM pmap, for G3 and G4) is missing vm_page_dirty() calls in its pmap_protect() implementation, analogous to the change you posted right above. Given this, I think it's safe to say that this missing piece is necessary. We'll work on a fix for this; looking at moea64_protect(), there may be additional work needed to support this as well, so it may take a few days. - JustinReceived on Wed May 13 2020 - 13:56:37 UTC
This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:24 UTC