Re: svn commit: r360233 - in head: contrib/jemalloc . . . : This partially breaks a 2-socket 32-bit powerpc (old PowerMac G4) based on head -r360311

From: Mark Millard <marklmi_at_yahoo.com>
Date: Thu, 11 Jun 2020 14:36:37 -0700
On 2020-Jun-11, at 13:55, Justin Hibbits <chmeeedalf at gmail.com> wrote:

> On Wed, 10 Jun 2020 18:56:57 -0700
> Mark Millard <marklmi_at_yahoo.com> wrote:
> 
>> On 2020-May-13, at 08:56, Justin Hibbits <chmeeedalf_at_gmail.com> wrote:
>> 
>>> Hi Mark,  
>> 
>> Hello Justin.
> 
> Hi Mark,

Hello again, Justin.

>> 
>>> On Wed, 13 May 2020 01:43:23 -0700
>>> Mark Millard <marklmi_at_yahoo.com> wrote:
>>> 
>>>> [I'm adding a reference to an old arm64/aarch64 bug that had
>>>> pages turning to zero, in case this 32-bit powerpc issue is
>>>> somewhat analogous.]
>>>> 
>>>>> . . .  
>>> ...  
>>>> . . .
>>>> 
>>>> (Note: dsl-only.net closed down, so the E-mail
>>>> address reference is no longer valid.)
>>>> 
>>>> Author: kib
>>>> Date: Mon Apr 10 15:32:26 2017
>>>> New Revision: 316679
>>>> URL: 
>>>> https://svnweb.freebsd.org/changeset/base/316679
>>>> 
>>>> 
>>>> Log:
>>>> Do not lose dirty bits for removing PROT_WRITE on arm64.
>>>> 
>>>> Arm64 pmap interprets accessed writable ptes as modified, since
>>>> ARMv8.0 does not track Dirty Bit Modifier in hardware. If writable
>>>> bit is removed, page must be marked as dirty for MI VM.
>>>> 
>>>> This change is most important for COW, where fork caused losing
>>>> content of the dirty pages which were not yet scanned by
>>>> pagedaemon.
>>>> 
>>>> Reviewed by:	alc, andrew
>>>> Reported and tested by:	Mark Millard <markmi at
>>>> dsl-only.net> PR:	217138, 217239
>>>> Sponsored by:	The FreeBSD Foundation
>>>> MFC after:	2 weeks
>>>> 
>>>> Modified:
>>>> head/sys/arm64/arm64/pmap.c
>>>> 
>>>> Modified: head/sys/arm64/arm64/pmap.c
>>>> ==============================================================================
>>>> --- head/sys/arm64/arm64/pmap.c	Mon Apr 10 12:35:58
>>>> 2017	(r316678) +++ head/sys/arm64/arm64/pmap.c	Mon
>>>> Apr 10 15:32:26 2017	(r316679) _at__at_ -2481,6 +2481,11 _at__at_
>>>> pmap_protect(pmap_t pmap, vm_offset_t sv sva += L3_SIZE) {
>>>> 			l3 = pmap_load(l3p);
>>>> 			if (pmap_l3_valid(l3)) {
>>>> +				if ((l3 & ATTR_SW_MANAGED) &&
>>>> +				    pmap_page_dirty(l3)) {
>>>> +
>>>> vm_page_dirty(PHYS_TO_VM_PAGE(l3 &
>>>> +					    ~ATTR_MASK));
>>>> +				}
>>>> 				pmap_set(l3p, ATTR_AP(ATTR_AP_RO));
>>>> 				PTE_SYNC(l3p);
>>>> 				/* XXX: Use pmap_invalidate_range
>>>> */
>>>> 
>>>> . . .
>>>> 
>>> 
>>> Thanks for this reference.  I took a quick look at the 3 pmap
>>> implementations we have (haven't check the new radix pmap yet), and
>>> it looks like only mmu_oea.c (32-bit AIM pmap, for G3 and G4) is
>>> missing vm_page_dirty() calls in its pmap_protect() implementation,
>>> analogous to the change you posted right above. Given this, I think
>>> it's safe to say that this missing piece is necessary.  We'll work
>>> on a fix for this; looking at moea64_protect(), there may be
>>> additional work needed to support this as well, so it may take a
>>> few days.  
>> 
>> Ping? Any clue when the above might happen?
>> 
>> I've been avoiding the old PowerMacs and leaving
>> them at head -r360311 , pending an update that
>> would avoid the kernel zeroing pages that it
>> should not zero. But I've seen that you were busy
>> with more modern contexts this last about a month.
>> 
>> And, clearly, my own context has left pending
>> (for much longer) other more involved activities
>> (compared to just periodically updating to
>> more recent FreeBSD vintages).
>> 
>> . . .
>> 
> 
> Sorry for the delay, I got sidetracked with a bunch of other
> development.

> I did install a newer FreeBSD on my dual G4 and couldn't
> see the problem.

How did you test?

In my context it was far easier to see the problem
with builds that did not use MALLOC_PRODUCTION. In
other words: jemalloc having its asserts tested.

The easiest way I found to get the asserts to fail
was to do (multiple processes (-m) and totaling to
more than enough to force paging/swapping):

stress -m 2 --vm-bytes 1700M &

(Possibly setting up some shells first
to potentially later exit.)

Normally stress itself would hit jemalloc
asserts. Apparently the asserts did not
stop the code and it ran until a failure
occurred (via dtv=0x0). I never had to
manually stop the stress processes.

If no failures during, then exit shells
that likely were swapped out or partially
paged out during the stress run. They
hit jemalloc asserts during their cleanup
activity in my testing.


> That said, the attached patch effectively copies
> what's done in OEA6464 into OEA pmap.  Can you test it?

I'll try it once I get a chance, probably later
today.

I gather from what I see that moea64_protect did not
need the changes that you originally thought might
be required? I only see moea_protect changes in the
patch.

===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Received on Thu Jun 11 2020 - 19:36:47 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:24 UTC