Re: arm64 fork/swap data corruptions: A ~110 line C program demonstrating an example (Pine64+ 2GB context) [Corrected subject: arm64!]

From: Mark Millard <markmi_at_dsl-only.net> Date: Tue, 14 Mar 2017 21:33:08 -0700 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:10 UTC

A single Byte access to a 4K Byte aligned region between
the fork and wait/sleep/swap-out prevents that specific
4K Byte region from having the (bad) zeros.

Sounds like a page sized unit of behavior to me.

Details follow.

On 2017-Mar-14, at 3:28 PM, Mark Millard <markmi_at_dsl-only.net> wrote:

> [test_check() between the fork and the wait/sleep prevents the
> failure from occurring. Even a small access to the memory at
> that stage prevents the failure. Details follow.]
> 
> On 2017-Mar-14, at 11:07 AM, Mark Millard <markmi_at_dsl-only.net> wrote:
> 
>> [This is just a correction to the subject-line text to say arm64
>> instead of amd64.]
>> 
>> On 2017-Mar-14, at 12:58 AM, Mark Millard <markmi_at_dsl-only.net> wrote:
>> 
>> [Another correction I'm afraid --about alternative program variations
>> this time.]
>> 
>> On 2017-Mar-13, at 11:52 PM, Mark Millard <markmi_at_dsl-only.net> wrote:
>> 
>>> I'm still at a loss about how to figure out what stages are messed
>>> up. (Memory coherency? Some memory not swapped out? Bad data swapped
>>> out? Wrong data swapped in?)
>>> 
>>> But at least I've found a much smaller/simpler example to demonstrate
>>> some problem with in my Pine64+_ 2GB context.
>>> 
>>> The Pine64+ 2GB is the only amd64 context that I have access to.
>> 
>> Someday I'll learn to type arm64 the first time instead of amd64.
>> 
>>> The following program fails its check for data
>>> having its expected byte pattern in dynamically
>>> allocated memory after a fork/swap-out/swap-in
>>> sequence.
>>> 
>>> I'll note that the program sleeps for 60s after
>>> forking to give time to do something else to
>>> cause the parent and child processes to swap
>>> out (RES=0 as seen in top).
>> 
>> The following about the extra test_check() was
>> wrong.
>> 
>>> Note the source code line:
>>> 
>>> // test_check(); // Adding this line prevents failure.
>>> 
>>> It seem that accessing the region contents before forking
>>> and swapping avoids the problem. But there is a problem
>>> if the region was only written-to before the fork/swap.
> 
> There is a place that if a test_check call is put then the
> problem does not happen at any stage: I tried putting a
> call between the fork and the later wait/sleep code:

I changed the byte sequence patterns to avoid
zero values since the bad values are zeros:

static value_type value(size_t v) { return (value_type)((v&0xFEu)|0x1u); }
                  // value now avoids the zero value since the failures
                  // are zeros.

With that I can then test accurately what bytes have
bad values vs. do not. I also changed to:

void partial_test_check(void) {
    if (value(0u)!=gbl_region.array[0])    raise(SIGABRT);
    if (value(0u)!=(*dyn_region).array[0]) raise(SIGABRT);
}

since previously [0] had a zero value and so I'd used [1].

On this basis I'm now using the below. See the comments tied
to partial_test_check() calls:

extern void test_setup(void);         // Sets up the memory byte patterns.
extern void test_check(void);         // Tests the memory byte patterns.
extern void partial_test_check(void); // Tests just [0] of each region
                                      // (gbl_region and dyn_region).

int main(void) {
    test_setup();
    test_check(); // Before fork() [passes]

    pid_t pid = fork();
    int wait_status = 0;;

    // After fork; before waitsleep/swap-out.

    if (0==pid) partial_test_check();
                     // Even the above is sufficient by
                     // itself to prevent failure for
                     // region_size 1u through
                     // 4u*1024u!
                     // But 4u*1024u+1u and above fail
                     // with this access to memory.
                     // The failing test is of
                     // (*dyn_region).array[4096u].
                     // This test never fails here.

    if (0<pid) partial_test_check(); // This never prevents
                                     // later failures (and
                                     // never fails here).

    if (0<pid) { wait(&wait_status); }

    if (-1!=wait_status && 0<=pid) {
        if (0==pid) {
            sleep(60);

            // During this manually force this process to
            // swap out. I use something like:

            // stress -m 1 --vm-bytes 1800M

            // in another shell and ^C'ing it after top
            // shows the swapped status desired. 1800M
            // just happened to work on the Pine64+ 2GB
            // that I was using. I watch with top -PCwaopid .
        }

        test_check(); // After wait/sleep [fails for small-enough region_sizes]
    }
}

> This suggests to me that the small access is forcing one or more things to
> be initialized for memory access that fork is not establishing of itself.
> It appears that if established correctly then the swap-out/swap-in
> sequence would work okay without needing the manual access to the memory.
> 
> 
> So far via this test I've not seen any evidence of problems with the global
> region but only the dynamically allocated region.
> 
> However, the symptoms that started this investigation in a much more
> complicated context had an area of global memory from a .so that ended
> up being zero.
> 
> I think that things should be fixed for this simpler context first and
> that further investigation of the sh/su related should wait to see what
> things are like after this test case works.

===
Mark Millard
markmi at dsl-only.net