Re: arm64 fork/swap data corruptions: A ~110 line C program demonstrating an example (Pine64+ 2GB context) [Corrected subject: arm64!]

From: Scott Bennett <bennett_at_sdf.org>
Date: Thu, 16 Mar 2017 01:07:31 -0500
Mark Millard <markmi ta dsl-only.net> wrote:

> [Something strange happened to the automatic CC: fill-in for my original
> reply. Also I should have mentioned that for my test program if a
> variant is made that does not fork the swapping works fine.]
>
> On 2017-Mar-15, at 9:37 AM, Mark Millard <markmi at dsl-only.net> wrote:
>
> > On 2017-Mar-15, at 6:15 AM, Scott Bennett <bennett at sdf.org> wrote:
> > 
> >>    On Tue, 14 Mar 2017 18:18:56 -0700 Mark Millard
> >> <markmi at dsl-only.net> wrote:
> >>> On 2017-Mar-14, at 4:44 PM, Bernd Walter <ticso_at_cicely7.cicely.de> wrote:
> >>> 
> >>>> On Tue, Mar 14, 2017 at 03:28:53PM -0700, Mark Millard wrote:
> >>>>> [test_check() between the fork and the wait/sleep prevents the
> >>>>> failure from occurring. Even a small access to the memory at
> >>>>> that stage prevents the failure. Details follow.]
> >>>> 
> >>>> Maybe a stupid question, since you might have written it somewhere.
> >>>> What medium do you swap to?
> >>>> I've seen broken firmware on microSD cards doing silent data
> >>>> corruption for some access patterns.
> >>> 
> >>> The root filesystem is on a USB SSD on a powered hub.
> >>> 
> >>> Only the kernel is from the microSD card.
> >>> 
> >>> I have several examples of the USB SSD model and have
> >>> never observed such problems in any other context.
> >>> 
> >>> [remainder of irrelevant material deleted  --SB]
> >> 
> >>    You gave a very long-winded non-answer to Bernd's question, so I'll
> >> repeat it here.  What medium do you swap to?
> > 
> > My wording of:
> > 
> > The root filesystem is on a USB SSD on a powered hub.
> > 
> > was definitely poor. It should have explicitly mentioned the
> > swap partition too:
> > 
> > The root filesystem and swap partition are both on the same
> > USB SSD on a powered hub.
> > 
> > More detail from dmesg -a for usb:
> > 
> > usbus0: 12Mbps Full Speed USB v1.0
> > usbus1: 480Mbps High Speed USB v2.0
> > usbus2: 12Mbps Full Speed USB v1.0
> > usbus3: 480Mbps High Speed USB v2.0
> > ugen0.1: <Generic OHCI root HUB> at usbus0
> > uhub0: <Generic OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus0
> > ugen1.1: <Allwinner EHCI root HUB> at usbus1
> > uhub1: <Allwinner EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus1
> > ugen2.1: <Generic OHCI root HUB> at usbus2
> > uhub2: <Generic OHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on usbus2
> > ugen3.1: <Allwinner EHCI root HUB> at usbus3
> > uhub3: <Allwinner EHCI root HUB, class 9/0, rev 2.00/1.00, addr 1> on usbus3
> > . . .
> > uhub0: 1 port with 1 removable, self powered
> > uhub2: 1 port with 1 removable, self powered
> > uhub1: 1 port with 1 removable, self powered
> > uhub3: 1 port with 1 removable, self powered
> > ugen3.2: <GenesysLogic USB2.0 Hub> at usbus3
> > uhub4 on uhub3
> > uhub4: <GenesysLogic USB2.0 Hub, class 9/0, rev 2.00/90.20, addr 2> on usbus3
> > uhub4: MTT enabled
> > uhub4: 4 ports with 4 removable, self powered
> > ugen3.3: <OWC Envoy Pro mini> at usbus3
> > umass0 on uhub4
> > umass0: <OWC Envoy Pro mini, class 0/0, rev 2.10/1.00, addr 3> on usbus3
> > umass0:  SCSI over Bulk-Only; quirks = 0x0100
> > umass0:0:0: Attached to scbus0
> > . . .
> > da0 at umass-sim0 bus 0 scbus0 target 0 lun 0
> > da0: <OWC Envoy Pro mini 0> Fixed Direct Access SPC-4 SCSI device
> > da0: Serial Number <REPLACED>
> > da0: 40.000MB/s transfers
> > 
> > (Edited a bit because there is other material interlaced, even
> > internal to some lines. Also: I removed the serial number of the
> > specific example device.)

     Thank you.  That presents a much clearer picture.
> > 
> >>    I will further note that any kind of USB device cannot automatically
> >> be trusted to behave properly.  USB devices are notorious, for example,
> >> 
> >>   [reasons why deleted  --SB]
> >> 
> >>    You should identify where you page/swap to and then try substituting
> >> a different device for that function as a test to eliminate the possibility
> >> of a bad storage device/controller.  If the problem still occurs, that
> >> means there still remains the possibility that another controller or its
> >> firmware is defective instead.  It could be a kernel bug, it is true, but
> >> making sure there is no hardware or firmware error occurring is important,
> >> and as I say, USB devices should always be considered suspect unless and
> >> until proven innocent.
> > 
> > [FYI: This is a ufs context, not a zfs one.]

     Right.  It's only a Pi, after all. :-)
> > 
> > I'm aware of such  things. There is no evidence that has resulted in
> > suggesting the USB devices that I can replace are a problem. Otherwise
> > I'd not be going down this path. I only have access to the one arm64
> > device (a Pine64+ 2GB) so I've no ability to substitution-test what
> > is on that board.

     There isn't even one open port on that hub that you could plug a
flash drive into temporarily to be the paging device?  You could then
try your tests before returning to the normal configuration.  If there
isn't an open port, then how about plugging a second hub into one of
the first hub's ports and moving the displaced device to the second
hub?  A flash drive could then be plugged in.  That kind of configuration
is obviously a bad idea for the long run, but just to try your tests it
ought to work well enough.  (BTW, if a USB storage device containing a
paging area drops off=line even momentarily and the system needs to use
it, that is the beginning of the end, even though it may take up to a few
minutes for everything to lock up.  You probably won't be able to do an
orderly shutdown, but will instead have to crash it with the power switch.
In the case of something like a Pi, this is an unpleasant fact of life,
to be sure.)
     I think I buy your arguments, given the evidence you've collected
thus far, including what you've added below.  I just like to eliminate
possibilities that are much simpler to deal with before facing nastinesses
like bugs in the VM subsystem. :-)
> > 
> > It would be neat if some folks used my code to test other arm64
> > contexts and reported the results. I'd be very interested.
> > (This is easier to do on devices that do not have massive
> > amounts of RAM, which may limit the range of devices or
> > device configurations that are reasonable to test.)
> > 
> > There is that other people using other devices have reported
> > the behavior that started this investigation. I can produce the
> > behavior that they reported, although I've not seen anyone else
> > listing specific steps that lead to the problem or ways to tell
> > if the symptom is going to happen before it actually does. Nor
> > have I seen any other core dump analysis. (I have bugzilla
> > submittals 217138 and 217239 tied to symptoms others have
> > reported as well as this test program material.)
> > 
> > Also, considering that for my test program I can control which pages
> > get the zeroed-problem by read-accessing even one byte of any 4K
> > Byte page that I want to make work normally, doing so in the child
> > process of the fork, between the fork and the sleep/swap-out, it does
> > not suggest USB-device-specific behavior. The read-access is changing
> > the status of the page in some way as far as I can tell.
> > 
> > (Such read-accesses in the parent process make no difference to the
> > behavior.)
>
> I should have noted another comparison/contrast between
> having memory corruption and not in my context:
>
> I've tried variants of my test program that do not fork but
> just sleep for 60s to allow me to force the swap-out. I
> did this before adding fork and before using
> parital_test_check, for example. I gradually added things
> apparently involved in the reports others had made
> until I found a combination that produced a memory
> corruption test failure.
>
> These tests without fork involved find no problems with
> the memory content after the swap-in.
>
> For my test program it appears that fork-before-swap-out
> or the like is essential to having the problem occur.
>
     A comment about terminology seems in order here.  It bothers
me considerably to see you writing "swap out" or "swapping" where
it seems like you mean to write "page out" or "paging".  A BSD
system whose swapping mechanism gets activated has already waded
very deeply into the quicksand and frequently cannot be gotten out
in a reasonable amount of time even with manual assistance.  It is
often quicker to crash it, reboot, and wait for the fsck(8) cleanups
to complete.  Orderly shutdowns, even of the kind that results from
a quick poke to the power button, typically get mired in the same
mess that already has the system in knots.  Also, BSD systems since
3.0BSD, unlike older AT&T (pre-SysVR2.3) systems, do not swap in,
just out.  A swapped out process, once the system determines that it
has adequate resources again to attempt to run the process, will have
the interrupted text page paged in and the rest will be paged in by
the normal mechanism of page faults and page-in operations.  I assume
you must already know all this, which is a large part of why it grates
on me that you appear to be using the wrong terms.


                                  Scott Bennett, Comm. ASMELG, CFIAG
**********************************************************************
* Internet:   bennett at sdf.org   *xor*   bennett at freeshell.org  *
*--------------------------------------------------------------------*
* "A well regulated and disciplined militia, is at all times a good  *
* objection to the introduction of that bane of all free governments *
* -- a standing army."                                               *
*    -- Gov. John Hancock, New York Journal, 28 January 1790         *
**********************************************************************
Received on Thu Mar 16 2017 - 05:08:06 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:10 UTC