Re: FYI: WITH_REPRODUCIBLE_BUILD= problem for some files?

From: Mark Millard <marklmi_at_yahoo.com>
Date: Tue, 4 May 2021 10:01:15 -0700
On 2021-May-4, at 08:51, Mark Millard <marklmi at yahoo.com> wrote:

> On 2021-May-4, at 06:01, Ed Maste <emaste at freebsd.org> wrote:
> 
>> On Mon, 3 May 2021 at 22:26, Mark Millard <marklmi_at_yahoo.com> wrote:
>>> 
>>> But I'll note that I've built and stalled py37-diffoscope
>>> (new to me). A basic quick test showed that it reports:
>>> 
>>> W: diffoscope.main: Fuzzy-matching is currently disabled as the "tlsh" module is unavailable.
>> 
>> I just looked up tlsh - its "A Locality Sensitive Hash"; I presume
>> diffoscope uses it to infer file renames. I believe the warning
>> emitted here should have no impact on the output we're looking for.
> 
> Okay.
> 
>> As far as the utf-8 issues go, diffoscope requires a utf-8 locale and
>> I suspect that is the issue. If you don't have LANG set already, try
>> setting LANG=C.UTF-8 in your environment.
> 
> That is not the issue for the UnicodeDecodeError:
> 
> # echo $LANG
> C.UTF-8
> 
> # diffoscope /.zfs/snapshot/2021-04-*-01:40:48-0/bin/sh
> $<3/>2021-05-04 08:49:21 W: diffoscope.main: Fuzzy-matching is currently disabled as the "tlsh" module is unavailable.
> $<3/>Traceback (most recent call last):
>  File "/usr/local/lib/python3.7/site-packages/diffoscope/main.py", line 745, in main
>    sys.exit(run_diffoscope(parsed_args))
>  File "/usr/local/lib/python3.7/site-packages/diffoscope/main.py", line 677, in run_diffoscope
>    difference = load_diff_from_path(path1)
>  File "/usr/local/lib/python3.7/site-packages/diffoscope/readers/__init__.py", line 31, in load_diff_from_path
>    return load_diff(codecs.getreader("utf-8")(fp), path)
>  File "/usr/local/lib/python3.7/site-packages/diffoscope/readers/__init__.py", line 35, in load_diff
>    return JSONReaderV1().load(fp, path)
>  File "/usr/local/lib/python3.7/site-packages/diffoscope/readers/json.py", line 33, in load
>    raw = json.load(fp)
>  File "/usr/local/lib/python3.7/json/__init__.py", line 293, in load
>    return loads(fp.read(),
>  File "/usr/local/lib/python3.7/codecs.py", line 504, in read
>    newchars, decodedbytes = self.decode(data, self.errors)
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb7 in position 18: invalid start byte
> 

Well, the list of differing files is huge. But this seems to
be .gnu_debuglink content for the area it is in. I'll note
that I did installworld but not the likes of distrib-dirs
or distribution this time.

This test did buildworld to two distinct directories:

zroot/BUILDs/13_0R-CA72-nodbg-clang       5.13G   118G     5.13G  /usr/obj/BUILDs/13_0R-CA72-nodbg-clang
zroot/BUILDs/13_0R-CA72-nodbg-clang-alt   4.28G   118G     4.28G  /usr/obj/BUILDs/13_0R-CA72-nodbg-clang-alt

and installworld to 2 distinct directories:

zroot/DESTDIRs/13_0R-CA72-instwrld-alt    1.44G   118G     1.44G  /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt
zroot/DESTDIRs/13_0R-CA72-instwrld-norm   1.44G   118G     1.44G  /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm

Previously (armv7 target) I had built, installed, rebuilt
to same directory (after clean-out) and installed to an
alternate directory. That had gotten only a few files
different but I do not know (yet) if it was the procedural
difference that made the difference.

Prefix of the list of different files this time:

# diff -rq /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/ /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/ | more
Files /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/bin/[ and /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/bin/[ differ
Files /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/bin/cat and /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/bin/cat differ
Files /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/bin/chflags and /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/bin/chflags differ
Files /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/bin/chio and /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/bin/chio differ
. . .

Looking, aarch64 seems to typically get a back-to-back
sequence of 4 bytes different in native programs in my
builds:

# cmp -x /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/bin/cat /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/bin/cat
00003bd4 1d 65
00003bd5 eb a3
00003bd6 bb ca
00003bd7 8e 1a

# ls -Tld /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/bin/cat /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/bin/cat
-r-xr-xr-x  1 root  wheel  18448 May  4 08:55:01 2021 /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/bin/cat
-r-xr-xr-x  1 root  wheel  18448 May  3 23:16:36 2021 /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/bin/cat

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
. . .
 25 .gnu_debuglink 00000010  0000000000000000  0000000000000000  00003bc8  2**0
                  CONTENTS, READONLY

00003bd4-00003bc8 == 0xC

# cmp -x /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/bin/chflags /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/bin/chflags
00002208 88 a1
00002209 e6 40
0000220a 60 94
0000220b bf ce

# ls -Tld /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/bin/chflags /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/bin/chflags
-r-xr-xr-x  1 root  wheel  11440 May  4 08:55:01 2021 /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/bin/chflags
-r-xr-xr-x  1 root  wheel  11440 May  3 23:16:36 2021 /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/bin/chflags

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
. . .
 25 .gnu_debuglink 00000014  0000000000000000  0000000000000000  000021f8  2**0
                  CONTENTS, READONLY

00002208-000021f8 == 0x10

# cmp -x /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/bin/chio /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/bin/chio
000050c4 6b 3e
000050c5 08 ca
000050c6 7a 2f
000050c7 5d 64

# ls -Tld /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/bin/chio /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/bin/chio
-r-xr-xr-x  1 root  wheel  23728 May  4 08:55:01 2021 /usr/obj/DESTDIRs/13_0R-CA72-instwrld-alt/bin/chio
-r-xr-xr-x  1 root  wheel  23728 May  3 23:16:37 2021 /usr/obj/DESTDIRs/13_0R-CA72-instwrld-norm/bin/chio

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
. . .
 25 .gnu_debuglink 00000010  0000000000000000  0000000000000000  000050b8  2**0
                  CONTENTS, READONLY

000050c4-000050b8 == 0xC

For all I know, some individual byte(s) in the 4 might accidentally
match sometimes. The addition offset after .gnu_debuglink's file
offset does vary (0xC and 0x10 above).

The content of those differences do not look like
file path components, for example the 0x08 byte.

I do build with:

# Avoid stripping but do not control host -g status as well:
DEBUG_FLAGS+=
#
WITH_REPRODUCIBLE_BUILD=
WITH_DEBUG_FILES=

But that was true for the earlier armv7 target example
that I reported that only got a few files with
differences.


===
Mark Millard
marklmi at yahoo.com
( dsl-only.net went
away in early 2018-Mar)
Received on Tue May 04 2021 - 15:01:25 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:28 UTC