Re: dump trying to access incorrect block numbers? [It is not just dump that can get such]

From: Mark Millard <markmi_at_dsl-only.net>
Date: Fri, 7 Jul 2017 16:02:51 -0700
Michael Butler imb at protected-networks.net  wrote on 
Fri Jul 7 14:45:12 UTC 2017 :

> Recent builds doing a backup (dump) cause nonsensical errors in syslog:
> 
> Jul  7 00:10:24 toshi kernel: 
> g_vfs_done():ada0s3a[READ(offset=6050375794688, length=32768)]error = 5
> Jul  7 00:10:24 toshi kernel: 
> g_vfs_done():ada0s3a[READ(offset=546222112768, length=32768)]error = 5
> Jul  7 00:10:24 toshi kernel: 
> g_vfs_done():ada0s3a[READ(offset=2142846844928, length=32768)]error = 5
> Jul  7 00:10:24 toshi last message repeated 7 times
> Jul  7 00:10:24 toshi kernel: 
> g_vfs_done():ada0s3a[READ(offset=2226879725568, length=32768)]error = 5
> Jul  7 00:10:24 toshi kernel: 
> g_vfs_done():ada0s3a[READ(offset=2941159211008, length=32768)]error = 5
> Jul  7 00:10:24 toshi last message repeated 2 times
> Jul  7 00:10:24 toshi kernel: 
> g_vfs_done():ada0s3a[READ(offset=3067208531968, length=32768)]error = 5
> Jul  7 00:10:24 toshi kernel: 
> g_vfs_done():ada0s3a[READ(offset=3277290733568, length=32768)]error = 5
> Jul  7 00:10:24 toshi kernel: 
> g_vfs_done():ada0s3a[READ(offset=3487372935168, length=32768)]error = 5
> Jul  7 00:10:24 toshi kernel: 
> g_vfs_done():ada0s3a[READ(offset=3697455136768, length=32768)]error = 5
> Jul  7 00:10:24 toshi kernel: 
> g_vfs_done():ada0s3a[READ(offset=3865520898048, length=32768)]error = 5
> 
> FSCK declares nothing to be wrong with the file-system. I even used the 
> '-r' inode reclaim option and '-Z' to zero unused blocks to no effect.
> 
> I now have two UFS-based systems showing the same symptoms - what's up 
> with this?

I've seen these kind of messages on 32-bit powerpc -r320570 when
using "boot -s" (standalone) and doing an fsck after making the
ufs root file system writable. (-r320570 could not boot
multi-user all the way without workarounds due to socket software
errors.) [Context was a production-style kernel build, not the
debug style --but I likely did not try this for a debug kernel
build.]

The messages came out before the following:
(manually retyped from a camera picture)

** //.snap/fsck_snapshot
** Last Mount on /
** Root file system
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
Reclaimed: 0 directories, 1 files, 22680 fragments
780914 files, 4797127 used, 19552199 free (443479 frags, 3288590 blocks, 1.8% fragmentation)

***** FILE SYSTEM MARKED CLEAN *****


There were a lot of the messages.


I've not checked if anything after -r320570
for 32-bit powerpc shows such or not. (The
socket software problem has an official fix
checked in: -r320652 . But I've not got as
far as progressing to it or beyond yet.)

-r320570 was a fix of another major problem
for the use of  __pthread_cleanup_push_imp
stubs.

I was not sure if the g_vfs_done notices
were a distinct issue from the other issues
of the time frame or not at the time and
did not get as far as investigating that
question at the time.


Both dump and fsck likely are using snapshots
so the issue is likely ties to ufs snapshots.
May be it has a INO64 incompleteness that
gives the huge offsets. (Wild guess.)

If your context was more typical then the issue
spans little-endian and big-endian since the
powerpc context is big-endian but most usage
is little endian.

===
Mark Millard
markmi at dsl-only.net
Received on Fri Jul 07 2017 - 21:02:55 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:12 UTC