Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

From: Warner Losh <imp_at_bsdimp.com>
Date: Mon, 8 Jan 2018 09:12:16 -0700
On Jan 8, 2018 8:34 AM, "Mark Johnston" <markj_at_freebsd.org> wrote:

On Thu, Jan 04, 2018 at 09:10:37AM +0100, Michael Tuexen wrote:
> > On 31. Dec 2017, at 02:45, Warner Losh <imp_at_bsdimp.com> wrote:
> >
> > On Sat, Dec 30, 2017 at 4:41 PM, O. Hartmann <ohartmann_at_walstatt.org>
wrote:
> >
> >> On most recent CURRENT I face the error shwon below on /tmp filesystem
> >> (UFS2) residing
> >> on a Samsung 850 Pro SSD:
> >>
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp:
0x4515d2a3 !=
> >> bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> >> != bp: 0xd9fba319
> >> handle_workitem_freefile: got error 5 while accessing filesystem
> >>
> >> I've already formatted the /tmp filesystem, but obviously without any
> >> success.
> >>
> >> Since I face such strange errors also on NanoBSD images dd'ed to SD
cards,
> >> I guess there
> >> is something fishy ...
> >
> >
> > It indicates a problem. We've seen these 'corruptions' on data in
motion at
> > work, but I hacked fsck to report checksum mismatches (it silently
corrects
> > them today) and we've not seen any mismatch when we unmount and fsck the
> > filesystem.
> Not sure this helps: But we have seen this also after system panics
> when having soft update journaling enabled. Having soft update journaling
> disabled, we do not observed this after several panics.
> Just to be clear: The panics are not related to this issue,
> but to other network development we do.

I saw the same issue this morning on a mirrored root filesystem after my
workstation came up following a power failure. fsck recovered using the
journal, and I subsequently saw a number of these checksum failures.
Upon shutdown, I saw the same handle_workitem_freefile errors as above.
I then ran a full fsck from single-user mode, which didn't turn up any
inconsistencies, and after that the checksum failure errors disappeared,
presumably because fsck fixed them.


Yes. Fsck automatically fixes issues like that. It does it silently. I have
patched to make it noisy, and the dozen cases I saw the errors, fsck was
silent with my whiny patches. I can put them up for review if people want...

Warner
Received on Mon Jan 08 2018 - 15:12:18 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:14 UTC