Re: r327359: cylinder checksum failed: cg0, cgp: 0x4515d2a3 != bp: 0xd9fba319 Dec 30 23:29:24 <0.2>

From: Chris H <bsd-lists_at_BSDforge.com>
Date: Sun, 07 Jan 2018 21:09:47 -0800
On Sun, 7 Jan 2018 12:31:34 +0100 "O. Hartmann" <ohartmann_at_walstatt.org> said

> Am Thu, 4 Jan 2018 12:14:47 +0100
> "O. Hartmann" <ohartmann_at_walstatt.org> schrieb:
> 
> > On Thu, 4 Jan 2018 09:10:37 +0100
> > Michael Tuexen <tuexen_at_freebsd.org> wrote:
> > 
> > > > On 31. Dec 2017, at 02:45, Warner Losh <imp_at_bsdimp.com> wrote:
> > > > 
> > > > On Sat, Dec 30, 2017 at 4:41 PM, O. Hartmann <ohartmann_at_walstatt.org>
> > wrote:
> > > >     
> > > >> On most recent CURRENT I face the error shwon below on /tmp filesystem
> > > >> (UFS2) residing
> > > >> on a Samsung 850 Pro SSD:
> > > >> 
> > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > !=
> > > >> bp: 0xd9fba319
> > > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > > >> != bp: 0xd9fba319
> > > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > > >> != bp: 0xd9fba319
> > > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > > >> != bp: 0xd9fba319
> > > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > > >> UFS /dev/gpt/tmp (/tmp) cylinder checksum failed: cg 0, cgp: 0x4515d2a3
> > > >> != bp: 0xd9fba319
> > > >> handle_workitem_freefile: got error 5 while accessing filesystem
> > > >> 
> > > >> I've already formatted the /tmp filesystem, but obviously without any
> > > >> success.
> > > >> 
> > > >> Since I face such strange errors also on NanoBSD images dd'ed to SD
> > cards,
> > > >> I guess there
> > > >> is something fishy ...    
> > > > 
> > > > 
> > > > It indicates a problem. We've seen these 'corruptions' on data in motion
> > at
> > > > work, but I hacked fsck to report checksum mismatches (it silently
> > corrects
> > > > them today) and we've not seen any mismatch when we unmount and fsck the
> > > > filesystem.    
> > > Not sure this helps: But we have seen this also after system panics
> > > when having soft update journaling enabled. Having soft update journaling
> > > disabled, we do not observed this after several panics.
> > > Just to be clear: The panics are not related to this issue,
> > > but to other network development we do.
> > > 
> > > You can check using tunefs -p devname if soft update journaling is enabled
> > or
> > > not.  
> > 
> > In all cases I reported in earlier and now, softupdates ARE ENABLED on all
> > partitions in question (always GPT, in my cases also all on flash based
> > devices, SD card and/or SSD).
> 
> 
> ... and journalling as well!
> 
> In case of the SD, I produced the layout of the NanoBSD image via "dd"
> including the /cfg
> partition. The problem occured even when having overwritten the SD card with
> a new image.
> The problem went away once I unmounted /cfg and reformatted via newfs. After
> that, I did
> not see any faults again! I have no explanation for this behaviour except the
> dd didn't
> overwrite "faulty" areas or the obligate "gpart recover" at the end of the
> procedure
> restored something faulty.
> 
> The /tmp filesystem I reported in was also from an earlier date - and I
> didn't formatted
> it as I said - I confused the partition in question with another one. The
> partition has
> been created and formatted months ago under CURRENT.
> 
> In single user mode, I reformatted the partition again - with journaling and
> softupdates
> enabled. As with the /cfg partition on NanoBSD with SD card, I didn't realise
> any faults
> again since then. 
> 
FWIW I *also* experience this on gpart/FFS2 partitioned/formatted drives
*with* journaling enabled. As a result; if the system crashes, more often
times, than not, fsck(8) canNOT use the journal, and indicates that it
must "fall through" to complete the task. This is on a SATA (ahci) driven
disk. My experiences with this seem to suggest that journaling is the cause.
> > 
> > 
> > > 
> > > Best regards
> > > Michael  
> > > > 
> > > > Warner
> -- 
> O. Hartmann
> 
> Ich widerspreche der Nutzung oder Übermittlung meiner Daten für
> Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG).
--Chris
Received on Mon Jan 08 2018 - 04:09:39 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:14 UTC