Re: ZFS melting under postgres...

From: Pawel Jakub Dawidek <pjd_at_FreeBSD.org>
Date: Tue, 22 Jan 2008 10:45:47 +0100
On Wed, Dec 12, 2007 at 03:17:29PM -0800, Peter Losher wrote:
> Hi,
> 
> As part of our testing 7.0/ZFS we tried putting it thru it's paces
> having ZFS act as our storage medium for some test pgsql db's (like for
> sqlgrey, etc) and in both BETA2 and BETA4 (amd64) we get the same
> results with a RAIDZ2 container:
> 
> -=-
> Dec 12 14:24:12 nsa sqlgrey: fatal: setconfig error at
> /usr/local/sbin/sqlgrey line 186.
> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
> path=/dev/ad4 offset=3665128448 size=22016
[...]
> It basically corrupts the container from the inside until it fails
> completely (usually withing 24-48 hours depending on how busy the db is)
> 
> I had thought it was a bad SATA replicator/controller, but we had that
> replaced w/ one from Supermicro.  So it's either the disks, or something
> in ZFS.  Anyone used ZFS to backend any db's (mysql or pgsql?)
> 
> If you need more info, let me know...

It is hard for me to believe that this is FreeBSD-specific bug, because
checksumming is below FreeBSD-specific code. Of course everything is
possible, but I just think it's just unlikely.

I'd start from configuring UFS on top of GELI with authentication. GELI
will also detect silent data corruptions:

	# geli init -a hmac/md5 -e null -s 4096 -P -K /dev/null /dev/ad4
	# geli attach -p -k /dev/null /dev/ad4
	# dd if=/dev/zero of=/dev/ad4.eli bs=1m (this will take a while)
	# newfs -U /dev/ad4.eli
	# mount -o noatime /dev/ad4.eli /mnt/tmp
	Try your DB test on this file system.

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd_at_FreeBSD.org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

Received on Tue Jan 22 2008 - 08:46:07 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:26 UTC