Re: ZFS melting under postgres...

From: Hugo Silva <hugo_at_barafranca.com>
Date: Thu, 13 Dec 2007 02:58:43 +0000
Benjamin Close wrote:
> Peter Losher wrote:
>> Hi,
>>
>> As part of our testing 7.0/ZFS we tried putting it thru it's paces
>> having ZFS act as our storage medium for some test pgsql db's (like for
>> sqlgrey, etc) and in both BETA2 and BETA4 (amd64) we get the same
>> results with a RAIDZ2 container:
>>
>> -=-
>> Dec 12 14:24:12 nsa sqlgrey: fatal: setconfig error at
>> /usr/local/sbin/sqlgrey line 186.
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad4 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad6 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad8 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad10 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad12 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad14 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad16 offset=3665128448 size=21504
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad18 offset=3665128448 size=21504
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad4 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad6 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad8 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad10 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad12 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad14 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad16 offset=3665128448 size=21504
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad18 offset=3665128448 size=21504
>> Dec 12 16:49:53 nsa root: ZFS: zpool I/O failure, zpool=vault error=86
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad4 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad6 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad8 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad10 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad12 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad14 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad16 offset=3665128448 size=21504
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad18 offset=3665128448 size=21504
>> Dec 12 16:49:53 nsa postgres[50527]: [5-1] PANIC:  could not write to
>> log file 2, segment 53 at offset 7864320, length 8192: Input/output 
>> error
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad4 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad6 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad8 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad10 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad12 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad14 offset=3665128448 size=22016
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad16 offset=3665128448 size=21504
>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>> path=/dev/ad18 offset=3665128448 size=21504
>> Dec 12 16:49:53 nsa root: ZFS: zpool I/O failure, zpool=vault error=86
>> Dec 12 16:49:53 nsa postgres[50596]: [1-1] FATAL:  the database system
>> is starting up
>> Dec 12 16:49:53 nsa kernel: pid 50527 (postgres), uid 70: exited on
>> signal 6 (core dumped)
>> -=-
>>
>> It basically corrupts the container from the inside until it fails
>> completely (usually withing 24-48 hours depending on how busy the db is)
>>
>> I had thought it was a bad SATA replicator/controller, but we had that
>> replaced w/ one from Supermicro.  So it's either the disks, or something
>> in ZFS.  Anyone used ZFS to backend any db's (mysql or pgsql?)
>>
>> If you need more info, let me know...
>>
>>   
> Try turning of zil, whilst I don't use a db, I have zfs under high 
> load. I've found without zil turned off I see checksum corruption as 
> well:
>
> /boot/loader.conf
>
> vfs.zfs.zil_disable=1
>
> Cheers,
>    Benjamin

Wouldn't it be a bad idea to disable ZIL ?

http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Disabling_the_ZIL_.28Don.27t.29

Regards,

Hugo

> _______________________________________________
> freebsd-current_at_freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to 
> "freebsd-current-unsubscribe_at_freebsd.org"
Received on Thu Dec 13 2007 - 01:58:29 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:24 UTC