Re: ZFS melting under postgres...

From: David Duchscher <daved_at_tamu.edu>
Date: Thu, 13 Dec 2007 07:59:35 -0600
On Dec 12, 2007, at 10:25 PM, Benjamin Close wrote:

> Hugo Silva wrote:
>> Benjamin Close wrote:
>>> Peter Losher wrote:
>>>> Hi,
>>>>
>>>> As part of our testing 7.0/ZFS we tried putting it thru it's paces
>>>> having ZFS act as our storage medium for some test pgsql db's  
>>>> (like for
>>>> sqlgrey, etc) and in both BETA2 and BETA4 (amd64) we get the same
>>>> results with a RAIDZ2 container:
>>>>
>>>> -=-
>>>> Dec 12 14:24:12 nsa sqlgrey: fatal: setconfig error at
>>>> /usr/local/sbin/sqlgrey line 186.
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad4 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad6 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad8 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad10 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad12 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad14 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad16 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad18 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad4 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad6 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad8 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad10 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad12 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad14 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad16 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad18 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: zpool I/O failure, zpool=vault  
>>>> error=86
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad4 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad6 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad8 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad10 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad12 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad14 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad16 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad18 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa postgres[50527]: [5-1] PANIC:  could not  
>>>> write to
>>>> log file 2, segment 53 at offset 7864320, length 8192: Input/ 
>>>> output error
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad4 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad6 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad8 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad10 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad12 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad14 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad16 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad18 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: zpool I/O failure, zpool=vault  
>>>> error=86
>>>> Dec 12 16:49:53 nsa postgres[50596]: [1-1] FATAL:  the database  
>>>> system
>>>> is starting up
>>>> Dec 12 16:49:53 nsa kernel: pid 50527 (postgres), uid 70: exited on
>>>> signal 6 (core dumped)
>>>> -=-
>>>>
>>>> It basically corrupts the container from the inside until it fails
>>>> completely (usually withing 24-48 hours depending on how busy  
>>>> the db is)
>>>>
>>>> I had thought it was a bad SATA replicator/controller, but we  
>>>> had that
>>>> replaced w/ one from Supermicro.  So it's either the disks, or  
>>>> something
>>>> in ZFS.  Anyone used ZFS to backend any db's (mysql or pgsql?)
>>>>
>>>> If you need more info, let me know...
>>>>
>>>>
>>> Try turning of zil, whilst I don't use a db, I have zfs under  
>>> high load. I've found without zil turned off I see checksum  
>>> corruption as well:
>>>
>>> /boot/loader.conf
>>>
>>> vfs.zfs.zil_disable=1
>>>
>>> Cheers,
>>>    Benjamin
>>
>> Wouldn't it be a bad idea to disable ZIL ?
>>
>> http://www.solarisinternals.com/wiki/index.php/ 
>> ZFS_Evil_Tuning_Guide#Disabling_the_ZIL_.28Don.27t.29
>
> A good read is:
>
> http://blogs.sun.com/perrin/entry/the_lumberjack
>
> Which shows why zil exists.
>
> Cheers,
>    Benjamin

So does anybody know of a battery backed NVRAM card that can be used  
with FreeBSD that the ZIL could be offloaded to?

--
DaveD
Received on Thu Dec 13 2007 - 13:15:27 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:24 UTC