Re: ZFS melting under postgres...

From: Benjamin Close <Benjamin.Close_at_clearchain.com>
Date: Thu, 13 Dec 2007 14:51:22 +1030
Kip Macy wrote:
> On Dec 12, 2007 6:58 PM, Hugo Silva <hugo_at_barafranca.com> wrote:
>   
>> Benjamin Close wrote:
>>     
>>> Peter Losher wrote:
>>>       
>>>> Hi,
>>>>
>>>> As part of our testing 7.0/ZFS we tried putting it thru it's paces
>>>> having ZFS act as our storage medium for some test pgsql db's (like for
>>>> sqlgrey, etc) and in both BETA2 and BETA4 (amd64) we get the same
>>>> results with a RAIDZ2 container:
>>>>
>>>> -=-
>>>> Dec 12 14:24:12 nsa sqlgrey: fatal: setconfig error at
>>>> /usr/local/sbin/sqlgrey line 186.
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad4 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad6 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad8 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad10 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad12 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad14 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad16 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad18 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad4 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad6 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad8 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad10 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad12 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad14 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad16 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad18 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: zpool I/O failure, zpool=vault error=86
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad4 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad6 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad8 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad10 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad12 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad14 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad16 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad18 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa postgres[50527]: [5-1] PANIC:  could not write to
>>>> log file 2, segment 53 at offset 7864320, length 8192: Input/output
>>>> error
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad4 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad6 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad8 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad10 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad12 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad14 offset=3665128448 size=22016
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad16 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: checksum mismatch, zpool=vault
>>>> path=/dev/ad18 offset=3665128448 size=21504
>>>> Dec 12 16:49:53 nsa root: ZFS: zpool I/O failure, zpool=vault error=86
>>>> Dec 12 16:49:53 nsa postgres[50596]: [1-1] FATAL:  the database system
>>>> is starting up
>>>> Dec 12 16:49:53 nsa kernel: pid 50527 (postgres), uid 70: exited on
>>>> signal 6 (core dumped)
>>>> -=-
>>>>
>>>> It basically corrupts the container from the inside until it fails
>>>> completely (usually withing 24-48 hours depending on how busy the db is)
>>>>
>>>> I had thought it was a bad SATA replicator/controller, but we had that
>>>> replaced w/ one from Supermicro.  So it's either the disks, or something
>>>> in ZFS.  Anyone used ZFS to backend any db's (mysql or pgsql?)
>>>>
>>>> If you need more info, let me know...
>>>>
>>>>
>>>>         
>>> Try turning of zil, whilst I don't use a db, I have zfs under high
>>> load. I've found without zil turned off I see checksum corruption as
>>> well:
>>>
>>> /boot/loader.conf
>>>
>>> vfs.zfs.zil_disable=1
>>>
>>> Cheers,
>>>    Benjamin
>>>       
>> Wouldn't it be a bad idea to disable ZIL ?
>>
>> http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Disabling_the_ZIL_.28Don.27t.29
>>     
>
> Yes. However, FreeBSD suffers from deadlocks under load if ZIL is enabled.
>
>  -Kip
>   
It also comes down to what your doing. ZFS is always consistent on disk. 
ZIL provides the journal between the last pool transaction write and 
what has changed since that write. Either way zfs will come up cleanly 
after a power failure, it's just whether you have those last few sync's 
or not.
For the application I'm using zfs for (rsynced backups, snapshoted 
daily) that'll be corrected the next day anyway. For a DB, this could be 
a show stopper.

Cheers,
    Benjamin
Received on Thu Dec 13 2007 - 03:18:19 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:24 UTC