Re: Would ZFS and gmirror work well together in a two-node failover cluster?

From: Sven Willenberger <sven_at_dmv.com> Date: Thu, 24 Jul 2008 14:52:41 -0400 · This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:39:33 UTC

On Fri, 2008-07-18 at 12:06 -0400, Maurice Volaski wrote:
> I am looking to put together a two-node high-availability cluster 
> where each node has identical data storage consisting of a set of 
> internal data drives (separate from the boot drive). I want ZFS to 
> manage the drives as a JDBOD in a RAIDZ2 configuration. Thus, if an 
> individual drive misbehaves or fails, ZFS detects and handles the 
> fault.
> 
> But I'm also looking to mirror this entire setup in real time to a 
> second identical server.
> 
> Basically, my question is can this work well on FreeBSD while taking 
> full advantage of ZFS?
> 
> Specifically, my understanding is that the only way to handle the 
> real time mirror is with gmirror and ggated, but it's not clear how 
> gmirror would interact with ZFS.
> 

My findings have been that ZFS and ggate[cd] do *not* play nicely
together so I would concur taht gmirror/ggate[cd] would be the way to
create a real-time mirror. (For those interested, I have posted some
information on how ggate[cd] and zpool will cause lockups rendering it
unsuitable for real-time multi-host mirroring over on -STABLE).

> I am assuming that gmirror operates only on individual drives, so if 
> I had a set of 24 drives on each server, there would be 24 mirrored 
> drive pairs.
> 
> One concern I have is that this setup could run into trouble with 
> gmirror's potentially sabotaging ZFS's RAIDZ2. For example, when a 
> drive starts failing, won't gmirror see it before ZFS does and take 
> the unfavorable action of substituting the corresponding drive in the 
> failover server in subsequent I/O, leaving ZFS's RAIDZ2 out of the 
> loop?
> 
> This is just one particular scenario, but in general, it's not 
> entirely clear that it's possible to have fine-grained control of 
> when, how much and in what direction gmirror manages synchronization 
> among drive pairs.
> 
> -Maurice

As was suggested in other followups, it would seem that zfs send/recv
may be a viable option depending on whethere it is granular enough
(timewise) to be practical.

As far as failover goes (e.g. yank power cord out) your best bet would
be using CARP as a virtual IP and using devd or ifstated to trigger an
event on CARP interface change (from BACKUP to MASTER or vice-versa).

I would be interested in seeing how well the zfs option of snapshots
would work as rebuilding a 500GB gmirror is really a lengthy process ...

Sven