Re: ZFS txg implementation flaw

From: Slawa Olhovchenkov <slw_at_zxy.spb.ru>
Date: Tue, 29 Oct 2013 00:57:34 +0400
On Mon, Oct 28, 2013 at 04:51:02PM -0400, Allan Jude wrote:

> On 2013-10-28 16:48, Slawa Olhovchenkov wrote:
> > On Mon, Oct 28, 2013 at 02:28:04PM -0400, Allan Jude wrote:
> >
> >> On 2013-10-28 14:16, Slawa Olhovchenkov wrote:
> >>> On Mon, Oct 28, 2013 at 10:45:02AM -0700, aurfalien wrote:
> >>>
> >>>> On Oct 28, 2013, at 2:28 AM, Slawa Olhovchenkov wrote:
> >>>>
> >>>>> I can be wrong.
> >>>>> As I see ZFS cretate seperate thread for earch txg writing.
> >>>>> Also for writing to L2ARC.
> >>>>> As result -- up to several thousands threads created and destoyed per
> >>>>> second. And hundreds thousands page allocations, zeroing, maping
> >>>>> unmaping and freeing per seconds. Very high overhead.
> >>>>>
> >>>>> In systat -vmstat I see totfr up to 600000, prcfr up to 200000.
> >>>>>
> >>>>> Estimated overhead -- 30% of system time.
> >>>>>
> >>>>> Can anybody implement thread and page pool for txg?
> >>>> Would lowering vfs.zfs.txg.timeout be a way to tame or mitigate this?
> >>> vfs.zfs.txg.timeout: 5
> >>>
> >>> Only x5 lowering (less in real case with burst writing). And more fragmentation on writing and etc.
> >>> _______________________________________________
> >>> freebsd-current_at_freebsd.org mailing list
> >>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> >>> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> >> >From my understanding, increasing the timeout so you are doing fewer
> >> transaction groups, would actually be the way to increase performance,
> >> at the cost of 'bursty' writing and the associated uneven latency.
> > This (increasing the timeout) is dramaticaly decreasing read
> > performance by very high IO burst.
> It shouldn't affect read performance, except during the flush operations
> (every txg.timeout seconds)

Yes, I talk about this time.

> If you watch with 'gstat' or 'gstat -f ada.$' you should see the cycle
> 
> reading quickly, then every txg.timeout seconds (and for maybe longer),
> it flushes the entire transaction group (may be 100s of MBs) to the
> disk, this high write load may make reads slow until it is finished.

Yes. And read may delayed for some seconds.
This is unacceptable for may case.
Received on Mon Oct 28 2013 - 19:55:37 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:43 UTC