Re: ZFS secondarycache on SSD problem on r255173

From: Steven Hartland <killing_at_multiplay.co.uk>
Date: Wed, 16 Oct 2013 15:55:26 +0100
I'm not clear what you rolled back there as r255173 has ntothing to do
with this. Could you clarify

Any errors recorded in /var/log/messages?

Could you add code to record the none zero value of zio->io_error in
l2arc_read_done as this may give some indication of the underlying
issue.

Additionally could always put a panic in that code path too and then
create a dump so the details can be fully exhamined.

In terms of the slowness thats going to be a side effect of the cache
failures.

Oh could you also confirm that the issue doesn't exist if you
1. Exclude r255753
2. Set vfs.zfs.max_auto_ashift=9

    Regards
    Steve
----- Original Message ----- 
From: "Vitalij Satanivskij" <satan_at_ukr.net>
To: "Steven Hartland" <killing_at_multiplay.co.uk>
Cc: "Vitalij Satanivskij" <satan_at_ukr.net>; "Dmitriy Makarov" <supportme_at_ukr.net>; "Justin T. Gibbs" <gibbs_at_freebsd.org>; "Borja 
Marcos" <borjam_at_sarenet.es>; <freebsd-current_at_freebsd.org>
Sent: Wednesday, October 16, 2013 3:10 PM
Subject: Re: ZFS secondarycache on SSD problem on r255173


> Yes
>
> We have 15 servers, all of them have problem while using with patch fo ashift, sh we rollback path (for r255173)
> and all of them works for a week without that's problem's. Yesterday one of of servers was updated to stable/10 (beta1)
>
> wich include patch  and after around 12 hours of works l2arc begin et errors like that
>
> kstat.zfs.misc.arcstats.l2_cksum_bad
> kstat.zfs.misc.arcstats.l2_io_error
>
>
> For now patch disabled in ower production.
>
>
> Please note we have very heavy load on zfs pool so 90GB arc and 3x180Gb L2arc have very big hit's on it  on it.
>
>
> SSD used for cache's is intel ssd 530 series smart for all devices in in normal states's
> no bad values on it.
>
> Steven Hartland wrote:
> SH> Have you confirmed the ashift changes are the actual cause of this
> SH> by backing out just those changes and retesting on the same hardware.
> SH>
> SH> Also worth checking your disks smart values to confirm there are no
> SH> visible signs of HW errors.
> SH>
> SH>     Regards
> SH>     Steve
> SH>
> SH> ----- Original Message ----- 
> SH> From: "Vitalij Satanivskij" <satan_at_ukr.net>
> SH> To: "Dmitriy Makarov" <supportme_at_ukr.net>
> SH> Cc: "Steven Hartland" <killing_at_multiplay.co.uk>; "Justin T. Gibbs" <gibbs_at_freebsd.org>; "Borja Marcos" <borjam_at_sarenet.es>;
> SH> <freebsd-current_at_freebsd.org>
> SH> Sent: Wednesday, October 16, 2013 9:01 AM
> SH> Subject: Re: ZFS secondarycache on SSD problem on r255173
> SH>
> SH>
> SH> > Hello.
> SH> >
> SH> > Patch brocke cache functionality.
> SH> >
> SH> > Look at's Dmitriy's mail from  Mon, 07 Oct 2013 21:09:06 +0300
> SH> >
> SH> > With subject ZFS L2ARC - incorrect size and abnormal system load on r255173
> SH> >
> SH> > As patch alredy in head and BETA it's not good.
> SH> >
> SH> > Yesterday we update one machine up to beta1 and forgot about patch. So 12 Hours and cache broken... :((
> SH> >
> SH> >
> SH> >
> SH> > Dmitriy Makarov wrote:
> SH> > DM> The attached patch by Steven Hartland fixes issue for me too. Thank you!
> SH> > DM>
> SH> > DM>
> SH> > DM> --- Исходное сообщение --- 
> SH> > DM> От кого: "Steven Hartland" < killing_at_multiplay.co.uk >
> SH> > DM> Дата: 18 сентября 2013, 01:53:10
> SH> > DM>
> SH> > DM> ----- Original Message ----- 
> SH> > DM> From: "Justin T. Gibbs" <
> SH> > DM>
> SH> > DM> --- 
> SH> > DM> Дмитрий Макаров
> SH> > DM> _______________________________________________
> SH> > DM> freebsd-current_at_freebsd.org mailing list
> SH> > DM> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> SH> > DM> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> SH> >
> SH>
> SH>
> SH> ================================================
> SH> This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the 
> event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any 
> information contained in it.
> SH>
> SH> In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
> SH> or return the E.mail to postmaster_at_multiplay.co.uk.
> SH>
> SH> _______________________________________________
> SH> freebsd-current_at_freebsd.org mailing list
> SH> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> SH> To unsubscribe, send any mail to "freebsd-current-unsubscribe_at_freebsd.org"
> 


================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster_at_multiplay.co.uk.
Received on Wed Oct 16 2013 - 12:55:11 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:43 UTC