Re: zpool: multiple IDs, CURRENT drops all pools after reboot

From: Steven Hartland <killing_at_multiplay.co.uk>
Date: Tue, 16 Sep 2014 22:06:36 +0100
> On of my backup drives dedicated to a ZPOOL is faulting and showing up multiple ID. The
> only working ID is id: 257822624560506537.
> 
> FreeBSD CURRENT with three ZFS disks and only 4GB of RAM is very "flaky" regarding this
> issue: today, tow times the whole poolset vanishes after a reboot. Giving the box 8 GB
> total and rebooting doens't show the problem, it gets more frequent when reducing the RAM
> to 4GB (FreeBSD 11.0-CURRENT #2 r271684: Tue Sep 16 20:41:47 CEST 2014). This is a bit
> spooky.
> 
> Below the faulted harddrive. I guess the drive/pool below shown triggers somehow the loss
> of all other pools (I have to import the other pools, which do not have any defects, but
> they they drop out after a reboot and vanish).
> 
> Is there a way getting rid of the faulty IDs without destroying the pool?
> 
> Regards,
> 
> Oliver 
> 
>  root_at_thor: [/etc] zpool import
>    pool: BACKUP00
>      id: 9337833315545958689
>   state: FAULTED
>  status: One or more devices contains corrupted data.
>  action: The pool cannot be imported due to damaged devices or data.
>         The pool may be active on another system, but can be imported using
>         the '-f' flag.
>    see: http://illumos.org/msg/ZFS-8000-5E
>  config:
> 
>         BACKUP00               FAULTED  corrupted data
>           8544670861382329237  UNAVAIL  corrupted data
> 
>    pool: BACKUP00
>      id: 257822624560506537
>   state: ONLINE
>  action: The pool can be imported using its name or numeric identifier.
>  config:
> 
>         BACKUP00    ONLINE
>           ada3p1    ONLINE
> 

Might be a long shot but check out the patches on:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=187594

Specifically:
https://bugs.freebsd.org/bugzilla/attachment.cgi?id=147070

And if that doesn't work:
https://bugs.freebsd.org/bugzilla/attachment.cgi?id=147286

The second has all the changes from the first with the addition
of some changes which dynamically size the max dirty data.

These changes are in discussion and its likely the additions
in the second patch aren't the right direction but they
have been reported to show good improvements under high
memory pressure for certain workloads, so would be interesting
to see if they help with your problem.

All that said you shouldnt end up with corrupt data no matter
what.

Are there any other symptoms? Has memory been checked for
faults etc?

    Regards
    Steve
Received on Tue Sep 16 2014 - 19:06:44 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:52 UTC