Replacing a failed disk in raidz2 zfs (and gpt)

From: Philip M. Gollucci <pgollucci_at_p6m7g8.com>
Date: Thu, 3 Feb 2011 06:11:34 +0000
All,

I have a zroot(mirror)+zmysql(raidz2) setup on a MySQL db box.
One drive failed (mfid3).  We've since replaced it.

I can't for the life of me get zpool to replace it. I can't remember why
I used gpt instead of direct disks for the zmysql pool (but thats how it
is).  I've tried all of the following commands with different errors,
and I must say I'm stumped.  I've done this several times before for the
ASF (but no gpt at play there).

$ zpool scrub zmysql
just runs, and completes, no error

$ zpool replace zmysql gpt/disk3
cannot replace gpt/disk3 with gpt/disk3: one or more devices is
currently unavailable

$ zpool remove zmysql gpt/disk3
cannot remove gpt/disk3: only inactive hot spares or cache devices can
be removed

$ zpool offline zmysql gpt/disk3
cannot offline gpt/disk3: no valid replicas

$ zpool add zmysql gpt/disk3
invalid vdev specification
use '-f' to override the following errors:
mismatched replication level: pool uses raidz and new vdev is disk

I would say thats b/c I didn't run gpt commands on it, but see below.
I think got copied over via raid card pass through, or it just hasn't
rescaned it yet.

$ zpool online zmysql gpt/disk3
warning: device 'gpt/disk3' onlined, but remains in faulted state
use 'zpool replace' to replace devices that are no longer present

$ zpool add zmysql spare gpt/disk3
cannot add to 'zmysql': one or more devices is currently unavailable

$ zpool replace zmysql gpt/disk3 gpt/disk3
cannot replace gpt/disk3 with gpt/disk3: one or more devices is
currently unavailable

Below is some system information.  More details on request.
No, I can not import/export the pool, or reboot the box.

Thanks in advance!


$ zpool status -v zmysql
  pool: zmysql
 state: DEGRADED
status: One or more devices could not be used because the label is
missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-4J
 scrub: scrub completed after 0h16m with 0 errors on Tue Feb  1 21:13:41
2011
config:

        NAME           STATE     READ WRITE CKSUM
        zmysql         DEGRADED     0     0     0
          raidz2       DEGRADED     0     0     0
            gpt/disk2  ONLINE       0     0     0
            gpt/disk3  UNAVAIL     15 6.96M     0  experienced I/O failures
            gpt/disk4  ONLINE       0     0     0
            gpt/disk5  ONLINE       0     0     0
            gpt/disk6  ONLINE       0     0     0
            gpt/disk7  ONLINE       0     0     0

errors: No known data errors


$ zpool upgrade
This system is currently running ZFS pool version 13.

All pools are formatted using this version.

$ zfs upgrade
This system is currently running ZFS filesystem version 3.

All filesystems are formatted with the current version.

$ hd -v /dev/mfid3p1 | head
hd: /dev/mfid3p1: Input/output error

$ hd -v /dev/gpt/disk3 | head
hd: /dev/gpt/disk3: Input/output error

$ ls /dev/mfid3*
crw-r-----  1 root  operator  -   0,  97 Nov 17 08:03:12 2010 mfid3
crw-r-----  1 root  operator  -   0, 107 Nov 17 08:03:12 2010 mfid3p1
crw-r-----  1 root  operator  -   0, 108 Nov 17 08:03:12 2010 mfid3p2
crw-r-----  1 root  operator  -   0, 109 Nov 17 08:03:12 2010 mfid3p3

$ ls /dev/gpt
total 1
dr-xr-xr-x  2 root  wheel     -      512 Nov 17 08:03:12 2010 ./
dr-xr-xr-x  7 root  wheel     -      512 Nov 17 08:03:12 2010 ../
crw-r-----  1 root  operator  -   0, 117 Nov 17 08:03:12 2010 disk0
crw-r-----  1 root  operator  -   0, 122 Nov 17 08:03:12 2010 disk1
crw-r-----  1 root  operator  -   0, 127 Nov 17 08:03:12 2010 disk2
crw-r-----  1 root  operator  -   0, 132 Nov 17 08:03:12 2010 disk3
crw-r-----  1 root  operator  -   0, 149 Nov 17 08:03:12 2010 disk4
crw-r-----  1 root  operator  -   0, 154 Nov 17 08:03:12 2010 disk5
crw-r-----  1 root  operator  -   0, 159 Nov 17 08:03:12 2010 disk6
crw-r-----  1 root  operator  -   0, 164 Nov 17 08:03:12 2010 disk7
crw-r-----  1 root  operator  -   0, 115 Nov 17 08:03:12 2010 swap0
crw-r-----  1 root  operator  -   0, 120 Nov 17 08:03:12 2010 swap1
crw-r-----  1 root  operator  -   0, 125 Nov 17 08:03:12 2010 swap2
crw-r-----  1 root  operator  -   0, 130 Nov 17 08:03:12 2010 swap3
crw-r-----  1 root  operator  -   0, 147 Nov 17 08:03:12 2010 swap4
crw-r-----  1 root  operator  -   0, 152 Nov 17 08:03:12 2010 swap5
crw-r-----  1 root  operator  -   0, 157 Nov 17 08:03:12 2010 swap6
crw-r-----  1 root  operator  -   0, 162 Nov 17 08:03:12 2010 swap7

(yes, I know its time to update, I'm waiting on 8.2)
$ uname -a
FreeBSD x 8.0-RELEASE-p2 FreeBSD 8.0-RELEASE-p2 #1 r203057: Wed Jan 27
06:42:10 UTC 2010     root_at_Z:/usr/obj/usr/src/sys/X  amd64

gpart show
=>       34  142081981  mfid0  GPT  (68G)
         34        128      1  freebsd-boot  (64K)
        162   50331648      2  freebsd-swap  (24G)
   50331810   90177536      3  freebsd-zfs  (43G)
  140509346    1572669         - free -  (768M)

=>       34  142081981  mfid1  GPT  (68G)
         34        128      1  freebsd-boot  (64K)
        162   50331648      2  freebsd-swap  (24G)
   50331810   90177536      3  freebsd-zfs  (43G)
  140509346    1572669         - free -  (768M)

=>       34  142081981  mfid2  GPT  (68G)
         34        128      1  freebsd-boot  (64K)
        162   50331648      2  freebsd-swap  (24G)
   50331810   90177536      3  freebsd-zfs  (43G)
  140509346    1572669         - free -  (768M)

=>       34  142081981  mfid3  GPT  (68G)
         34        128      1  freebsd-boot  (64K)
        162   50331648      2  freebsd-swap  (24G)
   50331810   90177536      3  freebsd-zfs  (43G)
  140509346    1572669         - free -  (768M)

=>       34  142081981  mfid4  GPT  (68G)
         34        128      1  freebsd-boot  (64K)
        162   50331648      2  freebsd-swap  (24G)
   50331810   90177536      3  freebsd-zfs  (43G)
  140509346    1572669         - free -  (768M)

=>       34  142081981  mfid5  GPT  (68G)
         34        128      1  freebsd-boot  (64K)
        162   50331648      2  freebsd-swap  (24G)
   50331810   90177536      3  freebsd-zfs  (43G)
  140509346    1572669         - free -  (768M)

=>       34  142081981  mfid6  GPT  (68G)
         34        128      1  freebsd-boot  (64K)
        162   50331648      2  freebsd-swap  (24G)
   50331810   90177536      3  freebsd-zfs  (43G)
  140509346    1572669         - free -  (768M)

=>       34  142081981  mfid7  GPT  (68G)
         34        128      1  freebsd-boot  (64K)
        162   50331648      2  freebsd-swap  (24G)
   50331810   90177536      3  freebsd-zfs  (43G)
  140509346    1572669         - free -  (768M)


$ pciconf -lv |grep ....
mfi0_at_pci0:2:14:0:       class=0x010400 card=0x1f031028 chip=0x00151028
rev=0x00 hdr=0x00
    vendor     = 'Dell Computer Corporation'
    device     = 'Integrated RAID controller (PERC 5/i RAID Controller)'

console/dmesg during hot swap:
mfi0: sense error 0, sense_key 0, asc 0, ascq 0
mfid3: hard error cmd=read fsbn 50331810
mfi0: 17960 (349585200s/0x0020/info) - Patrol Read started
mfi0: 18038 (349586341s/0x0020/info) - Patrol Read complete
mfi0: 18039 (349891840s/0x0002/WARN) - Removed: PD 03(e1/s3)
mfi0: 18040 (349891840s/0x0002/info) - Removed: PD 03(e1/s3) Info:
enclPd=08, scsiType=0, portMap=08, sasAddr=5000c50001439195,0000000000000000
mfi0: 18041 (349891840s/0x0002/info) - State change on PD 03(e1/s3) from
UNCONFIGURED_BAD(1) to FAILED(11)
mfi0: 18042 (349891840s/0x0002/info) - State change on PD 03(e1/s3) from
FAILED(11) to UNCONFIGURED_BAD(1)
mfi0: 18043 (349891857s/0x0002/info) - Inserted: PD 03(e1/s3)
mfi0: 18044 (349891857s/0x0002/info) - Inserted: PD 03(e1/s3) Info:
enclPd=08, scsiType=0, portMap=08, sasAddr=5000c5001ce0e065,0000000000000000
mfi0: 18045 (349891857s/0x0002/info) - State change on PD 03(e1/s3) from
UNCONFIGURED_BAD(1) to UNCONFIGURED_GOOD(0)

-- 
------------------------------------------------------------------------
1024D/DB9B8C1C B90B FBC3 A3A1 C71A 8E70  3F8C 75B8 8FFB DB9B 8C1C
Philip M. Gollucci (pgollucci_at_p6m7g8.com) c: 703.336.9354
VP Apache Infrastructure; Member, Apache Software Foundation
Committer,                        FreeBSD Foundation
Consultant,                       P6M7G8 Inc.
Sr. System Admin,                 Ridecharge Inc.

Work like you don't need the money,
love like you'll never get hurt,
and dance like nobody's watching.


Received on Thu Feb 03 2011 - 05:21:42 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:40:11 UTC