Need to replace disk in zpool ... confused

I need to replace a bad disk in a zpool on FreeNAS.

zpool status shows

  pool: raid-5x3
 state: ONLINE
 scrub: scrub completed after 15h52m with 0 errors on Sun Mar 30 13:52:46 2014
config:

    NAME                                            STATE     READ WRITE CKSUM
    raid-5x3                                        ONLINE       0     0     0
      raidz1                                        ONLINE       0     0     0
        ada5p2                                      ONLINE       0     0     0
        gptid/a767b8ef-1c95-11e2-af4c-f46d049aaeca  ONLINE       0     0     0
        ada8p2                                      ONLINE       0     0     0
        ada10p2                                     ONLINE       0     0     0
        ada7p2                                      ONLINE       0     0     0

errors: No known data errors

  pool: raid2
 state: DEGRADED
status: One or more devices could not be opened.  Sufficient replicas exist for
    the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
   see: http://www.sun.com/msg/ZFS-8000-2Q
 scrub: none requested
config:

    NAME                                            STATE     READ WRITE CKSUM
    raid2                                           DEGRADED     0     0     0
      raidz1                                        DEGRADED     0     0     0
        gptid/5f3c0517-3ff2-11e2-9437-f46d049aaeca  ONLINE       0     0     0
        gptid/5fe33556-3ff2-11e2-9437-f46d049aaeca  UNAVAIL      0     0     0  cannot open
        gptid/60570005-3ff2-11e2-9437-f46d049aaeca  ONLINE       0     0     0
        gptid/60ebeaa5-3ff2-11e2-9437-f46d049aaeca  ONLINE       0     0     0
        gptid/61925b86-3ff2-11e2-9437-f46d049aaeca  ONLINE       0     0     0

errors: No known data errors

glabel status shows

                                      Name  Status  Components
                             ufs/FreeNASs3     N/A  da0s3
                             ufs/FreeNASs4     N/A  da0s4
                    ufsid/4fa405ab96518680     N/A  da0s1a
                            ufs/FreeNASs1a     N/A  da0s1a
                            ufs/FreeNASs2a     N/A  da0s2a
gptid/5f3c0517-3ff2-11e2-9437-f46d049aaeca     N/A  ada1p2
gptid/60570005-3ff2-11e2-9437-f46d049aaeca     N/A  ada3p2
gptid/60ebeaa5-3ff2-11e2-9437-f46d049aaeca     N/A  ada4p2
gptid/a767b8ef-1c95-11e2-af4c-f46d049aaeca     N/A  ada6p2
gptid/61925b86-3ff2-11e2-9437-f46d049aaeca     N/A  ada9p2
gptid/4599731b-8f15-11e1-a14c-f46d049aaeca     N/A  ada10p2

camcontrol devlist shows

<Hitachi HDS723030BLE640 MX6OAAB0>  at scbus0 target 0 lun 0 (pass0,ada0)
<ST3000VX000-9YW166 CV13>          at scbus4 target 0 lun 0 (pass1,ada1)
<ST3000VX000-9YW166 CV13>          at scbus6 target 0 lun 0 (pass3,ada3)
<Hitachi HDS723030BLE640 MX6OAAB0>  at scbus7 target 0 lun 0 (pass4,ada4)
<ST3000DM001-9YN166 CC4C>          at scbus8 target 0 lun 0 (pass5,ada5)
<WDC WD30EZRX-00MMMB0 80.00A80>    at scbus8 target 1 lun 0 (pass6,ada6)
<WDC WD30EZRX-00MMMB0 80.00A80>    at scbus9 target 0 lun 0 (pass7,ada7)
<ST3000DM001-9YN166 CC4C>          at scbus9 target 1 lun 0 (pass8,ada8)
<Hitachi HDS723030BLE640 MX6OAAB0>  at scbus10 target 0 lun 0 (pass9,ada9)
<Hitachi HDS5C3030ALA630 MEAOA580>  at scbus11 target 0 lun 0 (pass10,ada10)
< USB Flash Memory 1.00>           at scbus12 target 0 lun 0 (pass11,da0)

I'm pretty sure that ada2 is the bad disk.

It appears that I left a spare in there - ada0 - last time I was in the box. Can I replace ada2 with ada0 remotely? Until someone gets to the office? With what commands?

Here's what I don't understand:

  1. Why don't ada0, ada2, ada5, ada7, and ada8 appear in glabel status?
  2. Why does zpool status show those long gptid's for some disks, and "ada" names for others?
  3. If I want to zpool replace raid2 -- what do I use for the device and new-device names?

Solution 1:

FreeNAS is a NAS solution, as such, some technical choices are hidden behind whatever firmware, system or GUI such appliance can use.

If you get the partition schema used on a given disk inside a ZFS pool made with FreeNAS (small VM example):

$ glabel status
                                      Name  Status  Components
gptid/a699226f-bcc4-11e3-952d-0800271cd34d     N/A  ada4p2
gptid/a6cfc072-bcc4-11e3-952d-0800271cd34d     N/A  ada5p2
gptid/a707f034-bcc4-11e3-952d-0800271cd34d     N/A  ada6p2

A closer look to the disk ada4:

$ gpart show ada4
=>      34  62914493  ada4  GPT  (30G)
        34        94        - free -  (47k)
       128   4194304     1  freebsd-swap  (2.0G)
   4194432  58720095     2  freebsd-zfs  (28G)

FreeNAS is adding a small swap partition on each added disk, the remaining disk space being on the 2nd partition (aka p2 for ada4p2).

Why ?

Why not. IMHO, it might have something to do with partition alignment, but it can also be simply because FreeNAS is usually installed on a USB key or some small CF drive without any swap (or one being the excuse for the other).

About your questions:

  1. Why don't ada0, ada2, ada5, ada7, and ada8 appear in glabel status?

    glabel is the short for GEOM labelling. So it only displays information regarding supported partitions/fs (see man glabel for a more complete list of supported partitions). In this case, the disks themselves and the swap partitions are not shown.

  2. Why does zpool status show those long gptid's for some disks, and "ada" names for others?

    Same as question 1: because of GEOM labelling, or in that particular case, lack of it.

    Sometimes partitions not initiated/labelled through glabel (or they loose this information ). In all cases: don't worry too much, this is only a naming thing. So it would not be the end of the world if one partition is having a gptid and the other a simple device name.

    Of course, you cannot change the label once the partitions are in a zpool (the system of preventing you from modifying used partitions - make sense).

  3. If I want to zpool replace raid2 -- what do I use for the device and new-device names?

    As seen, it might be better to let FreeNAS set things for you regarding the disks partitioning: replacing a failed drive on FreeNAS.

    However, it is also possible to do it by hand without worrying about partitioning (the re-silvering will kick-in automatically and would last about the time you are used while doing a scrub - to give you a magnitude order):

    $ zpool replace raid2 gptid/5fe33556-3ff2-11e2-9437-f46d049aaeca /dev/ada0
    

    Once the re-silvering is done, you would have something like that:

    $ zpool status raid2
      pool: raid2
     state: ONLINE
      scan: resilvered ...G in ?h?m with 0 errors on Sun Apr  6 17:17:25 2014
    config:
    
            NAME                                              STATE     READ WRITE CKSUM
            NAME                                              STATE     READ WRITE CKSUM
            raid2                                             ONLINE       0     0     0
              raidz1                                          ONLINE       0     0     0
                gptid/5f3c0517-3ff2-11e2-9437-f46d049aaeca    ONLINE       0     0     0
                replacing-0
                  gptid/5fe33556-3ff2-11e2-9437-f46d049aaeca  UNAVAIL      0     0     0  cannot open
                  ada0                                        ONLINE       0     0     0
                gptid/60570005-3ff2-11e2-9437-f46d049aaeca    ONLINE       0     0     0
                gptid/60ebeaa5-3ff2-11e2-9437-f46d049aaeca    ONLINE       0     0     0
                gptid/61925b86-3ff2-11e2-9437-f46d049aaeca    ONLINE       0     0     0
    

    Then a zpool detach raid2 gptid/5fe33556-3ff2-11e2-9437-f46d049aaeca to remove the faulty device from the zpool.

    You could still plan for a better/more "in-line" replacement when you have a new disk exchanged for the faulty one. I advise you to carefully set things on a VM and prepare all that (as it seems you are new to this).

    ZFS is a nice filesystem with lot of great features BUT it requires planning.

For more information:

  • ZFS and partition alignment
  • man glabel
  • Replacing a failed drive with FreeNAS

Solution 2:

I think you should reconsider your use of FreeNAS. You've had an uncharacteristically. large. number. of issues. with. your. FreeNAS. installation(s). over. the years.

Many of these issues were planning and ZFS design problems. It may be time to refactor or rebuild your environment now that you have some knowledge of best or better-practices.