HUH721010AL5200 drives with different number of sectors (WD vs Dell)

I ordered some spare hard drives to replace a failed hard drive in my zfs pool (raidz2 with 12 10TB hard drives on a supermicro server running Ubuntu). I made sure to order hard drives with the exact same model number as those already in the pool (HUH721010AL5200). However, it appears that the new ones I received are Dell OEM, while the original drives are Western Digital.

When I tried to replace the failed (WD) hard drive with a new one (Dell), the zfs replace command failed with the following error message: "Device is tool small".

Upon closer inspection, it appears that the two drives have a different number of sectors.

Here is the result of gdisk for my new drive:

gdisk -l /dev/sdi GPT fdisk (gdisk) version 1.0.3

Partition table scan: MBR: protective BSD: not present APM: not present GPT: present

Found valid GPT with protective MBR; using GPT. Disk /dev/sdi: 19134414848 sectors, 8.9 TiB Model: HUH721010AL5200 Sector size (logical/physical): 512/4096 bytes Disk identifier (GUID): ED5CF966-DA38-2D4B-8F8E-3C3867C25E07 Partition table holds up to 128 entries Main partition table begins at sector 2 and ends at sector 33 First usable sector is 34, last usable sector is 19134414814 Partitions will be aligned on 2048-sector boundaries Total free space is 4029 sectors (2.0 MiB)

Number Start (sector) End (sector) Size Code Name 1
2048 19134396415 8.9 TiB BF01 zfs-57963ba0e4d1284c 9
19134396416 19134412799 8.0 MiB BF07

And the same thing for one of the old drives:

gdisk -l /dev/sdh GPT fdisk (gdisk) version 1.0.3

Partition table scan: MBR: protective BSD: not present APM: not present GPT: present

Found valid GPT with protective MBR; using GPT. Disk /dev/sdh: 19532873728 sectors, 9.1 TiB Model: HUH721010AL5200 Sector size (logical/physical): 512/4096 bytes Disk identifier (GUID): 6EEE7537-C089-544B-A500-EE19A147CA99 Partition table holds up to 128 entries Main partition table begins at sector 2 and ends at sector 33 First usable sector is 34, last usable sector is 19532873694 Partitions will be aligned on 2048-sector boundaries Total free space is 4029 sectors (2.0 MiB)

Number Start (sector) End (sector) Size Code Name 1
2048 19532855295 9.1 TiB BF01 zfs-2363298e7ec25d90 9
19532855296 19532871679 8.0 MiB BF07

As you can see, the new drive has fewer sectors --> smaller capacity --> zfs refuses to use it.

I was told to update the firmware of the hard drive, but I'm not sure how to proceed and I want to be super careful not to lose any data.

Anyone has an idea, short of returning these drives and finding/buying the version made by WD?

Thank you,

jf

EDIT: adding the result of smartctl in response to the comment by @shodanshok

For the new drive (too small):

smartctl --all /dev/sdi
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.18.0-21-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               HGST
Product:              HUH721010AL5200
Revision:             LS17
Compliance:           SPC-4
User Capacity:        9,796,820,402,176 bytes [9.79 TB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
Formatted with type 2 protection
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000cca27349c848
Serial number:        2YH9KX5D
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Wed May 12 11:15:06 2021 CDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     31 C
Drive Trip Temperature:        50 C

Manufactured in week 36 of year 2018
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  3
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  5
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 17381195776

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0         95          0.044           0
write:         0        0         0         0          4          0.011           0
verify:        0        0         0         0        271          0.000           0

Non-medium error count:        0

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                   -   12484                 - [-   -    -]
# 2  Background short  Completed                   -   12413                 - [-   -    -]

Long (extended) Self Test duration: 63514 seconds [1058.6 minutes]

And for comparison, one of the older drives:

smartctl --all /dev/sdh
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.18.0-21-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               HGST
Product:              HUH721010AL5200
Revision:             A384
Compliance:           SPC-4
User Capacity:        10,000,831,348,736 bytes [10.0 TB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000cca2732326c0
Serial number:        2YGMA92D
Device type:          disk
Transport protocol:   SAS (SPL-3)
Local Time is:        Wed May 12 11:19:35 2021 CDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     32 C
Drive Trip Temperature:        85 C

Manufactured in week 23 of year 2018
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  41
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  2669
Elements in grown defect list: 0

Vendor (Seagate) cache information
  Blocks sent to initiator = 53115368926871552

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0       41         0        41    3717572      74439.685           0
write:         0        0         0         0     605524      58839.145           0
verify:        0        0         0         0      49259          0.000           0

Non-medium error count:        0

No self-tests have been logged

Thanks again for the help.

jf


The two drivers probably use a different Host Protected Area (HPA) settings. Please check it by using hdparm -N /dev/yourdisk

EDIT: based on your smartctl output, the first (smaller) disk is formatted with additional sector integrity data - for example, with 520-byte physical sectors. This will naturally means a smaller portion of the available raw storage capacity can be dedicated to user data.

You should be able to re-format your disk by using sg_format - ie: issuing something similar to sg_format --format --size=512 --fmtpinfo=0 /dev/yourdisk