Trigger ZFS dedup one-off scan/rededup

Solution 1:

No, you can't dedup existing data without copying it. Rember, you'll only benefit form dedup if the whole Dedup-Table fits into the RAM/L2ARC.

You can estimate the benefits of dedup with zds -S poolname withoute even turning on dedup:

pfexec zdb -S rpool Simulated DDT histogram:

bucket              allocated                       referenced          
______   ______________________________   ______________________________
refcnt   blocks   LSIZE   PSIZE   DSIZE   blocks   LSIZE   PSIZE   DSIZE
------   ------   -----   -----   -----   ------   -----   -----   -----
     1     313K   13.4G   13.4G   13.4G     313K   13.4G   13.4G   13.4G
     2     111K   5.27G   5.27G   5.27G     233K   10.7G   10.7G   10.7G
     4    5.15K   96.2M   96.2M   96.2M    22.4K    403M    403M    403M
     8    1.03K   12.2M   12.2M   12.2M    10.3K    111M    111M    111M
    16      384   16.3M   16.3M   16.3M    8.10K    350M    350M    350M
    32      157   6.17M   6.17M   6.17M    6.47K    250M    250M    250M
    64       83   6.52M   6.52M   6.52M    6.37K    511M    511M    511M
   128       17    395K    395K    395K    2.61K   62.5M   62.5M   62.5M
   256        2      5K      5K      5K      802   2.24M   2.24M   2.24M
    2K        1     512     512     512    2.66K   1.33M   1.33M   1.33M
    8K        1    128K    128K    128K    8.21K   1.03G   1.03G   1.03G
 Total     431K   18.8G   18.8G   18.8G     613K   26.8G   26.8G   26.8G

dedup = 1.43, compress = 1.00, copies = 1.00, dedup * compress / copies = 1.43

Solution 2:

Beware that the current dedup implementation (build 134) is RAM demanding and has an outstanding issue when large amounts of data are deleted, more or less bricking your ZFS pool for a significant period of time. http://bugs.opensolaris.org/bugdatabase/view_bug.do;jsessionid=a24a5761748eedbb50cd39d3530e?bug_id=6924390

About deduping existing data, copying/moving files one by one while staying on the same pool should do the trick.

Solution 3:

Great answer by blasafer, I'd just add that Block Pointer rewrite is a planned feature that'll enable re-compression of already stored data, maybe it could be used to re-dedup too. But it's in the future and I'm just guessing anyway.