How can I remove dedup from my pool without running out of RAM?
I have a server with 8 disk bays filled with 3 TB disks. Using 4 mirrored vdevs of 2 disks each, this gives me 12 TB of redundant storage.
Here's the issue - I read somewhere that I needed "x GBs of RAM for each TB of deduped data" (paraphrasing). I stupidly took this to mean that if my pool contained mostly data that could not be deduped, it wouldn't use much RAM. To my dismay, it seems that by "deduped data" he meant all the data in the pool that had dedup was enabled.
The result is that my system recently started locking up, presumably due to running out of RAM, and needed to be reset. When I realized my mistake, I thought I could fix it by creating a new dataset with dedup disabled, copy all my data over to the new dataset, then destroying the old dataset. Luckly I've only filled about 35% of my pool. Before attempting this, I disabled dedup on all my datasets.
Unfortunately, any time I attempt to delete something from the old dataset, all 16 threads on my system goes to 100% and all 24 GB of ram is suddenly filled (I see this through htop
) then my system locks up.
Is there any way I can dig myself out of this hole without destroying my entire pool and starting over?
Solution 1:
I actually figured this out on my own by just fumbling around. My system was automatically mounting ZFS volumes at boot time. If I booted my system normally, it would freeze during boot with the text "A start job is running for Mount ZFS datasets..." or something to that effect. If I booted in rescue mode, it would boot fine and get me to a prompt, but ZFS would be silently trying to mount my datasets in the background, eventually locking up my machine after 10-15 minutes. Additionally this denied me from making any changes to my pool.
I got around this by disabling the systemd task zfs-mount.service
and rebooting into rescue mode. Now I could selectively mount datasets and make changes to my pool without locking up the machine.
I still haven't solved my problem though. Even though I've disabled dedup, copied all the data from my deduped dataset into a new one and deleted the old dataset, I still have a huge DDT:
dedup: DDT entries 29022001, size 975 on disk, 315 in core bucket allocated referenced ______ ______________________________ ______________________________ refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE ------ ------ ----- ----- ----- ------ ----- ----- ----- 1 27.7M 2.78T 2.78T 2.78T 27.7M 2.78T 2.78T 2.78T 2 1.65K 191M 191M 191M 3.30K 382M 382M 383M 4 71 4.27M 4.27M 4.39M 310 19.4M 19.4M 19.8M 8 132 16.3M 16.3M 16.3M 1.18K 149M 149M 149M 16 3 32.5K 32.5K 36K 51 537K 537K 600K 4K 1 16K 16K 16K 6.61K 106M 106M 106M 128K 1 128K 128K 128K 146K 18.3G 18.3G 18.3G Total 27.7M 2.78T 2.78T 2.78T 27.8M 2.80T 2.80T 2.80T
However, since I figured out the "running out of RAM" part I'll consider this problem solved and rather post a new question one later if necessary.
Quick edit: My DDT appears to be shrinking, and quite fast at that. Perhaps it will shrivel up and disappear in due time. We shall see.
Another quick edit: Awesome! The DDT shrank quicker and quicker until finally the command zpool status -D tank
returns dedup: no DDT entries
.