What kind of periodic maintenance should I carry out on HDD backup?
Since it seems to have been missed by most posters here, this is my recommended answer to the specifics of your question, using this excellent post, What medium should be used for long term, high volume, data storage (archival)? as the guide. I'll not re-cite the references and research from there, as he did an excellent job, and reading the whole post is better than the summary for this case.
Limiting yourself to one HDD in cold storage (offline), with the two options given you should connect the drive every couple years, or thereabout, and spin it up. The biggest reason for doing so is to keep the spindle grease from hardening and seizing. The spindle grease will harden over time, and spinning the disk once in a while can significantly delay that eventuality. If you want to get some insight into the importance of the grease to a HDD look at the amount of effort Minebea, a HDD motor manufacturer puts into their research about it in this report.
While the disk is connected, you may as well run some SMART diagnostics to look for signs of impending failure of either the electronics, the hardware or the platter. Although, from the research presented at FAST'07 by Google and Carnegie Mellon University {winning 'Best Paper' that year}, the SMART test can be indicative of failure, but a 'passing' test may not be indicative of good health. Nevertheless, checking won't hurt. Yes, it is old research, but nobody seems to have replaced it with anything newer.
Having the drive running for a while, and accessing the data will also renew the strength of the magnetic fields holding the data. Some an argue that it is not necessary based on hordes of anecdotal evidence, but what research there is seems to indicate that the weakening of the magnetic fields is possible. I present three papers from the University of Wisconsin-Madison: Parity Pollution, Data Corruption, and Disk-Pointer Corruption. After reading these you can decide how much their conclusions threatens your data, and how much effort it is worth to protect against it.
Suggested curation routine
I don't know what OS you use, what tools you have or prefer, nor what file system you choose. Therefore my suggestions will be generic only, allowing you to choose the tools that best fit your configuration and preferences.
First is the setup for storage. Before saving the files to the HDD create archives of them. This doesn't imply compression, nor does it avoid it. Choose an archive format that will give you error recovery or 'self-healing' abilities. Don't create one massive archive, rather archive things that belong together, creating a library of archives. If you choose compression, then be sure that it doesn't interfere with the error recovery ability. For most music, video, movie, and picture formats there is no point in doing compression. Such file formats are already compressed, and trying to compress them rarely gains space, sometimes creating larger files, and wastes your time and CPU power in the bargain. Still, archive them for the error recovery above. Then create a check-sum for each archive file, using the digest algorithm of your choice. Security isn't the issue here, merely a sanity check for the file, so MD5 should suffice, but anything will work. Save a copy of the check-sums with the archive files, and in a second place on the same HDD - perhaps a dedicated directory for the total collection of check-sums. All this is saved to the disk. Next, and quite important, is to also save on that HDD the tools you used to create the check-sums and to restore the archives (and to uncompress them as well, if you used compression). Depending on your system this could be the programs themselves, or it might need to be the installers for them. Now you can store the HDD how you choose.
Second is the storage. Current HDDs are reasonably protected from physical shock (shaking and bouncing shock), but there's no point in pushing it either. Store it pretty much the way you have mentioned in your question. I would add in to try avoiding areas where it is likely to be subject to electro-magnetic forces. Not in the same closed as you circuit breaker panel or above your HAM radio, for example. Lightning miles away is something you can't avoid, but the vacuum cleaner and power say are avoidable. If you want to get extreme, get a Faraday shield or Faraday bag for it. Of you suggestions two are either pointless, or bad. Changing its physical position while it's stored will not affect anything that matters, and shaking it could cause damage, shouldn't as most drives have good G-shock protection, but it is possible.
Last is the periodic measures. On a schedule you choose, annually or bi-annually, for example, remove it from storage and reconnect it to the computer. Run the SMART test, and actually read the results. Be prepared to replace the disk when SMART results show you should, not "next time," but "this time." While it's connected check all the archive files against their check-sums. If any fail the check, try to use the archive format's error recovery abilities to restore that file, recreate the archive, and its check-sum and resave it. Since you also gave option 2 as having a "nice amount" of free space, copy the archives to new directories and then delete the originals. Simply "moving" them may not move them at all. On many newer file systems moving the file will change what directory it is listed in, but the file contents will stay where they are. By copying the file you force it to be written somewhere else, then you can free up the space by deleting the original. If you have many archive files, none are likely to be so large as to fill the free space on the HDD. After you have verified or restored all the files, and moved any you choose to, restore you packaging and put it back in storage until next time.
Extra things to pay attention to. When you upgrade your system or, worse, switch to a different OS, make sure you still have the ability to read that HDD in the new configuration. If you have anything that is not plain text, make sure that you don't loose the ability to read the file as saved. For example: MS-Word documents can have equations created in one format, newer versions cannot read those. See this for that very problem. Word isn't the only possible source of trouble, however, and not even Open Source formats guarantee that your data is future-proof. For a major blunder in this realm read about the failed Digital Domesday Book project. As new technologies appear, consider updating your collection as well. If you have movies saved as AVI files, and you like MKV better, convert them. If you have word processing documents and upgrade your program, resave the archived ones in the new format.
From a professional point of view, your options are:
- Pray.
- Make multiple copies, on multiple devices.
In your "option 1" (much more space) you could very marginally increase your odds by making multiple copies on the same hardware, but the fact is that hardware fails, not infrequently rendering the whole disk unreadable. A single copy is not a viable backup strategy.
I'm unclear if this is an actual backup (of files on a primary device) or an archive (of files removed from the primary device.) The extra copy is somewhat more important if you care at all about the archive case - in the backup case there is in theory a primary copy so you have to have at least two failures before you are totally out of luck.
If you have more free space than the backup data uses - your option 1 in the question - or if you have multiple copies of the data, I've got an idea that would "do something"; if you think SpinRite really helps with hard drive "maintenance" and/or want to completely overwrite and then re-write every bit of your data, this would do it.
Whether you should do something or not, I'm not too sure... bit-rot or Data Degradation seems to really exist, and questions like this one here on superuser and this one on serverfault seem to advise backups or maybe an error-correcting or fault-tolerant RAID (but for only a single hard drive I'd pick multiple backups & hash/CRC checks & not worry about what to do if a RAID fails).
I'm leaning towards the more simple and lazy "do-nothing" approach, but the following is at least a good "make sure I can still read my data once a year, and might as well re-write it too" idea.
Linux DIY Emulation of some SpinRite maintenance features
Lots of people seem convinced that SpinRite really works, but it's not free and I run Linux, so I've listened to Steve Gibson's HOW does SpinRite work? video and he says that one of the things SpinRite does now is:
- Reads the entire drive
- Flips the bits & writes them
- Reads them again
- Flips the bits back & writes them
- Reads them again
If the drive finds any (minor) problems, this should "induce the drive itself to swap the bad sectors with a good ones."
How often should you do this? Steve says "no one really know how often that is, but every few months should be often enough". I'm just guessing every 6 months or every year or so.
badblocks
The reading/flipping/reading/flipping process sounds nearly identical to what badblocks
does when it uses it's write-mode testing (-w
option) only it doesn't really "bit-flip" your data, but does destructively write, read & flip all the bits on the partition:
With this option, badblocks scans for bad blocks by writing some patterns (0xaa, 0x55, 0xff, 0x00) on every block of the device, reading every block and comparing the contents.
Not coincidentally, those patterns are, in binary: 10101010, 01010101, 11111111, 00000000.
So badblocks writes, reads & flips bits pretty thoroughly, and it's free too. If you have mke2fs
run badblocks (with badblocks -cc
) it'll save the list of badblocks so ext2/3/4 will avoid them, if any were found.
The downside is badblocks' write testing is destructive, so you'll need at least two partitions for this to work (to save & write back your data).
-
Keep two copies of your data on the hard drive, each on DIFFERENT PARTITIONS!.
This lets you overwrite every bit on a single partition with 10, 01, 11, 00 doubles your recovery chances if bad areas develop. And keep a list of checksums/hashes for your data files, like CRC32 or MD5 (though MD5/SHA's are very slow compared to CRC, and random errors shouldn't be missed by CRC) - Every few months:
- Read your backup copies & verify it still matches the checksums/hashes.
-
"Pseudo"-bit-flip a partition with
badblocks -w
ormke2fs -cc
(Only ONE partition, do not overwrite all your data, just one copy!) - Copy your data back onto the freshly flipped partition
- "Pseudo"-bit-flip the other partition (one that hasn't been flipped yet)
- Copy your data back onto that freshly flipped partition
This is similar to just reformatting & copying your data back, but a quick/standard format won't usually write to every sector, so you may end up not changing/flipping many of the bits
The best solution is always multiple copies on multiple devices.
I've read that optical media could be readable for 10, 20, maybe even 50+ years, and two identical disks/ISO's would fit with gddrescue
(below).
Cloud storage is often free for a few GB's, storing files there (optionally encrypted) may be a good idea, especially if the amounts keep going up.
Also, saving your files in an error-correcting archive may help if any errors do turn up, but losing one file out of a million may not be as bad as losing a whole archive of a million files. If any separate error-correcting software existed, like an ECC-CRC, that could help, but I don't know of any, and an extra copy of the data would be even better.
Tangentially related, SpinRite also "tries very hard" to read data from a bad sector of a hard drive, reading from different directions & velocities, which also sounds very similar to gddrescue
, in case (or when) you do run into trouble reading your data. gddrescue can also read from two copies of data with errors and hopefully piece together one full good copy, and I'm tempted to make two (or more) identical copies of your data partition with dd
, but then if badblocks does find any bad sectors you couldn't avoid them since it would change the identical copies.