How to tell rsync to skip files on a damaged hard drive block, instead of being stuck trying to read it

Solution 1:

Short answer: rsync it is not the right tool to be used in this case: its use can be even harmful.
Use ddrescue instead (better than dd_rescue). It is able to do what you are asking for.


If the disk is physically damaged, there is the possibility to brick it with any attempt to repair it.

It is not only a question about the use of your time, when rsync seems to hang forever approaching a damaged sector. The problem is that with repeated operations an irreparable failure can happens, and then you will be not anymore able to to rescue your data without expensive parts replacement (always if it will be still possible and you will not have bricked your HDD).

In this case the most safe procedure I found is

  1. To create a raw image on another not broken disk.
  2. To create a copy of that image.
  3. To work on the copy to fix the filesystem and to rescue the files.

Why the copy? Because if it fails something in the filesystem fixing step you can always start again without the need to touch again the original damaged HDD.

I suggest you to use ddrescue, to do the raw disk image, defects included, because it works fine even in case of read errors.


How to do it with ddrescue

You can use ddrescue exactly as you would like to use rsync, skipping the damaged sectors without retrying or splitting them, copying as much data as possible.
This command is here below (instead of /dev/hda1 you will put your device):

ddrescue --no-split /dev/hda1 imagefile logfile

After that you have done this first passage (the faster one), you can try to refine it trying to access for 3 times in case of error.

ddrescue --direct --max-retries=3 /dev/hda1 imagefile logfile 

You can continue to refine the image repeating the ddrescue command invocations with other options, trying each time to extract more data (see the references). When you will finish you can create the copy (if you have all the needed space) and then to fix the filesystem.

Note that the raw image will be as big as the original HDD.
You can find on internet, on this and on other sites of StackExchange many questions&answers about how to rescue data with ddrescue or other tools.

References:

  • On "Forensics Wiki"
  • On "What's the difference between ddrescue, gddrescue, and dd_rescue?"
  • On internet "LINUX - dd_rescue VS ddrescue (gddrescue BEST)"

Solution 2:

I found this topic while was searching for a solution to save data from a failing SD-card (Ubuntu 20.04).

The answer of Hastur is awesome although the commands don't work for me on Ubuntu 20.04.

Based on Grmpfhmbl comment and ddrescue --help

  • the --no-split flag was removed so can use --no-scrape instead

    -n, --no-scrape skip the scraping phase

ddrescue --no-scrape /dev/hda1 imagefile logfile

And the command to refine has changed, too.

  • use -d or --idirect for direct disk access for the input file

    -d, --idirect use direct disc access for input file

  • instead of --max-retries=3 we can use -r or --retry-passes=<n>

    -r, --retry-passes= exit after retry passes (-1=infinity) [0]

ddrescue --idirect --retry-passes=3 /dev/hda1 imagefile logfile 

I assume no credit for this post, it's just an update of the answer of Hastur.