Is there any way to speed up ddrescue?

I had a 500GB drive HDD crash about 5 days ago. I used ddrescue on the important partition a few days ago, and it's been on "Trimming failed blocks" for almost 2 days now.

Original command:

ddrescue -n /dev/rdisk1s2 /Volumes/OSXBackup/rdisk1s2.img /Volumes/OSXBackup/rdisk1s2.log

Current output:

Initial status (read from logfile)
rescued:   248992 MB,  errsize:   1007 MB,  errors:   15867
Current status
rescued:   249021 MB,  errsize:    978 MB,  current rate:    17408 B/s
   ipos:    44405 MB,   errors:   15866,    average rate:     2784 B/s
   opos:    44405 MB,     time from last successful read:       0 s
Trimming failed blocks...

The original command used the ddrescue -n parameter, and I have restarted the process a few times as needed (and it seemed to pick up right where it left off each time).

Is there any way to speed up this process?

Edit: Six hours later, this is the current status:

rescued:   249079 MB,  errsize:    920 MB,  current rate:      409 B/s
   ipos:    39908 MB,   errors:   15851,    average rate:     2698 B/s
   opos:    39908 MB,     time from last successful read:       0 s
Trimming failed blocks...

It appears that while "errors" is counting down excruciatingly slowly, ipos/opos is counting down how much data it has to churn through, and it seems to be working at a rate of 750MB/hour. At this rate, it will complete in ~53 hours. Yikes.

Edit #2: Two days later, still running. However, there is hope. It has moved passed the "Trimming failed blocks" portion, and on to the next phase "Splitting failed blocks". If anything, what should be taken away from viewing this question is that this definitely takes a long time when a good amount of data/errors are involved. My only hope is that I can successfully recover some important data when all is said and done.

rescued:   249311 MB,  errsize:    688 MB,  current rate:        0 B/s
ipos:    26727 MB,   errors:   15905,    average rate:     1331 B/s
opos:    26727 MB,     time from last successful read:      20 s
Splitting failed blocks...

Solution 1:

I observed that using the -n (no-split) option together with -r 1 (retry once) and setting -c (cluster size) to a smaller value can help.

My impression is that the splitting step is very slow as ddrescue splits and splits again the damaged areas. This takes a lot of time because ddrescue tries to restore very small portions of data. So, I prefer to use -n (no-split) together with -c 64, -c 32, -c 16, a.s.o.

Probably the -n (no-split) should always be used for one first pass in forward and reverse directions. It seems that the more the data were split, the slower the cloning, although I'm not sure about this. I assume the larger the non-treated areas, the best when running ddrescue again, because more contiguous sectors are to clone.

As I'm using a logfile, I don't hesitate to cancel the command with Ctrl+C when the data read speed becomes two low.

I also use the -R (Reverse) mode and after a first pass it often gives me higher speeds reading backwards than forward.

It's not clear to me how already retried sectors (-r N) are handled when running the ddrescue command again, especially when alternating forward (default) and reverse (-R) cloning commands. I'm not sure if the number of times they were tried is stored in the logfile and probably the work is done again useless.

Probably the -i (input position) flag can help speed up things too.

Solution 2:

It can be very hard to see the progress of ddrescue, but there is another command included called ddrescuelog.

A simple command ddrescuelog -t YourLog.txt will output these nice infos:

current pos:     2016 GB,  current status: trimming
domain size:     3000 GB,  in    1 area(s)
rescued:     2998 GB,  in 12802 area(s)  ( 99.91%)
non-tried:         0 B,  in    0 area(s)  (  0%)

errsize:     2452 MB,  errors:   12801  (  0.08%)
non-trimmed:   178896 kB,  in 3395 area(s)  (  0.00%)
non-split:     2262 MB,  in 9803 area(s)  (  0.07%)
bad-sector:    10451 kB,  in 19613 area(s)  (  0.00%)

You can even use it while ddrescue is running...

Solution 3:

If your aim is to obtain the bulk of the data intact, then you could speed up its extraction. But if you really want to rescue as much data as possible, then letting ddrecue nibble at each and every is the route to take.

Solution 4:

I have found that playing with the -K parameter you can speed things up. From what I've seen if ddrescue finds an error when running with the -n option tries to jump a fixed amount of sectors. If it still can't read it jumps double the size. If you have large damaged areas you can indicate a big K value (for example 100M) and so the jumping on an error will be larger the first time and it will be easier to avoid problematic areas quickly in the first past.

By the way, there is a wonderful graphical application to analyze the log.

http://sourceforge.net/projects/ddrescueview/

Solution 5:

One more way to monitor ddrescue's progress (on Linux, at least) is through the use of strace.

First, find the PID for the ddrescue process using "ps aux | grep ddrescue"

root@mojo:~# ps aux | grep ddrescue
root     12083  0.2  0.0  15764  3248 pts/1    D+   17:15   0:04 ddrescue --direct -d -r0 /dev/sdb1 test.img test.logfile
root     12637  0.0  0.0  13588   940 pts/4    S+   17:46   0:00 grep --color=auto ddrescue

Then run "strace" against that process. You'll see something like:

root@mojo:~# strace -p 12083
Process 12083 attached - interrupt to quit
lseek(4, 1702220261888, SEEK_SET)       = 1702220261888
write(4, "\3101\316\335\213\217\323\343o\317\22M\346\325\322\331\3101\316\335\213\217\323\343o\317\22M\346\325\322\331"..., 512) = 512
lseek(3, 1702220261376, SEEK_SET)       = 1702220261376
read(3, "\3101\316\335\213\217\323\343o\317\22M\346\325\322\331\3101\316\335\213\217\323\343o\317\22M\346\325\322\331"..., 512) = 512
lseek(4, 1702220261376, SEEK_SET)       = 1702220261376
write(4, "\3101\316\335\213\217\323\343o\317\22M\346\325\322\331\3101\316\335\213\217\323\343o\317\22M\346\325\322\331"..., 512) = 512
^C

...and so on. The output is fast and ugly, so I then pipe it through "grep" to filter out the stuff I care about:

root@mojo:/media/u02/salvage# nice strace -p 12083 2>&1|grep lseek
lseek(4, 1702212679168, SEEK_SET)       = 1702212679168
lseek(3, 1702212678656, SEEK_SET)       = 1702212678656
lseek(4, 1702212678656, SEEK_SET)       = 1702212678656
lseek(3, 1702212678144, SEEK_SET)       = 1702212678144
lseek(4, 1702212678144, SEEK_SET)       = 1702212678144
lseek(3, 1702212677632, SEEK_SET)       = 1702212677632
lseek(4, 1702212677632, SEEK_SET)       = 1702212677632
lseek(3, 1702212677120, SEEK_SET)       = 1702212677120
lseek(4, 1702212677120, SEEK_SET)       = 1702212677120
lseek(3, 1702212676608, SEEK_SET)       = 1702212676608
^C

In that example, the "1702212676608" equates to "the amount of data that still needs to be processed on that 2 Tb disk you're trying to salvage." (Yeah. Ouch.) ddrescue is spitting out a similar number -- albeit as "1720 GB" -- in its screen output.

strace gives you a MUCH higher granularity data stream for you to examine; it's one more way to evaluate the speed of ddrescue and estimate a completion date.

Running it constantly is probably a bad plan since it would compete with ddrescue for CPU time. I've taken to piping it to "head" so I can grab the first 10 values:

root@mojo:~# strace -p 4073 2>&1 | grep lseek | head

Hope this helps someone.