Wrong dd command on main drive - How to recover data?
I assume the partition table and boot partition can be recreated easily, so I will focus on the ext4 partition.
The layout of the filesystem is somewhat dependent on the options used when creating it. I'll describe the common case. You can see if this matches yours by running dumpe2fs
on the device (which will hopefully find all of the top-level metadata in cache rather than reading from disk).
The normal block size for ext4 filesystems is 4096 bytes, so you have lost 1024 blocks.
The first thing overwritten was block 0, the primary superblock. This is not a problem by itself, because there are backup superblocks. After that is the group descriptor table, which also has backups within the filesystem.
Then there are block bitmaps and inode bitmaps. This is where the news starts to get slightly worse. If any of these are below block 1024, which they probably are, you've lost information about which inodes and blocks are in use. This information is redundant, and will be reconstructed by fsck based on what it finds traversing all the directories and inodes, if those are intact.
But the next thing is the inode table, and here you've probably lost a lot of inodes, including the root directory, journal, and other special inodes. It will be nice to have those back. Obviously the root directory at least is still functional, or just about all commands you try to run would be failing already.
If you run a dd if=/dev/nvme1n1p2 of=/some/external/device bs=4096 count=1024
now, you'll get a backup copy of whatever is in your cache currently, mixed with the bad data for the blocks that aren't cached. Then after booting a rescue disk you can do the same dd
in reverse, to put that partially-good data back on the disk, overwriting the all-bad stuff that's there now.
After this you might find automated recovery tools (fsck
, testdisk
) work well enough. If not, you have a map you can use to help with manual recovery. Using the "free block" lists from dumpe2fs
, you know which blocks to ignore.
Most of what you lost is probably inodes. It's actually fairly likely that you had no file contents in the first 4MB of disk. (I ran mkfs.ext4
with no options on a 1TB image file, and the first non-metdata block turned out to be block 9249)
Every inode you manage to recover will identify the data blocks of a whole file. And those data blocks might be located all over the disk, not necessarily nearby.
Day 2
The dump posted on pastebin reveals great news:
Group 0: (Blocks 0-32767) csum 0x9569 [ITABLE_ZEROED]
Primary superblock at 0, Group descriptors at 1-117
Reserved GDT blocks at 118-1141
Block bitmap at 1142 (+1142)
Inode bitmap at 1158 (+1158)
Inode table at 1174-1685 (+1174)
21349 free blocks, 8177 free inodes, 2 directories, 8177 unused inodes
Free blocks: 11419-32767
Free inodes: 16-8192
Since we think only 4MB at the start of the filesystem have been overwritten, we only need to worry about blocks 0-1023. And the reserved GDT blocks go all the way out to block 1141! This is the kind of damage that should be repaired by a simple e2fsck -b $backup_superblock_number
(after a reboot). You could at least try that with -n
to see what it thinks.
If the disk used GPT, the partition table should be recoverable by using the backup GPT data at the end of the disk. You can do this with gdisk
; see the gdisk
documentation on data recovery for details. In brief: When you launch gdisk
on the disk, it will probably notice the damage and ask you if you want to use the backup GPT data or the MBR data. If you pick the GPT option and then write the changes, the partition table will be fixed. If gdisk
does not ask about which partition table to use, you might still be able to load the backup table using the c
option on the recovery & transformation menu.
If this fails, you could still re-create the partition table (or at least, the partitions' start and end points) by using the data in /sys/block/nvme1n1/nvme1n1p1/start
and /sys/block/nvme1n1/nvme1n1p1/size
files (and similarly for /dev/nvme1n1p2
). If you resort to this data, though, it's imperative that you NOT shut down the computer, contrary to what hek2mgl advised. That said, hek2mgl is not wrong that continuing to use the disk in its current state runs the risk of making matters worse. Overall, I'd say the best compromise is to try to fix the partition table problem as quickly as possible and then shut down and fix the filesystem problem from an emergency disk.
Unfortunately, your ESP is toast. Given your disk layout, I'm guessing you mounted the ESP at /boot
and stored your kernels there. Thus, you'll need to re-install your kernel packages using a chroot
or some other means. Ditto for your boot loader or boot manager.