Recovering a partition with no valid superblocks
My system had these partitions:
- DellUtility
- Windows 7
- Ubuntu 14.04 (64bit) - extended
- /home - extended
- virtualbox - extended
- swap - extended
(A Dell box with Ubuntu dual-boot and seperate home / "other data" partitions)
Sequence of Events
- Distro upgrade to 16.04.1, apparently successful
- reboot and end up in Emergency Mode as described e.g. here
- use of systemctl etc. was unsuccessful; a suggestion was using Upstart via GRUB, which booted to desktop
- the next boot (into Emergency Mode) did not succeed; now the /home partition was unavailable
- boot via USBDrive; /home still unavailable, listed as "unknown partition" by gparted
- lengthy investigation via fdisk, testdisk etc. wiki, a howto showed that all superblocks were corrupted, no backups available. I'm not a recovery expert but this seems unusual.
- Make a testdisk image into "virtualbox" partition
- Follow process described here without success, including last-ditch use of
mke2fs -S
. - testdisk deeper search finds partition, but not the superblocks; alogn with messages like
No ext2, JFS, Reiser, cramfs or XFS marker
- Tools like photorec can retrieve some files, but they are garbled, miss filename / structure, and many are encrypted due to it being a /home partition
- Decide that as I have a backup of the partition, and the original data, delete and replace the /home partition, which succeeds. Boot into "home-new" and set it up...
- and on the next boot, imagine my surprise to find both "home-new" and the "virtualbox" partition are unavailable. Same superblock issue. Well, great.
- Using USBDrive, re-re-create home partition in a different disk location; this appears stable. Delete "home-new".
- Realise that the backup of /home, and some critical data, are on that now-unreadable partition; obtain a USB disk and make an image on there
Current Situation
My partitions are:
- DellUtility (OK)
- Windows 7 (OK)
- Ubuntu 16.04 (OK)
- /home-new (bad, deleted, in same disk location as original /home) - extended
- /home (seems OK, at different disk location) - extended
- virtualbox (bad) - extended
- swap (OK) - extended
The "virtualbox" partition is ~550GB. It is unreadable and has no valid superblocks.
Important data on it: * a testdisk image of my original /home partiton , ~50GB, itself unreadable with no valid superblocks * some original data from /home that wasn't handled by Crashplan. Not enough to be a critical problem, but enough to be annoying.
Note that first boots were fine - it was on the next boot the issues occurred. The "virtualbox" partition has been backed up as a Testdisk image on a USB drive.
fdisk:
sudo fdisk -l /dev/sda
Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x15642c74
Device Boot Start End Sectors Size Id Type
/dev/sda1 * 63 80324 80262 39.2M 6 FAT16
/dev/sda2 18395136 427995135 409600000 195.3G 7 HPFS/NTFS/exFAT
/dev/sda3 427995136 1920120831 1492125696 711.5G f W95 Ext'd (LBA)
/dev/sda4 1920122880 1953523711 33400832 15.9G 82 Linux swap / Solaris
/dev/sda5 438233088 489433087 51200000 24.4G 83 Linux
/dev/sda6 773240832 1920120831 1146880000 546.9G 83 Linux
/dev/sda7 633921536 736321535 102400000 48.8G 83 Linux
Partition 1 does not start on physical sector boundary.
Partition table entries are not in disk order.
dumpe2fs:
sudo dumpe2fs /dev/sda6
dumpe2fs 1.42.13 (17-May-2015)
dumpe2fs: Bad magic number in super-block while trying to open /dev/sda6
Couldn't find valid filesystem superblock.
parted:
sudo parted -l
Model: ATA ST1000DM003-1CH1 (scsi)
Disk /dev/sda: 1000GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags:
Number Start End Size Type File system Flags
1 32.3kB 41.1MB 41.1MB primary fat16 boot
2 9418MB 219GB 210GB primary ntfs
3 219GB 983GB 764GB extended lba
5 224GB 251GB 26.2GB logical ext4
7 325GB 377GB 52.4GB logical ext4
6 396GB 983GB 587GB logical
4 983GB 1000GB 17.1GB primary linux-swap(v1)
Hypothesis
As you can imagine, this is very frustrating. Note the "Partition 1 does not start on physical sector boundary." warning in fdisk, which did not appear before.
Going by Partition does not start on physical sector boundary? , I suspect there is some alignment issue which confuses low-level disk utilities, introduced by the 16.04 upgrade, but that's just a suspicion.
gdisk:
sudo gdisk -l /dev/sda
GPT fdisk (gdisk) version 1.0.1
Partition table scan:
MBR: MBR only
BSD: not present
APM: not present
GPT: not present
***************************************************************
Found invalid GPT and valid MBR; converting MBR to GPT format
in memory.
***************************************************************
Disk /dev/sda: 1953525168 sectors, 931.5 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 8C15FD44-F839-4637-853E-C092F0959C48
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1953525134
Partitions will be aligned on 8-sector boundaries
Total free space is 209964007 sectors (100.1 GiB)
Number Start (sector) End (sector) Size Code Name
1 63 80324 39.2 MiB 0700 Microsoft basic data
2 18395136 427995135 195.3 GiB 0700 Microsoft basic data
4 1920122880 1953523711 15.9 GiB 8200 Linux swap
5 438233088 489433087 24.4 GiB 8300 Linux filesystem
6 773240832 1920120831 546.9 GiB 8300 Linux filesystem
7 633921536 736321535 48.8 GiB 8300 Linux filesystem
Questions
- How can I identify the cause? My system now seems stable, but am wary of further low-level disk writes.
- Is it possible to recover the superblock-less "virtualbox" partition - and then hopefully the superblock-less "home" partition image within?
Perhaps the superblocks are intact, but via an offset that the system isn't aware of.
Solution 1:
Have made some progress (and a partial answer):
Noticed that in some applications (e.g. System Monitor), at some point the swap became reported as 546.9GB. The swap should be 15.9GB, and that's a number suspiciously near the broken "virtualbox" partition.
lsblk
showed that /dev/sda6 - the partition - was also mapped via cryptswap1 to swap.
/etc/crypttab had:
cryptswap1 /dev/sda6 /dev/urandom swap,cipher=aes-cbc-essiv:sha256
The smoking gun! So hypothesis now is the 16.04 upgrade failed to re-configure swap correctly, and on later boots the swap startup broke the partition (which would explain why first boots were successful).
- Disable swap everywhere (used technique in What to do about "the disk drive for /dev/mapper/cryptswap1 is not ready yet or not present"? )
- Reboot and confirm swap-less state
- Use testdisk to investigate. It attempts to mark active partitions as deleted and can't find the partition, so quit.
- Confirm state unchanged with
sudo dumpe2fs /dev/sda6
dumpe2fs: Bad magic number in super-block while trying to open /dev/sda6 Couldn't find valid filesystem superblock.
- So use the last ditch method described above:
sudo /sbin/mkfs.ext4 -S -v /dev/sda6
sudo mount /dev/sda6 /media/USER/virtualbox-image/
- ... and ls lists some of the files / directories!
Well, hooray!!
I couldn't access the files, so ran fsck. There were so many errors I gave up with preen and interactive modes and just used -y. Here are some fsck messages for reference:
- Group descriptor 4349 checksum is 0xf6d0, should be 0x2ed1. FIXED.
- /dev/sda6: Inode 13434881 is in use, but has dtime set. FIXED.
- /dev/sda6: Inode 13434881 has an extra size (336) which is invalid FIXED.
- /dev/sda6: Inode 13434881 has INDEX_FL flag set but is not a directory. HTREE INDEX CLEARED.
- /dev/sda6: Inode 13434881, i_blocks is 137157068659908, should be 0. FIXED.
- Inodes that were part of a corrupted orphan linked list found. Fix? yes
- Inode 13434886 was part of the orphaned inode list. FIXED.
- Inode 13434886 has imagic flag set. Clear? yes
- Inode 13434886 has an extra size (62340) which is invalid Fix? yes
- Inode 13434886 has compression flag set on filesystem without compression support. Clear? yes
- Inode 13434886 has INDEX_FL flag set but is not a directory. Clear HTree index? yes
- Inode 13434886, i_size is 18440780219561279704, should be 0. Fix? yes
- Inode 13434886, i_blocks is 219803506189340, should be 0. Fix? yes
- Inode 13495674 has a bad extended attribute block 21496064. Clear? yes
- Inode 13495674 has illegal block(s). Clear? yes
- Illegal block #0 (1376321536) in inode 13495674. CLEARED.
- File /image_new_superblock.dd (inode #45605, mod time Wed Oct 28 12:58:24 2015) has 1 multiply-claimed block(s), shared with 1 file(s): ... (inode #13455772, mod time Thu Jul 4 03:48:32 1996) Clone multiply-claimed blocks? yes
Many screens of the above, timestamps all over the place. fsck actually aborted several times due to memory allocation, LOL. Eventually it ran clean, with messages:
- Running additional passes to resolve blocks claimed by more than one inode...
- Pass 1B: Rescanning for multiply-claimed blocks
- Pass 1C: Scanning directories for inodes with multiply-claimed blocks
- Pass 1D: Reconciling multiply-claimed blocks
Can now mount the partition and copy data. A lot of files still seem intact.
But there's more to do; this partition has been affected for 3 weeks. Presumably if I repeat this on the image made initially, I can recover more or nearly all the data. And still need to investigate "Partition 1 does not start on physical sector boundary." But, looking good!