Recovering a partition with no valid superblocks

My system had these partitions:

  • DellUtility
  • Windows 7
  • Ubuntu 14.04 (64bit) - extended
  • /home - extended
  • virtualbox - extended
  • swap - extended

(A Dell box with Ubuntu dual-boot and seperate home / "other data" partitions)

Sequence of Events

  • Distro upgrade to 16.04.1, apparently successful
  • reboot and end up in Emergency Mode as described e.g. here
  • use of systemctl etc. was unsuccessful; a suggestion was using Upstart via GRUB, which booted to desktop
  • the next boot (into Emergency Mode) did not succeed; now the /home partition was unavailable
  • boot via USBDrive; /home still unavailable, listed as "unknown partition" by gparted
  • lengthy investigation via fdisk, testdisk etc. wiki, a howto showed that all superblocks were corrupted, no backups available. I'm not a recovery expert but this seems unusual.
  • Make a testdisk image into "virtualbox" partition
  • Follow process described here without success, including last-ditch use of mke2fs -S.
  • testdisk deeper search finds partition, but not the superblocks; alogn with messages like No ext2, JFS, Reiser, cramfs or XFS marker
  • Tools like photorec can retrieve some files, but they are garbled, miss filename / structure, and many are encrypted due to it being a /home partition
  • Decide that as I have a backup of the partition, and the original data, delete and replace the /home partition, which succeeds. Boot into "home-new" and set it up...
  • and on the next boot, imagine my surprise to find both "home-new" and the "virtualbox" partition are unavailable. Same superblock issue. Well, great.
  • Using USBDrive, re-re-create home partition in a different disk location; this appears stable. Delete "home-new".
  • Realise that the backup of /home, and some critical data, are on that now-unreadable partition; obtain a USB disk and make an image on there

Current Situation

My partitions are:

  • DellUtility (OK)
  • Windows 7 (OK)
  • Ubuntu 16.04 (OK)
  • /home-new (bad, deleted, in same disk location as original /home) - extended
  • /home (seems OK, at different disk location) - extended
  • virtualbox (bad) - extended
  • swap (OK) - extended

The "virtualbox" partition is ~550GB. It is unreadable and has no valid superblocks.

Important data on it: * a testdisk image of my original /home partiton , ~50GB, itself unreadable with no valid superblocks * some original data from /home that wasn't handled by Crashplan. Not enough to be a critical problem, but enough to be annoying.

Note that first boots were fine - it was on the next boot the issues occurred. The "virtualbox" partition has been backed up as a Testdisk image on a USB drive.

fdisk:

sudo fdisk -l /dev/sda
Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: dos
Disk identifier: 0x15642c74

Device     Boot      Start        End    Sectors   Size Id Type
/dev/sda1  *            63      80324      80262  39.2M  6 FAT16
/dev/sda2         18395136  427995135  409600000 195.3G  7 HPFS/NTFS/exFAT
/dev/sda3        427995136 1920120831 1492125696 711.5G  f W95 Ext'd (LBA)
/dev/sda4       1920122880 1953523711   33400832  15.9G 82 Linux swap / Solaris
/dev/sda5        438233088  489433087   51200000  24.4G 83 Linux
/dev/sda6        773240832 1920120831 1146880000 546.9G 83 Linux
/dev/sda7        633921536  736321535  102400000  48.8G 83 Linux

Partition 1 does not start on physical sector boundary.
Partition table entries are not in disk order.

dumpe2fs:

sudo dumpe2fs /dev/sda6
dumpe2fs 1.42.13 (17-May-2015)
dumpe2fs: Bad magic number in super-block while trying to open /dev/sda6
Couldn't find valid filesystem superblock.

parted:

sudo parted -l
Model: ATA ST1000DM003-1CH1 (scsi)
Disk /dev/sda: 1000GB
Sector size (logical/physical): 512B/4096B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type      File system     Flags
 1      32.3kB  41.1MB  41.1MB  primary   fat16           boot
 2      9418MB  219GB   210GB   primary   ntfs
 3      219GB   983GB   764GB   extended                  lba
 5      224GB   251GB   26.2GB  logical   ext4
 7      325GB   377GB   52.4GB  logical   ext4
 6      396GB   983GB   587GB   logical
 4      983GB   1000GB  17.1GB  primary   linux-swap(v1)

Hypothesis

As you can imagine, this is very frustrating. Note the "Partition 1 does not start on physical sector boundary." warning in fdisk, which did not appear before.

Going by Partition does not start on physical sector boundary? , I suspect there is some alignment issue which confuses low-level disk utilities, introduced by the 16.04 upgrade, but that's just a suspicion.

gdisk:

sudo gdisk -l /dev/sda
GPT fdisk (gdisk) version 1.0.1

Partition table scan:
  MBR: MBR only
  BSD: not present
  APM: not present
  GPT: not present


***************************************************************
Found invalid GPT and valid MBR; converting MBR to GPT format
in memory. 
***************************************************************

Disk /dev/sda: 1953525168 sectors, 931.5 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): 8C15FD44-F839-4637-853E-C092F0959C48
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 1953525134
Partitions will be aligned on 8-sector boundaries
Total free space is 209964007 sectors (100.1 GiB)

Number  Start (sector)    End (sector)  Size       Code  Name
   1              63           80324   39.2 MiB    0700  Microsoft basic data
   2        18395136       427995135   195.3 GiB   0700  Microsoft basic data
   4      1920122880      1953523711   15.9 GiB    8200  Linux swap
   5       438233088       489433087   24.4 GiB    8300  Linux filesystem
   6       773240832      1920120831   546.9 GiB   8300  Linux filesystem
   7       633921536       736321535   48.8 GiB    8300  Linux filesystem

Questions

  1. How can I identify the cause? My system now seems stable, but am wary of further low-level disk writes.
  2. Is it possible to recover the superblock-less "virtualbox" partition - and then hopefully the superblock-less "home" partition image within?

Perhaps the superblocks are intact, but via an offset that the system isn't aware of.


Solution 1:

Have made some progress (and a partial answer):

Noticed that in some applications (e.g. System Monitor), at some point the swap became reported as 546.9GB. The swap should be 15.9GB, and that's a number suspiciously near the broken "virtualbox" partition.

lsblk showed that /dev/sda6 - the partition - was also mapped via cryptswap1 to swap.

/etc/crypttab had:

cryptswap1 /dev/sda6 /dev/urandom swap,cipher=aes-cbc-essiv:sha256

The smoking gun! So hypothesis now is the 16.04 upgrade failed to re-configure swap correctly, and on later boots the swap startup broke the partition (which would explain why first boots were successful).

  • Disable swap everywhere (used technique in What to do about "the disk drive for /dev/mapper/cryptswap1 is not ready yet or not present"? )
  • Reboot and confirm swap-less state
  • Use testdisk to investigate. It attempts to mark active partitions as deleted and can't find the partition, so quit.
  • Confirm state unchanged with sudo dumpe2fs /dev/sda6
  • dumpe2fs: Bad magic number in super-block while trying to open /dev/sda6 Couldn't find valid filesystem superblock.
  • So use the last ditch method described above:
  • sudo /sbin/mkfs.ext4 -S -v /dev/sda6
  • sudo mount /dev/sda6 /media/USER/virtualbox-image/
  • ... and ls lists some of the files / directories!

Well, hooray!!

I couldn't access the files, so ran fsck. There were so many errors I gave up with preen and interactive modes and just used -y. Here are some fsck messages for reference:

  • Group descriptor 4349 checksum is 0xf6d0, should be 0x2ed1. FIXED.
  • /dev/sda6: Inode 13434881 is in use, but has dtime set. FIXED.
  • /dev/sda6: Inode 13434881 has an extra size (336) which is invalid FIXED.
  • /dev/sda6: Inode 13434881 has INDEX_FL flag set but is not a directory. HTREE INDEX CLEARED.
  • /dev/sda6: Inode 13434881, i_blocks is 137157068659908, should be 0. FIXED.
  • Inodes that were part of a corrupted orphan linked list found. Fix? yes
  • Inode 13434886 was part of the orphaned inode list. FIXED.
  • Inode 13434886 has imagic flag set. Clear? yes
  • Inode 13434886 has an extra size (62340) which is invalid Fix? yes
  • Inode 13434886 has compression flag set on filesystem without compression support. Clear? yes
  • Inode 13434886 has INDEX_FL flag set but is not a directory. Clear HTree index? yes
  • Inode 13434886, i_size is 18440780219561279704, should be 0. Fix? yes
  • Inode 13434886, i_blocks is 219803506189340, should be 0. Fix? yes
  • Inode 13495674 has a bad extended attribute block 21496064. Clear? yes
  • Inode 13495674 has illegal block(s). Clear? yes
  • Illegal block #0 (1376321536) in inode 13495674. CLEARED.
  • File /image_new_superblock.dd (inode #45605, mod time Wed Oct 28 12:58:24 2015) has 1 multiply-claimed block(s), shared with 1 file(s): ... (inode #13455772, mod time Thu Jul 4 03:48:32 1996) Clone multiply-claimed blocks? yes

Many screens of the above, timestamps all over the place. fsck actually aborted several times due to memory allocation, LOL. Eventually it ran clean, with messages:

  • Running additional passes to resolve blocks claimed by more than one inode...
  • Pass 1B: Rescanning for multiply-claimed blocks
  • Pass 1C: Scanning directories for inodes with multiply-claimed blocks
  • Pass 1D: Reconciling multiply-claimed blocks

Can now mount the partition and copy data. A lot of files still seem intact.

But there's more to do; this partition has been affected for 3 weeks. Presumably if I repeat this on the image made initially, I can recover more or nearly all the data. And still need to investigate "Partition 1 does not start on physical sector boundary." But, looking good!