APFS: Snapshot is invalid
error: sibling_map_val
Let's break this down. First according to APFS spec (PDF):
Hard links that all refer to the same inode are called siblings. Each sibling has its own identifier thatʼs used instead of the shared inode number when siblings need to be distinguished. ... You use sibling links and sibling maps to convert between sibling identifiers and inode numbers. Sibling-link records let you find all the hard links whose target is a given inode. Sibling-map records let you find the target inode of a given hard link.
So the sibling_map
is just like a spreadsheet with a couple of columns in it, a key
that refers to an actual file on the file system, and a value
that has the Object ID of a hard linked "file". In this case, the value for your ID is not the correct length, indicating it is corrupted.
Further, that corrupted data appears to be in an incomplete snapshot, so the solution is to delete that snapshot, which can be quite difficult.
Possible Solutions (Least to Most Destructive)
Delete Local Snapshot
Yes, you mentioned this, but it's an important first step. First, make sure you turn off TimeMachine.
Is there any way to force-delete the snapshot?
Yes, and you may as well make a script of it because it's a frequent problem. 99% of the time is the oldest snapshot or the one that says dateless
tmutil listlocalsnapshots /
... Output of that command looks like this:
Snapshots for volume group containing disk /:
com.apple.TimeMachine.2020-04-01-122516.local
com.apple.TimeMachine.2020-04-01-132348.local
com.apple.TimeMachine.2020-04-01-143800.local
com.apple.TimeMachine.2020-04-01-153811.local
com.apple.TimeMachine.2020-04-01-183757.local
com.apple.TimeMachine.2020-04-01-193758.local
com.apple.TimeMachine.2020-04-01-203828.local
You just need to copy the timestamp for each line you want to kill and paste it into the next command. Again, usually deleting just the oldest (top) one will resolve related issues.
sudo tmutil deletelocalsnapshots 2020-04-01-090758
If successful, you will get no response (exit 0) in terminal.
2. Delete the Offending Snapshot
WARNING: You should not proceed if you don't have a full backup of your drive. You could lose some data. You could lose all your data.
Boot into single user mode (reboot into recovery mode and enter commands in the terminal as root user) and try to find the location of the snapshot. Something like:
find / com.apple.apfs.purgatory.84779e # Totally untested
Once you find it, rm
that file. If you are unable to locate the file, on to step 3.
3. Do a Safe Reinstall
While still in recovery mode, exit terminal and perform a reinstall of the operating system. This is "safe" in that only the system files are recreated. Your $HOME
folder will be left in place, so if all goes smoothly, you shouldn't have to recover your hard drive from a backup.
Once finished, run fsck_apfs
again to verify the issue is resolved. If not...
4. Do a Full Reinstall
Still in recovery mode, open Disk Utility and delete the partition on which the OS is installed. Doing this will delete all of your information. Recreate the partition (consider using HFS+ if you have frequent issues with APFS) just as it was before. Exit Disk Utility back into recovery mode.
Before doing a reinstall, use fsck_apfs
on the new partition to verify that it doesn't come back with any errors containing the word physical
. Any errors at this point likely indicate an issue with the hard drive itself, and it may need to be replaced. Examples of such errors include:
Unable to mark physical extent range
found physical extent corruption
Try repairing, of course, but if you aren't successful, consider replacing your drive.
Then proceed with an install just as you did in step 3, followed by running a recovery from your most recent backup.
Good luck.