What's a good setup to back up macOS data and restore to Linux?

Background

I have an old MBP which works fine right now. I have no immediate plans to change it.

But it is getting old, the discrete GPU died, and I expect it could die at any time. I would like to keep my option open to move to a Linux laptop when this machine dies. Most of the software I use is either open source or runs on Linux as well and I rely heavily on Terminal/bash already. Considering that and not wanting to be tied to macOS due to not having a solid cross-platform backup, here's my current backup strategy to allow either a Mac or linux based OS to be my next computer for my current data.

modified backups

  • keep frequent backups, but, rather than relying only on Time Machine, backup /Users/ via rsync as well. (what file system is my best bet here?)

  • run preliminary trials on restoring the rsync backups into a test Linux VM.

As you an see - I'm not decided what filesystem I should use to rsync as Time Machine backs up to HFS+ and since I haven't picked a Linux it's not clear if I want / can mount HFS+ to read the backup data.

What setup allows me to restore my data that is backed up with rsync to any Linux?


Solution 1:

Use the cloud to store your data.

I actively use three different operating systems: macOS, Windows, and BSD (I do use Linux from time to time, but generally avoid it). To access my data on each of these platforms, I use the cloud (a hybrid cloud to be exact).

This can be as simple as having a OneDrive, DropBox, etc. account and a simple USB external drive to back up your data to a highly integrated hybrid cloud consisting of cloud storage, a local NAS with it's own cloud sync clients and external backup.

The point is: separate your data from your "operations." In other words, become platform agnostic. Because, when you structure your data this way, regardless of what you move to, your data will be ready.

Simple Cloud

If you were to use OneDrive or DropBox, for example, you could store all of your documents to the cloud. Both have clients for macOS and Linux so both could sync with no issue.

As for backups, both have native backup software (i.e. Time Machine) that allow you to efficiently back up your machine and data to an external USB or NAS device.

Hybrid Cloud

I am a huge fan of Synology NAS devices. I store my data on cloud services (like OneDrive) and sync it via Synology's cloud sync software back to my Synology. My Mac (and my Windows and BSD machines) all have external USB drives for backup and my NAS has it's own external drive(s) for backup (I have two).

Is it overkill? Probably. However, I haven't lost data in over 15 years (I was doing this before cloud storage came out strictly with NAS - nfs, smb, afp, etc.)

Some notes

  • I use utilities like KeePass's (cross platform password database) data file on the cloud so all of my apps can access it from any device from anywhere

  • I use local incremental or snapshot backups so I can easily restore data should I type the incorrect rm command.

  • I keep my data and backups in multiple places - it's in the cloud, it's synced to three machines (where applicable) which are locally backed up, it's synced to a NAS which itself is locally backed up.

  • I back up my "settings" (i.e. .bash_profile, ./ssh/config, ./ssh/authorizedkeys etc.) the cloud.

TL;DR

Efficient use of cloud services and/or technologies will enable you to store you data in a centralized location making migration from one platform to another quite simple.

Solution 2:

One approach would be to keep on using Time Machine and defer the decision till the mac dies.

  1. Purchase a new mac from Apple. Keep all packaging.

  2. Restore from Time Machine backup (very simple).

  3. Evaluate the replacement mac (or not, if you've already made up your mind)

  4. If you decide to migrate to Linux, just copy your user files to your target Linux system.

  5. Return mac to Apple. You have up to 14 days from purchase (double-check that in your country) to return it, no questions asked. Apple is extremely good with this, I've done it before, though not for this reason. 14 days is more than enough to do this, if you can get hold of a Linux system soon enough.

Solution 3:

There are three problems you're dealing with here:

  1. How do you back up your Mac filesystem in such a way that you don't lose any data?

  2. How do you make that data accessible to linux?

  3. How do you use that data in linux?

None of these are immediate when operating between the Apple ecosystem and any other (Windows or Linux).

The reason for these difficulties is that Apple's filesystems, in order of historical introduction, HFS, HFS+ and APFS, are filesystems that use "Extended Attributes" (EA's), which include "forks."

These metadata components may not be translatable in any obvious way to another filesystem. For example, Apple FS's have as standard, two forks (though they may technically have any number). The data fork contains most of what we usually think of of file data, along with a resource fork, often but not solely used by executables. (And though that wikipedia link does not mention the resource fork in APFS, they do still exit there.) There is other metadata, including that for the Finder program and "a separate area for metadata distinct from either the data or resource fork... However, the amount of data stored here is minimal, being just the creation and modification timestamps, the file type and creator codes, fork lengths, and the file name."1

An Approach that Puts Off Part of the Problem by Backing up to an Apple Formatted Disk

One approach is to backup your present files using some Mac solution that will keep copies of your files on an HFS+ or APFS filesystem. When the time comes to move to linux, you will have your files and be able to read them (though not write) using linux's hfsutils, hfsprogs and hfsplus or, apfs-fuse(installation tutorial), apfsprogs-git & linux-apfs-dkms-git.

Make sure that your backup system does not store your files in some proprietary archive format that you will not be able to read under linux, which may happen if you are not using a cross-platform tool. For-fee solutions include Get Backup Pro and CronoSync Express. While the first would be a true backup (keeping historical copies of files), the second could be either a backup or a simple mirror. It is possible that TimeMachine might also work, though you will have to confirm that it doesn't use an archive format unreadable under linux. You just want an APFS filesystem, with you files copied to it.

Later You Will Want to Use Your Files on Linux

Of course, for the purposes of your question you will in addition want to know how to represent all of your files in a usable way on some linux filesystem. Clearly the Finder data is of no use, and you will have to lose HFS+'s "birthtime" attribute (see below), because that is not tracked in Linux. The data fork contains the bulk of the information, but what relevance the resource fork and some of the other metadata may have, will depend on the file. How problematic this may become for you, may not be clear until you try.

The following approaches will allow you to save all of your MacOS data instead to a linux formatted disk, thereby doing without Mac backup software or TimeMachine, etc., and also dispensing with the later need to read an Apple disk under linux; though you will still then be faced with the question of proper use of that data under linux. You may do well to consider this article, "Command Line Backup Solutions on Mac OS X," before proceeding. As discussed in that article, be careful to note that using MacOS's resource fork aware version of rsync (or tar) will produce output that is not usable by linux's version or rsync or tar!

Backing up to a linux disk with rsync

There is a project called rsync+hfsmode that will handle backing up to linux formatted disks properly, at least for HFS+, but it does it by creating two files on the backup drive: filename, containing the data fork, and ._filename containing the resource fork and Finder metadata. Furthermore, when copying back to an HFS+ disk, a second step is needed to reconstitute those two files into a proper HFS+ data structure. You can see a more complete discussion at the project page. The filename/._filename system for storing HFS+/APFS files to other filesystems, has a name. It is called AppleDouble format. I am not clear if this same approach will work for APFS, though the question asked on the Apple Developer Forum was responded to with silence; so perhaps not.

Backing up to any kind of disk with dar

Disk Archiver (dar), which is cross-platform and available in Homebrew, can handle the unique characteristics of MacOS filesystems (they do not distinguish between HFS+ and APFS, but say they can handle extended attributes, including file forks). According to their Features page:

EXTENDED ATTRIBUTES (EA) references: MacOS X FILE FORKS / ACL Dar is able to save and restore EA, all or just those matching a given pattern.

File Forks (MacOS X) are implemented over EA as well as Linux's ACL, they are thus transparently saved, tested, compared and restored by dar. Note that ACL under MacOS seem to not rely on EA, thus while they are marginally used they are ignored by dar.

FILESYSTEM SPECIFIC ATTRIBUTES (FSA) references: MacOSX/FreeBSD Birthdate, Linux FS attributes

Since release 2.5.0 dar is able to take care of filesystem specific attributes. Those are grouped by family strongly linked to the filesystem they have been read from, but perpendicularly each FSA is designated also by a function. This way it is possible to translate FSA from a filesystem into another filesystem when there is a equivalency in role.

currently two families are present: HFS+ family contains only one function : the birthtime. In addition to ctime, mtime and atime, dar can backup, compare and restore all four dates of a given inode (well, ctime is not possible to restore)

I also had some discussion on these issues with the developer.

Since dar is cross-platform, you don't have to worry about the format it stores the files in, since you'll also be able to install dar on linux, when it is time to move there. In this case it probably makes sense to format your backup disk as some linux filesystem. You could use APFS if you wanted, as it's also readable under linux, but that seems pointless.

Restoring to a linux disk will produce error messages when metadata can not be reproduced. You will be able to save the problematic files in a smaller archive. Whether you can explore the attributes of those failing files using linux tools, I am not yet clear.

Backing up to Any Kind of Disk Using Restic

Restic is similarly cross-platform and available in Homebreaw, and can handle Apple disks. (Though again, they do not distinguish HFS+ from APFS.) There is a detailed bug report describing how restic behaves when backing up HFS+, showing what it is able to handle and where it fails.

Similarly to dar, restoring to a linux disk will produce error messages when metadata can not be reproduced. Whether you will be able to manipulate or save those problematic files separately, I am not yet clear.

Here is a short description of its installation and use in MacOS, along with a scheduler.