Hot clone a living Linux service

Solution 1:

Maybe you can't "hot clone" a whole server (you can, but only if it's a virtual machine), but you can freeze and restore a single process, with criu, Checkpoint/Restore in Userspace.

This allows you to save the program's internal state to disk and stop the program, and later, to restore the program to that state from the saved files.

To support your desired operation, you can copy the files representing the saved program to another server, and restore it there.

criu requires a recent kernel with various features compiled in, so older Linux distributions might not work. You can run criu check on a particular machine to determine if the prerequisites for criu are present.

Solution 2:

It may be a bit out of scope of your current environment, but the industry standard way of doing this is to virtualize your server. Many virtualization hosts (VMware, virtualbox, etc.) allow “snapshots” that save the state of a server, which can then be cloned into new instances. These new instances will have exactly the same state as the original, down to running processes. Of course you’ll want to make sure that the software that you’re running will still perform correctly in a virtual environment (CUDA/ GPU calculation springs to mind).

Solution 3:

The question you mention refer to a link, http://www.linuxfocus.org/English/March2005/article370.shtml, which describe all the ways I had imagined to do your requests.

That the options are there does not mean a lot to what is running on the server. You have to consider that all the files that could change in the cloning process could be inconsistent files on the target machine. On that post you provide they talk about databases, and cloning it like that do not give any insurance of data integrity.

It is not exactly clear what you meant with "until we feed it several times".

But if I understood well what you ask, you have to consider that in order to clone a system it needs the time to copy and calculate resources.

To perform an "ON/OF" or better called an active/backup environment, the server has to be properly configured in the cluster.

I'm sorry if is not the answer you expect, but the options you get are those.

Solution 4:

There are many potential issues with what you are trying to do, and of course as you know it would be best to take the server offline and clone it while no data is being dynamically stored.

However, what you seek to do is entirely plausible, as I have done it before. If you use dd you can clone the full server at the block level to another drive or another server. It will however take some additional setup on the new server, and you probably won't be able to simply turn the other off and the new one on. For us to understand this, we need to know a few things about your server hardware and software.

Firstly, in order to determine the best data strategy, it would be helpful to know what is updating regularly. Do you have an SQL server which is dynamically updating but have static content? Alternatively, do you have a team of developers over a subversioning system like git sending constant data updates to your content? Depending on what is updating will determine the best full course of action.

If for example, it is only the SQL which is updating regularly, then you can migrate to a new server while that server is live in the following manner:

  • dd to clone all data the new server.
  • Start setting up the new server, it may take some work especially if it is different hardware, but still may be faster than setting up from scratch.
  • It may also take some DNS changes, since you can't use the same DNS on another server if you need to work on the second server live while the first server is still live.
  • After the new server is complete and running independently, take a final backup of the sql server on the original server, and import it into the new server.

You may need to take your original server offline temporarily to ensure that you don't miss any data. Alternatively, to have zero downtime, you could make the second live, point the dns to the new server, and then update any dns entries manually on the new server, so there is effectively zero downtime. This is more hassle than a few minutes of downtime though to backup the sql and restore to the new server, but may be necessary for zero downtime.

This of course is only one use case example, and depending on your configuration and several variables, you may need to create your own strategy for the migration based on your specific case.

The other issue is in regards to the server hardware configuration. Is the new server 100% identical in hardware to the old server? If so, then the setup is easier. However, if on the far other hand, it is a totally, completely different hardware configuration, then you may need to implement a different strategy which is to simply set up the second server ahead of time, then backup all your data and sql databases on the first server and manually migrate them over, changing configuration as desired.

Server migration is by no means trivial, and in order to have a successful move, you need to have deep knowledge of servers, or staff on hand who have the same. In any case, it is highly recommended that you immediately take a full backup and store it on a third source, even on your local computer, so that if the worst case scenario happens (both servers crash and die irreparably), you still have another copy of your data to rebuild your servers with.

Hope this helps, and good luck with your server move!