How to build a high availability Postfix system?

I'm needing to set up a remote mirror for a postfix server (where the content of both mail servers should be the same at any time).

The idea is that if the main server comes down at some point the mirror server will take its place, manage the new incoming mails, and when the e-mail server comes up again, it will update it with the new e-mails and return it the control to manage the new incoming mails.

The mail servers will be hosted in different places (i.e. maindomain.com, themirrorsite.com).

Getting a simple back-up server doesn't seem too difficult:

  • http://beginlinux.com/blog/2010/03/backup-mx-with-postfix/
  • http://www.postfix.org/STANDARD_CONFIGURATION_README.html#backup
  • http://www.linuxmail.info/postfix-backup-mx/

But the problem is that this configuration wouldn't make the back-up site a complete mirror of the main mail server (it will hold only the e-mails received while the main server is down).

Is there a way to achieve the required configuration?


The outcome you want to achieve, and the manner in which you have decided to do it, are very different things. To be blunt, what you want to implement is a bad idea, and if you can somehow manage to make it work, it won't work for very long (or very well).

What makes this question difficult to answer is that you've leapt straight to the implementation, and haven't described anything useful about your environment or what you're trying to actually achieve. Please don't do that, you'll get a lot better results here if you "show your working".

Let me posit a couple of scenarios, though, to give you a taste of what's possible, practical, and useful:

  • Ensuring no mail gets lost: (I don't think this is what you need, as the documentation you refer to covers it adequately) All you want to have here is assurance that regardless of how long your mail delivery and management infrastructure is down for, you won't bounce any mail, and you can control when delivery is made. For this, a "simple" off-site backup MX will work adequately. I say "simple" because you need to replicate a lot of data to the backup (all anti-spam logic, valid user/alias information so you can properly bounce invalid mail at SMTP time, that sort of thing), but it's all scriptable, automatable, and fairly trivially implementable with a bit of care. As long as you've got enough disk to queue all the mail, you can queue for weeks or months until your primary site comes back and then you poke the backup MX and it dumps a metric buttload of mail into your mail infrastructure and your users go "aaaaaaahhh!"
  • Ensuring full mail system availability: It sounds like this is what you want, but it's not simple or pretty. Basically, you want to be able to provide "full" mail service to your userbase in the event of a complete site failure. In principle, this is actually impossible, because replication isn't instantaneous, but you can get to a reasonable level of reliability at least. The difficult bit isn't the MTA, though; it's the mail store itself. You'll need to figure out a way of replicating all mail storage operations (new mail delivery, message state changes, deletion) to the second site in near-real-time -- and do it both ways, depending on which site is live. You can take the cheap option, of a periodic rsync (with the risk that anything done since the last rsync is gone forever if you need to failover), or go for various file- or block-level replication techniques to try and keep things in-sync in near-real-time (reducing the amount of data loss in exchange for significantly more complicated configuration and operation). Some mail systems have support for some sort of replication built-in, which can make life easier. Then there's the whole issue of failing over, and how do you do that, and then failing back, which is harder again, and finally you've got to test it periodically, to ensure that OS upgrade you did a while back didn't break anything...

Basically, the latter option is painful and annoying. My personal preference, if you can get away with it (and you'd be surprised how often you can) is to put all your eggs in one basket, after making sure you've got a really good, sturdy basket (proper systems engineering), keeping a stock of basket-patches and tools on hand (focusing on High Recoverability), and ensuring that people know that every once in a while, a few eggs might get broken and you're really sorry but life isn't perfect (don't make SLA guarantees that aren't reasonable).

There are times when you need ultra high availability, and I've built systems that ensure it, but they're not simple, and in many cases they're not cost-effective, which is what we're here for. Yes, HA is cool and sexy, and you get geek cred for building some towering monstrosity of complexity, but we're not here to stroke our egos. We're here to deliver business value, and I'm sorry, but a Rube Goldberg highly-available multi-site mail cluster isn't likely to be providing as much value as a simple, robust mail service and the occasional "we're sorry for the mail outage, we'll have the systems back in an hour, please feel free to have a coffee and a muffin on us" announcement.


You can achieve this by MX DNS failover + a data replication system.

For MX failover: Two mail servers, need help with dns configuration for the backup one

For data replication: http://www.drbd.org/docs/install/

-$


I've used dbmail to accomplish a similar solution. dbmail stores all the email in a database. You can setup database replication to make sure that your emails are also stored in the remote location. It makes management of the mail system more complicated as you have to manage the database as well as the email.