How to make sure that in case of our mail server being unreachable (connection down) mail still gets queued and resent once it's back up?

We have had some issues with losing internet connection once in a while. Normally that would mean our mail server is unreachable and incoming email is bouncing. We've set up MX Backup (second MX record after ours) and while for some providers it was enough (gmail dropped to second MX and was able to queue up mail), others still bounced the email (hotmail).

Is there anything we can do to make sure that we never lose email? We would like to host email internally because there is a lot of internal email traffic. We are also considering a backup internet connection (what would be the proper dns/mx setup for that)? Still even with backup we sometimes lose power so having email queued outside would be perfect but backup MX doesn't seem to be reliable enough.

Please let me know what's possible in our situation to improve email reliability.

Thanks.


Solution 1:

You can never be sure you'll not lose email. That's just not in the protocol.

However, properly configured email servers will try again for something like 12 to 48 hours, with longer and longer windows between retry attempts rather than just dropping the email altogether.

Technically you can't control what other mail servers do or if they're configured properly. Part of the problem in my experience is that people are of the mentality that email is instant messaging. It's not. It's unreliable and rickety and it's amazing that it is still usable in this day. There's nothing built in for authenticating or verifying users, it's not encrypted, and because of spammers, it's not uncommon for legit messages to get stuck in junk folders or for email servers to drop messages suspected of being spam without notifying the sender that the message wasn't truly delivered.

If you already have a second MX record to an off-site host, you're already doing the right thing. The next thing you could try is setting up redundant links with two different providers and play with BGP and the wonderful and unique world of hurt that playing with redundant links brings.

The real question is, that for this downtime, did it affect your mail and business to the point that it justifies pursuing this route? How often do you experience outages, and how long do they typically last? If the business impact is high enough you can look at the redundancy route; otherwise maintain an offsite (with separate provider) second MX system and that should be good enough until the benefit/cost ratio dictates otherwise.

Solution 2:

You could use a solution like postini, since its now owned by Google, you can leverage their spam filtering, and also have a place for mail to queue if your connection goes down that your Exchange server is on. I've used it in the past, and it was fantastic when the datacenter that we had our Exchange servers at was experiencing issues.