Traffic routing with unreliable connections

I have a group of offices that are all connected to the main office via DSL links on the far end to save costs. (We're a non-profit, don't ask)

We've historically had noticeable problems with the link-up between the ISP that handles our remote sites and the ISP that handles the T1 lines our OpenVPN runs on, so those links frequently go down.

Our mail server's public interface is on the 1st provider's network so it worked just fine, but it's much slower because it's also DSL.

To address the upstream network unreliability issues, I'd written a script that simply modifies the DNS records on the remote sites to point to the internal IP if the tunnel is up or the public IP if the VPN tunnel to the main site is down.

How can I do this in a more elegant fashion that will be instantaneous (instead of my cron-driven scripts) and transparent to the users?

Edit: Remote Offices: Ubuntu 9.10 LTSP servers running various vendor-provided Actiontecs & Motorola and a few with Netgears and Linksys firewall. Main office: Almost 100% Linux (CentOS, in this case) with multiple Netgear FVS318/338 series firewalls with individual firewalls for each IP on our /27. (another don't ask, it was before I got here)


Solution 1:

OpenVPN should be able to execute commands upon the creation and termination of tunnels. Instead of having this job run in a cron, you can have the DNS record shuffling triggered by these events. Then, you just have to monitor something over the unreliable link to know when to restart the VPN tunnel.

Solution 2:

It depends on your budget. IP SLA by Cisco(and definitely others) does exactly that. Here is an excellent starting point

You might be able to pull that off without anything else. I assume that your users DNS point to your remote site's router. In your remote site's router you can add the primary DNS of your 1st provider and secondary DNS for your 2nd provider. Most routers these days are clever enough to fail to the secondary once the primary fails.

EDIT: To be fair depending on your DSL you could find a used cisco router from $60. Since IP SLA's are supported since 12.3(14)T