Server becomes unreachable and comes back up on its own (most likely a network issue)

Solution 1:

This kind of problem usually doesn't generate a lot of log messages. You have discovered the important two messages which show the interfaces going down and up. These can be generated by unplugging the ethernet cable and plugging it back in.

It could be a bad cable between the NIC and the router. My first steps (done one at a time) would be:

  • Replace the cable connected to eth0 and see if that resolves the problem.
  • Reconfigure the network interfaces so the traffic currently on eth0 is on eth1 and vise versa. (Requires a network restart and cable swap.) If the problem moves, then it is like a failing NIC.
  • Verify the status of the upstream device and its power supply. If it looses power or is otherwise failing you can see this kind of behavior.
  • Run netstat -i or ifconfig and examine the error counts. Normally, they should be 0 or single digits. High carrier or frame errors may indicate duplex mismatch. Duplex mismatch can be verified by uploading then downloading a large file. Large speed differences accompanied by increasing error counts indicate mismatch on the link. Cable modems usually have different upload and download bandwidths, so local transfers work better for this test.

One tool I do use is mtr. I use a command like mtr -i 15 -n google.com to monitor connectivity. Consider using one of your ISP's servers instead of google.com. It can be run in report mode in batch. If the problem is upstream of the server, the output should help identify where the problem is occurring.

Solution 2:

BillThor has some great suggestions. If none of his solutions resolve the issue, auto-negotiation could be to blame (though unlikely). Try forcing the speed and duplex of the connection (instructions for RedHat, but other distros are similar)

Edit /etc/sysconfig/network-scripts/ifcfg-eth0:

ETHTOOL_OPTS="speed 100 duplex full autoneg off"

Then restart the interface:

/etc/init.d/network restart