How do I audit changes made to our servers, routers, etc.?
Solution 1:
Let me side-step some gratuitous comments about the apparent lack of control in your environment. My apologies for the situation; try to reign those cowboys in :)
Definitely look at Rancid for your networking needs. You can monitor changes to configurations. Additional integration will let you automate backups upon detection of configuration changes based on Syslog messages or SNMP trap notifications.
For Linux, consider forcing admins to access hosts through a logging portal (like an SSH jump with a ForceCommand
that wraps script(1)
before connecting to a destination host). Venerable tools like Tripwire can log inappropriate changes made to system files.
For Windows, check out the pretty software from ObserveIT, which can do host-based monitoring of interactive sessions.
Given that you seem to have already had some face-blowing-up going on, I strongly encourage you to foster a culture of responsibility about this (a "soft" control / policies). Some admins do behave like cowboys, but surely most understand that undocumented. unannounced changes lead to problems. Establish work windows, production blackouts, change notifications, etc.
This are simply smart practices, which they and customers will come to appreciate; the admins because they'll be able to find out when they shoot themselves in the foot more easily and customers because they'll feel more aware of what's going on.
Solution 2:
Record everything on command line centos /fedora/ ubuntu
How to keep a detailed audit trail of what’s being done on your Linux systems
Solution 3:
Manage your users sysadmins: As Jeff said, try an illustrative story:
At my last job we had X thousand users on Y hundred devices and Z dozen sites. Some bright spark decided to change the configs on all Y hundred devices and reboot them remotely to apply the configs.
1 day later, the config started failing. He ended up having to drive to every site to reset the devices using a serial cable and a laptop. He learnt his lesson and never did it again. Now he's a manager and makes sure his sysadmins don't repeat his mistakes.
Or whatever suits.
(The above is an exaggerated version of my own experience with a remote firmware update on a switch; we had to bring the switch back from the site and re-flash it using a serial cable.)
Solution 4:
I've more or less the same requirements and came quickly on rancid to track changes on my network devices. Liking the simplicity but power of rancid I've been looking for the same kind of tool to track configuration changes on Windows servers. As such tool didn't exist I gave it a try and created ranwinconf
I can say it does the jobs by notifying me by mail when some important part has been modified in one of 200+ Windows servers I'm responsible for.
Hope it can be helpful to other people too!