System administration standard [closed]
I have been given the responsibility to manage a team of 4 system admins. They are managing 70+ servers. They don't yet have written processes/procedures/practices. I don't much about system administration. Is there a standard which we can follow to standardize our work or choose best practices?
I'd endorse what others have said about not jumping in and laying down the law. You say the team, right now, is managing 70+ servers, so my first question is: how well are they doing? Is there lots of unscheduled downtime, working-day outages, constant scrambling to fix stuff just before it explodes? Or are they doing a pretty good job from a service-delivery standpoint, with only the occasional unforeseeable disaster of the sort that happens to us all to mar the peace?
If it's the latter, then you've got yourself a good team which seems to know what it's doing, and not trying to fix what isn't broken is an important part of not putting your team's backs up.
If it's the former, you may still have a good team; good teams can flounder because of a lack of support and engagement from the business (no budget for new kit, no agreement on compensation for the midnight work that would be required to upgrade things without working-day outages, no clear agreement on SLAs), or internal frictions, or a host of other non-technical reasons.
If it's the former, of course, you may just have an inadequate team.
The right response varies wildly across these three scenarios, and will also be affected by the personalities involved.
If you have a good team, working well, then let them lead you. What they're doing is right, but you need to understand what it is that they do, and how. They'll tell you, if you ask, and if you ask nicely they'll probably tell you in the most useful way, by writing it all down. Annual reviews and agreed-on goals are a good way of inserting more documentation into the working sysadmin's life. Essentially, what they're doing now is close to best-practise, so try to get them to document it in a mutually-useful way, rather than imposing anything on them.
If you have a good team working badly, they probably know what needs to change in order to become a good team working well. Listen to them, and work out how to convert their needs into justified requirements to be passed back to the business. You can add a lot of value as the bridge between tech world and business world, if you're prepared to listen to both sides, and say "no" to both sides in appropriate measure.
If you have a bad team working badly, then you have your work cut out for you. Identifying and documenting what's going wrong will be important in being able to discipline, and if necessary, replace people without exposing the business to liability. Identifying low-hanging fruit - things that could be easily nudged into going well - is important in getting some quick team-motivational and business-credibility wins, and baselining what's wrong is helpful here in being able to show that some quick improvements have been made.
I see I have rambled off-track somewhat, but I honestly believe that best-practice and standardisation exist to satisfy the needs of the business and the people to get the job done, rather than being some ivory pinnacle of documentation excellence standing alone in a vacuum, so my answer reflects my interconnected approach. I'm sorry if it's overlong!
Consider starting with ITIL: http://en.wikipedia.org/wiki/Information_Technology_Infrastructure_Library
ITIL gives detailed descriptions of a number of important IT practices and provides comprehensive checklists, tasks and procedures that any IT organisation can tailor to its needs.
Don't expect to read an ITIL book and know everything but it is a good place to start. Jumping in after reading ITIL and telling the sys admins "the new law" might get you some unhappy sys admins.
What I would suggest is sitting them down and discussing with them how best to improve the documentation, and how to cover time tracking/etc.