What Data Center room routine maintenance tasks should be done?
Beyond the work done "inside" equipment (administration, programming, network config) there exists physical equipment and rooms. This physical room equipment needs care and feeding as well on a routine basis.
So, the question is, for a generic data center (insert server room, comm closet, etc.) with servers, racks, networking equipment, etc., what routine maintenance should be done in regards to the room itself?
Disclaimer: When working with A/C, eletrical, fire suppression, always use licensed professionals and follow local laws/codes. Know your limits.
Also, I have found this book (albeit old) to be a wealth of knowledge for sysadmins:
Sun Blueprints - Enterprise Data Center Design and Methodology - Rob Snevely
The book can be downloaded in its entirety here: http://java.coe.psu.ac.th/SunDocuments/SunBluePrints/edcdesign.pdf
The point of this though isn't on how to design a data center, but rather just routine maintenance of it.
AIR CONDITIONING/COOLING
- Verify temp and humidity levels as well as proper airflow (electronic monitoring/logging/alerting is a plus here) in multiple spots in the room, not just at the thermostat
- Have filters changed on a schedule
- Have an a/c tech perform scheduled maintenance per OEM recommendations
- make sure your cold/hot aisles are still designed properly as equipment comes and goes
- keep a log of maintenance along with any issues/notes for future reference
ELECTRICAL
- have an electrician verify proper input/output/load/grounding
- IF APPLICABLE, schedule (and actually conduct) building power failover tests to both UPS battery and generator
- Have a site power analysis that checks for the soundness of the wiring, the quality of the AC voltage, and source of any power disturbances
FIRE SUPPRESSION
- have the fire suppression system tested per code requirements
- instruct anyone with access to the room on how the fire suppression works as well as how to operate any handheld fire suppression equipment (this should be done more than once)
6S/CLEANUP/VISUAL INSPECTION/LABELING/VISUAL INDICATORS
http://www.vitalentusa.com/learn/6s_article.php
- Do a visual walkthrough of the room (best done with more than one person), looking for things out of place. Using 6S methods, clean up the room. Put things in their proper place (tools, logbooks, documents, dvd/cds, tapes, loose equipment, etc.)
- Trash - never leave trash in a data center, empty it frequently. Boxes, extra/spare equipment, etc. should be kept in a separate room if possible or in a locked storage cabinet within the room.
- Contaminants - avoid eating/drinking in the data center. Contaminants such as bugs, hair, skin, dust, etc. will happen, so dry "Swiffering" or similar is recommended on a weekly basis. Do not use a wet mop.
- Labeling - keep labels up to date, concise, and understandable (to more than just you). Label EVERYTHING that makes sense to have a label. Equipment, cabling, outlets, A/C, etc. should be labeled with useful labels and verified that they are correct and up to date.
- Visual Indicators - Alert lighting/LEDs, Alarm panels, Visual check logs, etc. should be easily viewable and up to date. LEDs/panels should be checked constantly, don't rely on software monitoring to be accurate/timely.
- FLOORING/WALLS/CEILING/LADDER-RACKS - check to make sure these are in good physical shape. Raised floor tiles should be checked to make sure the sub floor is sound and the tiles themselves are in proper condition with the right supports beneath. Walls and ceilings should be checked for any cracking/holes that could cause issues if not dealt with. Ladder racks should be inspected for safety.
- NEATNESS - Make sure equipment, cabling, etc. is neat and orderly. Think in terms of "tomorrow my data center will be showcased on Google's homepage". Would you be proud or humiliated?
PHYSICAL ACCESS
- Verify who has access to the room and adjust accordingly (proximity card or other electronic access methods are preferred over simple keys)
- Verify doors close properly and have a tight seal to keep the room pressure correct (especially important with fire suppression)
- Run scheduled reports (if possible) on room access
I'm sure there are others I didn't think of, so I'd love to hear more.