What level of documentation do you expect to be provided to you by developers?

Solution 1:

All of these things should be documented in detail, although when the operation is standard for the operating system, application server, web server etc you may be able to assume the IT operations people know how to do that.

Installation: document everything about how it is installed and configured, including how to tell if it is operating correctly.

Tell us about the architecture, especially about the communication between various solution components (e.g. range of ports - RPC mechanisns often use a range of ports - we need to know what the range is and when the application might run out of ports).

Patching: document anything specific to the application - what needs to be shut down before patching, and any follow up actions after patching (caches, indexes, proxies that may need to be cleared or rebuilt).

Maintainance: document what normal and abnormal operation looks like - what queues and other things should be monitored and what the normal range of these is.

Tell us how to manage the data - especially tables and files that grow without limit (e.g. log files and transaction histories). How should these be purged and what's the impact of removing old entries? (on reporting etc).

Tell us how to carry out standard "business as usual"/in-life management actions - this might be adding or modifying user accounts, for example.

Tell us about any other regular management actions that might be required (e.g. which certificates are used and what to do when they expire).

For all changes tell us how to roll them back (not all changes are successful). And tell us that you've tested the rollback plans!

Diagnosis: Document log file formats and locations and EVERY application error message that might turn up, saying what the error message means has gone wrong and what might need to be changed to fix it. Never use the same error message for two different events.

Shot down and start up: How, what order, any special procedures (e.g. letting servers drain connections before shutting them down).

I strongly disagree that the best way of doing this is to throw the application over the fence and let the IT people work out what is needed. The operational documentation (and in general, the manageability features of the application) need to be thought about up front.

Solution 2:

A follow-on question would be: what happens when (not if) the developers don't supply sufficient documentation?

I recommend that IT have the ability to enter defect reports against the software, using whatever defect tracking system the developers use. That way, if they didn't tell you, for instance, that the files in a particular folder need to be purged, and that only a week's worth should be kept, you could enter a defect saying "application fills the disk with log files", and suggest they work with IT on a documented technique for purging that folder.

Solution 3:

My list of requirements for documentation would be (not in any specific order):

(documentation on:)

  • all command line switches
  • all exit states and return values
  • log messages (not so much the content but rather explaining fields if it is not configurable)
  • configuration syntax
  • switches in the config files
  • memory usage
  • is it threaded or forked
  • what are the signals the server reacts on
    • are there any signal that don't restart the server but make it re-read the config
    • how does it behave? (does it wait for existing threads/processes to finish with the old config. Does it kill them, ...)
  • what happens on unclean shutdown (especially if it is some kind of persistence service/server)
  • does it log thru system provided calls or does it log with something written by itself (yuck for apache and access log - I clearly prefer on-board tools for logging)
  • IPv4 and IPv6 ready if it's a network service
  • documentation on trunk and documentation on a specific version
    • nothing is as bad as configuring something for hours just to find out it will be ignored because the config option is only available in trunk
  • which config option is valid in which version (available since: v1.0, deprecated since: v1.2 or something alike)

Documentation like this are examples for good documentation:

  • http://httpd.apache.org/docs/2.2/
  • http://www.postgresql.org/docs/8.3/static/index.html

I'd consider documentation like this to be full of fail:

  • http://tomcat.apache.org/tomcat-6.0-doc/index.html

Also the FreeBSD Handbook is a great example of documentation, and OpenBSD's approach. They kick stuff out that isn't properly documented.

EDIT: this list is by no means complete it is just the basic stuff that immediately came to my mind. Also the documentation should be well readable, not just something that reads like someone threw up.

Solution 4:

In short, I expect the documentation I specify and contract for.

Too many times this critical detail is left out of an agreement. The end user expects it and wants it for free of course. Good developers will correct this oversight early in process and set expectations including a price and time requirement.