Backup is not Archiving - But what requirements does archiving have?

There's related thread on how to handle archiving. I'm asking on how to find the requirements in a company on the way archiving should be done, and also the requirements that need to be fullfilled so that archiving is actually done instead of just backups.

  • How long should data be kept
    • at least?
    • at most?
  • What about legal implications (I know depending on your region)?
  • How do you ensure archiving from a policy/requirement point of view?
  • How do you ensure archives can be restored from a policy/requirement point of view?
  • Which data should be archived?
  • ...
  • ...

I'm looking for good reasoning on archiving and a framework of policies that can be adapted, not actually answers but a set of questions that may lead a path to something that will be valid in n-months/years/decades (which of those apply already is a question actually)


Backups are there to restore things to the way they were at a previous point in time. We do backups to restore service, and usually don't think much about the underlying data.

Archives are very data focused, so you really need to think about things like classification and the information life-cycle. Much like software development, deep understanding of the business problem and development of good specs is key to a successful (and affordable implementation). EMail is pretty easy. Forms are very easy. (Retention is usually easily discernable 1040 form for 10 years, MV-22 for 2 years, etc) If your contract/procurement people are well organized, classifying that data is usually pretty easy. Random files are hard.

I work for a government agency, so our archives actually have five objectives -- objectives that often conflict and are largely out of our control. They are:

  1. Classify and preserve data for the purposes of historical record.
  2. Preserve business records to comply with applicable law and regulation. In our case we have all sorts records which must be preserved for a period ranging from 3 years to 28 years. We also have records which must be kept forever.
  3. Respond to discovery requests related to litigation. (In our case requiring legal holds that have been around for as long as a decade)
  4. Respond to discovery requests related to internal audit, internal investigations or other investigations. (Lawful intercept, etc)
  5. Reduce the operating costs of the underlying IT service. If you archive all email after 90 days, you've managed to substantially reduce the footprint of your email systems, especially in large environments.

Let's talk about litigation and audit, since they are probably the most common reason to archive if you're getting started now. Now you need to think about other stuff:

  • Scope. EMail? IM? Phones? File Servers? Sharepoint? Wiki? Desktops? Laptops? PDAs? If you need more than Email & IM, you've just eliminated a whole ton of options.
  • Default retention time. If you get sued often, you'll want to purge as quickly as you are legally able. Otherwise, you'll want to strike a balance between due diligence and productivity of your workers.
  • IT Policy. Once you're archiving, PSTs must be banished. If anyone finds out that you're possess information relative to litigation on your PCs, that will reflect badly upon you.
  • Chargeback. You must present the legal folks with costs so they can weigh the value of the suit vs. the cost of compliance. Otherwise, you'll be keeping everything forever.
  • Cost avoidance. Some archive solutions facilitate disaster recovery and eliminate the need for backups. Hosted archives sometimes provide an emergency mail access point.
  • Special Requirements. If you have a presence in Europe, my understanding is that you need to use solutions like EMC Centera for WORM storage for some compliance purposes. That usually isn't needed in the US, but your industry may have a need for more exotic measures.
  • Production. If you get sued alot, your legal team probably already has specialized tools like Concordance that they use to organize and present evidence. If you don't, you may want to use your archiving tools to provide some workflow capability for your attorneys or outside counsel. This is another area where exploring early will help reduce costs and get you a successful implementation.

Keep asking questions, you're on the right path -- there is no one answer to the archiving problem.


Rule #1: Ignore (politely) anyone that isn't paying for the archive.

All of the above answers are excellent but they leave out one thing: The important thing is "who pays for this?"

The department that pays for it has requirements: how fast data can be restored, length of time to hold the data, security requirements, etc. Other people might have "helpful advice" but if they aren't paying for it, it's just a suggestion.

Nobody wants archives. They have some external reason for needing them. Usually it is "customers" or "compliance" or a mix of both. If the answer is "customers" then some product manager should have an idea of what they need. If the answer is "compliance" then some manager should be able to explain what is required, and auditors can validate the requirements.

There should be give-and-take. After receiving the requirements, you should return with a proposal with a cost. The people needing the archives may need something less expensive and then you should be willing to negotiate (remove) features.

Sometimes the "who pay's" question is reversed. Once I was at a company where the legal department required certain docs to be archived but we had to pay for the archiving. Therefore, we put together the proposal and asked them to sign off on it. We were paying for it, but only doing the minimum to "cover our butts".

Another time I was required to archive certain data as part of a law suit. The legal department needed to track all costs associated with the law suit (I think that if we won, we could get the plaintiff to pay our costs). They gave us a special cost center code to use to buy tapes, pay for shipping, boxes, new tape drives, labor, and so on. If they hadn't been willing to pay for our costs, we would never had done such a good job.