Should I ever DELETE (SQL and DB) anything?

I am curious, should I ever delete anything? Right now I am building a site (for myself) which allows you to subscribe to users which you then receive a message every time the user has upload content.

Or comments, if there is a thread and someone writes a direct comment to your comment you get a message saying so. Should I ever delete those or just simply hide it?

Each subscription has three (64bit) int. id, commentId, recipientId. You can find out who wrote to you looking at the comment table via commentId. If i dont use delete it will have a 4th int saying the status (show, hidden/delete).

Should I leave them or delete them? If I should delete them, then why? I can see maybe when there is personal user that you must delete on request but other then that should I ever delete?

I don't know which SQL DB I'll be using.

-edit-

Thanks guys. Right now I'll delete nothing except for things i can generate. Such as that subscription thing about.


Solution 1:

The company I work for offers software to people in certain regulated industries so generally I have the "never delete anything" attitude because if you delete anything you have lost the completeness of your audit trail. Instead mark the information as deleted (or move it to an archive version of the tables) and record who "deleted" it and when.

The only reasons to really delete stuff are

  • if you are running out of space (but disk is cheap these days)
  • for efficiency (but if your data structure is well indexed and not badly fragmented this will make little difference)
  • for legal reasons (if you are asked to remove someone's details you will most likely have to comply, depending on local data protection laws, or if the content itself infringes some law)

Your users may be thankful of nothing getting really deleted, if they accidentally delete something useful and you can get it back. And if an upset user who has previously provided valuable information to the site throws a hissy fit and deletes all their posts in vengeance you can retract the deletions easily.

One extra very important point: you shuold make it clear in your terms of service that information may not be really deleted when the user can not see it any more, and provide a route (if only "email [email protected] and ask for it to be done") for them to really delete data they have the right under relevant laws to ask to be deleted.

Solution 2:

Typically, today's modern disk sizes and IO performance means you don't have to delete records to save space or maintain performance. Usually a "record deleted" field on the record can mark the record as deleted (or as other status's) with an audit trail.

Some industries mandate that you never delete "transactional" data for regulatory reasons. You would already know if you need to do this. If there is any payment information you will typically need to keep the data (or make the data available) for 7 years (UK accounting law).

For other purposes, there is actually a good reason to physically delete data.

If it's not there, it's not discoverable.

The Freedom of Information Act (in the UK) states that if the data is discoverable then it is included in the scope for any search. This includes "soft deleted" records and historical backups.

For some systems, we ensure we PURGE old records and reuse / destroy old backup tapes / files after 'so many' months to ensure that it's not available for FOI requests. (Servicing an FOI request that goes back several years and requires restoring of hundreds of old mailboxes from archive backups is VERY costly).

This is different from OPERATIONAL backups. We keep backups so we can restore in the event of disaster. We also have a "Records Store" for both paper based and electronic media that MUST be kept, and we copy out emails and such to that store.