What is the best way to implement soft deletion?
Working on a project at the moment and we have to implement soft deletion for the majority of users (user roles). We decided to add an is_deleted='0'
field on each table in the database and set it to '1'
if particular user roles hit a delete button on a specific record.
For future maintenance now, each SELECT
query will need to ensure they do not include records where is_deleted='1'
.
Is there a better solution for implementing soft deletion?
Update: I should also note that we have an Audit database that tracks changes (field, old value, new value, time, user, ip) to all tables/fields within the Application database.
Solution 1:
I would lean towards a deleted_at
column that contains the datetime of when the deletion took place. Then you get a little bit of free metadata about the deletion. For your SELECT
just get rows WHERE deleted_at IS NULL
Solution 2:
You could perform all of your queries against a view that contains the WHERE IS_DELETED='0'
clause.
Solution 3:
Having is_deleted
column is a reasonably good approach.
If it is in Oracle, to further increase performance I'd recommend partitioning the table by creating a list partition on is_deleted
column.
Then deleted and non-deleted rows will physically be in different partitions, though for you it'll be transparent.
As a result, if you type a query like
SELECT * FROM table_name WHERE is_deleted = 1
then Oracle will perform the 'partition pruning' and only look into the appropriate partition. Internally a partition is a different table, but it is transparent for you as a user: you'll be able to select across the entire table no matter if it is partitioned or not. But Oracle will be able to query ONLY the partition it needs. For example, let's assume you have 1000 rows with is_deleted = 0
and 100000 rows with is_deleted = 1
, and you partition the table on is_deleted
. Now if you include condition
WHERE ... AND IS_DELETED=0
then Oracle will ONLY scan the partition with 1000 rows. If the table weren't partitioned, it would have to scan 101000 rows (both partitions).