Should you enforce constraints at the database level as well as the application level?
I've been reading the book "Enterprise Rails" by Dan Chak and it got me thinking: Do you feel that you should have data constraints at both the database level and the application level? Or do you feel similarly to opinionated frameworks like Ruby on Rails - the database is just a "dumb repository" for data, and all checks should be done in your application (I'm not trying to single out RoR here - I'm a huge fan of Rails myself, but I disagree with its approach to the database)?
Personally, I feel that you should have them both in order to make sure your database and application are well-secured. What I mean is that you should make use of not-null constraints, give your fields a length if known ( as opposed to leaving them all at nvarchar(255) ), have things such as Foreign Keys, Check Constraints and Triggers on your database, and then also enforce this through business logic rules in your application. IMO this makes your application robust through its user interface, and also secure against someone who might have direct access to the database.
The counter-argument I most often see is that it requires what amounts to duplicate logic; once at the database level, and once at the application level -- Let's say you have a check constraint to verify that a product's SKU is entered (i.e. it's length is greater than zero).
You would now need to also include validation methods in your business logic to make sure the value the user entered has a length greater than zero, and also possibly some client-side Javascript to catch the error as the user types data.
I for one do not see this as a bad thing - yes you have some duplicate logic, but the end result is the "database as fortress" mindset, since if you think about it your data is the single most important part of your application; after all, what good is your shiny new Web 2.0 application if the data can easily be corrupted and compromised?
What are your thoughts on this? Should the database be an impenetrable fortress like Fort Knox, or an open safe that's guarded by lasers? In other words, should you sacrifice some duplication of logic to ensure a secure data model, or leave everything to your application and use the database simply to store data?
In short : database should enforce constraints.
Why :
- Easier. For ex. to have a constraint set on a particular data column there is only one place to set it : the column itself. The data might come from various sources but the check is put where the data is finally put to rest.
- Integrity. Database should be responsible for the data it hosts. An inconsistent database is as good as no database.
- Flexibility. New UI development environments come all too frequently. If database puts up its hand to say that it will take care of the constraints , the front end development and functional testing are easier.
Yes, if you want to restrict what goes in the database. The layers should be distinct from each other as much as possible and your database shouldn't rely on another layer ensuring that it follows the rules.
There's no guarantee that a buggy (or malicious) "business logic" layer will not insert toxic data into your tables. Of course, if you can trust the other layers, you probably won't need it. But I work in a mainframe shop where the DBAs are always having to fix problems caused by the young Java whippersnappers rolling out their buggy code to production without adequate (any?) testing :-).
Database tables that are shared between different development areas (and that's all of them for us) should always protect themselves from errant data. When App A puts dodgy data into the table used by App B, it's not the App A developers that take the heat, it's the DBAs.
If you follow the Jeff Atwood school of a database is just a dumb data storage & retrieval system then you would put all the validation in the application layer.
However, I find that applications are like small children. Unchecked they will throw everything around the room. It will be up to the parents to clean up the mess. In this case it will be the DBAs doing the cleaning.
However, I think you need to be careful about using every database data integrity feature, just because it is there. Overloading your database with foreign key constraints and triggers might create more problems than you think. I tend to use foreign keys only on tables which are very closely related, such as a header/detail table pair. If you start adding foreign keys everywhere you can end up with an unmagageable database.
I rarely use triggers. I think they make a database very opaque. You issue a simple update/insert/delete command and strange things might happen. I guess there are two places where triggers are unavoidable:
When you don't have source code to the application writing to the database and you need to modify the behaviour. Triggers are your only option.
If you are performing CRUD operations on a view. Triggers are mandatory for the insert/update/delete operations.
I tend to perform basic validation in the app. This way the user is given immediate feedback that something is wrong. Complex validation that requires looking up related tables is probably best done in the database (as well as the simple validation that the app does). I would argue that some forms of validation are almost impossible to guarantee at the application level, without using complicated locking strategies.
If you have multiple applications, possibly written in different languages on different platforms, then the case for putting more of the validation into the database layer is strengthened. The liklihood of two or more applications, written by different programmers, performing identical validation is fairly remote. Best do it in one place.
The Jeff Atwoods of this world would suggest that you write a web service that all the apps use to communicate with. The web service performs the data validation. Doing this allows the database to remain a dumb storage container, thus enabling you to switch database engines. In reality you rarely change database engines (unless you started out with Microsoft Access!). If you are writing web services purely to centralise your data validation then I thnk you are going overboard.
If you're certain that you will never have another client application, you may get away with treating the database as a simple storage. However, if you'll ever have more than one client application obviously you will have to replicate constraints in all client applications, which is a bad idea. Remember that other clients include developer tools.
Also, by using the database as a "dumb repository", you will most likely end up with a less efficient application. The database can do a lot of stuff much more efficiently than your application can. Why not take advantage of that?
Typically there is always some duplication, and databases aren't just dumb repositories.
db
The database ensures integrity at a data level. Foreign key constraints, non null constraints, and size constraints are all fundamentally covered by the database.
You can't do everything in the db, but you can do a lot. Protect the data.
business layer
Moving up a level, you have the business logic. Typically this is your point of integration with other applications (web service, your own UI, etc). Here business logic in encoded into the application. Things like if a product has an end date of x, then it can't also occur in y, if y has a different end date.
It would be hard to describe that sort of rule in the database, so you don't. But your business logic layer still intercepts things that it knows are invalid. e.g. if description field isn't allowed to be null, the business logic should never send it to the database. It'll error regardless, but you try to intercept things at the point they are known to be bad.
it's also hard to express in a database other 'rules' like 'new users have a 1 year expiry date if they are from Arkensas, 2 years otherwise, unless they have 3 children and one of them is named Barry'. We can laugh at this example, but a seasoned programmer will tell you that business logic is one of the biggest oxymorons around.
ui
Moving up to the UI, the UI also regularly encodes business logic in screens. Forms and other pages will oftentimes exist in an invalid state, and it's the job of the UI to know the rules at least most of the time. Hopefully the UI defers logic to the business layer, but the business layer doesn't have a clue that field 1 is the due date and field 2 is the description.
The UI knows to search products with X if the user has already selected widget Y. The UI knows that description is required, and that the item count is > 0 and < 100 (In these examples a good UI will rely on the business layer to tell it, for example, the min and max, but the UI is still aware of the relationship)
In web UI's we also add client script that again duplicates logic in server code. We use client script to have a better user experience, but in the end don't trust anything that comes from the client (script can be turned off, form fields manually manipulated, etc)
So you can see that logic will be duplicated. You try to reduce duplication as much as possible, but realistically it's rarely possible given a nontrivial program.