Java Programming - Where should SQL statements be stored? [closed]
Usually, the more the application grows in terms of size and/or reusability, the more the need is to externalize/abstractize the SQL statements.
Hardcoded (as static final constants) is the first step. Stored in a file (properties/xml file) is the next step. Metadata driven (as done by an ORM like Hibernate/JPA) is the last step.
Hardcoded has the disadvantage that your code is likely to become DB-specific and that you need to rewrite/rebuild/redistribute on every change. Advantage is that you have it in 1 place.
Stored in a file has the disadvantage that it can become unmaintainable when the application grows. Advantage is that you don't need to rewrite/rebuild the app, unless you need to add an extra DAO method.
Metadata driven has the disadvantage that your model code is very tight coupled with the database model. For every change in the database model you'll need to rewrite/rebuild/redistribute code. Advantage is that it is very abstract and that you can easily switch from DB server without the need to change your model (but ask yourself now: how often would a company switch from DB server? likely at least only once per 3 years, isn't it?).
I won't call stored procedures a "good" solution for this. They have an entirely different purpose. Even though, your code would be dependent on the DB / configuration used.
I don't know if this is optimal, but in my experience they end up hardcoded (i.e. String literals) in the DAO layer.
I don't think anyone will give you the pro/con break down you want as it is a rather large question. So instead here is what I've used in the past, and what I will be using going forward.
I use to use SQL hardcoded in the DAL. I thought this was fine until the DBAs wanted to play with the SQL. Then you have to dig it out, format it and fire it over to the DBAs. Who will laugh at it and replace it all. But without the nice question marks, or the question marks in the wrong order and leave you to stick it back in the Java code.
We have also used a ORM, and while this is great for developers our DBAs hated it as there is no SQL for them to laugh at. We also used a odd ORM (a custom one from 3rd party supplier) which had a habit of killing the database. I've used JPA since and was great, but getting anything complicated using it past the DBAs is a up hill battle.
We now use Stored Procedures (with the call statement hardcoded). Now the first thing everyone will complain about is that you are tied to the database. You are. However how often have you changed database? I know for a fact that we simply could not even attempt it, the amount of other code dependent on it plus retraining our DBAs plus migrating the data. It would be a very expensive operation. However if in your world changing DBs at a drop of a hat is required SPs are likely out.
Going forward I would like to use stored procedures with code generation tools to create Java classes from Oracle packages.
Edit 2013-01-31: A few years and DBAs later and we now use Hibernate, going to SQL (stored procs in the DB) only when absolutely required. This I think is the best solution. 99% of the times the DBs don't need to worry about the SQL, and the 1% they do it is in a place they are already comfortable with.
By using an ORM (such as hibernate) you hopefully will have no SQL statements to worry about. Performance is usually acceptable and you get vendor independence as well.
Should SQL code be considered “code” or “metadata”?
Code.
Should stored procedures be used only for performance optimization or they are a legitimate abstraction of the database structure?
Stored procedures allow for reuse, including inside of other stored procedures. This means that you can make one trip to the database & have it execute supporting instructions - the least amount of traffic is ideal. ORM or sproc, the time on the wire going to the db & back is something you can't recoup.
ORM doesn't lend itself to optimization because of its abstraction. IME, ORM also means a lack of referencial integrity - make a database difficult to report from. What was saved in complexity, has now increased to be able to get the data out in a workable fashion.
Is performance a key factor the decision? What about vendor lock-in?
No, simplicity is. Vendor lockin happens with the database as well - SQL is relatively standardized, but there are still vendor specific ways of doing things.