Doesn't Linq to SQL miss the point? Aren't ORM-mappers (SubSonic, etc.) sub-optimal solutions?

I'd like the community's take on some thoughts I've had about Linq to Sql and other ORM mappers.

I like Linq to Sql and the idea of expressing data access logic (or CRUD operations in general) in your native development tongue rather than having to deal with the "impedance mismatch" between C# and SQL. For example, to return an ObjectDataSource-compatible list of Event instances for a business layer, we use:

return db.Events.Select(c => new EventData() { EventID = c.EventID, Title = c.Title })

If I were to implement this using old SQL-to-C# constructs, I'd have to create a Command class, add the EventID parameter (using a string to describe the "@EventID" argument), add the SQL query string to the Command class, execute the command, and then use (cast-type)nwReader["FieldName"] to pull each returned field value and assign it to a member of a newly created instance of my EventData class (yuck).

So, that is why people like Linq/SubSonic/etc. and I agree.

However, in the bigger picture I see a number of things that are wrong. My sense is that Microsoft also sees something wrong and that is why they are killing Linq to SQL and trying to move people to Linq to Entities. Only, I think that Microsoft is doubling-down on a bad bet.

So, what is wrong?

The problem is that there are architecture astronauts, especially at Microsoft, who look at Linq to Sql and realize that it is not a true data management tool: there are still many things you cannot do easily of comfortably in C# and they aim to fix it. You see this manifested in the ambitions behind Linq to Entities, blog posts about the revolutionary nature of Linq and even the LinqPad challenge.

And the problem with that is that it assumes that SQL is the problem. That is, in order to reduce a mild discomfort (impedance mismatch between SQL and C#), Microsoft has proposed the equivalent of a space suit (full isolation) when a band-aid (Linq to SQL or something similar) would do just fine.

As far as I can see, developers are quite smart enough to master the relational model and then apply it intelligently in their development efforts. In fact, I would go one further and say that Linq to SQL, SubSonic, etc. are already too complex: the learning curve isn't that much different from mastering SQL itself. Since, for the foreseeable future, developers must master SQL and the relational model, we're now faced with learning two query / CRUD languages. Worse yet, Linq is often difficult to test (you don't have a query window), removes us one layer from the real work we are doing (it generates SQL), and has very clumsy support (at best) for SQL constructs like Date handling (e.g. DateDiff), "Having" and even "Group By".

What is the alternative? Personally, I don't need a different model for data access like Linq to Entities. I'd prefer to simply pop up a window in Visual Studio, enter and validate my SQL, and then press a button to generate or supplement a C# class to encapsulate the call. Since you already know SQL, wouldn't you prefer to just enter something like this:

Select EventID, Title From Events Where Location=@Location

and end up with an EventData class that A) contains the EventID and Title fields as properties and B) has a factory method that takes a 'Location' string as an argument and that generates a List<EventData>? You'd have to think carefully about the object model (the above example obviously doesn't deal with that) but the fundamental approach of still using SQL while eliminating the impedance mismatch appeals to me a great deal.

The question is: am I wrong? Should Microsoft rewrite the SQL infrastructure so that you don't have to learn SQL / relational data management any more? Can they rewrite the SQL infrastructure in this way? Or do you think that a very thin layer on top of SQL to eliminate the pain of setting up parameters and accessing data fields is quite sufficient?

Update I wanted to promote two links to the top because I think that they capture important aspects of what I am after. First, CodeMonkey points out an article entitled "The Vietnam of Computer Science." It takes a while to get started but is a very interesting read. Second, AnSGri points to one of Joel Spolsky's more prominent pieces: The Law of Leaky Abstractions. It isn't exactly on topic but it is close and is a great read.

Update 2: I've given the "answer" to ocdecio although there are many great answers here and the choice of the "right" answer is purely subjective. In this case, his answer squared with what I think is truly the best practice given the current state of technology. This is an area that I fully expect to evolve, however, so things may well change. I'd like to thank everyone who contributed, I've upvoted everyone who I think gave a thoughtful answer.


Solution 1:

Let me preface this by saying that I am a dyed-in-the-wool database guy.

As a gross over-generalization: Developers don't know SQL. Developers don't really want to know SQL. They can write it, they can design tables, but it makes them feel icky. They tend to do stupid things when the necessary query is more than a simple join. Not because the developers are stupid -- because they can't be bothered. They like living in a world where they only have to deal with one concept space; moving from objects to tables and back is a context switch the price for which they don't like paying.

This doesn't mean they are bad, or wrong; it means there is an opportunity for improvement. If your customers (in this case, developers using your framework) don't like SQL and tables -- give them a layer of abstraction that lets them get away without dealing with the underlying mess.

It's the same logic that makes garbage collection / automated memory management a big hit. Yes, developers can deal with it; yes, they can write code that is better optimized without it; but not having to deal with it makes them happier and more productive.

Solution 2:

I think the popularity of ORMs has been spawned by developers developing data layers and writing the same CRUD code over and over again application after application. ORMs are just another tool/technology that lets developers spend less time writing the same SQL statements over and over and concentrate on the logic of the application instead (hopefully).

Solution 3:

For at least 6 years I have been using my own ORM that is based on a very simple concept: projection. Each table is projected into a class, and SQL is generated on the fly based on the class definition. It still requires me to know SQL but it takes care of the 90% simple CRUD, and I never had to manage connections, etc - and it works for the major DB vendors.

I'm happy with what I have and didn't find anything worth dropping it for.

Solution 4:

IMHO, OR/M is not only about 'abstracting the SQL away' or hiding the SQL, or enabling multi-DBMS support.

It enables you to put more focus on your problem domain, since you have to spent less time writing the boring CRUD SQL queries. On the other hand, if you are using a good OR/M, this OR/M should enable you to write SQL queries if this seems to be necessary.

An OR/M can be a powerful tool if you use it properly; it can take care of lazy loading, polymorphic queries / associatons ...
Don't get me wrong; there's nothing wrong with plain SQL, but, if you have to take care yourself of translating your (well thought and normalized) relational model to an expressive OO/domain model, then I think you're spending way to much time doing plumbing.

Using an OR/M also does not mean that you -as a developer- should have no knowledge of SQL. The contrary is true imho.
Knowing SQL and knowing how to write an efficient SQL query, will -imho- enable you to use an OR/M properly.

I must also admit that I'm writing this with NHibernate in mind. This is the OR/M that I'm using atm, and I haven't used Linq to SQL or Linq to entities (yet).