.net ORM Comparison [closed]

I was talking with someone about the Entity Framework, and I'm not really into it yet, but I'd like to learn it. However, I'm still kinda confused whether I should learn it or not. I've heard a lot of people saying you shouldn't use the entity framwork, however I haven't heared any arguments why that is.

So my question is, what are the pro's and con's of using the Entity Framework compared to other products. Like

  • NHibernate
  • DataObjects.Net
  • etc..

In terms of ease of use, testability, semantics...

I know there are some duplicate questions about this. But they are all kinda outdated (2008,2009) and to be honest, the arguments are also lacking something. I know Entity Framework 4.0 is available, and I haven't found a good (complete) comparison yet.


Answers

Some of the nice people here have answered my question by explaining some details on the different frameworks. Thought it might be good to show them here for future reference.

  • J. Tihon has made an excelent post explaining how to make the EF work when you need more extensibility.
  • Diego Mijelshon has created an answer with some of the pitfalls of EF and how NHibernate solves them.

Solution 1:

Since J. Tihon did a great job on explaining EF features, I'll just list the areas where NHibernate runs circles around EF:

  • Caching
    • EF has nothing out of the box; there's just an unsupported sample
    • NH has complete caching support, including DB-based invalidation. It's also extensible and provider-based, meaning it works with different types of local and distributed caches
  • Batching
    • EF has none
    • NH has extensive support for lazy-loading groups of entities or collections at once (in any DB), and persisting changes in the same way (Oracle and SQL Server). There's also MultiQueries and Future Queries, allowing you to arbitrarily group different queries to be sent in one roundtrip.
  • User types
    • EF has no extensibility at all. It doesn't even support Enum properties
    • No type mappings are hardcoded in NH. You can extend it to support any value types you can create, modify the way existing types are mapped, etc
  • Collection support
    • EF supports only simple collections of entities. Many-to-many always uses a composite key
    • NH supports collections of entities, value types, component types, and also indexed collections and dictionaries (where both the key and the value can be of any type). Many-to-many collections with their own key are supported (idbag)
  • Logging
    • EF has no logging out of the box. There's the same unsupported sample listed above
    • NH has extensive logging, allowing you to debug issues easily. It uses log4net by default, but you can use any logging framework you want
  • Querying
    • EF has LINQ as the main query language. LINQ has a high impedance when mapping to relational databases. EF's provider does not support using entities as parameters; you always have to use Ids. There's also a query language that's poorly documented
    • NH has LINQ (not as complete as EF's, though), HQL, QueryOver and Criteria.
  • Event system and interceptors
    • EF has almost nothing
    • NH has a powerful event system that allows you to extend or replace its behavior at any point of the session lifecyle: loading objects, persisting changes, flushing, etc.

I think extensibility is the main selling point. Every aspect of NH is correctly decoupled from the rest, using interfaces and base clases that you can extend whenever you need to, and exposed in configuration options.

EF follows the usual MS pattern of making things closed by default, and we'll see what's extensible later.

Solution 2:

I spend a huge amount of time to bend the Entity Framework to my needs and can therefore say, that it fulfills most of the requirements you want from an ORM. But some aspects are far too complex, as other ORMs have shown that it can be made easier.

For example, getting started with the Entity Framework is fairly easy, since you can just fire up the Designer in Visual Studio and have a working ORM in a matter of minutes. But you end up with Entity-Classes tied to the ObjectContext created by the designer (this can be avoided using a custom T4 Template). This is not necessarly a bad thing, but it's that kind of Microsoft "Getting Started" approaches, that you don't want to use in a real application.

But if you dive deeper into the Entity Framework you can see, how you can avoid most of it's pitfalls: The Designer generates an EDMX file, which (if you look at it in an XML editor) is nothing more than a combination of the three main aspects of an ORM, the physical storage (your database), the conceptual model (your entity classes) and the mapping between both of them. The custom build-action applied to .edmx files in Visual Studio will split those 3 parts into three separate files and adds them to the assembly as embedded resources. When creating an ObjectContext the path to those three files is used in the ConnectionString (which always looks a bit confusing to me). What you can actually do here, is do all this by yourself. This means writing the storage schema, conceptual model and mapping in an XML editor (much like NHibernate) and embed these to the assembly containing your model.

The base Entity Framework base class "ObjectContext" can than be constructed from those three files (it takes an MetadataWorkspace and EntityConnection) but the point is, that you have full control over how the ObjectContext get's created. This opens the door for a lot of functionality you might not expect from the Entity Framework. For Example: you can embed multiple SSDL storage schemas in the same assembly to match a specific database-type (i usually add one for SQL Server and one for SQL Server CE 4.0). And create a constructor overload that chooses the appropriate storage schema for a specific kind of DbConnection.

Since you have your own ObjectContext implementation now, you can implement various interfaces on it. Like your own IRepository, but since i like the ObjectContext approach, i create something like:

interface ICatalog
{
    IEntitySet<Article> { get; }
    void Save();
}

interface IEntitySet<T> : IQueryable<T>
{
    void Add(T);
    void Remove(T); 
}

class EntityFrameworkCatalog : ICatalog
{
    ...
}

But creating a Repository if you have an Entity Framework ObjectContext is really easy, plus you get an IQueryable. Based on this information you can avoid having strong class coupling between your services and the ORM and completly mock out the Entity Framework in tests. Also, when testing your Entity Framework implementation, you can use a SQL Server CE database during unit-tests to ensure that your mappings are fine (usually the different between the storage schema for CE and the full blown SQL Server is just a few data-types). So you can actually test all behaviors of your Entity Framework implemantion just fine.

This makes Entity Framework place nicely with modern software concepts, but it doesn't enforce such practices on you, which makes the "Getting Started" easier.

Now to the complex bits: The Entity Framework has a small set of supported CLR types, which basically only include the primitive ones, like ints, strings and byte-arrays. It also provides some level of complex-types, which follow the same rules. But what if you have a complex entity property such as a DOM representation of a document, which you would like to have serialized to XML in the database. As far as i know, NHibernate provides a feature called IUserType, which allows you to define such a mapping for you. In Entity Framework this gets much more complicated, but it's still in pretty in it's own way. The conceptual model allows you to include assembly-internal complex-types (as long as you tell the ObjectContext about it (ObjectContext.CreateProxyTypes(Type[])). So you can create a wrapper for your original type, that is only known to the Entity Framework like so:

 class Document : IXmlSerializable { }
 class Article
 {
     public virtual Document Content { get; set; }
 }
 internal class EntityFrameworkDocument : Document
 {
     public string Xml
     {
         get
         {
              // Use XmlSerializer to generate the XML-string for this instance.
         }
         set
         {
              // Use XmlSerializer to read the XML-string for this instance.
         }
     }
 }

Altough the EF can now return those serialized documents from the storage, writing them to it, requires you to intercept the storing of an Article and replace a simple Document with the EntityFrameworkDocument one, to ensure that EF can serialize it. I'm sure other ORMs does that pretty easily and it get's worse. Currently there is no way, to do the same with System.Uri class (which is immutable, but would otherwise work) or an Enum. Apart from those restrictions you can fit the EF to most of your needs. But you will spend a lot time on it (like I did).

Since my experience with other ORMs is limited, I would summarize:

  • Entity Framework is in the GAC, even in the Client Profile
  • Entity Framework can be customized to represent even complex entity types (Including some self-referencing many-to-many for example, or the the XML serialization above)
  • It can be "abstracted" away, so you can stick to IRepository etc.
  • IQueryable implementation (altough it's not that complete as DataObjects.Net)
  • It only requires System.Data and System.Data.Entity, you can even include multiple storage schemas for other providers which would normally require a reference, but if you stick to DbConnection you can just do this:

    ICatalog Create(DbConnection connection, string storageSchemaPath) ICatalog CreateMySql(DbConnection mySqlConnection) { return Create(connection, "res://Assembly/Path.To.Embedded.MySql.Storage.ssdl"); }

Edit I recently found out, that if your entities and your "catalog" implementation are in the same assembly, you can use internal properties for an XML serialization process. So instead of deriving an internal EntityFrameworkDocument from Document you could add an internal Property called Xml to the Document class itself. This still only applies if you have full control over your entities, but it removes the need to intercept any changes to the catalog, to make sure that your derived class is used. The CSDL looks the same, EF just allows the mapped property to be internal. I still have to ensure that this would work in Medium-Trust environments.