Why use 'virtual' for class properties in Entity Framework model definitions?

In the following blog: http://weblogs.asp.net/scottgu/archive/2010/07/16/code-first-development-with-entity-framework-4.aspx

The blog contains the following code sample:

public class Dinner
{
   public int DinnerID { get; set; }
   public string Title { get; set; }
   public DateTime EventDate { get; set; }
   public string Address { get; set; }
   public string HostedBy { get; set; }
   public virtual ICollection<RSVP> RSVPs { get; set; }
}

public class RSVP
{
   public int RsvpID { get; set; }
   public int DinnerID { get; set; }
   public string AttendeeEmail { get; set; }
   public virtual Dinner Dinner { get; set; }
}

What is the purpose of using virtual when defining a property in a class? What effect does it have?

It allows the Entity Framework to create a proxy around the virtual property so that the property can support lazy loading and more efficient change tracking. See What effect(s) can the virtual keyword have in Entity Framework 4.1 POCO Code First? for a more thorough discussion.

Edit to clarify "create a proxy around": By "create a proxy around", I'm referring specifically to what the Entity Framework does. The Entity Framework requires your navigation properties to be marked as virtual so that lazy loading and efficient change tracking are supported. See Requirements for Creating POCO Proxies.
The Entity Framework uses inheritance to support this functionality, which is why it requires certain properties to be marked virtual in your base class POCOs. It literally creates new types that derive from your POCO types. So your POCO is acting as a base type for the Entity Framework's dynamically created subclasses. That's what I meant by "create a proxy around".

The dynamically created subclasses that the Entity Framework creates become apparent when using the Entity Framework at runtime, not at static compilation time. And only if you enable the Entity Framework's lazy loading or change tracking features. If you opt to never use the lazy loading or change tracking features of the Entity Framework (which is not the default), then you needn't declare any of your navigation properties as virtual. You are then responsible for loading those navigation properties yourself, either using what the Entity Framework refers to as "eager loading", or manually retrieving related types across multiple database queries. You can and should use lazy loading and change tracking features for your navigation properties in many scenarios though.

If you were to create a standalone class and mark properties as virtual, and simply construct and use instances of those classes in your own application, completely outside of the scope of the Entity Framework, then your virtual properties wouldn't gain you anything on their own.

Edit to describe why properties would be marked as virtual

Properties such as:

 public ICollection<RSVP> RSVPs { get; set; }

Are not fields and should not be thought of as such. These are called getters and setters, and at compilation time, they are converted into methods.

//Internally the code looks more like this:
public ICollection<RSVP> get_RSVPs()
{
    return _RSVPs;
}

public void set_RSVPs(RSVP value)
{
    _RSVPs = value;
}

private RSVP _RSVPs;

That's why they're marked as virtual for use in the Entity Framework; it allows the dynamically created classes to override the internally generated get and set functions. If your navigation property getter/setters are working for you in your Entity Framework usage, try revising them to just properties, recompile, and see if the Entity Framework is able to still function properly:

 public virtual ICollection<RSVP> RSVPs;

The virtual keyword in C# enables a method or property to be overridden by child classes. For more information please refer to the MSDN documentation on the 'virtual' keyword

_{UPDATE: This doesn't answer the question as currently asked, but I'll leave it here for anyone looking for a simple answer to the original, non-descriptive question asked.}

I understand the OPs frustration, this usage of virtual is not for the templated abstraction that the defacto virtual modifier is effective for.

If any are still struggling with this, I would offer my view point, as I try to keep the solutions simple and the jargon to a minimum:

Entity Framework in a simple piece does utilize lazy loading, which is the equivalent of prepping something for future execution. That fits the 'virtual' modifier, but there is more to this.

In Entity Framework, using a virtual navigation property allows you to denote it as the equivalent of a nullable Foreign Key in SQL. You do not HAVE to eagerly join every keyed table when performing a query, but when you need the information -- it becomes demand-driven.

I also mentioned nullable because many navigation properties are not relevant at first. i.e. In a customer / Orders scenario, you do not have to wait until the moment an order is processed to create a customer. You can, but if you had a multi-stage process to achieve this, you might find the need to persist the customer data for later completion or for deployment to future orders. If all nav properties were implemented, you'd have to establish every Foreign Key and relational field on the save. That really just sets the data back into memory, which defeats the role of persistence.

So while it may seem cryptic in the actual execution at run time, I have found the best rule of thumb to use would be: if you are outputting data (reading into a View Model or Serializable Model) and need values before references, do not use virtual; If your scope is collecting data that may be incomplete or a need to search and not require every search parameter completed for a search, the code will make good use of reference, similar to using nullable value properties int? long?. Also, abstracting your business logic from your data collection until the need to inject it has many performance benefits, similar to instantiating an object and starting it at null. Entity Framework uses a lot of reflection and dynamics, which can degrade performance, and the need to have a flexible model that can scale to demand is critical to managing performance.

To me, that always made more sense than using overloaded tech jargon like proxies, delegates, handlers and such. Once you hit your third or fourth programming lang, it can get messy with these.

It’s quite common to define navigational properties in a model to be virtual. When a navigation property is defined as virtual, it can take advantage of certain Entity Framework functionality. The most common one is lazy loading.

Lazy loading is a nice feature of many ORMs because it allows you to dynamically access related data from a model. It will not unnecessarily fetch the related data until it is actually accessed, thus reducing the up-front querying of data from the database.

From book "ASP.NET MVC 5 with Bootstrap and Knockout.js"

In the context of EF, marking a property as virtual allows EF to use lazy loading to load it. For lazy loading to work EF has to create a proxy object that overrides your virtual properties with an implementation that loads the referenced entity when it is first accessed. If you don't mark the property as virtual then lazy loading won't work with it.

Why use 'virtual' for class properties in Entity Framework model definitions?

Related

Recent Posts