RESTful design: when to use sub-resources? [closed]

Solution 1:

A year later, I ended with the following compromise (for database rows that contain a unique identifier):

  1. Assign all resources a canonical URI at the root (e.g. /companies/{id} and /employees/{id}).
  2. If a resource cannot exist without another, it should be represented as its sub-resource; however, treat the operation as a search engine query. Meaning, instead of carrying out the operation immediately, simply return HTTP 307 ("Temporary redirect") pointing at the canonical URI. This will cause clients to repeat the operation against the canonical URI.
  3. Your specification document should only expose root resources that match your conceptual model (not dependent on implementation details). Implementation details might change (your rows might no longer be unique identifiable) but your conceptual model will remain intact. In the above example, you'd tell clients about /companies but not /employees.

This approach has the following benefits:

  1. It eliminates the need to do unnecessary database look-ups.
  2. It reduces the number of sanity-checks to one per request. At most, I have to check whether an employee belongs to a company, but I no longer have to do two validation checks for /companies/{companyId}/employees/{employeeId}/computers/{computerId}.
  3. It has a mixed impact on database scalability. On the one hand you are reducing lock contention by locking less tables, for a shorter period of time. But on the other hand, you are increasing the possibility of deadlocks because each root resource must use a different locking order. I have no idea whether this is a net gain or loss but I take comfort in the fact that database deadlocks cannot be prevented anyway and the resulting locking rules are simpler to understand and implement. When in doubt, opt for simplicity.
  4. Our conceptual model remains intact. By ensuring that the specification document only exposes our conceptual model, we are free to drop URIs containing implementation details in the future without breaking existing clients. Remember, nothing prevents you from exposing implementation details in intermediate URIs so long as your specification declares their structure as undefined.

Solution 2:

This is problematic because it's no longer obvious that a user belongs to a particular company.

Sometimes this may highlight a problem with your domain model. Why does a user belong to a company? If I change companies, am I whole new person? What if I work for two companies? Am I two different people?

If the answer is yes, then why not take some company-unique identifier to access a user?

e.g. username:

company/foo/user/bar

(where bar is my username that is unique within that specific company namespace)

If the answer is no, then why am I not a user (person) by myself, and the company/users collection merely points to me: <link rel="user" uri="/user/1" /> (note: employee seems to be more appropriate)

Now outside of your specific example, I think that resource-subresource relationships are more appropriate when it comes to use rather than ownership (and that's why you're struggling with the redundancy of identifying a company for a user that implicitly identifies a company).

What I mean by this is that users is actually a sub-resource of a company resource, because the use is to define the relationship between a company and its employees - another way of saying that is: you have to define a company before you can start hiring employees. Likewise, a user (person) has to be defined (born) before you can recruit them.

Solution 3:

Your rule to decide if a resource should be modeled as sub resource is valid. Your problem does not arise from a wrong conceptual model but you let leak your database model into your REST model.

From a conceptual view an employee if it can only exist within a company relationship is modeled as a composition. The employee could be thus only identified via the company. Now databases come into play and all employee rows get a unique identifier.

My advice is don't let the database model leak in your conceptional model because you're exposing infrastructure concerns to your API. For example what happens when you decide to switch to a document oriented database like MongoDB where you could model your employees as part of the company document and no longer has this artificial unique id? Would you want to change your API?

To answer your extra questions

How should I represent the fact that a resource to belongs to another?

Composition via sub resources, other associations via URL links.

How should I represent the fact that a resource cannot be identified without another?

Use both id values in your resource URL and make sure not to let your database leak into your API by checking if the "combination" exists.

What relationships are sub-resources meant and not meant to model?

Sub resources are well suited for compositions but more generally spoken to model that a resource cannot exist without the parent resource and always belongs to one parent resource. Your rule when a resource could not exist without another, it should be represented as its sub-resource is a good guidance for this decision.