Creating Classes in R: S3, S4, R5 (RC), or R6? [closed]

Solution 1:

It seems you are already aware of some of the definitions and uses for the various OOP types. I will give my opinion on when it is appropriate to use which.

  1. Use S3 classes for situations where both of the following apply: (a) your object is static and not self-modifying, and (b) you do not care about multi-argument method signatures, i.e., your method dispatches purely on its first argument, the S3 class of the object. Additionally, S3 classes are a good solution when you can live with these restrictions and want to overload many operators.

  2. Use S4 classes if your object is static and not self-modifying, but you care about multi-argument method signatures. From my experience, S4 OOP has always been more hassle than it is worth, although it "guarantees" type safety to some extent.

  3. Use reference classes if your object is self-modifying. Otherwise, you will have to define many replace methods (e.g., some_method<-, which is called with the syntax some_method(obj) <- value). This is awkward and computationally slow, since R will be creating a full copy of the object each time. R6 is a good substitute, although I have not found it necessary for my purposes.

Most people new to R think it is confused; that the reason there are so many OOP implementations is because there was no consensus.

This is incorrect.

Due to its statistical nature, most heterogeneous structures in R (i.e, things that should be objecty) end up being the result of a statistical algorithm: an lm, glmnet, gbm, etc. object. It usually suffices to bundle this information and provide the expected interfaces for summarizing it: print, summary, etc.

Owing to its legacy as a statistical playground, this frees the user from having to think about more advanced concepts like inheritance and allocation / de-allocation, and opens the playing field to more contributors. This means that it is slightly more annoying to create complex projects (e.g., web servers, text parsers, graphical interfaces, etc.) in R than in a typical object-driven language like Ruby, but the lack of a uniform OOP-type is balanced by ease of use.

One final way to think about it is that the different approaches are like phase transitions in matter: solid, gas, liquid. Rather than treating all heterogeneous structures (i.e., OOP-like things) uniformly, some fall more naturally under one structure than another. If I am wrapping a simple list in an S3 class to display nicely with an overloaded print method, it would be rather silly to set up a whole reference class for this purpose.