Inheritance in R

Solution 1:

A first pass, not quite good enough

Here are two classes

.A <- setClass("A", representation(a="integer"))
.B <- setClass("B", contains="A", representation(b="integer"))

The symbol .A is a class generator function (essentially a call to new()), and is a relatively new addition to the methods package.

Here we write an initialize,A-method, using callNextMethod to call the next method (the default constructor, initialize,ANY-method) for the class

setMethod("initialize", "A", function(.Object, ..., a=integer()) {
    ## do work of initialization
    cat("A\n")
    callNextMethod(.Object, ..., a=a)
})

The argument corresponding to the slot a=a comes after ... so that the function does not assign any unnamed arguments to a; this is important because unnamed arguments are supposed (from ?initialize) to be used to initialize base classes, not slots; the importance of this becomes apparent below. Similarly for "B":

setMethod("initialize", "B", function(.Object, ..., b=integer()) {
    cat("B\n")
    callNextMethod(.Object, ..., b=b)
})

and in action

> b <- .B(a=1:5, b=5:1)
B
A
> b
An object of class "B"
Slot "b":
[1] 5 4 3 2 1

Slot "a":
[1] 1 2 3 4 5

Actually, this is not quite correct, because the default initialize is a copy constructor

.C <- setClass("C", representation(c1="numeric", c2="numeric"))
c <- .C(c1=1:5, c2=5:1)

> initialize(c, c1=5:1)
An object of class "C"
Slot "c1":
[1] 5 4 3 2 1

Slot "c2":
[1] 5 4 3 2 1

and our initialize method has broken this aspect of the contract

>  initialize(b, a=1:5)   # BAD: no copy construction
B
A
An object of class "B"
Slot "b":
integer(0)

Slot "a":
[1] 1 2 3 4 5

Copy construction turns out to be quite handy, so we don't want to break it.

Retaining copy construction

There are two solutions employed to retain copy construction functionality. The first avoids defining an initialize method, but instead creates a plain old function as a constructor

.A1 <- setClass("A1", representation(a="integer"))
.B1 <- setClass("B1", contains="A1", representation(b="integer"))

A1 <- function(a = integer(), ...) {
    .A1(a=a, ...)
}

B1  <- function(a=integer(), b=integer(), ...) {
    .B1(A1(a), b=b, ...)
}

These functions include ... as arguments, so that class "B1" can be extended and its constructor still used. This is actually quite attractive; the constructor can have a sensible signature with documented arguments. initialize can be used as a copy constructor (remember, there is no initialize,A1-method or initialize,B1-method, so the call .A1() invokes the default, copy-constructor able initialize method). The function (.B1(A1(a), b=b, ...) says "call the generator for class B1, with an unnamed argument creating its superclass using the "A1" constructor, and a named argument corresponding to slot b". As mentioned above, from ?initialize, the unnamed argument(s) are used to initialize superclass(es) (with plural classes when the class structure involves multiple inheritance). The use of constructors means that class A1 and B1 can be ignorant of each other's structure and implementation.

The second solution, less commonly used in its full glory, is to write an initialize method that retains copy construction, along the lines of

.A2 <- setClass("A2", representation(a1="integer", a2="integer"),
     prototype=prototype(a1=1:5, a2=5:1))

setMethod("initialize", "A2", 
    function(.Object, ..., a1=.Object@a1, a2=.Object@a2)
{
    callNextMethod(.Object, ..., a1=a1, a2=a2)
})

The argument a1=.Object@a1 uses the current value of the a1 slot of .Object as a default, relevant when the method is being used as a copy constructor. The example illustrates the use of a prototype to provide an initial values different from 0-length vectors. In action:

> a <- .A2(a2=1:3)
> a
An object of class "A1"
Slot "a1":
[1] 1 2 3 4 5

Slot "a2":
[1] 1 2 3

> initialize(a, a1=-(1:3))    # GOOD: copy constructor
An object of class "A1"
Slot "a1":
[1] -1 -2 -3

Slot "a2":
[1] 1 2 3

Unfortunately this approach fails when trying to initialize a derived class from a base class.

Other considerations

One final point is the structure of the initialize method itself. Illustrated above is the pattern

## do class initialization steps, then...
callNextMethod(<...>)

so callNextMethod() is at the end of the initialize method. An alternative is

.Object <- callNextMethod(<...>)
## do class initialization steps by modifying .Object, e.g.,...
.Object@a <- <...>
.Object

The reason to prefer the first approach is that there is less copying involved; the default initialize,ANY-method populates slots with a minimum of copying, whereas the slot update approach copies the entire object each time a slot is modified; this can be very bad if the object contains large vectors.