What is the difference between mode and class in R?

I am learning R (I started just this week), and I've been struggling with the concepts of typeof, mode, storage.mode and class. I've been searching up and down (official R documentation, StackOverflow, Google, etc.), and I haven't been able to find any clear explanation of the difference between these. (A few StackOverflow and CrossValidated answers have not really helped clear me up.) Finally (I hope), I think I understand it, and so my question is to verify if my understanding is correct.

mode vs storage.mode: mode and storage.mode are basically the same thing, except for a tiny difference in how the "single" datatype is handled.

mode vs typeof: Very similar, except for a few differences, most notably that both (typeof "integer" and "double") = (mode "numeric"); and both (typeof "special" and "builtin" = (mode "function").

class: Class is based on R's object-oriented class hierarchy. I'm having a hard time finding this graphically laid out, but the best I've been able to find is this diagram:

rpy2 robjects package

(If anyone can point me to a more accurate R class hierarchy, I'll replace it.)

Although the class names don't correspond exactly with the results of the R class() function, I believe the hierarchy is basically accurate. My understanding is that the "class" of an object--that is, the result of the class() function--is the root class in the hierarchy. So, for example, "Vector" is not a root class and so it never shows up as the result of the class() function. The root class might rather be "StrVector" ("character") or "BoolVector" ("logical"). In contrast, "Matrix" is itself a root class; hence, its class is "matrix".

Apparently, R supports multiple inheritance, and so some objects can have more than one class.

typeof/mode/storage.mode vs class: This is what was hardest part for me to understand. My understanding now is this: typeof/mode/storage.mode (which I will refer to henceforth simply as "mode") is basically the most complex datatype that an R object can hold as one of its values. So, for example, since matrixes, arrays and vectors can hold only one vector datatype, their mode (that is, the most complex datatype they can hold) is typically numeric, character or logical, even though their class (their position in the class hierarchy) is something completely different.

Where this gets most interesting (that is, confusing) is with objects like lists. A mode of "list" means that each value in the object can itself be a list (that is, an object that can hold diverse datatypes). Thus, regardless of if the class itself is "list", there are multiple objects (e.g. data frames) that can contain diverse values, and whose mode is therefore "list", even if their class is something else.

So, in summary, my understanding is:

  • typeof/mode/storage.mode (almost the same thing) is basically the most complex datatype that an R object can hold as one of its values; whereas

  • class is an object's object-oriented classification according to the R class hierarchy.

Is my understanding accurate? If not, could someone please give a more accurate explanation?


Solution 1:

'mode' is a mutually exclusive classification of objects according to their basic structure. The 'atomic' modes are numeric, complex, character and logical. Recursive objects have modes such as 'list' or 'function' or a few others. An object has one and only one mode.

'class' is a property assigned to an object that determines how generic functions operate with it. It is not a mutually exclusive classification. If an object has no specific class assigned to it, such as a simple numeric vector, it's class is usually the same as its mode, by convention.

Changing the mode of an object is often called 'coercion'. The mode of an object can change without necessarily changing the class. e.g.

> x <- 1:16
> mode(x)
[1] "numeric"
> dim(x) <- c(4,4)
> mode(x)
[1] "numeric"
> class(x)
[1] "matrix"
> is.numeric(x)
[1] TRUE
> mode(x) <- "character"
> mode(x)
[1] "character"
> class(x)
[1] "matrix"

However:

> x <- factor(x)
> class(x)
[1] "factor"
> mode(x)
[1] "numeric"

At this stage, even though x has mode numeric again, its new class, factor, inhibits it being used in arithmetic operations.

In practice, mode is not used very much, other than to define a class implicitly when no explicit class has been assigned.

Solution 2:

I hope the following examples are helpful. In particular, have a look at the last two examples.

x <- 1L
print(c(class(x), mode(x), storage.mode(x), typeof(x)))

x <- 1
print(c(class(x), mode(x), storage.mode(x), typeof(x)))

x <- letters
print(c(class(x), mode(x), storage.mode(x), typeof(x)))

x <- TRUE
print(c(class(x), mode(x), storage.mode(x), typeof(x)))

x <- cars
print(c(class(x), mode(x), storage.mode(x), typeof(x)))

x <- cars[1]
print(c(class(x), mode(x), storage.mode(x), typeof(x)))

x <- cars[[1]]
print(c(class(x), mode(x), storage.mode(x), typeof(x)))

x <- matrix(cars)
print(c(class(x), mode(x), storage.mode(x), typeof(x)))

x <- new.env()
print(c(class(x), mode(x), storage.mode(x), typeof(x)))

x <- expression(1 + 1)
print(c(class(x), mode(x), storage.mode(x), typeof(x)))

x <- quote(y <- 1 + 1)
print(c(class(x), mode(x), storage.mode(x), typeof(x)))

x <- ls
print(c(class(x), mode(x), storage.mode(x), typeof(x)))

The results are:

[1] "integer"      "numeric"      "integer"      "integer"
[1] "numeric"      "numeric"      "double"       "double" 
[1] "character"    "character"    "character"    "character"
[1] "logical"      "logical"      "logical"      "logical"
[1] "data.frame"   "list"         "list"         "list"      
[1] "data.frame"   "list"         "list"         "list"      
[1] "numeric"      "numeric"      "double"       "double" 
[1] "matrix"       "list"         "list"         "list"  
[1] "environment"  "environment"  "environment"  "environment"
[1] "expression"   "expression"   "expression"   "expression"
[1] "<-"           "call"         "language"     "language"
[1] "function"     "function"     "function"     "closure" 

The last example shows you a case where typeof() != storage.mode().

Solution 3:

Just for better readability.

I hope the following examples are helpful. In particular, have a look at the last two examples.

x <- 1L
print(c(class(x), mode(x), storage.mode(x), typeof(x)))
[1] "integer"      "numeric"      "integer"      "integer"

x <- 1
print(c(class(x), mode(x), storage.mode(x), typeof(x)))
[1] "numeric"      "numeric"      "double"       "double" 

x <- letters
print(c(class(x), mode(x), storage.mode(x), typeof(x)))
[1] "character"    "character"    "character"    "character"

x <- TRUE
print(c(class(x), mode(x), storage.mode(x), typeof(x)))
[1] "logical"      "logical"      "logical"      "logical"

x <- cars
print(c(class(x), mode(x), storage.mode(x), typeof(x)))
[1] "data.frame"   "list"         "list"         "list"   

x <- cars[1]
print(c(class(x), mode(x), storage.mode(x), typeof(x)))
[1] "data.frame"   "list"         "list"         "list"      

x <- cars[[1]]
print(c(class(x), mode(x), storage.mode(x), typeof(x)))
[1] "numeric"      "numeric"      "double"       "double" 

x <- matrix(cars)
print(c(class(x), mode(x), storage.mode(x), typeof(x)))
[1] "matrix"       "list"         "list"         "list"  

x <- new.env()
print(c(class(x), mode(x), storage.mode(x), typeof(x)))
[1] "environment"  "environment"  "environment"  "environment"

x <- expression(1 + 1)
print(c(class(x), mode(x), storage.mode(x), typeof(x)))
[1] "expression"   "expression"   "expression"   "expression"

x <- quote(y <- 1 + 1)
print(c(class(x), mode(x), storage.mode(x), typeof(x)))
[1] "<-"           "call"         "language"     "language"

x <- ls
print(c(class(x), mode(x), storage.mode(x), typeof(x)))
[1] "function"     "function"     "function"     "closure"

Credits - @SHUAICHENG WANG