Create a data.frame where a column is a list

I know how to add a list column:

> df <- data.frame(a=1:3)
> df$b <- list(1:1, 1:2, 1:3)
> df
  a       b
1 1       1
2 2    1, 2
3 3 1, 2, 3

This works, but not:

> df <- data.frame(a=1:3, b=list(1:1, 1:2, 1:3))
Error in data.frame(1L, 1:2, 1:3, check.names = FALSE, stringsAsFactors = TRUE) : 
  arguments imply differing number of rows: 1, 2, 3

Why?

Also, is there a way to create df (above) in a single call to data.frame?


Slightly obscurely, from ?data.frame:

If a list or data frame or matrix is passed to ‘data.frame’ it is as if each component or column had been passed as a separate argument (except for matrices of class ‘"model.matrix"’ and those protected by ‘I’).

(emphasis added).

So

data.frame(a=1:3,b=I(list(1,1:2,1:3)))

seems to work.


If you are working with data.tables, then you can avoid the call to I()

library(data.table)
# the following works as intended
data.table(a=1:3,b=list(1,1:2,1:3))

   a     b
1: 1     1
2: 2   1,2
3: 3 1,2,3

data_frames (variously called tibbles, tbl_df, tbl) natively support the creation of list columns using the data_frame constructor. To use them, load one of the many libraries with them such as tibble, dplyr or tidyverse.

> data_frame(abc = letters[1:3], lst = list(1:3, 1:3, 1:3))
# A tibble: 3 × 2
    abc       lst
  <chr>    <list>
1     a <int [3]>
2     b <int [3]>
3     c <int [3]>

They are actually data.frames under the hood, but somewhat modified. They can almost always be used as normal data.frames. The only exception I've found is that when people do inappropriate class checks, they cause problems:

> #no problem
> data.frame(x = 1:3, y = 1:3) %>% class
[1] "data.frame"
> data.frame(x = 1:3, y = 1:3) %>% class == "data.frame"
[1] TRUE
> #uh oh
> data_frame(x = 1:3, y = 1:3) %>% class
[1] "tbl_df"     "tbl"        "data.frame"
> data_frame(x = 1:3, y = 1:3) %>% class == "data.frame"
[1] FALSE FALSE  TRUE
> #dont use if with improper testing!
> if(data_frame(x = 1:3, y = 1:3) %>% class == "data.frame") "something"
Warning message:
In if (data_frame(x = 1:3, y = 1:3) %>% class == "data.frame") "something" :
  the condition has length > 1 and only the first element will be used
> #proper
> data_frame(x = 1:3, y = 1:3) %>% inherits("data.frame")
[1] TRUE

I recommending reading about them in R 4 Data Science (free).