Unnesting character string list and converting into wide data

r tidyverse

I often have nested lists containing characters string in a tibble. I would like to convert this to long data attempts have been to transform it into wide data using the tidyr unnest_wider function, but I am unable to produce my desired results.
Suppose you have a dataframe with a nested character list like the following:

#Producing a list with random character strings and embedding into dataframe.
set.seed(112)
df <- data.frame(id = 1:10)
df$alist <- replicate(10, list(sample(letters[1:5],round(runif(1, min = 1, max = 5)))))

Using tidyr::unnest_wider almost gets me there, but I would like for the strings to go into the column names.

df %>% unnest_wider(alist, names_sep = "_")

By using two for loops I´m able to achieve what I want:

df %>% select(alist) %>% unlist(alist) %>% unique() %>% sort() %>% as.list() -> let

df[unlist(let)] <- NA

for(index in let) {
  
for (i in df$id){
  ifelse(index %in% df$alist[[i]],
         T,
         F) -> df[i,index]
  }
}

id	alist	a	b	c	d	e
1	b, d, a	TRUE	TRUE	FALSE	TRUE	FALSE
2	a, b	TRUE	TRUE	FALSE	FALSE	FALSE
3	d, a, c, e	TRUE	FALSE	TRUE	TRUE	TRUE
4	d, a, b, c	TRUE	TRUE	TRUE	TRUE	FALSE
5	a, c	TRUE	FALSE	TRUE	FALSE	FALSE
6	e	FALSE	FALSE	FALSE	FALSE	TRUE
7	c	FALSE	FALSE	TRUE	FALSE	FALSE
8	c, e, d	FALSE	FALSE	TRUE	TRUE	TRUE
9	a, d, e, c, b	TRUE	TRUE	TRUE	TRUE	TRUE
10	a	TRUE	FALSE	FALSE	FALSE	FALSE

But frankly I am looking for a more elegant solution to the problem. I have the sense I am missing something obvious here...

Solution 1:

set.seed(112)
df <- data.frame(id = 1:10)
df$alist <-
  replicate(10, list(sample(letters[1:5], round(
    runif(1, min = 1, max = 5)
  ))))

library(tidyverse)

nm <- sort(unique(unlist(df$alist)))

bind_cols(df, map_df(
  .x = df$alist,
  .f = ~ purrr::set_names(x = nm %in% .x, nm = nm)
)) 
#>    id         alist     a     b     c     d     e
#> 1   1       b, d, a  TRUE  TRUE FALSE  TRUE FALSE
#> 2   2          a, b  TRUE  TRUE FALSE FALSE FALSE
#> 3   3    d, a, c, e  TRUE FALSE  TRUE  TRUE  TRUE
#> 4   4    d, a, b, c  TRUE  TRUE  TRUE  TRUE FALSE
#> 5   5          a, c  TRUE FALSE  TRUE FALSE FALSE
#> 6   6             e FALSE FALSE FALSE FALSE  TRUE
#> 7   7             c FALSE FALSE  TRUE FALSE FALSE
#> 8   8       c, e, d FALSE FALSE  TRUE  TRUE  TRUE
#> 9   9 a, d, e, c, b  TRUE  TRUE  TRUE  TRUE  TRUE
#> 10 10             a  TRUE FALSE FALSE FALSE FALSE

^{Created on 2022-01-23 by the reprex package (v2.0.1)}

Unnesting character string list and converting into wide data

Solution 1:

Related

Recent Posts