Extract intersection list from upset object

I'm making some comparisons with UpSetR, and I'd like to save the lists of elements that fall into each intersection. Is this possible? I can't find it anywhere...

It would be pretty tedious to do it manually (many lists), and since they're calculated anyway not being able to save them is frustrating


There is no ready upSetR function for this (yet). But, it is possible to extract it:

library(UpSetR)

# Example input as list, expected output is 1 and 5:
listInput <- list(one = c(1, 2, 3, 5, 7, 8, 11, 12, 13), 
                  two = c(1, 2, 4, 5, 10),
                  three = c(1, 5, 6, 7, 8, 9, 10, 12, 13))

When assigned upset returns a value, which also includes the data:

x <- upset(fromList(listInput))
x$New_data
#    one two three
# 1    1   1     1
# 2    1   1     0
# 3    1   0     0
# 4    1   1     1
# 5    1   0     1
# 6    1   0     1
# 7    1   0     0
# 8    1   0     1
# 9    1   0     1
# 10   0   1     0
# 11   0   1     1
# 12   0   0     1
# 13   0   0     1

From here we can see it is 1st and the 4th rows are found in all three sets. The order of items are defined based on the order they appear in the list, see:

x1 <- unlist(listInput, use.names = FALSE)
x1 <- x1[ !duplicated(x1) ]
x1
# [1]  1  2  3  5  7  8 11 12 13  4 10  6  9

Now we know the rownumbers from "New_data" refer to in our list. So, as we have 3 columns, filter rows where sum is 3:

x1[ rowSums(x$New_data) == 3 ]
# [1] 1 5

Or we could just use Reduce:

Reduce(intersect, listInput)
# [1] 1 5

Here is my take at extracting the different intersections together with the list of elements in them.

The main idea is to paste all the 0's and 1's from the binary table to create unique identifiers for each intersection and them use the dplyr::group_by function to extract the information

data <- data.frame(
  entry = paste0("Entry.", 1:10),
  "A" = c(0,0,0,0,1,0,1,1,0,0),
  "B" = c(1,0,0,0,1,1,1,1,1,0),
  "C" = c(1,1,1,1,0,0,1,0,1,1)
)

# NOT REQUIRED. Only to confirm that upset works with these data
upset(data)

You can then identify the intersections by pasting all the binary columns. I use the unite convenience function for this:

NB: you may have to change this depending on whether your data has row names or a column with names

data_with_intersection <- data %>%
  unite(col = "intersection", -c("entry"), sep = "")

From here, you can easily calculate the size of each intersection:

# Table of intersections and the number of entries
data_with_intersection %>%
  group_by(intersection) %>%
  summarise(n = n()) %>%
  arrange(desc(n))

Or even extract the list of entries/elements in each intersection:

# List of intersections and their entries
data_with_intersection %>%
  group_by(intersection) %>%
  summarise(list = list(entry)) %>%
  mutate(list = setNames(list, intersection)) %>%
  pull(list)