How to select row with exactly only 2 unique value with tidyverse?
What I have:
library(magrittr)
set.seed(1234)
what_i_have <- tibble::tibble(
A = c(0, 1) |> sample(5, replace = TRUE),
B = c(0, 1) |> sample(5, replace = TRUE),
C = c(0, 1) |> sample(5, replace = TRUE)
)
It looks like this:
> what_i_have
# A tibble: 5 x 3
A B C
<dbl> <dbl> <dbl>
1 1 1 1
2 1 0 1
3 1 0 1
4 1 0 0
5 0 1 1
What I want:
what_i_want <- what_i_have %>% .[apply(., 1, function(row) row |> unique() |> length() == 2),]
It looks like this:
# A tibble: 4 x 3
A B C
<dbl> <dbl> <dbl>
1 1 0 1
2 1 0 1
3 1 0 0
4 0 1 1
My question is: is there a tidyverse
way to do the things above?
I tried this:
what_i_have |>
dplyr::rowwise() |>
dplyr::filter_all(function(row) row |> unique() |> length() == 2)
but it returns the following empty tibble
and I do not know why
# A tibble: 0 x 3
# Rowwise:
# … with 3 variables: A <dbl>, B <dbl>, C <dbl>
Thank you.
Here is one option with tidyverse
. Here, I treat each row as a vector (via c_across
), then get the number of distinct values using n_distinct
and return TRUE
for the rows that have 2 unique values.
library(tidyverse)
what_i_have %>%
rowwise %>%
filter(n_distinct(c_across(everything())) == 2)
Output
A B C
<dbl> <dbl> <dbl>
1 0 1 1
2 1 0 1
3 1 0 0
4 1 1 0
A mixed method approach with apply
could be:
what_i_have %>%
filter(apply(., 1, \(x)length(unique(x)))==2)
Data
what_i_have <-
structure(
list(
A = c(0, 1, 1, 1, 1),
B = c(1, 0, 0, 1, 1),
C = c(1, 1, 0, 1, 0)
),
class = c("tbl_df", "tbl", "data.frame"),
row.names = c(NA,-5L)
)