How create two Variable and filter with map_dbl [duplicate]
We may use colMeans
on a logical matrix in base R
, convert the named vector to a two column data.frame with stack
stack(+(colMeans(Data == "") > 0.05))[2:1]
Explanation - Data == ""
returns a logical matrix, colMeans
get the mean
of the logical vector for each column (which would be the percentage (*100
) of TRUE values), then convert to logical vector by comparing with 0.05
(5 percent). The logical can be coeced to binary with either (+
) or use as.integer
. The output of colMeans
is a named vector
, which remains as such. stack
converts the logical named vector to a two column data.frame. Indexing ([2:1]
) will reorder the columns i.e. 2nd column appears first, followed by first column.
-output
ind values
1 Year 0
2 Month 0
3 Day 0
4 Hour 0
5 Id_Type 1
6 Code_Intersecction 1
With tidyverse
, the equivalent is enframe
(from tibble
)
library(dplyr)
library(tidyr)
library(purrr)
map(Data, ~ +(round(mean(.x == ""), 3) * 100 >= 5)) %>%
enframe(name = 'Variables') %>%
unnest(value)
# A tibble: 6 × 2
Variables value
<chr> <int>
1 Year 0
2 Month 0
3 Day 0
4 Hour 0
5 Id_Type 1
6 Code_Intersecction 1
Use tibble:rownames_to_column
:
tibble::rownames_to_column(Data_Null, var ="Variables")
# A tibble: 6 x 2
Variables Null
<chr> <dbl>
1 Year 0
2 Month 0
3 Day 0
4 Hour 0
5 Id_Type 1
6 Code_Intersecction 1