pivot_longer a grouped summary statistics data frame
There are multiple instances of _
in the column name so, we may use names_pattern
to capture characters as a group i.e. the 'metric' column should get all the characters before the _
(.*
) and the value columns (.value
) would be one or more characters that are not a _
([^_]+
) till the end ($
) of the string in column name
library(dplyr)
library(tidyr)
summary_statistics %>%
pivot_longer(cols = -major, names_to = c("metric", ".value"),
names_pattern = "(.*)_([^_]+)$")
-output
# A tibble: 32 × 6
major metric median mean sd skewness
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 Agricultural Production Crops K_int 34.9 760. 2681. 5.39
2 Agricultural Production Crops K_phys 107. 776. 1781. 3.98
3 Agricultural Production Crops Intangibles_intensity 0.325 0.346 0.255 0.382
4 Agricultural Production Crops g_k_it_to_K_int 0 0.132 0.217 1.56
5 Agricultural Production Crops total_q 0.764 1.40 1.83 2.70
6 Agricultural Production Crops i_phys 0.0471 0.169 1.63 18.5
7 Agricultural Production Crops i_int 0.0543 0.0885 0.106 2.99
8 Agricultural Production Crops i_tot 0.124 0.258 1.63 18.3
9 Agricultural Production Crops c_tot 0.0907 0.123 0.506 10.8
10 Agricultural Production Crops operating_activities_net_cash_flow 9.35 144. 500. 4.72
# … with 22 more rows
With names_sep
we can use _
with a regex lookaround to suggest that there are no more _
till the end ($
) of the string that follows the _
summary_statistics %>%
pivot_longer(
cols = -major,
names_to = c("metric", ".value"),
names_sep = "_(?=[^_]+$)"
)
# A tibble: 32 × 6
major metric median mean sd skewness
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 Agricultural Production Crops K_int 34.9 760. 2681. 5.39
2 Agricultural Production Crops K_phys 107. 776. 1781. 3.98
3 Agricultural Production Crops Intangibles_intensity 0.325 0.346 0.255 0.382
4 Agricultural Production Crops g_k_it_to_K_int 0 0.132 0.217 1.56
5 Agricultural Production Crops total_q 0.764 1.40 1.83 2.70
6 Agricultural Production Crops i_phys 0.0471 0.169 1.63 18.5
7 Agricultural Production Crops i_int 0.0543 0.0885 0.106 2.99
8 Agricultural Production Crops i_tot 0.124 0.258 1.63 18.3
9 Agricultural Production Crops c_tot 0.0907 0.123 0.506 10.8
10 Agricultural Production Crops operating_activities_net_cash_flow 9.35 144. 500. 4.72
# … with 22 more rows