Sorting a range of columns case-wise and applying that sort order to another range of columns
In R I've got a data frame containing the indices of images (i1,i2,i3) that were presented in a randomized order and as well a rating for each image (a1,a2,a3):
Now I would like to add new variables that represent the indices of the presented images in ascending order (i1_s,i2_s,i3_s). This re-ordering has to be applied to the corresponding answers as well (a1_s,a2_s,a3_s). So the final data frame should look like this:
How can I do this task in R? I think this should be a pretty common task that has to be done a lot in randomized study designs, but I couldn't find how-tos or tutorials. The problem is, I don't know the proper technical terms for this kind of operation.
Note: Since in the actual study there are hundreds of image- and answer-variables, a generic approach that operates on patterns of variable names is highly appreciated!
EDIT: added code for above sample data:
df <- data.frame(
ID = c("1", "2", "3"),
i1 = c(78, 19, 4),
i2 = c(19, 4, 78),
i3 = c(4, 78, 19),
a1 = c("B", "D", "M"),
a2 = c("A", "H", "A"),
a3 = c("D", "A", "E"),
i1_s = c(NA,NA,NA),
i2_s = c(NA,NA,NA),
i3_s = c(NA,NA,NA),
a1_s = c(NA,NA,NA),
a2_s = c(NA,NA,NA),
a3_s = c(NA,NA,NA)
)
Curious to see how others do this but here's one solution.
df %>%
as_tibble() %>%
mutate(across(-ID, as.character)) %>%
pivot_longer(cols = -ID) %>%
mutate(
num = substring(name,2),
name = substring(name,1,1)
) %>%
pivot_wider(
names_from = name
) %>%
mutate(
i = as.numeric(i)
) %>%
arrange(ID, i) %>%
group_by(ID) %>%
mutate(
rank1 = row_number()
) %>%
mutate(
i = as.character(i)
) %>%
pivot_longer(cols = c(i,a)) %>%
mutate(
name = paste0(name, rank1, '_s')
) %>%
select(-num, -rank1) %>%
arrange(ID, name) %>%
pivot_wider() %>%
mutate(across(starts_with('i'), as.numeric))
Output:
ID a1_s a2_s a3_s i1_s i2_s i3_s
<int> <chr> <chr> <chr> <dbl> <dbl> <dbl>
1 1 D A B 4 19 78
2 2 H D A 4 19 78
3 3 M E A 4 19 78
Data:
df <- structure(list(ID = 1:3, i1 = c(78L, 19L, 4L), i2 = c(19L, 4L,
78L), i3 = c(4L, 78L, 19L), a1 = c("B", "D", "M"), a2 = c("A",
"H", "A"), a3 = c("D", "A", "E")), class = "data.frame", row.names = c(NA,
-3L))
Here's another tidyverse
approach. You can provide a names_pattern
when putting your data into long form. After sorting values with arrange
for each ID
, you can pivot_wider
to put back into wide form if desired.
library(tidyverse)
df %>%
pivot_longer(cols = -ID,
names_to = c(".value", "number"),
names_pattern = "(\\w+)(\\d+)") %>%
group_by(ID) %>%
arrange(i) %>%
mutate(number = row_number()) %>%
pivot_wider(id_cols = ID,
names_from = number,
names_glue = "{.name}_s",
values_from = c(-ID, -number),
names_sep = "")
Output
ID i1_s i2_s i3_s a1_s a2_s a3_s
<chr> <dbl> <dbl> <dbl> <chr> <chr> <chr>
1 1 4 19 78 D A B
2 2 4 19 78 H D A
3 3 4 19 78 M E A