Sorting a range of columns case-wise and applying that sort order to another range of columns

r sorting

In R I've got a data frame containing the indices of images (i1,i2,i3) that were presented in a randomized order and as well a rating for each image (a1,a2,a3):

enter image description here

Now I would like to add new variables that represent the indices of the presented images in ascending order (i1_s,i2_s,i3_s). This re-ordering has to be applied to the corresponding answers as well (a1_s,a2_s,a3_s). So the final data frame should look like this:

enter image description here

How can I do this task in R? I think this should be a pretty common task that has to be done a lot in randomized study designs, but I couldn't find how-tos or tutorials. The problem is, I don't know the proper technical terms for this kind of operation.

Note: Since in the actual study there are hundreds of image- and answer-variables, a generic approach that operates on patterns of variable names is highly appreciated!

EDIT: added code for above sample data:

df <- data.frame(
  ID = c("1", "2", "3"),
  i1 = c(78, 19, 4),
  i2 = c(19, 4, 78),
  i3 = c(4, 78, 19),
  a1 = c("B", "D", "M"),
  a2 = c("A", "H", "A"),
  a3 = c("D", "A", "E"),
  i1_s = c(NA,NA,NA),
  i2_s = c(NA,NA,NA),
  i3_s = c(NA,NA,NA),
  a1_s = c(NA,NA,NA),
  a2_s = c(NA,NA,NA),
  a3_s = c(NA,NA,NA)
)

Curious to see how others do this but here's one solution.

df %>%
  as_tibble() %>%
  mutate(across(-ID, as.character)) %>%
  pivot_longer(cols = -ID) %>%
  mutate(
    num = substring(name,2),
    name = substring(name,1,1)
  ) %>%
  pivot_wider(
    names_from = name
  ) %>%
  mutate(
    i = as.numeric(i)
  ) %>%
  arrange(ID, i) %>%
  group_by(ID) %>%
  mutate(
    rank1 = row_number()
  ) %>%
  mutate(
    i = as.character(i)
  ) %>%
  pivot_longer(cols = c(i,a)) %>%
  mutate(
    name = paste0(name, rank1, '_s')
  ) %>%
  select(-num, -rank1) %>%
  arrange(ID, name) %>%
  pivot_wider() %>%
  mutate(across(starts_with('i'), as.numeric))

Output:

     ID a1_s  a2_s  a3_s   i1_s  i2_s  i3_s
  <int> <chr> <chr> <chr> <dbl> <dbl> <dbl>
1     1 D     A     B         4    19    78
2     2 H     D     A         4    19    78
3     3 M     E     A         4    19    78

Data:

df <- structure(list(ID = 1:3, i1 = c(78L, 19L, 4L), i2 = c(19L, 4L, 
78L), i3 = c(4L, 78L, 19L), a1 = c("B", "D", "M"), a2 = c("A", 
"H", "A"), a3 = c("D", "A", "E")), class = "data.frame", row.names = c(NA, 
-3L))

Here's another tidyverse approach. You can provide a names_pattern when putting your data into long form. After sorting values with arrange for each ID, you can pivot_wider to put back into wide form if desired.

library(tidyverse)

df %>%
  pivot_longer(cols = -ID, 
               names_to = c(".value", "number"), 
               names_pattern = "(\\w+)(\\d+)") %>%
  group_by(ID) %>%
  arrange(i) %>%
  mutate(number = row_number()) %>%
  pivot_wider(id_cols = ID, 
              names_from = number, 
              names_glue = "{.name}_s", 
              values_from = c(-ID, -number), 
              names_sep = "")

Output

  ID     i1_s  i2_s  i3_s a1_s  a2_s  a3_s 
  <chr> <dbl> <dbl> <dbl> <chr> <chr> <chr>
1 1         4    19    78 D     A     B    
2 2         4    19    78 H     D     A    
3 3         4    19    78 M     E     A

Sorting a range of columns case-wise and applying that sort order to another range of columns

Related

Recent Posts