Conditionally replace matching values from one data.frame to values in another data.frame

I have a df1 that looks like

df1:

df1 <- data.frame(
  DE201 = c("A15", "A18"),
  DE207 = c("A903", "A906")
)

DE201	DE207
A15	A903
A18	A906

... and another df2 that has the dictionary values for those values, row wise

df2:

df2 <- data.frame(
  module = rep("Fall", 4),
  Data_Element_ID = c(rep("DE201", 2), rep("DE207", 2)),
  Data_Element_Name = c(rep("Injury result", 2), rep("Patient activity", 2)),
  Answer_Code = c("A15", "A18", "A903", "A906"),
  Answer_value = c("Yes", "No", "Ambulating with assistance", "Intracranial injury"),
  DE_original = c("DE201_A15", "DE201_A18", "DE207_A903", "DE207_A906")
)

module	Data_Element_ID	Data_Element_Name	Answer_Code	Answer_Value	DE_original
Fall	DE201	Injury result	A15	Yes	DE201_A15
Fall	DE201	Injury result	A18	No	DE201_A18
Fall	DE207	Patient activity	A903	Ambulating with assistance	DE207_A903
Fall	DE207	Patient activity	A906	Intracranial injury‬	DE207_A906

...

I want to replace all the values in df1, including its column name, to be the matching values in df2 (df1 column names to be matched df2's Data_Element_Name, and df1 values to be matched df2's Answer_Value)

for example, my desired output is something like

output <- data.frame(
  Injury_result = c("Yes", "No"),
  Patient_activity = c("Ambulating with assistance", "Intracranial injury")
)

Injury result	Patient activity
Yes	Ambulating with assistance
No	Intracranial injury‬

...

I've tried below two methods but none of them worked, and merge/join are not really applicable here because they don't really have the same key/ID to join/merge by, since one is column wise elements , and one is row wise elements. Tried mutate() with if_else() and case_when():

mutate(
  DE201 = case_when(
    DE201 == df2$Answer_Code ~ df2$Answer_Value, 
    TRUE                     ~ DE201
  ),
  DE204 = if_else(
    DE204 %in% df2$Answer_Code,
    df2$Answer_Value, DE204
  )
)

There might be a more direct way of doing this, but it's not a straightforward task. In this case, I reshaped the data from df1 to a long format using pivot_longer() in order to tie in the data from df2 using left_join(). Then, it's a question of reshaping the data back to a wide format using pivot_wider() for those columns of data you wanted to keep.

Note that I'm adding a row id that I eventually discard because we need to keep track of what lines of data goes where when we are reshaping from wide to long back to wide.

library(dplyr)
library(tidyr)

df1 |>
  mutate(row_id = row_number()) |>
  pivot_longer(-row_id, names_to = "Data_Element_ID", values_to = "Answer_Code") |>
  left_join(df2) |>
  select(row_id, Data_Element_Name, Answer_value) |>
  pivot_wider(names_from = Data_Element_Name, values_from = Answer_value) |>
  select(-row_id)

# A tibble: 2 x 2
  `Injury result` `Patient activity`        
  <chr>           <chr>                     
1 Yes             Ambulating with assistance
2 No              Intracranial injury

Conditionally replace matching values from one data.frame to values in another data.frame

Related

Recent Posts