Rename phylo tip labels with separate df

I am looking to rename 1290 tip labels in my phylo tree using a second df. I am able to rename one label at a time using the below code:

phylo$tip.label[phylo$tip.label=="e54924083c02bd088c69537d02406eb8"] <- "something"

but this is clearly inefficient. How may I rename all labels with a second df that contains the original tip labels and the new labels? I can provide example data if necessary (the files are very large).

Thanks!


Solution 1:

You can use the value matching function %in% to first detect which labels are in the "old labels" column in your data frame and then replace them by the "new labels".

## A random tree
my_tree <- rtree(20)

## A data.frame of names to replace
my_data_frame <- data.frame(old.labels = c("t1", "t3", "t9"),
                            new.labels = c("tip_1", "tip_3", "tip_9"))

## Find the old labels in the tree
my_tree$tip.label[my_tree$tip.label %in% my_data_frame$old.labels] <- my_data_frame$new.labels

my_tree$tip.label %in% my_data_frame$old.labels returns a logical vector (TRUE/FALSE) of each tips that matches the my_data_frame$old.labels which you can then easily replace by something of your choice of the same length (i.e. length(which(my_tree$tip.label %in% my_data_frame$old.labels)) == length(my_data_frame$new.labels)).