Rename phylo tip labels with separate df
I am looking to rename 1290 tip labels in my phylo tree using a second df. I am able to rename one label at a time using the below code:
phylo$tip.label[phylo$tip.label=="e54924083c02bd088c69537d02406eb8"] <- "something"
but this is clearly inefficient. How may I rename all labels with a second df that contains the original tip labels and the new labels? I can provide example data if necessary (the files are very large).
Thanks!
Solution 1:
You can use the value matching function %in%
to first detect which labels are in the "old labels" column in your data frame and then replace them by the "new labels".
## A random tree
my_tree <- rtree(20)
## A data.frame of names to replace
my_data_frame <- data.frame(old.labels = c("t1", "t3", "t9"),
new.labels = c("tip_1", "tip_3", "tip_9"))
## Find the old labels in the tree
my_tree$tip.label[my_tree$tip.label %in% my_data_frame$old.labels] <- my_data_frame$new.labels
my_tree$tip.label %in% my_data_frame$old.labels
returns a logical vector (TRUE
/FALSE
) of each tips that matches the my_data_frame$old.labels
which you can then easily replace by something of your choice of the same length (i.e. length(which(my_tree$tip.label %in% my_data_frame$old.labels)) == length(my_data_frame$new.labels)
).