Recycle vector by starting from the end
I'm trying to recycle a vector, but don't want to recycle with the default in R.
Imagine I have 2 vectors with unequal number of elements:
gen1 = 2:10
gen2 = 1:10
rbind(gen1,gen2)
This gives this table
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
gen1 2 3 4 5 6 7 8 9 10 2
gen2 1 2 3 4 5 6 7 8 9 10
As you can see in the last column, the 2 gets paired with 10. But I want this:
gen1 = c(2,2:10)
gen2 = 1:10
rbind(gen1,gen2)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
gen1 2 2 3 4 5 6 7 8 9 10
gen2 1 2 3 4 5 6 7 8 9 10
Now the 2 is duplicated, but at the front. Evidently I do not want to do this by hand since I have a collection of these non pairing vectors which I want to use this trick. Is there a way to do this?
Or perhaps a way to find the 'closest' position possible in the list.
For example, if I have
[,1] [,2] [,3] [,4] [,5] [,6]
gen1 8 9 10 8 9 10
gen2 5 6 7 8 9 10
I would like this to be:
[,1] [,2] [,3] [,4] [,5] [,6]
gen1 8 8 9 9 10 10
gen2 5 6 7 8 9 10
First example in question
1) Convert each to a ts series with appropriate alignment and then use na.locf.
library(zoo)
# inputs
gen1 <- 2:10; gen2 = 1:10
t(na.locf(cbind(gen1 = ts(gen1, start = 2), gen2 = ts(gen2)), fromLast = TRUE))
giving:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
gen1 2 2 3 4 5 6 7 8 9 10
gen2 1 2 3 4 5 6 7 8 9 10
2) It can also be written with pipes like this
cbind(gen1 = ts(gen1, start = 2), gen2 = ts(gen2)) |>
na.locf(fromLast = TRUE) |>
t()
3) If you want to derive the aligment from the data itself use this:
maxlen <- max(length(gen1), length(gen2))
cbind(gen1 = ts(gen1, end = maxlen), gen2 = ts(gen2, end = maxlen)) |>
na.locf(fromLast = TRUE) |>
t()
4) Another approach is to use dynamic time warping.
library(dtw)
with(dtw(gen1, gen2), rbind(gen1 = gen1[index1], gen2 = gen2[index2]))
giving:
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
gen1 2 2 3 4 5 6 7 8 9 10
gen2 1 2 3 4 5 6 7 8 9 10
Last example in question
The last example in the question seems entirely different and is just a matter of sorting each row.
# input in reproducible form
m <- rbind(gen1 = c(8, 9, 10, 8, 9, 10), gen2 = c(5, 6, 7, 8, 9, 10))
t(apply(m, 1, sort))
giving
[,1] [,2] [,3] [,4] [,5] [,6]
gen1 8 8 9 9 10 10
gen2 5 6 7 8 9 10