Fastest way to multiply matrix columns with vector elements in R
I have a matrix m
and a vector v
. I would like to multiply first column of matrix m
by the first element of vector v
, and multiply the second column of matrix m
by the second element of vector v
, and so on. I can do it with the following code, but I am looking for a way which does not require the two transpose calls. How can I do this faster in R?
m <- matrix(rnorm(120000), ncol=6)
v <- c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5)
system.time(t(t(m) * v))
# user system elapsed
# 0.02 0.00 0.02
Solution 1:
Use some linear algebra and perform matrix multiplication, which is quite fast in R
.
eg
m %*% diag(v)
some benchmarking
m = matrix(rnorm(1200000), ncol=6)
v=c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5)
library(microbenchmark)
microbenchmark(m %*% diag(v), t(t(m) * v))
## Unit: milliseconds
## expr min lq median uq max neval
## m %*% diag(v) 16.57174 16.78104 16.86427 23.13121 109.9006 100
## t(t(m) * v) 26.21470 26.59049 32.40829 35.38097 122.9351 100
Solution 2:
If you have a larger number of columns your t(t(m) * v) solution outperforms the matrix multiplication solution by a wide margin. However, there is a faster solution, but it comes with a high cost in in memory usage. You create a matrix as large as m using rep() and multiply elementwise. Here's the comparison, modifying mnel's example:
m = matrix(rnorm(1200000), ncol=600)
v = rep(c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5), length = ncol(m))
library(microbenchmark)
microbenchmark(t(t(m) * v),
m %*% diag(v),
m * rep(v, rep.int(nrow(m),length(v))),
m * rep(v, rep(nrow(m),length(v))),
m * rep(v, each = nrow(m)))
# Unit: milliseconds
# expr min lq mean median uq max neval
# t(t(m) * v) 17.682257 18.807218 20.574513 19.239350 19.818331 62.63947 100
# m %*% diag(v) 415.573110 417.835574 421.226179 419.061019 420.601778 465.43276 100
# m * rep(v, rep.int(nrow(m), ncol(m))) 2.597411 2.794915 5.947318 3.276216 3.873842 48.95579 100
# m * rep(v, rep(nrow(m), ncol(m))) 2.601701 2.785839 3.707153 2.918994 3.855361 47.48697 100
# m * rep(v, each = nrow(m)) 21.766636 21.901935 23.791504 22.351227 23.049006 66.68491 100
As you can see, using "each" in rep() sacrifices speed for clarity. The difference between rep.int and rep seems to be neglible, both implementations swap places on repeated runs of microbenchmark(). Keep in mind, that ncol(m) == length(v).
Solution 3:
As @Arun points out, I don't know that you'll beat your solution in terms of time efficiency. In terms of code understandability, there are other options though:
One option:
> mapply("*",as.data.frame(m),v)
V1 V2 V3
[1,] 0.0 0.0 0.0
[2,] 1.5 0.0 0.0
[3,] 1.5 3.5 0.0
[4,] 1.5 3.5 4.5
And another:
sapply(1:ncol(m),function(x) m[,x] * v[x] )
Solution 4:
For the sake of completeness, I added sweep
to the benchmark. Despite its somewhat misleading attribute names, I think it may be more readable than other alternatives, and also quite fast:
n = 1000
M = matrix(rnorm(2 * n * n), nrow = n)
v = rnorm(2 * n)
microbenchmark::microbenchmark(
M * rep(v, rep.int(nrow(M), length(v))),
sweep(M, MARGIN = 2, STATS = v, FUN = `*`),
t(t(M) * v),
M * rep(v, each = nrow(M)),
M %*% diag(v)
)
Unit: milliseconds
expr min lq mean
M * rep(v, rep.int(nrow(M), length(v))) 5.259957 5.535376 9.994405
sweep(M, MARGIN = 2, STATS = v, FUN = `*`) 16.083039 17.260790 22.724433
t(t(M) * v) 19.547392 20.748929 29.868819
M * rep(v, each = nrow(M)) 34.803229 37.088510 41.518962
M %*% diag(v) 1827.301864 1876.806506 2004.140725
median uq max neval
6.158703 7.606777 66.21271 100
20.479928 23.830074 85.24550 100
24.722213 29.222172 92.25538 100
39.920664 42.659752 106.70252 100
1986.152972 2096.172601 2432.88704 100
Solution 5:
As done by bluegrue, a simple rep would suffice as well to perform element-wise multiplication.
The number of multiplications and summations is reduced by a wide-margin as if simple matrix multiplication with diag()
is performed, where for this case a lot of zero-multiplications can be avoided.
m = matrix(rnorm(1200000), ncol=6)
v=c(1.5, 3.5, 4.5, 5.5, 6.5, 7.5)
v2 <- rep(v,each=dim(m)[1])
library(microbenchmark)
microbenchmark(m %*% diag(v), t(t(m) * v), m*v2)
Unit: milliseconds
expr min lq mean median uq max neval cld
m %*% diag(v) 11.269890 13.073995 16.424366 16.470435 17.700803 95.78635 100 b
t(t(m) * v) 9.794000 11.226271 14.018568 12.995839 15.010730 88.90111 100 b
m * v2 2.322188 2.559024 3.777874 3.011185 3.410848 67.26368 100 a