What is the right way to multiply data frame by vector?
Solution 1:
This works too:
data.frame(mapply(`*`,df,v))
In that solution, you are taking advantage of the fact that data.frame
is a type of list
, so you can iterate over both the elements of df
and v
at the same time with mapply
.
Unfortunately, you are limited in what you can output from mapply
: as simple list
, or a matrix
. If your data are huge, this would likely be more efficient:
data.frame(mapply(`*`,df,v,SIMPLIFY=FALSE))
Because it would convert it to a list
, which is more efficient to convert to a data.frame
.
Solution 2:
If you're looking for speed and memory efficiency - data.table
to the rescue:
library(data.table)
dt = data.table(df)
for (i in seq_along(dt))
dt[, (i) := dt[[i]] * v[i]]
eddi = function(dt) { for (i in seq_along(dt)) dt[, (i) := dt[[i]] * v[i]] }
arun = function(df) { df * matrix(v, ncol=ncol(df), nrow=nrow(df), byrow=TRUE) }
nograpes = function(df) { data.frame(mapply(`*`,df,v,SIMPLIFY=FALSE)) }
N = 1e6
dt = data.table(A = rnorm(N), B = rnorm(N))
v = c(0,2)
microbenchmark(eddi(copy(dt)), arun(copy(dt)), nograpes(copy(dt)), times = 10)
#Unit: milliseconds
# expr min lq mean median uq max neval
# eddi(copy(dt)) 23.01106 24.31192 26.47132 24.50675 28.87794 34.28403 10
# arun(copy(dt)) 337.79885 363.72081 450.93933 433.21176 516.56839 644.70103 10
# nograpes(copy(dt)) 19.44873 24.30791 36.53445 26.00760 38.09078 95.41124 10
As Arun points out in the comments, one can also use the set
function from the data.table
package to do this in-place modification on data.frame
's as well:
for (i in seq_along(df))
set(df, j = i, value = df[[i]] * v[i])
This of course also works for data.table
's and could be significantly faster if the number of columns is large.