For each row in an R dataframe
Solution 1:
You can use the by()
function:
by(dataFrame, seq_len(nrow(dataFrame)), function(row) dostuff)
But iterating over the rows directly like this is rarely what you want to; you should try to vectorize instead. Can I ask what the actual work in the loop is doing?
Solution 2:
You can try this, using apply()
function
> d
name plate value1 value2
1 A P1 1 100
2 B P2 2 200
3 C P3 3 300
> f <- function(x, output) {
wellName <- x[1]
plateName <- x[2]
wellID <- 1
print(paste(wellID, x[3], x[4], sep=","))
cat(paste(wellID, x[3], x[4], sep=","), file= output, append = T, fill = T)
}
> apply(d, 1, f, output = 'outputfile')
Solution 3:
First, Jonathan's point about vectorizing is correct. If your getWellID() function is vectorized, then you can skip the loop and just use cat or write.csv:
write.csv(data.frame(wellid=getWellID(well$name, well$plate),
value1=well$value1, value2=well$value2), file=outputFile)
If getWellID() isn't vectorized, then Jonathan's recommendation of using by
or knguyen's suggestion of apply
should work.
Otherwise, if you really want to use for
, you can do something like this:
for(i in 1:nrow(dataFrame)) {
row <- dataFrame[i,]
# do stuff with row
}
You can also try to use the foreach
package, although it requires you to become familiar with that syntax. Here's a simple example:
library(foreach)
d <- data.frame(x=1:10, y=rnorm(10))
s <- foreach(d=iter(d, by='row'), .combine=rbind) %dopar% d
A final option is to use a function out of the plyr
package, in which case the convention will be very similar to the apply function.
library(plyr)
ddply(dataFrame, .(x), function(x) { # do stuff })