calculate the mean for each column of a matrix in R
I am working on R in R studio. I need to calculate the mean for each column of a data frame.
cluster1 // 5 by 4 data frame
mean(cluster1) //
I got :
Warning message:
In mean.default(cluster1) :
argument is not numeric or logical: returning NA
But I can use
mean(cluster1[[1]])
to get the mean of the first column.
How to get means for all columns ?
Any help would be appreciated.
Solution 1:
You can use colMeans
:
### Sample data
set.seed(1)
m <- data.frame(matrix(sample(100, 20, replace = TRUE), ncol = 4))
### Your error
mean(m)
# [1] NA
# Warning message:
# In mean.default(m) : argument is not numeric or logical: returning NA
### The result using `colMeans`
colMeans(m)
# X1 X2 X3 X4
# 47.0 64.4 44.8 67.8
Solution 2:
You can use 'apply' to run a function or the rows or columns of a matrix or numerical data frame:
cluster1 <- data.frame(a=1:5, b=11:15, c=21:25, d=31:35)
apply(cluster1,2,mean) # applies function 'mean' to 2nd dimension (columns)
apply(cluster1,1,mean) # applies function to 1st dimension (rows)
sapply(cluster1, mean) # also takes mean of columns, treating data frame like list of vectors
Solution 3:
In case you have NA's:
sapply(data, mean, na.rm = T) # Returns a vector (with names)
lapply(data, mean, na.rm = T) # Returns a list
Remember that "mean" needs numeric data. If you have mixed class data, then use:
numdata<-data[sapply(data, is.numeric)]
sapply(numdata, mean, na.rm = T) # Returns a vector
lapply(numdata, mean, na.rm = T) # Returns a list