Error: In mean.default(y) : argument is not numeric or logical: returning NA

From reading other peoples issues, I tried changing my data to numeric. However, I am still getting the error when I run the last "means" code. I copied the code exactly from worksheet so not sure where to go from here.

apdata$concen <- as.numeric(apdata$concen)

meansVector <- function(data, times, size, var) {
  v <- c()
  for (i in 1:times) {
  y <- sample(data[,var], size, replace=TRUE)
m <- mean(y)
v[i] <- m
}
return(v)
}


means <- meansVector(apdata, 10, 100, "concen")

My guess is that your apdata is a tbl_df, not a vanilla data.frame.

For instance, let's use ggplot2::mpg as a dataset. (This can easily be shown using mtcars versus mtcars_tbl <- tibble(mtcars), if you don't have ggplot2 available.)

data("mpg", package = "ggplot2")
mpg
# # A tibble: 234 x 11
#    manufacturer model      displ  year   cyl trans      drv     cty   hwy fl    class  
#    <chr>        <chr>      <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr>  
#  1 audi         a4           1.8  1999     4 auto(l5)   f        18    29 p     compact
#  2 audi         a4           1.8  1999     4 manual(m5) f        21    29 p     compact
#  3 audi         a4           2    2008     4 manual(m6) f        20    31 p     compact
#  4 audi         a4           2    2008     4 auto(av)   f        21    30 p     compact
#  5 audi         a4           2.8  1999     6 auto(l5)   f        16    26 p     compact
#  6 audi         a4           2.8  1999     6 manual(m5) f        18    26 p     compact
#  7 audi         a4           3.1  2008     6 auto(av)   f        18    27 p     compact
#  8 audi         a4 quattro   1.8  1999     4 manual(m5) 4        18    26 p     compact
#  9 audi         a4 quattro   1.8  1999     4 auto(l5)   4        16    25 p     compact
# 10 audi         a4 quattro   2    2008     4 manual(m6) 4        20    28 p     compact
# # ... with 224 more rows

Using your function:

meansVector(mpg, 2, 3, "displ")
# Warning in mean.default(y) :
#   argument is not numeric or logical: returning NA
# Warning in mean.default(y) :
#   argument is not numeric or logical: returning NA
# [1] NA NA

This is because of how subsetting single variables resolves in frames. Your data[,var] should be returning a vector, but with data.table and tbl_df, it returns a single-column frame. In base R, data[,var,drop=FALSE] then performs the same thing, returning the single-column frame.

We can test that theory further with:

meansVector(as.data.frame(mpg), 2, 3, "displ")
# [1] 3.900000 2.933333

How to fix? Use data[[var]] instead, as that is guaranteed to return just the column instead of a single-column frame. (Note that if it is a list-column, the return value is still not a vector, but that's a separate issue and unlikely in your situation.)

meansVector <- function(data, times, size, var) {
  v <- c()
  for (i in 1:times) {
    y <- sample(data[[var]], size, replace=TRUE)
    m <- mean(y)
    v[i] <- m
  }
  return(v)
}

meansVector(mpg, 2, 3, "displ")
# [1] 3.533333 3.066667