Standard Deviation in R Seems to be Returning the Wrong Answer - Am I Doing Something Wrong?
A simple example of calculating standard dev:
d <- c(2,4,4,4,5,5,7,9)
sd(d)
yields
[1] 2.13809
but when done by hand, the answer is 2. What am I missing here?
Try this
R> sd(c(2,4,4,4,5,5,7,9)) * sqrt(7/8)
[1] 2
R>
and see the rest of the Wikipedia article for the discussion about estimation of standard deviations. Using the formula employed 'by hand' leads to a biased estimate, hence the correction of sqrt((N-1)/N). Here is a key quote:
The term standard deviation of the sample is used for the uncorrected estimator (using N) while the term sample standard deviation is used for the corrected estimator (using N − 1). The denominator N − 1 is the number of degrees of freedom in the vector of residuals, .
Looks like R is assuming (n-1) in the denominator, not n.