What is the difference between statistical mean and calculus mean?

For example in statistics we learn that mean = E(x) of a function which is defined as

$$\mu = \int_a^b xf(x) \,dx$$

however in calculus we learn that

$$\mu = \frac {1}{b-a}\int_a^b f(x) \,dx $$

What is the difference between the means in statistics and calculus and why don't they give the same answer?

thank you


Solution 1:

This seems to be based on confusion resulting from resemblance between the notations used in the two situations.

In probability and statistics, one learns that $\displaystyle\int_{-\infty}^\infty x f(x)\,dx$ is the mean, NOT of the function $f$, but of a random variable denoted (capital) $X$ (whereas lower-case $x$ is used in the integral) whose probability density function is $f.$ This is the same as $\displaystyle \int_a^b xf(x)\,dx$ in cases where the probability is $1$ that the random variable $X$ is between $a$ and $b.$ (The failure, in the posted question, to distinguish betweeen the lower-case $x$ used in the integral and the capital $X$ used in the expression $\operatorname E(X)$ is an error that can make it impossible to understand expressions like $\Pr(X\le x)$ and some other things.)

In calculus, the expression $\displaystyle \frac 1 {b-a} \int_a^b f(x)\,dx$ is the mean, NOT of any random variable $X,$ but of the function $f$ itself, on the interval $[a,b].\vphantom{\dfrac11}$

Notice that in probability, you necessarily have $\displaystyle \int_a^b f(x)\,dx=1$ and $f(x)\ge 0,$ and the mean $\displaystyle \int_a^b xf(x)\,dx$ is necessarily between $a$ and $b.$ But none of that applies to the calculus problem, since the quantity whose mean is found is on the $f(x)$-axis, not on the $x$-axis. $$\S \qquad\qquad \S \qquad\qquad \S$$ Postscript: Nine people including me have up-voted "Jack M"'s comment, so just to satisfy that point of view I will add some things.

If $f$ is the density function of the probability distribution of the random variable (capital) $X,$ then the mean of $g(X)$ (where $g$ is some other function) is $$ \int_{-\infty}^\infty g(x) f(x)\,dx. $$ Applying that to the situation in calculus, one can say that the density function of the uniform distribution on the interval $[a,b]$ is $1/(b-a),$ so if $X$ is a random variable with that distribution, then $$ \operatorname E(f(X)) = \int_a^b f(x) \frac 1 {b-a} \, dx. $$ And a random variable $X$ itself can be regarded as a function whose domain is a sample space $\Omega,$ with the probability measure $P$ assigning probabilities to subsets of $\Omega,$ and then you have $$ \operatorname E(X) = \int_\Omega X(\omega)\, P(d\omega). $$

Solution 2:

Maybe you can consider the second form of mean as a sample mean, analogue to $\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i$ in the discrete case.

That is: the function $f(x)$ in this case would be the realization of a random variable.

Please consider the notion of a stochastic process, where a collection of random variables $X$ is indexed, or associated with a continuous deterministic variable $t$. For instance, $X(t)$ could model the weather temperature in a given moment of time $t$, during the interval $[ a, b]$. So, $X(a)$ would not be a value, but a whole random variable, with a given mean, variance, etc. And when you write $X(t)$ you have a collection of random variables, one for each $t \in [a, b]$.

If all these variables have the same properties, one say $X(t)$ is stationary, and therefore, $\mu$, the mean of $X(t)$, can be estimated using the series of values of a realization of $X(t)$ -- in our case, some actual measurements of the weather temperature in a given range of time. In that case, your function $f(x)$ would be renamed $x(t)$ (with a lowercase $x$) to indicate one realization of the stochastic process $X(t)$. And then

$$\bar{x} = \frac{1}{b - a} \int_a^b x(t)\,dt$$

would be an estimator of $\mu$, the actual mean of the stationary process $X(t)$.