Probabilistic techniques, methods, and ideas in ("undergraduate") real analysis

As the book Probabilistic Techniques in Analysis by Richard F. Bass shows, nowadays techniques drawn from probability are used to tackle problems in analysis.

The mentioned book presents a survey of these methods "at the level of a beginning Ph.D. student", but I would like to see some examples of "more basic" applications of probability to undergraduate (so to say) real analysis (in other words, I would like to see some applications of probabilistic reasoning to calculus problems).


Here is an example.

The question is how to show that $$\binom{n}{k}^{-1}=(n+1)\int_0^1 x^k (1-x)^{n-k} \, dx. $$

To make this self-contained, I'll paste this answer below:

Let's do it somewhat like the way the Rev. Thomas Bayes did it in the 18th century (but I'll phrase it in modern probabilistic terminology).

Suppose $n+1$ independent random variables $X_0,X_1,\ldots,X_n$ are uniformly distributed on the interval $[0,1]$.

Suppose for $i=1,\ldots,n$ (starting with $1$, not with $0$) we have: $$Y_i = \begin{cases} 1 & \text{if }X_i<X_0 \\ 0 & \text{if }X_i>X_0\end{cases}$$

Then $Y_1,\ldots,Y_n$ are conditionally independent given $X_0$, and $\Pr(Y_i=1\mid X_0)= X_0$.

So $\Pr(Y_1+\cdots+Y_n=k\mid X_0) = \dbinom{n}{k} X_0^k (1-X_0)^{n-k},$ and hence $$\Pr(Y_1+\cdots+Y_n=k) = \mathbb{E}\left(\dbinom{n}{k} X_0^k (1-X_0)^{n-k}\right).$$

This is equal to $$ \int_0^1 \binom nk x^k(1-x)^{n-k}\;dx. $$

But the event is the same as saying that the index $i$ for which $X_i$ is in the $(k+1)$th position when $X_0,X_1,\ldots,X_n$ are sorted into increasing order is $0$.

Since all $n+1$ indices are equally likely to be in that position, this probability is $1/(n+1)$.

Thus $$\int_0^1\binom nk x^k(1-x)^{n-k}\;dx = \frac{1}{n+1}.$$


The Weierstrass approximation theorem says that continuous real-valued functions on the unit interval can be uniformly approximated arbitrarily closely by polynomials. That is, for every continuous function $f:[0,1]\to\mathbb{R}$ and each $\varepsilon>0$, there is a polynomial $p$ such that $|f(x)-p(x)|<\varepsilon$ for all $x\in[0,1]$.

An elementary probabilistic proof goes as follows:

Let $U_1,U_2,\ldots$ be independent uniformly distributed random variables on $[0,1]$. For $n\in\mathbb{N}$, define a function $p_n:[0,1]\to\mathbb{R}$ by $$p_n(x) \triangleq \mathbb{E}\Big[\,f\Big(\frac{1_{U_1<x}+1_{U_2<x}+\cdots+1_{U_n<x}}{n}\Big) \Big]\;, $$ where $1_{U_i<x}$ is the indicator random variable of the event $\{U_i<x\}$. For brevity, let us write $\overline{X}_n(x)\triangleq \frac{1}{n}(1_{U_1<x}+1_{U_2<x}+\cdots+1_{U_n<x})$, so that $p_n(x)=\mathbb{E}[f(\overline{X}_n(x))]$. Note that $p_n(x)$ is a polynomial of degree $n$. Intuitively, by the law of large numbers, $\overline{X}_n(x)$ is going to be close to $x$ when $n$ is large. Hence, $f(\overline{X}_n(x))$ will also be close to $f(x)$.

To make this precise, let $\varepsilon>0$. A continuous function on a compact space is uniformly continuous and bounded. Therefore, there is a $\delta>0$ such that $|f(x)-f(y)|<\varepsilon/2$ for each $x,y\in[0,1]$ satisfying $|x-y|<\delta$. Moreover, there is a constant $c<\infty$ such that $|f(x)|<c$ for each $x\in[0,1]$.

Now, for each $x\in[0,1]$, we have \begin{align} |f(x)-p_n(x)| &= \Big|\mathbb{E}\big[f(x)-f(\overline{X}_n(x))\big]\Big| \\ &\leq \mathbb{E}\big|f(x)-f(\overline{X}_n(x))\big| \\ &< \underbrace{\mathbb{P}\big(|\overline{X}_n(x)-x|<\delta\big)}_{\leq 1} \frac{\varepsilon}{2} + \underbrace{\mathbb{P}\big(|\overline{X}_n(x)-x|\geq\delta\big)}_{ \text{via Chebyshev's} }\,c \;. \end{align} By Chebyshev's inequality, we have $$ \mathbb{P}\big(|\overline{X}_n(x)-x|\geq\delta\big) \leq \frac{\mathrm{Var}[\overline{X}_n(x)]}{\delta^2} = \frac{x(1-x)}{n\delta^2} \leq \frac{1}{4n\delta^2} \;, $$ which is smaller than $\frac{\varepsilon}{2c}$ for $n\geq\frac{c}{2\varepsilon\delta^2}$. It follows that $|f(x)-p_n(x)|<\varepsilon$ for all $x\in[0,1]$, provided $n\geq \frac{c}{2\varepsilon\delta^2}$, and this concludes the proof.


Thomas Bayes showed that $${n \choose k}\int_0^1 x^k (1-x)^{n-k}\mathrm dx = \frac{1}{n+1}$$ by pure thought, without using calculus (for all integers $k,n$ with $0 \leq k \leq n$). His argument, known as the Bayes' billiards argument, uses two equivalent probabilistic stories about picking random points on a number line from $0$ to $1$.

That's just one example though, not a general technique. Generalizing, a powerful technique that's not normally emphasized in math courses is probabilistic interpretation. For example, there are many integrals that, after some pattern matching and possibly in tandem with other techniques such as substitution, integration by parts, and differentiation under the integral sign, can be interpreted at the integral of a known probability density function, or as a moment of a known distribution, or as a convolution integral. The Normal, Beta, and Gamma distributions are especially important in this context.

You would also find a good explanation for probabilistic argument of this integral in the post of intuition for the derivation of beta function by Qiaochu Yuan.


This problem that has a bounty on it, contains a reference to a solution using probability theory. I think the probability solution is quite elegant. It works though by showing an even stronger result than the one asked for. So, I suspect another proof must exist, but that proof might be more complex.