Motivation of the Gaussian Integral
Solution 1:
You asked about a natural problem that leads to this integral. Here's a summary of the argument I give in my undergraduate probability theory class. (It's due to Dan Teague; he has the article here.)
Imagine throwing a dart at the origin in the plane. You're aiming at the origin, but there is some variability in your throws. The following assumptions perhaps seem reasonable.
- Errors do not depend on the orientation of the coordinate system.
- Errors in perpendicular directions are independent. (Being too high doesn't affect the probability of being too far to the right.)
- Large errors are less likely than small errors.
Let the probability of landing in a thin vertical strip from $x$ to $\Delta x$ be $p(x) \Delta x$. Similarly, let the probability of landing in a short horizontal strip from $y$ to $\Delta y$ be $p(y) \Delta y$. So the probability of the dart landing in the intersection of the two strips is $p(x) p(y) \Delta x \Delta y$. Since the orientation doesn't matter, any similar region $r$ units away from the origin has the same probability, and so we could express this in polar as $p(r) \Delta x \Delta y$; i.e., $p(r) = p(x) p(y)$.
Differentiating both sides of $p(r) = p(x) p(y)$ with respect to $\theta$ yields $0 = p(x) \frac{dp(y)}{d \theta} + p(y) \frac{dp(x)}{d \theta}$. Using $x = r \cos \theta$, $y = r \sin \theta$, simplifying, and separating variables produces the differential equation $$\frac{p'(x)}{x p(x)} = \frac{p'(y)}{y p(y)}.$$
Now, we assumed that $x$ and $y$ are independent, yet this differential equation holds for any $x$ and $y$. This is only possible if, for some constant $C$, $$\frac{p'(x)}{x p(x)} = \frac{p'(y)}{y p(y)} = C.$$ Solving the $x$ version of this differential equation yields $$\frac{dp}{p} = Cx \, dx \Rightarrow \ln p = \frac{Cx^2}{2} + c \Rightarrow p(x) = Ae^{Cx^2/2}.$$ Finally, since large errors are less likely than small errors $C$ must be negative. So we have $$p(x) = A e^{-kx^2/2}.$$ Since $p(x)$ is a probability density function, $$\int_{-\infty}^{\infty} A e^{-kx^2/2} dx = 1,$$ which is just a scaled version of your original integral.
(A little more work shows that $A = \sqrt{k/2\pi}$. Also, if you think about it some, it makes sense that $k$ should be inversely related to the variability in your throwing. And for the normal pdf, we do in fact have $k = 1/\sigma^2$.)
Solution 2:
In the early 18th century, Abraham de Moivre wrote a book on probability called The Doctrine of Chances. He wrote in English because he had fled to England to escape the persecution of Protestants in France. He considered the probability distribution of the number of heads that appear when a fair coin is tossed $n$ times. The exact value of the probability that number is $x$ is takes a while to compute. The mean is $\mu=n/2$ and the standard deviation is $\sigma= \sqrt{n}/2$. Consider $$ \varphi(x) = (\text{some normalizing constant}) \cdot e^{-x^2/2},\text{ and }\Phi(x) = \int_{-\infty}^x \varphi(u)\,du. $$ where the constant is chosen so that $\varphi$ integrates to 1. de Moivre found the normalizing constant numerically, and later his friend James Stirling showed that it is $1/\sqrt{2\pi}$, which I think was stated in a later edition of de Moivre's book.
de Moivre showed that the cumulative probability distribution of the number of heads, evaluated at $x$, approaches $F(x)=\Phi((x-\mu)/\sigma)$ as $n$ grows. This was an early version of the central limit theorem. The probability the the number of heads is exactly $x$ is approximated by $F(x+1/2) - F(x-1/2)$.
That is a reason to consider this function.
In the 19th century, Carl Gauss showed that least-squares estimates of regression coefficients coincide with maximum-likelihood estimates precisely if the cumulative probability distribution function (c.d.f.) of the errors is $x\mapsto\Phi(x/\sigma)$ for some $\sigma>0$. Apparently that's how the name "Gaussian" got attached to these functions.
James Clerk Maxwell showed that if $X_1,\dots,X_n$ are independent identically distributed random variables and their probability distribution is spherically symmetric in Euclidean space, then again, the c.d.f. must be that same "Gaussian" function.
Solution 3:
My guess (and it is only a guess) is that Laplace was motivated by applications to the heat equation. As it turns out, a scaled Gaussian describes how heat propagates from a point in Euclidean space, so you can describe how heat propagates from an arbitrary initial distribution by adding up a bunch of Gaussians. Of course, an arbitrary initial distribution may be continuous, and then the sum turns into a convolution.
If you expect that the distribution of heat propagating from a point is proportional to a Gaussian (which I guess you can motivate by heuristically applying the central limit theorem), then the proportionality constant is a Gaussian integral, which you now need to know the value of.
Solution 4:
The function $e^{-x^2}$ is proportional to the probability density of the normal distribution (with mean 0, variance $1/2$). So you want to find the constant $C$ to make this a probability density:
$$\int_{-\infty}^{\infty}\!Ce^{-x^2}\,dx=1$$
(since you want the total probability to be 1).
I suspect that's the historic reason as well.
Solution 5:
The function $e^{-x^2}$ is natural for investigation for lots of different reasons. One reason is that, depending on your normalization, it is essentially a fixed point of the Fourier Transform. That is, $$ \int_{\mathbb{R}^n} e^{-\pi x^2} e^{-2\pi ix\cdot t}\mathrm{d}x=e^{-\pi t^2} $$ Another reason is tied to the Central Limit Theorem. Suppose that $f$ satisifies $\int_{\mathbb{R}^n}f(x)\;\mathrm{d}x=1$, $\int_{\mathbb{R}^n}x\;f(x)\;\mathrm{d}x=0$, and $\int_{\mathbb{R}^n}|x|^2\;f(x)\;\mathrm{d}x=1$ (these can be attained by translating and scaling the domain and scaling the range of $f$). Let $f^{\;*k}$ be the convolution of $f$ with itself $k$ times. Then $k^{n/2}f^{\;*k}(x\sqrt{k})\to \frac{1}{\sqrt{2\pi}^n}e^{-x^2/2}$ as $k\to\infty$.