Is the Dirac Delta "Function" really a function?

First you should confront the question why should I think of the $\delta$-function as a function at all? If you are trying to imagine it as a real-valued function of real inputs, which just happens to be $0$ just about everywhere, then you are off to a bad (but very common) start. You can define $\delta$ as a symbol with certain properties relating to combining it with an actual function and some other symbols (e.g. $\int$), and this really suffices for most purposes, so why insist on trying to cram such an interesting object into a limited definition of "function?"

So instead, let's take a different approach. Let $f : \mathbb{R} \to \mathbb{C}$ be a generic function from the reals to the complexes. Consider the set of all1 such functions, and call it $L$ for lack of a better letter. $L$ is a set just like $\mathbb{R}$, and so we can define maps (read: functions) from it to $\mathbb{C}$ as well. The $\delta$-function is one such beast, defined by \begin{align} \delta : L & \to \mathbb{C} \\ f & \mapsto f(0). \end{align} Thus it is a function, but not of real numbers. It is a function of functions of reals, which is sometimes called a functional.

So what about the integrals? Well you can also approach this in a limiting fashion. One way is to note that $$ \lim_{\sigma\to0} \int\limits_\mathbb{R} f(x) \frac{1}{\sqrt{2\pi\sigma^2}} \mathrm{e}^{-x^2/2\sigma^2} \mathrm{d}x = f(0). $$ Exchange the limit and the integral2, and you see that there is a "function" - or rather a limit of a sequence of functions from $L$ that is itself not a member of $L$ - whose values seem to be given by $$ \delta(x) = \lim_{\sigma\to0} \frac{1}{\sqrt{2\pi\sigma^2}} \mathrm{e}^{-x^2/2\sigma^2}. $$ This is what a distribution is, with terminology suggestive of the probability distributions one so often integrates against (though I could be mistaken on the etymology). Note though that we really weren't allowed to switch that limit and integral while we still called that Gaussian-looking thing a member of $L$. After all, taking the pointwise limit first produces something that vanishes everywhere but a point, and such an object will cause the Lebesgue integral we were using to vanish as well.

In any event, the integral was there from the very beginning. You can think of this as overbearing notation for what we really wanted to say: "Give the value that results when $\delta$ acts on $f$." The integral notation has another advantage, though, and that is in connection with inner product spaces. Secretly, we constructed $L$ to be a vector space over $\mathbb{R}$. Then the set of linear maps from $L$ to $\mathbb{C}$ form its dual space $L^*$. For every $g \in L$ there is a corresponding $g^* \in L^*$, which can conveniently be represented in this integral notation as the complex conjugate of $g$.3 The inner product of $f$ and $g$ is $$ \langle f | \underbrace{g}_{g\in L} \rangle = \int\limits_\mathbb{R} f(x) \underbrace{g^*}_{g,g^*\in L}(x) \mathrm{d}x, $$ and so you can identify \begin{align} \underbrace{g^*}_{g^*\in L^*} : L & \to \mathbb{C} \\ f & \mapsto \int\limits_\mathbb{R} f\underbrace{g^*}_{g,g^*\in L}. \end{align}

Now for every $g \in L$ there is a corresponding dual member that you can write as the complex conjugate of $g$ for the purposes of such integration, but the converse is not true.4 $\delta$ is an example of a member of $L^*$ that has no actual function in $L$ we can complex conjugate and integrate against to replicate its behavior.


1 In practice this is often too much. It's better to restrict attention to, e.g., all square-integrable functions from $\mathbb{R}$ to $\mathbb{C}$.

2 Beware! A very dangerous thing to do!

3 Yes, we are about to thoroughly abuse the two meanings of $*$ - be on the lookout.

4 It won't be in general unless $L$ is finite-dimensional, but in that case you have Kronecker deltas and finite sums rather than Dirac deltas and integrals.


The reason that we try to make sense of the $\delta$-function inside of an integral is because its defining characteristic is given in terms of an integral. That is, the $\delta$-function is zero everywhere other than the origin, and

$$\int_{\mathbb{R}} \delta(x) dx = 1$$

Heuristically, the $\delta$-function concentrates all its mass at the origin. Of course, no actual function $f(x)$ enjoys this property, since if $f(x)$ was zero everywhere other than the origin, it would have integral $0$, even if $f(0) = + \infty$.

That being said, there are situations, such as in the study of electromagnetism, that we would like to talk of positive mass existing at a point. The language of integral calculus is indispensable, but it does not classically allow for such constructions. Thus, the $\delta$-function emerges as a way of allowing our theory of integration to make sense of these point masses.

One way to remedy the fact that the $\delta$-function is not a function is to reinterpret it as a distribution, as Chris explained above. Another option is to think of it as a measure. If you haven't studied measure theory, I'll avoid the technical details, only to mention that a measure is a way of assigning a size to a set. The integral above ends with $dx$, which corresponds to the measure that assigns to every set its "obvious" size. The size of $[0,4]$ is $4$, the size of a point set is $0$, and this can be extended to most "messy" sets in a sensible way. When we used the measure $dx$, it was impossible for our integral to detect point masses, since a point was assigned zero size, and hence was inconsequential with respect to integration.

However, we can define a measure $\delta_0$ that assigns a set size $1$ if it contains $0$, and assigns it a size of $0$ otherwise. If we integrate using this measure, all mass is concentrated at the origin, and indeed we have

$$\int_{\mathbb{R}} f(x) d\delta_0= f(0) = `` \int_{\mathbb{R}} f(x)\delta(x) dx"$$


There are several ways to look at this. Personally, I think it is more clear to say that the Dirac $\delta$ function is actually a function -- that's what the name says. Unfortunately, the Dirac $\delta$ function does not exist. There are closely related objects, however, which do exist, and which let us do most of the things that we wish we could do with the $\delta$ function if it actually existed. Understanding both of the previous two sentences is key to understanding what is going on.

Because the $\delta$ function does not exist, many people redefine the term "$\delta$ function" to mean something that does exist, so that they can still talk "as if" the $\delta$ function existed, even though it does not. I find this somewhat revisionist, but it is the most common approach I have seen among people who use the $\delta$ function in their day to day work, so you have to expect it when you look at the literature.

The idea is that any statement involving the $\delta$ function is actually an abbreviation of a different statement, or family of statements, each of which only involves objects that actually exist. For example, the equation $$ \int f \delta\,dx = f(0) $$ can be viewed as an abbreviation for $$ \lim_{n \to \infty} \int f\phi_n\,dx = f(0) $$ where $(\phi_n)$ is a particular sequence of actually-existing functions. Thus the $\delta$ function is replaced by the $\delta$ distribution. The answer by Chris White has more details.

On the other hand, $$ \int f \delta\,dx = f(0) $$ can also be viewed as an abbreviation for $$ \int f\,d\delta = f(0) $$ where the "$\delta$" on the right-hand side is not the $\delta$ function, it's the Dirac measure. This is explained in more detail by Isaac Solomon. Again, "$\int f\delta\,dx$" is a purely formal abbreviation because (in the jargon of measure theory) the $\delta$ measure is not absolutely continuous to Lebesgue measure and so it has no Radon-Nikodym derivative; if this derivative existed, it would be the $\delta$ function.

One reason to continue writing $$ \int f \delta\,dx = f(0) $$ is that it does not commit us to either of these two interpretations; we can switch back and forth between them whenever it is convenient. It is also convenient for setting up computational problems, in the same way that some people set up integrals by drawing diagrams labeled with infinitesimals while at the same time accepting that infinitesimals don't exist.

Another example of a non-existent but still useful mathematical object is the field with one element. There is no field with one element - so this object does not, strictly speaking, exist. But it has nevertheless been useful as a way of thinking about results that involve objects that do exist.