Solution 1:

I'll try to present the role that distributions play when we're looking for solutions of differential equations. For the moment, let's work in $\mathbb{R}^n$ and with smooth functions only. Suppose that we would like to solve the Poisson equation $\Delta u = f$, given the function $f \in C^{\infty}$. If such a function $u \in C^{\infty}$ exists, then for any $v \in C^{\infty}_0$, we have \begin{equation} \int_{\mathbb{R}^n} v \Delta u = \int_{\mathbb{R}^n} fv. \tag{1} \end{equation} Using integration by parts, \begin{equation} -\int_{\mathbb{R^n}} \nabla u \cdot \nabla v = \int_{\mathbb{R}^n} fv. \tag{2} \end{equation} Let us now reverse the logic: suppose we can find a function $u$ that satisfies $(2)$ for every $v \in C^{\infty}_0$, then does it necessarily satisfy $\Delta u =f$? Whether or not it does, we'll call such a function $u$ a $\textbf{weak solution}$ of the differential equation $\Delta u = f$. It is in general much easier to show that a particular differential equation has a weak solution than a solution in the usual sense.

The idea of distributions takes this idea one step further: we can view the map $v \mapsto -\int_{\mathbb{R}^n} \nabla u \cdot \nabla v$ as a linear functional $C^{\infty}_0 \to \mathbb{R}$. We'll define a $\textbf{distribution}$ as a linear functional $C^{\infty}_0 \to \mathbb{R}$; as before, we can ask if we have a distribution $\nu$ that agrees with the linear functional $v \mapsto \int_{\mathbb{R}^n} fv$, can we build from it a solution of $\Delta u =f$? (Such a distribution will often also be called a weak solution.) While it is not clear whether or not this will always occur, this is the setting in with the machinery of functional analysis is best suited to tackle the problem.

In many nice cases (such as the Poisson equation for suitable $f$), if we have a distrubution that solves our PDE in the weak sense, then it will turn out that this is in fact a function, and its regularity will depend on that of the given function $f$ (for example, if $f \in C^{\infty}$ and $\Delta u =f$, then $u \in C^{\infty}$ as well; this is an example of elliptic regularity.)

The takeaway from the above is the following: we would like to find $u$ satisfying $\Delta u = f$ in some function space, so enlarge our function space to the space of distributions. There, show we can find a solution, and show that this solution in fact was a member of the original function space to start with (of course, this last sentence will not always work out, but this is the philosophy and the hope!).

Solution 2:

The most basic application is the use of the fundamental solution (also known as the Green's function) to solve inhomogeneous linear problems. When $*$ is convolution and $\delta$ is the Dirac delta centered at zero, $\delta * f=f$ for a wide class of $f$. On the other hand, if $L$ is a linear differential operator, then $Lu * f=L(u*f)$. (Or at least, this is definitely true when $u,f$ are smooth.) This means that if you could find a solution to $Lu=\delta$, then you could convolve it with $f$ on both sides to get $L(u*f)=f$. So $u*f$ is the solution to $Lv=f$ if $u$ is the solution to $Lw=\delta$. This $u$ is called the fundamental solution or Green's function for the operator $L$.

Duhamel's principle lets us extend this to time-dependent problems, provided the spatial differential operator is constant (and again linear). Cf. http://en.wikipedia.org/wiki/Duhamel%27s_principle#General_considerations

Solution 3:

Below is the minimal mathematics - or rather the minimal Linear Systems Theory - needed to understand the role of distributions and Green's functions with differential equations. As extracted from old notes about "Signals and Systems".
Signals are represented by $\,x(t)\,$ and $\,y(t)\,$ where $\,x\,$ is an input signal / excitation, $\,y\,$ is an output signal / response and $\,t\,$ is time. A linear system $S$ is represented by $y(t) = S x(t)$ ; it produces an output when given an input and linearity means that: $$ S \left[ \lambda\, a(t) + \mu\, b(t) \right] = \lambda \, S \, a(t) \, + \, \mu \, S \, b(t) $$ So $S$ is a linear one-dimensional operator. More about linear operators and especially operators with (ordinary) differential equations in:

  • Operator Calculus

A system $S$ is homogeneous in time - also called Time invariant or Shift invariant - iff for all input signals $x(t)$ and for all output signals $y(t)$ and all time-shifts $\tau$ : $$ S \, x(t-\tau) = y(t-\tau) $$ Properties of linear homegeneous systems are for example: $$ S x'(t) = y'(t) \quad \Longleftrightarrow \quad S \frac{d}{dt} x(t) = \frac{d}{dt} S x(t) \quad \Longleftrightarrow \quad \left[ S \, , \, i\, \hbar \frac{d}{dt} \right] = 0 $$ The response of the derivative of the input is the derivative of the output; time differentiation cummutes with the operator of the system; conservation of energy is guaranteed (QM).
Consider the following summation, for a broad class of functions $f(t)$ : $$ S \left[ \sum_i f(\tau_i)\, x(t-\tau_i)\, \Delta\tau_i \right] = \sum_i f(\tau_i) \, S x(t-\tau_i) \, \Delta\tau_i = \sum_i f(\tau_i)\, y(t-\tau_i)\, \Delta\tau_i $$ Taking the limit of this Riemann sum for $\Delta\tau_i \to 0$ yields: $$ S \left[ \int_{-\infty}^{+\infty} f(\tau)\, x(t-\tau) d\tau \right] = \int_{-\infty}^{+\infty} f(\tau)\, y(t-\tau) d\tau $$ Where the convolution integrals $f * x$ and $f * y$ are recognized.
Suppose that a linear and homogeneous system is excited with a Dirac-delta as its input. Then the corresponding response is called a delta response, written as $\,h(t)\,$ by definition. So we have: $$ S\, x(t) = y(t) \qquad ; \qquad S\, \delta(t) = h(t) $$ This one-dimensional function is equivalent to Green's function when generalized to more dimensions e.g. space-time. The fundamental property of Dirac-delta says that: $x(t) = x(t) * \delta(t)$ , called "diafragma" property in Dutch, but couldn't find a nice English equivalent. Hence: $$ y(t) = S\, x(t) = S \left\{ \, x(t) * \delta(t) \, \right\} = x(t) * h(t) $$ Thus the superposition integral of $S$ has been found: $$ y(t) = h(t) * x(t) = \int_{-\infty}^{+\infty} h(\tau)\, x(t-\tau) d\tau $$ Consequently: if we know the Delta-response then we know any response of the system.
The above explains in a nutshell some essentials, at hand of one-dimensional linear & homogenous systems in time. I hope it nevertheless serves a purpose and that the reader is capable of thinking how to generalize this material to more than one dimension.