Understanding the Laplace operator conceptually
The Laplace operator: those of you who now understand it, how would you explain what it "does" conceptually? How do you wish you had been taught it?
Any good essays (combining both history and conceptual understanding) on the Laplace operator, and its subsequent variations (e.g. Laplace-Bertrami) that you would highly recommend?
The Laplacian $\Delta f (p)$ is the lowest-order measurement of how $f$ deviates from $f(p)$ "on average" - you can interpret this either probabilistically (expected change in $f$ as you take a random walk) or geometrically (the change in the average of $f$ over balls centred at $p$). To make this second interpretation precise, write the Taylor series
$$ f(p+x) = f(p) + f_i(p) x^i + \frac12 f_{ij}(p) x^i x^j + \cdots$$
and integrate:
$$ \int_{B_r(p)} f = f(p) V(B_r) + f_i(p)\int_{B_r(0)}x^idx+\frac12 f_{ij}(p)\int_{B_r(0)}x^ix^jdx + \cdots.$$
The integrals $\int x^i dx$ vanish because $x^i$ is an odd function under reflection in the $x^i$ direction, and similarly the integrals $\int x^i x^j dx$ vanish whenever $i\ne j$; so this simplifies to
$$ \frac{1}{V(B_r)}\int_{B_r(p)} f = f(p) + C \Delta f(p) r^2 + \cdots $$
where $C$ is a constant depending only on the dimension.
The Laplace-Beltrami operator is essentially the same thing in the more general Riemannian setting - all the nasty curvy terms will be higher order, so the same formula should hold.
To gain some (very rough) intuition for the Laplacian, I think it's helpful to think of the Laplacian on $\mathbb{R}$, which is just the second derivative $\frac{d^2}{dx^2}$. (This answer may be more elementary than the OP was looking for, but I wish I had kept some of these things in mind when I first learned about the Laplacian.)
Just as Anthony's answer discusses, the second derivative at $p \in \mathbb{R}$ measures how much $f(p)$ deviates from average values of $f$ on either side of it. If the second derivative is positive, then $f(p)$ is smaller than the average of $f(p + h)$ and $f(p - h)$ for small $h$. (As I would tell my calculus students, the trapezoid rule for Riemann sums is an overestimate when the second derivative is positive.)
Generally, a function is harmonic if and only if it satisfies the mean value property. In $\mathbb{R}$, harmonic functions are simply linear polynomials, which of course are precisely the functions that satisfy the mean value property.
The maximum principle states roughly that if $\Delta u \geq 0$, then local maxima of $u$ do not occur. This is a generalization of the familiar "second derivative test" from calculus, which says that if the second derivative of $u$ is positive, then local maxima of $u$ do not occur (the graph of $u$ is concave up).
Finally, let me go up one dimension and mention some of my intuition for harmonic functions $u(x,y)$ of two variables, in which case $\Delta u = \frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2}$. If $u$ is harmonic, then $\frac{\partial^2 u}{\partial x^2} = - \frac{\partial^2 u}{\partial y^2}$. This says that the graph of $u$ must always look like a saddle: if, say, the graph is concave up in the $x$-direction ($\frac{\partial^2 u}{\partial x^2} > 0$), then it must be concave down in the $y$-direction ($\frac{\partial^2 u}{\partial y^2} < 0$). When I picture a saddle-shaped graph in my head, I think I can also see why the maximum principle has to hold for harmonic functions, since a saddle has no local extrema.
I think the most important property of the Laplace operator $\Delta$ is that it is invariant under rotations. In fact, if a differential operator on Euclidean space is rotation and translation invariant, then it must be a polynomial in $\Delta$. That is why it is of such prominence in physical problems.
Some good books on the subject:
Rosenberg's The Laplacian on a Riemannian Manifold.
Gurarie's Symmetries and Laplacians.
Another view along the lines of the answers above:
Suppose you have some region in the plane $\Omega$, and you are given the value of some scalar function $f$ along the boundary $\partial \Omega$. You now want to fill in $f$ on the interior of $\Omega$ "as smoothly as possible." (A common physical interpretation is that $f$ is the heat of the region: you are fixing the temperature of the boundary of $\Omega$ and want to know what the temperature on the interior will be at steady state.)
What does "as smooth as possible" mean? Well, one measure of the smoothness of $f$ is to look at its gradient $\nabla f$ and measure $$E(f) = \int_{\Omega} \|\nabla f\|^2\,dA.$$ Notice that this integral, called the Dirichlet energy of $f$, achieves its lowest possible value of $0$ when $f$ is constant. The less smooth (to first order) that $f$ is, the higher the Dirichlet energy will be. Making $f$ as smooth as possible means finding the $f$ that satisfies the boundary conditions and minimizes $E$.
How do we minimize $E$? We "take the derivative and set it to zero": $$\nabla_f E = 0.$$ It may look a little weird to differentiate a scalar (the Dirichlet energy) with respect to a function, but the idea is the same as when you work with the ordinary gradient. Recall that for an ordinary scalar function $g(x,y,z):\mathbb{R}^3\to\mathbb{R}$, the gradient $\nabla g$ at a point is the unique vector that, when you dot it with any direction $v$, tells you the directional derivative of $g$ in that direction: $$\nabla g(x,y,z)\cdot v = \frac{d}{dt}g[(x,y,z)+t v]\biggr\vert_{t\to 0}.$$ The gradient of $E$ works the same way: it gives you the unique function over $\Omega$ that, when you take the inner product of $\nabla E(f)$ with any variation $\delta f$ of $f$, gives you the directional derivative of $E$ in that "direction": $$\int_{\Omega} \nabla E(f) \delta f\,dA = \frac{d}{dt}E(f+t \delta f)\biggr\vert_{t\to 0}.$$ You can do the multivariable calculus and after some integration by parts, you will see that $\nabla E(f) = -\Delta f.$ Several takeaways from this:
The function $f$ that interpolates the boundary conditions as smoothly as possible (in the sense of minimizing the Dirichlet energy) is the solution to the Laplace equation $-\Delta f = 0$.
Given some function $f$ that interpolates the boundary conditions but does not minimize the Dirichlet energy, the gradient of $E$, $-\Delta f$, is the "direction of steepest ascent" of $E$ -- the direction to change $f$ if you want to most quickly increase $E$. The negative of this, $\Delta f$, is the direction that most quickly decreases $E$: if you are trying to smooth $f$, this is then the direction that you want to flow $f$ in. This insight leads to the heat equation $$\frac{df}{dt} = \Delta f$$ which, given initial temperatures on $\Omega$, flows in the direction that best decreases the Dirichlet energy until the heat has diffused as smoothly as possible over the surface.
Nowhere in the above discussion was it essential that $\Omega$ was a piece of a plane: as long as you can define functions on $\Omega$ and take gradients of $f$ to get the Dirichlet energy, the above works equally well, and is one way of motivating the Laplace-Beltrami operator on arbitrary manifolds in $\mathbb{R}^3$. The physical picture here is that you have some conductive plate in empty space, and heat up the boundary of the plate, and look at how the heat equalizes over the plate.
Here is some intuition:
I think the most basic thing to know about the Laplacian $\Delta$ is that $-\Delta = -\text{div} \, \nabla$, and $-\text{div}$ is the adjoint of $\nabla$. Hence, $-\Delta$ has the familiar form $A^T A$ which recurs throughout linear algebra. We see that $-\Delta$ is a self-adjoint positive semidefinite operator, and so we would expect (or hope) that the familiar properties of positive semidefinite operators in linear algebra hold true for $-\Delta$. Namely, we expect that $-\Delta$ has real nonnegative eigenvalues, and that there should exist (in some sense) an orthonormal basis of eigenfunctions for $-\Delta$. This provides some intuition or motivation for the topic of "eigenfunctions of the Laplacian". (By the way, I think the Laplacian should have been defined to be $- \text{div} \, \nabla$.)
Notice that the integration by parts formula can be interpreted as telling us that $-\frac{d}{dx}$ is the adjoint of $\frac{d}{dx}$ (in a setting where boundary terms vanish). Fourier series can be discovered by computing the eigenfunctions of the anti-self-adjoint operator $\frac{d}{dx}$ in an appropriate setting. Moreover, a multivariable integration by parts formula can be interpreted as telling us that $-\text{div}$ is the adjoint of $\nabla$. Green's second identity can be interpreted as expressing the self-adjointness of the Laplacian.