How to interpret the adjoint?
Let $V \neq \{\mathbf{0}\}$ be a inner product space, and let $f:V \to V$ be a linear transformation on $V$.
I understand the definition1 of the adjoint of $f$ (denoted by $f^*$), but I can't say I really grok this other linear transformation $f^*$.
For example, it is completely unexpected to me that to say that $f^* = f^{-1}$ is equivalent to saying that $f$ preserves all distances and angles (as defined by the inner product on $V$).
It is even more surprising to me to learn that to say that $f^* = f$ is equivalent to saying that there exists an orthonormal basis for $V$ that consists entirely of eigenvectors of $f$.
Now, I can follow the proofs of these theorems perfectly well, but the exercise gives me no insight into the nature of the adjoint.
For example, I can visualize a linear transformation $f:V\to V$ whose eigenvectors are orthogonal and span the space, but this visualization tells me nothing about what $f^*$ should be like when this is the case, largely because I'm completely in the dark about the adjoint in general.
Similarly, I can visualize a linear transformation $f:V\to V$ that preserves lengths and angles, but, again, and for the same reason, this visualization tells me nothing about what this implies for $f^*$.
Is there (coordinate-free, representation-agnostic) way to interpret the adjoint that will make theorems like the ones mentioned above less surprising?
1 The adjoint of $f:V\to V$ is the unique linear transformation $f^*:V\to V$ (guaranteed to exist for every such linear transformation $f$) such that, for all $u, v \in V$,
$$ \langle f(u), v\rangle = \langle u, f^*(v)\rangle \,.$$
Solution 1:
For simplicity, let me consider only the finite-dimensional picture. In the infinite-dimensional world, you should consider bounded maps between Hilbert spaces, and the continuous duals of Hilbert spaces.
Recall that an inner product on a real [complex] vector space $V$ defines a canonical [conjugate-linear] isomorphism from $V$ to its dual space $V^\ast$ by $v \mapsto (w \mapsto \langle v,w\rangle)$, where I shamelessly use the mathematical physicist's convention that an inner product is linear in the second argument and conjugate-linear in the first; let us denote this isomorphism $V \cong V^\ast$ by $R$, so that $R(v)(w) := \langle v,w\rangle$.
Now, recall that a linear transformation $f : V \to W$ automatically induces a linear transformation $f^T : W^\ast \to V^\ast$, the transpose of $f$, by $\phi \mapsto \phi \circ f$; all $f^T$ does is use $f$ in the obvious way to turn functionals over $W$ into functionals over $V$, and really represents the image of $f$ through the looking glass, as it were, of taking dual spaces. However, if you have inner products on $V$ and $W$, then you have corresponding [conjugate-linear] isomorphisms $R_V : V \cong V^\ast$ and $R_W : W \cong W^\ast$, so you can use $R_V$ and $R_W$ to reinterpret $f^T : W^\ast \to V^\ast$ as a map $W \to V$, i.e., you can form $R_V^{-1} \circ f^T R_W : W \to V$. If you unpack definitions, however, you'll find that $R_V^{-1} \circ f^T \circ R_W$ is none other than your adjoint $f^\ast$. So, given fixed inner products on $V$ and $W$, $f^\ast$ is simply $f^T$, arguably a more fundamental object, except reinterpreted as a map between your original vector spaces, and not their duals. If you like commutative diagrams, then $f^T$, $f^\ast$, $R_V$ and $R_W$ all fit into a very nice commutative diagram.
As for the specific cases of unitary and self-adjoint operators:
If you want to be resolutely geometrical about everything, the fundamental notion is not the notion of a unitary, but rather that of an isometry, i.e., a linear transformation $f : V \to V$ such that $\langle f(u),f(v)\rangle = \langle u,v\rangle$. You can then define a unitary as an invertible isometry, which is equivalent to the definition in terms of the adjoint. In fact, if you're working on finite-dimensional $V$, then you can check that $f$ is unitary if and only if it is isometric.
In light of the longish discussion above, an operator $f : V \to V$ is self-adjoint if and only if $f^T : V^\ast \to V^\ast$ is exactly the same as $f$ after applying the [conjugate-linear] isomorphism $R_V : V \to V^\ast$, i.e., $f = R_V^{-1} \circ f^T \circ R_V$, or equivalently, $R_V \circ f = f^T \circ R_V$, which you can interpret as commutativity of a certain diagram. That self-adjointness implies all these nice spectral properties arguably shouldn't be considered obvious---at the end of the day, the spectral theorem, even in the finite-dimensional case, is a highly non-trivial theorem in every respect, especially conceptually!
I'm not sure that my overall spiel about adjoints and transposes is all that convincing, but I stand by my statement that the notion of isometry is the more fundamental one geometrically, one that happens to yield the notion of unitary simply out of the very definition of an adjoint, and that the spectral properties of a self-adjoint operator really are a highly non-trivial fact that shouldn't be taken for granted.
Solution 2:
It's difficult to give an intuitive description of the adjoint. Note that the adjoint is here even before we have scalar products. It's just that a scalar product allows to interpret the adjoint of a map $A:\ V\to V$ in one and the same space $V$.
A linear map $A:\ V\to W$ from one vector space $V$ to some other vector space $W$ (of any dimensions) produces for each vector $x \in V$ a vector $y:=Ax\in W$.
Assume now that on $W$ a linear function $\phi:\ W\to{\mathbb R}$, i.e., an element of $W^*$, is given, which assigns, e.g., to each point $y\in W$ a temperature value $\phi(y)$, or computes for each $y\in W$ the first coordinate with respect to some basis of $W$. Then the function $$\psi:\quad V\to{\mathbb R},\qquad x\mapsto\phi\bigl(Ax\bigr)$$ computes for each point $x\in V$ the temperature felt at $Ax\in W$, "even before $x$ is actually mapped to $W$". In this way we can regard $\psi$ as a "virtual" temperature distribution on $V$. It is obvious that $\psi$ is a linear function from $V$ to ${\mathbb R}$, i.e., an element of $V^*$.
What we have described here for one $\phi\in W^*$ can of course be done with every $\phi\in W^*$: For each such $\phi$ we shall get a corresponding $\psi\in V^*$. All in all the map $A:\ V\to W$ given at the beginning induces a certain map $$W^*\to V^*, \quad \phi\to\psi\ .$$ This map is called the transpose of $A$ and is denoted by $A^*$. By definition we have the identity $$A^*\phi.x\ =\ \phi. Ax\qquad\forall x\in V,\ \forall\phi\in W^*\ .$$ Here the . means that the linear functional on the left of the dot is applied to the vector on the right of the dot.
An example: Assume that $(e_k)_{1\leq k\leq n}$ is a basis in $V$ and $(f_i)_{1\leq i\leq m}$ is a Basis in $W$. Then $A$ has a certain matrix $[a_{ik}]$ with respect to these bases. Now let $$\phi:=f_1^*:\quad y\mapsto y_1\tag{1}$$ be the functional that computes the first coordinate of any given vector $y\in W$. Then $\phi.Ax$ is the first coordinate $y_1$ of the vector $y:=Ax$. We all know that $$y_1=\sum_{k=1}^n a_{1k} x_k\ .\tag{2}$$ Since we can write $x_k$ as $x_k=e_k^*.x$ we can interpret $(2)$ as $$A^*\phi.x=\phi.Ax=\sum_{k=1}^n a_{1k} e_k^*.x\qquad(x\in V)\ ,$$ or, given the definition $(1)$ of $\phi$: $$A^* f_1^*=\sum_{k=1}^n a_{1k} e_k^*\ .$$
Solution 3:
The adjoint allows us to go from an active transformation view of a linear map to a passive view and vice versa. Consider a map $T$ and a vector $u$ and a set of basis covectors $e^i$ for $i \in 1, 2, \ldots$. Given the definition of the adjoint, we have
$$\left[\underline T(u), e^i \right] = \left[u, \overline T(e^i) \right]$$
where the commonly used bracket notation $[x, f]$ means $f(x)$.
On the left, we're actively transforming $u$ to a new vector and evaluating components in some pre-established basis. On the right, we're passively transforming our space to use a new basis and evaluating $u$ in terms of that basis. So for each active transformation, there is a corresponding (and equivalent) passive one.
That said, while I think this can help identify the meaning of the adjoint, I don't see how this helps make intuitive the theorems you described.