Why do natural transformations express the fact that a vector space is canonically embedded in its double-dual but not in its dual?

I've been struggling for quite a while to understand why a vector space is considered to be "canonically embedded" into its double dual, but not its dual. As has been remarked in many other places, the distinction between whether an (iso-)morphism is "natural" can often seem vague and unintuitive. For me particularly, I think that part of the problem is that this sort of statement seems to run entirely counter to something I was taught early in Abstract Algebra as a Profound and Fundamental Lesson: "Isomorphic structures are exactly the same in all respects. When two things are isomorphic, all the things that can be said about one carry over verbatim to the other. There is no distinction between them." However, moving into more abstract linear algebra, a sort of about-face is being made, and now we are making the distinction of effectively saying, "My isomorphism is better than yours." In order to justify this apparent contradiction, the argument is typically made that the (iso-)morphism into the double dual does not require any "choices", while any embedding into the dual will require some "choice" to be made. However, this seems... unconvincing. So what if you can jury-rig a bilinear form out of whatever embedding/isomorphism I pick? Do we really have to pay attention to that? Again, this seems rather vague and unintuitive.

To make the argument more precise then, it is claimed that the ultimate answer lies in that Fountain of Eternal Truth - Category Theory. More specifically, it is claimed that the fact that there is a natural transformation from the identity functor on vector spaces to the double-dual functor justifies the claim that the embedding into the double-dual is "natural", while the fact that there is no such transformation between the identity and the dualizing functor shows that any such embedding into the dual is "not natural". This is elucidated beautifully in this thread. However, I claim that this is still not the final nail in the coffin of doubt. More specifically, I do not understand how natural transformations actually express the idea that a construction is (quotation marks) "natural".

How does this business with commuting diagrams make precise the idea that the embedding of a vector space into its double dual is "natural"? How does the implication that any association with the dual is "not natural" stem from a theorem saying that a certain collection of diagrams will never commute?

Another thing to note, that took me a little by surprise, is that the content of these arguments depends not only on the construction of the dual and double dual spaces, but also on this other construction called the transpose, which associates a linear map $f^*: F^* \to E^*$ to every linear map $f: E \to F$. So the fact that a map between a space and its dual is not "natural" also depends on the fact that we define an association between linear maps, that we package together with the dual operation to form the dualizing functor; however, this association between linear maps seems rather external to the association between a vector space and its dual. This is also bothering me.

I will not deny that the transpose operation does seem like a very natural thing to pair along with the dual operation, but what does seem odd is that the intrusiveness of this transpose operation should make or break the "naturalness" of something strictly between a vector space and its dual - honestly, who ordered that? Why can't I concoct some other association between linear maps - one that's covariant, at that, and package that together with the dual space to make something that admits a natural transformation from the identity? Note however that this would also break the "theorem" that a vector space is canonically embedded in its double dual, so this sort of train of thought is a double-edged sword.

Ultimately, I feel that I don't understand natural transformations in general very well; this example is really just the biggest one that sticks out to me and the one that I care about the most. I may post another question about the general case of understanding natural transformations, depending on how well this one goes and also whether I can manage to formulate it in a manner that seems intriguing and not simply lost and confused. At any rate, I look forward to any potential answers and would greatly appreciate whatever illumination you may be able to provide.


Solution 1:

I am answering as somebody who has struggled through a related matter, as you noted in the OP. I do not think I will be able to satisfy every one of your related threads of dissatisfaction and I am not sure I will be able to satisfy any at all. On the flip side, as the question is a year old, you may have resolved it for yourself long ago.

But let's give it a whirl. I love the question.

First of all, separate from the question about how the category language speaks to (or doesn't speak to) matters, it seems to me you are not convinced that there even is a substantive difference between the isomorphism of a finite dimensional vector space to its dual and the isomorphism to its double dual, a propos of your Profound and Fundamental Lesson of Abstract Algebra -- aren't they both isomorphisms? So, before even engaging the category theory, let me speak to this:

(1) I think you will gain useful insight about the situation from studying cases where the substance of the difference between a space and its dual is felt. user254665 mentioned one such instance in her/his answer. In general, the infinite-dimensional topological vector spaces of functional analysis provide an abundant source of examples. While the dual of a finite dimensional vector space is finite-dimensional of the same dimension, and therefore isomorphic, the dual of a Banach space is typically a different Banach space. For example the dual of $L^p$ is $L^q$ with $p^{-1}+q^{-1} = 1$, which are two different Banach spaces unless $p=2$. The dual of the space of continuous, compactly supported functions on a locally compact Hausdorff space is a space of measures i.e. it is not even a space of functions!

Even in these situations where the dual is really a different animal, the original space does embed in its double dual, as usual by mapping a vector to the functional on functionals obtained by evaluation at that vector. (I will avoid controversy by not lionizing this embedding as "natural".) In many cases, the embedding is proper, i.e. the double dual is bigger than the original space. Nonetheless, there's often no obvious embedding of the original space in the (single) dual at all.

I am not a functional analyst, but a place I've encountered this substance in my own life is in the difference between a locally compact abelian group and its character group, i.e. its Pontryagin dual. Like vector spaces, this is a situation where finiteness causes a non-canonical isomorphism to the dual, and there is a canonical isomorphism to the double dual. A finite abelian group $A$ is isomorphic to its dual $\hat A$, but not an infinite group. For example, the additive group $\mathbb{Z}$ of integers and the circle group $S^1 = \{z\in\mathbb{C}^\times \mid |z| = 1\}$ are Pontryagin duals of each other, and they don't even have the same cardinality. In the finite case, where they are isomorphic, I've still "bumped into" the difference between $A$ and $\hat A$, for example in trying to understand the relationship between an action of a group $G$ of automorphisms on $A$ and the induced action of $G$ on $\hat A$, e.g. see this question.

All of this is to say that study of such examples can help convince one that the dual is really not the same as the original object, so that even when they're isomorphic it's worth keeping track of which is which. (More so than it is worth distinguishing the object from its double-dual when they are isomorphic.)

(2) How to make sense of this difference in light of your Profound and Fundamental Lesson (PaFL), that isomorphic objects are to all intents and purposes the same.

This is a question about the scope of the PaFL.

The PaFL is the right way to see things when you view the objects in isolation from their surroundings and each other. Let $A$ and $B$ be isomorphic objects (e.g. vector spaces or groups). Any specific isomorphism $\phi:A\rightarrow B$ gives you a dictionary to translate statements about the isolated object $A$ to statements about the isolated object $B$ and vice versa. For example: if $A,B$ are vector spaces, then $\phi$ carries bases to bases, so there is a perfect bijective correspondence between bases of $A$ and bases of $B$. It carries linear transformations of $A$ to linear transformations of $B$ (via $T\mapsto \phi T\phi^{-1}$) so there is a bijection between such transformations. If we think of $\phi$ as a "renaming", then we can think of $B$ as just $A$ with different names.

From this point of view, $A$ and $B$ are "the same", and any "renaming" $\phi$ works as well as any other to show this. This is the PaFL.

But. If we allow $A$ and $B$ to interact with other objects (even each other!), then distinct isomorphisms start to feel very different! For example:

Let $A = \mathbb{R}^2$, seen as a real vector space. Let $B$ be $A$'s vector space dual, i.e. the space of linear functionals $A\rightarrow \mathbb{R}$, with pointwise addition and scalar multiplication. $B$ is isomorphic to $A$ since it is also a 2-dimensional real vector space. One has a wide choice of isomorphisms: fixing a basis of $A$, one can send it to any basis of $B$. There is a 4-dimensional manifold's worth of choice.

Now along comes a linear transformation $T$ acting on $A$, say by scaling the $x$-axis by a factor of $2$. One can pick some isomorphism $\phi:A\rightarrow B$ and translate $T$ into a transformation of $B$ as above (i.e. $\phi T \phi^{-1}$). But there is another (natural??) way that $T$ acts on $B$, irrespective of any choice of $\phi$, which is to send a functional $f:A\rightarrow\mathbb{R}$ to the functional $f\circ T$. Now one can ask about any given $\phi$: does the transformation of $B$ into which it translates $T$ equal this (natural??) action of $T$ on $B$? I.e. does $\phi T \phi^{-1} (f) = f\circ T$ for all $f\in B$? A priori, some $\phi$'s may be compatible with the action of $T$ on $B$ in this respect, and some may not.

One could go further. I chose a specific $T$ at the front end of this. But one could ask if there is a $\phi$ such that $\phi T\phi^{-1}(f)$ will equal $f\circ T$ regardless of the choice of $T$. This $\phi$, if it existed, would clearly (?) be "awesome" in some way that other isomorphisms aren't.

Perhaps you respond by saying, well, why did you bring $T$, and especially its action on $B$ by $f\mapsto f\circ T$, into it? This is a perfectly legitimate question. From the point of view where you only look at $A$ and $B$ as self-contained systems, there's no reason to. But my point is that mathematical objects are often embedded in a network of other mathematical objects (such as $T$, or a wide variety of choices of $T$, and their related actions on $A$ and $B$), and when we bring these other objects and the interactions between them into it, it complicates the (overly?) simplistic picture drawn by the PaFL. Maybe some isomorphisms play better than others with the network of relationships in which $A$ and $B$ are embedded.

(3) This is a segue into the matter of categories. A natural isomorphism between two functors is not an isomorphism between two isolated objects. It is some kind of construction that works simultaneously across an entire category, in such a way that the isomorphisms all interact well with a bunch of other maps.

Thus, the way in which the categorical language translates the word "natural" is, loosely, "working simultaneously across all the objects of a whole category, in such a way that it cooperates with the other relevant maps in the category." The naturality lies in the everywhere-at-once-ness and in the fits-in-with-what-was-already-going-on-ness.

To get specific to the case. Let $\mathscr{V}$ be the category of finite dimensional $\mathbb{R}$-vector spaces.

Let's try to carry out what you proposed in the penultimate paragraph of the OP, i.e. try to reconstruct the dualizing functor as a covariant functor; call it $D$. We are already given the map on objects: it sends $V\in\operatorname{Obj}\mathscr{V}$ to its dual $V^*$. We need to design, for every $T\in \operatorname{Hom}(V,W)$, a map $D(T):V^* \rightarrow W^*$, in such a way that the identity map always gets sent to the identity map, and for any $U\xrightarrow{S} V\xrightarrow{T}W$ occurring in $\mathscr{V}$, we have $D(TS) = D(T)D(S)$.

It seems to me that this is actually possible, modulo some axiom-of-choice typed issues. If we separately chose an isomorphism $\phi_V:V\rightarrow V^*$ for each $V\in \operatorname{Obj}\mathscr{V}$, then we could send $T:V\rightarrow W$ to $D(T) = \phi_W T\phi_V^{-1}$, which maps $V^*$ to $W^*$. Furthermore, it seems to me that the maps $\phi_V:V\rightarrow V^*$ would then constitute a natural isomorphism from the identity functor to our new "dualizing functor" $D$.

I think some readers will be given pause by the fact that this construction needs some form of the axiom of choice to be carried out. (I'm out of my set-theoretic league on what's needed. It seems to me that the category at hand is not a small category; thus we need an even stronger axiom like global choice, right?) But you've indicated that the need to make choices doesn't strike you as a barrier to "naturalness," so I assume that this high degree of nonconstructiveness of the construction won't be a problem. However, I see another issue as well:

This construction loses any information related to the fact that $V^*$ is supposed to be the dual of $V$. It completely ignores the fact that the elements of $V^*$ are supposed to be functionals on $V$. We could replace $V^*$ with any other vector space of the same dimension and carry out the same construction. Thus it seems to me $D$ doesn't really send $V$ to its dual in any meaningful sense. Thus, while it uses a nonconstructive axiom (global choice?) to get past the category-theoretic insistence that a natural transformation happen "all at once across a whole category", it doesn't (honestly anyway, it seems to me) meet the second condition that it "cooperates with what was already going on."

This is where the transpose (also called the adjoint) comes in. You ask, "who ordered that?" I.e. isn't the adjoint map extrinsic to the relationship between $V$ and its dual? I contend it's actually essential. If $T:V\rightarrow W$ is a map between vector spaces, then the adjoint $T^*:W^*\rightarrow V^*$ between their duals is defined as $f\overset{T^*}{\mapsto} f\circ T$. This $T^*$ cooperates with what was already going on! I.e. it transforms the dual space in accordance with what the elements in the dual space are supposed to mean. Without a relationship like that between $T$ and $T^*$ that incorporates the fact that the elements of $V^*$ are supposed to be the contents of $\operatorname{Hom}(V,\mathbb{R})$, a functor sending $V$ to $V^*$ is only meaningfully sending it to some other vector space of the same dimension, not actually its dual.

Thus a natural isomorphism to the dual really should somehow respect the adjoint, or something like it. Otherwise, what makes the dual the dual?

Obviously the question was soft and this is a soft answer. So let me know if any of this speaks to any of the issues you outlined.

Solution 2:

Consider the space $l_0$ of real sequences $(x_n)_{n\in N}$ that converge to $0,$ with $\|(x_n)_n\|=\sup_n |x_n|,$ and its dual $l_1,$ the space of absolutely summable real sequences $(y_n)_{n\in N}$ with norm $\|(y_n)_n\|=\sum_{n\in N}|y_n|<\infty.$ The space $l_0$ contains many positive sequences that are not summable,e.g if $y_n=1/n$ for each $n.$ We should expect an embedding $E$ from $l_0 $ into $l_1$ to preserve the algebraic structure and the topological structure, in other words $E$ should be a continuous linear bijection to its image, and $E^{-1},$ acting on the image of $E$, should also be continuous. Such an $E$ doesn't exist. As a special case of a fairly recent theorem, $l_0$ and $l_1$ are homeomorphic, but by a non-linear mapping $F$, so the algebraic structure is not preserved by $F$.

Solution 3:

It seems to me that there are two possible meanings (close to each other).

One is that such isomorphism $**$ is defined via very simple and "expected" means. Another word commonly used for this is canonical. The definition $(v,w):=w(v)$ for $w\in V^*$ identifies $v$ with an element of $V^{**}$ and this does not depend on any additional structure on $V$, such as a metric. In this sense, it is "natural": you pair elements of $V^*$ with elements of $V$, so you can consider it the other way around as a pairing between elements of $V$ and $V^*$.

Another meaning is, as you say, the categorical. This basically says that not only can you apply $**$ to spaces but also to linear maps and the corresponding diagram commutes. That is, you can identify $f: X\to Y$ with $f^{**}: X^{**}\to Y^{**}$ (again, the identification goes via simple and expected means). As before, the functorial definition of $**$ does not depend on any additional structures such as metrics or scalar products.

These two meanings often go hand-in-hand: if something has a simple and expected definition (or a complicated one, but satisfying simple axioms), usually it can be converted into something categorical. It seems to me that if one wants to highlight the categorical meaning, (s)he uses the word natural, if one wants to highlight the simple and expected thing, (s)he often uses the word canonical.

But I'm not sure if this answers your question because I guess that you are aware of all of this.

Extension + edit: a small attempt to give some intuition of why natural transformation are "natural". Consider a finite-dimensional vector space $V$ and two its bases $\mathcal{B}_1$ and $\mathcal{B}_2$. Let $\mathcal{V}$ be a category with only one object $V$ and morphisms $Mor(V,V)$ being all linear transformations; let $\mathcal{R}$ be a category with one object $\Bbb R^n$ and morphisms being all linear transformations. You can define two functors $F$ and $G$ from $\mathcal{V}$ to $\mathcal{R}$ that express a linear transformation as a matrix wrt. the coordinates $\mathcal{B}_1$ resp. $\mathcal{B}_2$. A natural transformation between $F$ and $G$ assigns to the object $V$ in $Obj(\mathcal{V})$ the morphism $x\mapsto C^{-1}x$ of $\Bbb R^n$ (an element of $Mor(\Bbb R^n, \Bbb R^n)$), where $C$ is the transition matrix from base $\mathcal{B}_1$ to $\mathcal{B}_2$. This morphism is just the coordinate transformation in $\Bbb R^n$.

The fact that it is a natural transformation just reflects that any linear map $f: V\to V$ (an element of $Mor(V,V)$) gives rise to the commutative diagram \begin{array}{ccc} \Bbb R^n & \stackrel{F(f)}{\to} & \Bbb R^n \\ \downarrow_{C^{-1}} && \downarrow_{C^{-1}} \\ \Bbb R^n & \stackrel{G(f)}{\to} & \Bbb R^n \\ \end{array} or equivalently, \begin{array}{ccc} \Bbb R^n & \stackrel{M}{\to} & \Bbb R^n \\ \downarrow_{C^{-1}} && \downarrow_{C^{-1}} \\ \Bbb R^n & \stackrel{C^{-1}MC}{\to} & \Bbb R^n \\ \end{array} where $M$ is the matrix expression of $F(f)$. In physics, this corresponds to a change of observer: observer $\mathcal{B}_2$ will just "see" a vector $C^{-1}x$ and/or "use" the matrix $C^{-1}MC$ whenever observer $\mathcal{B}_1$ "sees" the vector $x$ and "uses" the matrix $M$. But they both see the same "real object". In this sense, the natural transformation is "natural".