'Free Vector Space' and 'Vector Space'

Solution 1:

Yes, the definition probably looks "like" definitions of free objects you may have encountered. The reason is essentially that this author is defining vector spaces essentially as free objects without saying so. And the reason the author can do this is that, as it turns out, every vector space is a free object in the category of vector spaces (at least, every finite dimensional vector space is; the statement that every vector space is a free object, that is that every vector space has a basis, is equivalent to the Axiom of Choice).

Let's consider a vector space you are probably very familiar with: $\mathbb{R}^2$; we usually think of this vector space as being "given" by a collection of things called vectors, which are all things of the form $(x,y)$ with $x,y\in\mathbb{R}$. Then we define a vector sum by $(x,y)+(z,w) = (x+y,z+w)$ (where the sum on the right is the sum of real numbers), a scalar product given by $\lambda(x,y) = (\lambda x,\lambda y)$, and show that it satisfies the axioms, etc.

How is this way of thinking about $\mathbb{R}^2$ related to the definition given by the author? Well, a simple way to think about it is that an element of $\mathbb{R}^2$ is an ordered pair of reals. But an ordered pair is "essentially" the same thing as a function $\{1,2\}\to\mathbb{R}$. How so? Well, the pair $(x,y)$ corresponds to the function $1\mapsto x,\ 2\mapsto y$, because the first entry is $x$ and the second entry is $y$. Conversely, given a function $f\colon\{1,2\}\to\mathbb{R}$, the function uniquely determines an ordered pair by the rule $f\mapsto (f(1),f(2))$. So instead of thinking of the elements of $\mathbb{R}^2$ as ordered pairs, we can think of them as functions with domain $\{1,2\}$. I'll write a function $f\colon\{1,2\}\to \mathbb{R}$ as $\mathbf{f}$ when I want to think of it as a vector.

If we think of the elements of $\mathbb{R}^2$, how do the operations translate into "operations of functions"? Well, if $\mathbf{f}$ and $\mathbf{g}$ are functions from $\{1,2\}$ to $\mathbb{R}$, what should $\mathbf{f}+\mathbf{g}$ be? Since $\mathbf{f}$ corresponds to $(f(1),f(2))$ and $\mathbf{g}$ corresponds to $(g(1),g(2))$, then $\mathbf{f}+\mathbf{g}$ should correspond to $(f(1),f(2))+(g(1),g(2)) = (f(1)+g(1),f(2)+g(2))$. The function whose value at $1$ is $f(1)+g(1)$ and whose value at $2$ is $f(2)+g(2)$ is what we usually call the "sum" of $\mathbf{f}$ and $\mathbf{g}$, so $\mathbf{f}+\mathbf{g}$ just corresponds to the "pointwise sum" of $f$ and $g$: to find the value of $\mathbf{f}+\mathbf{g}$ at $i$, we find the value of $f$ and the value of $g$ and add them. Similarly, $\alpha\mathbf{f}$ should have value $\alpha f(1)$ at $1$ and $\alpha f(2)$ at $2$; that is, the "pointwise scalar product". So in a very natural way, we can think of $\mathbb{R}^2$ as "the set of all functions $f\colon{1,2}\to\mathbb{R}$" instead of as "the set of all ordered pairs $(x,y)$ with $x,y\in\mathbb{R}$."

It should be fairly easy to see that you can do the same thing with $\mathbb{R}^n$, using functions $f\colon\{1,2,\ldots,n\}\to\mathbb{R}$ to represent $n$-tuples (in fact, that is the set-theoretic definition of an $n$-tuple!). A function $f\colon\{1,2,\ldots,n\}\to\mathbb{R}$ corresponds to the $n$-tuple $(f(1),f(2),\ldots,f(n))$.

Of course, this gives you the "obvious" thing to do with $\mathbb{R}^n$. What about other vector spaces, like, say, $\mathbf{P}_n[x]$, the vector space of all polynomials of degree at most $n$ with coefficients in $\mathbb{R}$?

The key to dealing with these vector spaces is the fact that they have bases. Fix a basis $\beta$ for $\mathbf{P}_n[x]$ (or more generally, for $\mathbf{V}$). If $\beta=\{\mathbf{v}_1,\ldots,\mathbf{v}_n\}$ (assuming finite dimensionality for now), then every vector $\mathbf{v}\in\mathbf{V}$ can be written uniquely as a linear combination of elements of $\beta$: $$\mathbf{v}=\alpha_1\mathbf{v}_1+\cdots + \alpha_n\mathbf{v}_n.$$ Thus, if we agree on $\beta$, then we can describe any vector in $\mathbf{V}$ by specifying the coefficients of the linear combination; this is the "coordinate vector" relative to $\beta$; that is, if $\mathbf{v}$ happens to be the linear combination mentioned above, then the coordinate vector of $\mathbf{v}$ with respect to $\beta$ is $[\mathbf{v}]_{\beta}$, with $$[\mathbf{v}]_{\beta} = (\alpha_1,\ldots,\alpha_n).$$ And so, given $\beta$, there is a natural way to define a bijection $\mathbf{V}\leftrightarrow \mathbb{R}^n$, and hence to thinking of $\mathbf{V}$ as a "set of functions". But in order to keep track of $\beta$, and make sure we are talking about the same bases, it makes a bit more sense in this case to make the vectors of $\mathbf{V}$ correspond to functions with domain $\beta$ rather than $\{1,2,\ldots,n\}$. that is, we make $\mathbf{v}$ correspond to the function $f_{\mathbf{v}}\colon\beta\to \mathbb{R}$ given by $f_{\mathbf{v}}(\mathbf{v}_i) = \alpha_i$, the coefficient of $\mathbf{v}_i$ in the linear combination that expresses $\mathbf{v}$ in terms of $\beta$. Thus, in a very natural way, we can think of the elements of $\mathbf{V}$ as "functions from $\beta$ to $\mathbb{R}$"; turns out pointwise addition and scalar multiplication corresponds to vector addition and vector scalar product in $\mathbf{V}$, so everything works out.

Can we do this wholesale? Yes: fix a set $X$, and consider all functions $f\colon X\to\mathbb{R}$ with pointwise addition and pointwise multiplication. Then you get a vector space; and you can realize every vector space as isomorphic to one of these vector spaces by picking a basis and doing the stuff we did above. Different bases will yield different "realizations" of the vector space (if $\beta\neq\gamma$ as sets, then thinking of $\mathbf{V}$ as "all functions $\beta\to\mathbb{R}$" makes it a different object than if we think of $\mathbf{V}$ as "all functions $\gamma\to\mathbb{R}$"); but they will be isomorphic as vector spaces: there is a natural bijection between them which is linear, so there is a way to "translate" from one to the other.

The reason this all works is that every finite dimensional vector space is a free object in the category of finite dimensional (real) vector spaces. (I'll talk a bit about infinite dimensional later). Take the category of finite dimensional real vector spaces, and let $S$ be a set. A "free vector space on $S$" is a (finite dimensional real) vector space $\mathbf{V}$, together with a set-theoretic function $i\colon S\to \mathbf{V}$ such that for any (finite dimensional real) vector spaces $\mathbf{W}$ and set-theoretic function $j\colon S\to\mathbf{W}$, there exists a unique linear transformation $T\colon \mathbf{V}\to\mathbf{W}$ such that $j = T\circ i$. A vector space $\mathbf{V}$ is "free" if there exists a set $S$ such that $\mathbf{V}$ is "free on $S$".

Since a linear transformation is completely determined by the image of a basis, and any selection of images of a basis defines a (unique) linear transformation, if you let $\beta$ be a basis for $\mathbf{V}$, and you let $i\colon\beta\to\mathbf{V}$ be the natural inclusion, then this gives you a free object: given any $\mathbf{W}$ and any $j\colon\beta\to\mathbf{W}$, there is a unique linear $T\colon\mathbf{V}\to\mathbf{W}$ that 'extends' $j$; that is, such that $j = T\circ i$. So $\mathbf{V}$ is free, because it is free on (any of) its basis. And since two sets are "isomorphic" in the category $\mathcal{S}et$ if and only if they are bijectable, and free objects are unique up to unique isomorphism, then two vectors spaces with bases of the same size are "isomorphic up to unique isomorphism" that identifies the bases.

The problem here is that in order to think of a vector space as "free", you really need to think about the basis as well. That is, it is really "vector-space-together-with-a-specific-basis" that gives you a free object, and not the vector space by itself. Pick a different basis, you get a different (though isomorphic) free object.

Going to infinite dimensional vector spaces presents a bit more trouble; given an infinite basis $\beta$, it is not longer true that every function $f\colon\beta\to\mathbb{R}$ represents a vector (though every vector corresponds to a function), because linear combinations must be finite; we only get the functions with "finite support" (only finitely many nonzero values). You can restrict yourself to there, and then things work out generally okay, with some caveats (for example, a linear transformation $\mathbf{V}\to\mathbf{W}$ is still completely and uniquely determined by an arbitrary function $f\colon\beta\to\mathbf{W}$, where $\mathbf{W}$ is a basis for $\mathbf{V}$; this function need not have finite support).

What you are looking at with this definition, though, is really the representability of vector spaces as Hom-sets. Given a fixed vector space $\mathbf{W}$, for any vector space $\mathbf{V}$ you can consider the set of "homomorphisms" of $\mathbf{V}$ to $\mathbf{W}$; this is the set $\mathcal{L}(\mathbf{V},\mathbf{W})$ of linear transformations from $\mathbf{V}$ to $\mathbf{W}$. You are probably aware that this set is in fact a vector space itself with pointwise addition and scalar multiplication, and that it has dimension $\dim(\mathbf{V})\dim(\mathbf{W})$. If you set $\mathbf{W}=\mathbb{R}$ (thought of as a vector space), then $\mathcal{L}(\mathbf{V},\mathbb{R})$ is isomorphic to $\mathbf{V}$, because it has the same dimension (this vector space is called the dual of $\mathbf{V}$). And of course, $\mathcal{L}(\mathbf{V},\mathbb{R})$ is naturally bijectable with the collection of all set-theoretic functions $\beta\to\mathbb{R}$, where $\beta$ is a fixed basis for $\mathbb{R}$. So the author is kind of looking at the dual spaces instead of looking at the spaces; this leads to some difficulties in defining linear transformations, because passing to the dual is contravariant, as Qiaochu points out.

Solution 2:

What you've defined is not the free vector space on $S$. It is isomorphic to it, but is not the same thing. The problem is that "free vector space" is not just an operation on sets, it is a functor, so it also operates on functions between sets. And when you look at the space of functions $S \to F$, the natural way to extend this to a functor is contravariant: given a set function $f : S \to T$, there is a natural map from the space of functions $T \to F$ to the space of functions $S \to F$ given by precomposition. The free vector space functor, however, is covariant.

What is true is that there is a natural dual pairing between the free vector space on $S$ and the space of functions $S \to F$ given by composition, and finite-dimensional vector spaces which are dual have the same dimension. Using this pairing, the dual vector to an element $s \in S$ in the free vector space on $S$ is the indicator function $1_s : S \to F$ which is equal to $1$ at $s$ and equal to $0$ otherwise. So authors will sometimes identify $s$ with its indicator function. (Note that when $S$ is infinite it is highly false that the two spaces have the same dimension, which is another reason to avoid this definition.)

One might argue that the distinction is academic. If you think that, write down what happens to a left group action of a group $G$ on $S$ in each of the two cases. (If you can't tell what the difference is, replace "group" with "monoid.")

Solution 3:

This is not the definition of an arbitrary vector space, for (at least) two reasons.

1) The requirement that $S$ be finite means your vector space -- let me call it $F^{(S)}$ -- is a finite-dimensional vector space. So infinite-dimensional vector spaces are missing. For this, one can extend the construction to an arbitrary set $S$, but we need to slightly redefine $F^{(S)}$ to be the set of all functions $f: S \rightarrow F$ such that $f^{-1}(F \setminus \{0\})$ is finite (finitely nonzero functions). Note that this is precisely the definition in the planetmath article on Free Vector Spaces that you link to.

2) This definition captures all vector spaces up to isomorphism only. It is much closer to the concept of "vector space with distinguished basis". Indeed, if $V$ is an $F$-vector space, then taking $S$ to be a basis of $V$, we get a map from $V$ to $F^{(S)}$ by mapping each basis element $e \in S$ to the characteristic (or "delta") function $1_e: S \rightarrow F$: namely, for $e' \in S$, $1_e(e') = 1$ if $e = e'$ and $0$ otherwise. This is consistent with the universal mapping property underlying the definition of "free vector space", i.e., every vector space can be viewed as (or more accurately, canonically endowed with the structure of) the free vector space on any one of its bases in this way.

In terms of examples: for finite sets $S$, from a concrete perspective this is close to just considering the vector space $F^n$ of $n$-tuples of elements of $F$. Indeed, arguably $F^n$ is an example of this with $S = \{1,\ldots,n\}$. In general, the only difference I can see is that the finite set $S$ does not come with a distinguished ordering, so whereas $F^n$ has a canonical ordered basis $e_1,\ldots,e_n$, $F^{(S)}$ has a canonical unordered basis $\{1_s \ | \ s \in S\}$.