Why is the free monoid free?

I think you are confusing the universal property of the initial object of a category and the universal property of a free object.

What follows may be a bit too abstract for your taste, so you can skim and skip to the portion below the horizontal line:

Definition. Given a category $\mathcal{C}$, an initial object in $\mathcal{C}$ is an object $s$ such that for each object $a\in \mathcal{C}$ there is exactly one arrow $s\to a$.

An initial object, if it exists, is unique up to unique isomorphism: if $s$ and $s'$ are both initial objects, then there is a unique map $f\colon s\to s'$, a unique map $g\colon s'\to s$. So $g\circ f\colon s\to s$ is the unique map from $s$ to $s$, so $g\circ f = \mathrm{id}_s$; and similarly $f\circ g = \mathrm{id}_{s'}$.

By contrast, a free object is defined in terms of sets:

Definition. Let $\mathcal{C}$ be a category in which every object is a set and every arrow is a set function. Let $\mathbf{U}\colon\mathcal{C}\to\mathcal{S}et$ be the underlying set functor that maps every object to its underlying set. If $X$ is a set, then a free $\mathcal{C}$-object on $X$ is an object $F(X)$ of $\mathcal{C}$, together with a function $i\colon X\to \mathbf{U}(F(X))$ from $X$ to the underlying set of $F(X)$, such that:

  • For every object $C\in\mathcal{C}$ and every set-theoretic function $j\colon X\to \mathbf{U}(C)$, there exists a unique arrow $f\colon F(X)\to C$ in $\mathcal{C}$ such that $j=f\circ i$.

A free object on $X$, if it exists, is unique up to unique isomorphism. But the map $f$ need not be onto.

If $\mathcal{C}$ has free objects on every set, then the free object on the empty set is necessarily the initial object in $\mathcal{C}$. Thus, the trivial group is the initial object in $\mathcal{G}roup$, and the trivial monoid is the initial object in $\mathcal{M}onoid$, because the trivial group (resp. monoid) is the free group (resp. monoid) on the empty set. On the other hand, there is no free semigroup on the empty set.

The free $\mathcal{C}$-object on $X$, when it exists, can be interpreted as an initial object, but in a different category: you consider the category of all pairs $(C,j)$, where $C$ is an object of $\mathcal{C}$, $j\colon X\to C$ is a set-theoretic function, and arrows $(C,j)\to (D,k)$ are morphisms $f\colon C\to D$ from $\mathcal{C}$ such that $f\circ j = k$. Then $(F(x),i)$ is an initial object of this category.


If $\Sigma^*$ is the monoid of all strings drawn from the alphabet $\Sigma$, then it turns out that $\Sigma^*$ is the free monoid on the set $\Sigma$. The universal property it satisfies is not being "an initial object in the category of monoids", but rather that described above: given any set-theoretic function $j\colon \Sigma\to M$ into a monoid $M$, there is a unique monoid homomorphism $f\colon \Sigma^*\to M$ such that $f|_{\Sigma}=j$ (that is, $f$, restricted to the set $\Sigma$ viewed as a subset of $\Sigma^*$, is equal to $j$). But this map is not necessarily onto. It is onto if and only if $j(\Sigma)$ generates $M$ (as a monoid).

The proof that $\Sigma^*$ is indeed the free monoid on $\Sigma$ is fairly straightforward. Given $j\colon \Sigma \to M$, we define $f\colon \Sigma^*\to M$ recursively, using the definition of $\Sigma^*$:

  1. Define $f$ on $\Sigma_0$ (the empty word) as mapping to the identity element of $M$.
  2. Define $f$ on $\Sigma_1 = \Sigma$ to be equal to $j$.
  3. Having defined $f$ on $\Sigma_i$, define $f$ on $\Sigma_{i+1}$ as follows: given $uv\in\Sigma_{i+1}$, with $u\in\Sigma_i$ and $v\in \Sigma$, define $f(uv)$ to be $f(u)j(v)$.

This defines $f$ on all of $\Sigma^*$. Using associativity, you show that this is a monoid homomorphism. And for uniqueness, you again do it by induction: any other $g$ which agrees with $f$ on $\Sigma_1$ must agree on $\Sigma_0$ (because they are module homomorphisms), and if they agree on $\Sigma_i$, then they will agree on $\Sigma_{i+1}$, since $g(uv) = g(u)j(v) = f(u)j(v)=f(uv)$.

This shows that for any monoid $M$ and any set-theoretic function $j\colon \Sigma\to M$, there is a unique monoid homomorphism $f\colon \Sigma^*\to M$ such that $f(\sigma)=j(\sigma)$ for all $\sigma\in\Sigma$, so $\Sigma^*$ is the free monoid on $\Sigma$.

Again, $f$ will be onto if and only if $\langle j(\Sigma)\rangle = M$. It is always onto $\langle j(\Sigma)\rangle$, since the image is a submonoid of $M$ that contains $\Sigma$, and every element in the image is a product of elements of $j(\Sigma)$.