What is the difference between Categories and Relations?

For a common basis, I'll state basic definitions of a category and the relation type I'm thinking of. They're here for quick clarity, not precision, so feel free to revise for an answer.

Category:
A collection of objects, a collection of arrows each with a source and target among said objects, identity arrows, and a composition operator satisfying an associative law over arrows.

Reflexive relation on AxA:
A collection of objects, a labelled multi-set of pairs of said objects satisfying the reflexive law, and the standard relation combinator operation (which satisfies the associative law).
Note that labelled multi-set relations are commonly expressed and utilized in discrete math as multi-graphs. I collapse graphs and relations here for unified axiomatic treatment. This is not to treat, e.g., paths over graphs as different from relation composition.

What, exactly, separates these two? They appear essentially identical in definition and meaning to me. Relations immediately appear more general, given that categories only analogize to a certain restricted class of relations.
I note some notational difference, such as how categories name each arrow; I don't see how that would change mathematical power. Similarly with the explicit treatment of composition. While such contrivances can come in handy for certain proofs or explorations, I don't see how it justifies treating the two as separate branches rather than syntactic shims on identical concepts.

[EDITS: fixed associativity statement, extended relations to multi-set representation with graph analogy]


Solution 1:

Qiaochu is right: categories are much more general. Let me provide a concrete example which, while in a sense trivial, may illustrate just how much more you could get from a category than you can from a reflexive and transitive relation.

Consider the positive integers $\mathbb P$. The usual total order $\leqslant$ is a perfectly good reflexive, transitive relation on this set. We can think of this as a category which consists of the identity map $\mathrm{id}_n$ on each integer n ∈ $\mathbb P$, and the composition of the successor maps $\sigma_n: n \to n+1$. The result is essentially an almost acyclic directed graph (all circuits are stationary walks on a single vertex).

Now let me describe to you a different category on the positive integers: the category of injective linear maps on finite-dimensional real vector spaces — which is to say, the category whose objects are the positive integers, and whose arrows are m × n matrices over $\mathbb R$, with rank n, for m ≥ n.

An aside. "What!?" you may exclaim; "How can you talk about matrices having domains and codomains consisting of positive integers?" Well, the shape of each matrix defines what sorts of other matrices it can be composed with, and any action of a matrix M on a vector v ∈ $\mathbb R^n$ could be viewed as composing the m × n matrix M with the n × 1 matrix v to obtain an m × 1 matrix Mv. The integers stand in for the dimension of the vector space $\mathbb R^n$ or $\mathbb R^m$ which are the "real" domain and codomain of the matrix; and because we can substitute the linear action of operators on $\mathbb R^n$ with the effect of composing two linear operators (one of which happens to have a domain of $\mathbb R^1$), we may ignore the structure inside of the vector spaces and refer to them just by their dimensions.

As with the directed graph generated by the total order ≤, there exist arrows m → n for any m ≥ n (by definition). All of the maps from n to m for m > n+1 can be obtained by composing maps n → n+1 and n+1 → m; and furthermore, all maps n → n+1 can be obtained by composing any single designated map $\sigma_n: n \to n+1$ with a suitable map $\tau: n \to n$. So this category has a number of things in common with the one defined by the total order ≤. But there are important differences, owing in part to the fact that there are infinitely maps n → n for each n. Because of this, this second category has much more structure than a binary relation.

For instance, it becomes meaningful and interesting to talk about coproducts.

Definition. Let A,B be two objects. A coproduct of A and B is an object (A $\amalg$ B), together with two maps iA : A → (A $\amalg$ B) and iB : B → (A $\amalg$ B) such that, for any other object X and maps f : A → X and g : B → X, there exists a unique map (f | g) : (A $\amalg$ B) → X such that f = (f | g) $\circ$ iA and g = (f | g) $\circ$ iB . That is, in the following diagram, any two directed walks between a common pair of points represent the same map:

$\begin{matrix} \qquad\! A \!\!&\!\! \xrightarrow{\textstyle \;i_A\;} \!\!&\!\! (A \amalg B) \!\!&\!\! \xleftarrow{\textstyle \;i_B\;} \!\!&\!\! B \!\qquad \\ \mathrm{id_A}\Bigg\updownarrow & & \Bigg\downarrow(f|g)\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!\!& &\Bigg\updownarrow\mathrm{id_B} \\ \qquad\! A \!\!&\!\! \xrightarrow[\textstyle \quad f \quad] \!\!\!\!\!\!\!\!\!\!&\!\! X \!\!&\!\!\!\!\!\!\!\!\!\! \xleftarrow[\textstyle \quad g \quad] \!\!&\!\! B\!\qquad \end{matrix}$

  1. In the case of the partial order ≤, the coproduct of A and B is just  max { A, B } , which you should verify for yourself.
  2. In the category of matrices we described above, the coproduct of A and B is not the maximum of A and B, but is instead the integer A+B, corresponding to splitting a vector of dimension A+B into a block of size A and a block of size B.

    • The "inclusion" map $i_A$ would correspond to mapping a vector in v ∈ $\mathbb R^A$ into the first A coefficients of a vector in $\mathbb R^{A+B}$, with the others set to zero;
    • The "inclusion" map $i_B$ would be similar, mapping vectors in $\mathbb R^B$ into the final B coefficients.
    • For functions fA → A+B and gB → A+B, the map (f | g) corresponds to performing the map f to the first block and g to the second block of the vector.

    Of course, there's more than one way to decompose $\mathbb R^{A+B}$ into blocks; we could have chosen a different definition for the maps iA and iB, by having iB map vectors in $\mathbb R^B$ into the first B coefficients, and iA map vectors into the final A coefficients. This way of embedding $\mathbb R^A$ and $\mathbb R^B$ into the same space at the same time is equivalent to the one we describe above; in fact we can show that there is a unique isomorphism between these two ways of constructing (A $\amalg$ B). So, the coproduct (A $\amalg$ B) is dependent on the choice of maps iA and iB — but not in any way which is really important.

Even though the first category generated by ≤ is what you get when you forget the matrix structure of the second category — by replacing each infinite collection of arrows in each case with a single arrow between the same points — that second category of injective linear operators is substantially different from the simple one described by ≤.

This is just scratching the surface of one fingernail of category theory; but hopefully it convinces you that categories and reflexive, transitive binary relations are not trivially equivalent.

Solution 2:

A directed multigraph can be thought of as a one-dimensional simplicial complex: the vertices are points, and the edges are 1-simplices.

Given a category, we can form its nerve, and the 1-skeleton will essentially be a multi-graph.

The observation of the OP is that these two constructions are roughly inverse to each other.

The relationship between simplicial sets and categories is an important one, and is at the basis of the theory of higher categories and its relationship to homotopy theory. Thus the OP's observation and question shouldn't be dismissed.

On the other hand, I'm not sure that I agree with the sentiment of the OP's final non-parenthetical sentence: "I don't see how it justifies treating the two as separate branches rather than syntactic shims on identical concepts." Certainly (multi-)graph theorists and (higher-)category/homotopy theorists both make itensive studies of 1-dimensional simplicial sets, but the questions they ask and the constructions they are concerned with are typically extremely different. If they do find common ground, that's great, but they needn't be obliged to seek it.

Most working mathematicians using categories are, furthermore, not thinking in a homotopical/simplicial way at all about the structures involved, but are just taking advantage of the very useful language and range of concepts that category theory provides. These are concepts that (as far as I know) are not used at all by those studying multi-graphs and related structures, and so I don't see any reason that people should be obliged or even encouraged to translate (what to them are) familiar category-theoretic notions into a different terminological framework of relations/multi-graphs.

[Added: Ryan Budney's comment above makes essentially the same point in a pithier fashion.]

Solution 3:

This is an example of that the axioms of a theory doesn't say much of the intentions with the theory. The purpose of relation has never been to study universal properties, for example.