Which insights are particularly simple to get through category theory?
I'm trying to motivate myself to learn category theory, but I don't have a clear view of what would be the benefit. I've heard that it can be used to find similitudes between mathematical theories, but I haven't seen an example of this or maybe I can't even recognize them among the literature because I haven't mastered Category Theory to begin with.
Therefore, I'm looking for an insight that requires very little knowledge on the subject (for motivation; you can't ask people to study a large subject without offering small rewards among the path) and that highlights its virtudes.
In my opinion, what's "really happening" with category theory is that it is the model theory of type theory. On the type theory Wikipedia page just linked John Lane Bell is quoted as saying "Roughly speaking, a category may be thought of as a type theory shorn of its syntax." Type theory (in the broad sense of the term) can be thought of as (potentially) proof-relevant logic, and thus subsumes various other systems of logic. For example, universal algebra is very nicely captured by category theory in multiple, interrelated ways.
What this means is any "purely" categorical result corresponds to a provable theorem in a syntactic type theory/logic, which then applies to all "models" of that type theory/logic. (Or really, very much in the spirit of category theory of focusing on how things relate rather than what they are, it will be a result about how provable theorems in different logics relate.) A quintessential example of this is the notion of adjunction and particularly the quite easy to prove fact that right adjoints preserve all limits and, dually, left adjoints preserve all colimits. These two facts alone prove a surprisingly large number of facts once you recognize things as adjoints (or there are similar results for the even more broadly applicable notion of a representable functor), and very, very many constructions are adjoints. Let me provide a simple example. Consider a pair of adjoint functors $F \dashv U : \mathcal{C}\to\mathbf{Set}$ (this means $F : \mathbf{Set}\to\mathcal{C}$ the opposite direction of $U$). Here are just instances of the above theorems about preservation of (co)limits. $\coprod_{i\in I}F(1)\cong F(I)$, $U(\prod_{i\in I}M_i)\cong\prod_{i\in I}UM_i$ (whenever $\prod_{i\in I}M_i$ exists), $F(S/{\sim}) \cong F(S)/{\sim'}$ (i.e. $F$ applied to a quotient set is a quotient object), $N \rightarrowtail M \implies UN\rightarrowtail UM$, that is whenever $N$ is a subobject of $M$, then $UN$ is a subobject of $UM$, i.e. $UN$ is a subset of $UM$ up to isomorphism. These are completely general facts that apply to all adjoint functors whose right adjoint targets the category of sets and functions for any category $\mathcal{C}$. You can look through the examples of adjoint functors on the Wikipedia page and consider what they would look like in those examples.
Let's instantiate them to our example of universal algebra. Every category of algebraic objects (e.g. rings, lattices, groups, modules but not fields) has such an adjoint where $F$ maps to the free ring/lattice/group/etc. and $U$ maps to the underlying set or carrier of the ring/lattice/group/etc. (This adjunction satisfies a further property called monadicity which isn't relevant here, but is another piece of "pure" category theory from which we can derive many more results.) The theorems above are then: "the free object on a set is the coproduct of copies of the free object on one generator indexed by by that set", "the underlying set of a product of algebraic objects is the product of their underlying sets", "the free object on the quotient of a set is the quotient of the free object on the set", "the underlying set of subobject (subring, subgroup, submodule, sublattice, etc.) of an algebraic object is a subset (up to isomorphism) of the underlying set of the object". None of these results are particularly hard to prove, but proving any one of them concretely for a particular algebraic object, say rings, would be comparable or harder than proving the general result about preservation of (co)limits by adjoint functors, and obviously would be far less general. I've answered multiple questions here using little more than these preservation properties or the failure thereof.
In a similar but different vein is the notion of an elementary topos. In this case, instead of having a category with fairly minimal structure and proving results with sweeping generality, we'll consider much more highly structured categories which, whenever we can show a category is an example of one, will give us a variety of tools. The short description from the perspective of logic is an elementary topos is a category that allows us to do intuitionistic higher order logic, or, more familiarly, we can do intuitionistic set theory in an elementary topos (with natural numbers object (NNO) if we want the axiom of infinity). The most striking piece of "pure category theory" here is Lawvere's axioms for an elementary topos which are 1) the category has all finite limits (a very common property) and 2) the category has power objects. It's remarkable that these two simple properties lead to such a wealth of structure and (with NNOs) allows you to do a large chunk of (constructive) mathematics. What's even more remarkable is that toposes arose from algebraic geometry for completely unrelated reasons, and this provides quite a lot of examples of toposes, so it turns out that there are many models of intuitionistic set theory. (There are many examples of (pre)sheaves outside of algebraic geometry too, such as simplicial sets.) A notable class is sheaves on a topological space. Cohen forcing used to prove the independence of the continuum hypothesis and the axiom of choice can be formulated, quite directly, as talking about particular sheaves. We can feed these internal language based approaches back into algebraic geometry to make occasionally intimidating external definitions be completely pedestrian statements internally. For example, a locally ringed space is just the external view of the existence of a local ring (constructively formulated) internally. Furthermore, anything you can (constructively) prove about local rings, now becomes (also) a theorem about locally ringed spaces. Abelian categories have a similar story with categories of modules replacing categories of "sets". A similar story happens with symmetric monoidal categories and linear logic, but symmetric monoidal categories also connect to physics. Yet another example is $(\infty,1)$-categories and homotopy type theory.
Summarizing, "pure" category theory is often the study of classes of categories which, via internal languages, we can (but historically often haven't) reframe as talking about the model theory of some logic/type theory. Since internal languages are sound and complete, we can then instead talk syntactically about what's provable/constructible in the internal language. These syntactic results, which are often quite simple to establish in the internal language, can then be interpreted into many specific cases. My "checklist" when considering some new category includes things like: is this an abelian category? is this a topos? does this have finite (co)limits? is the category of models for a sketch? is it cartesian closed? is it symmetric monoidally closed? Basically, these questions help me figure out what the features of the type theory that is the internal language are, which usually provides a very compact notation and a pile of intuitions and results. For example, a cartesian closed category models a typed lambda calculus (with products), so I'm free to use that notation and all the reasoning tools I have from functional programming to apply to it. As mentioned above, if it's a topos then I additionally know that typed lambda calculus has dependent sums and products as well as sum types and a variety of other constructs.
Here's an example of categorical thinking clarifying a choice of construction in point-set topology. Take the Cartesian product of some sets, say indexed by a set $I$. I'll write it as $\prod_{i \in I} X_i$. Now, suppose we want to make $\prod_{i \in I} X_i$ into a topological space given that we have already defined topological spaces on the $X_i$. There are a many ways to do this. For the sake of this comment, I'll compare two. The product topology and the box topology.
The box topology is to some extent, the "intuitive" notion to use. It says if each $U_i \subseteq X_i$ are open sets, then the set-theoretic product $\prod_{i \in I} U_i$ is an open set of $\prod_{i \in I} X_i$ (and the arbitrary union of all such sets is open in the product too). The product topology, on the other hand, appears to come from nowhere. It says if each $V_i \subseteq X_i$ is open and for only finitely many $i \in I$, we have $V_i \neq X_i$, then the product $\prod_{i \in I} V_i$ is an open set of $\prod_{i \in I} X_i$ (and the arbitrary union of all such sets is open in the product too).
If $I$ is finite, these two notions coincide. On the other hand, if $I$ is infinite, it turns out that the product topology is coarser than the box topology. As the name suggests, we normally take the product topology as the natural topology on the Cartesian product of sets, rather than the box topology. Why?
Well, it turns out that in $\mathbf{Top}$, the category of topological spaces, the categorical product corresponds to the product topology, not the box topology. This is because of the projection maps $\pi_j: \prod_{i \in I} X_i \to X_j$. Given any continuous map $f: Y \to \prod_{i \in I} X_i$ (endowed with the product topology), there is a unique continuous map $f_j: Y \to X_j$ such that $f_j = \pi_j \circ f$. This is exactly the universal property of the product in a category.
We take the construction that gives the "best" properties, even if that construction isn't the most intuitive. This example shows, I think, that category theory can sometimes clarify what those "best" properties are.