Why isn't the axiom of extensionality considered a definition of equality?

First-order logic can be defined with or without equality. If you are working in a first-order logic with equality, which is the typical case, then stating that $X = Y \iff \forall x.x\in X\Leftrightarrow x \in Y$ is a substantial axiom that could lead to contradiction. The benefit of working in a logic with equality is that you have the rule for replacing equals for equals in predicates. As bof mentioned, this is the indiscernibility of identicals $\cfrac{x = y\qquad P(x)}{P(y)}$. Having the equality as part of the logic also impacts semantics (though considering models of set theory isn't a thing one normally does unless they are a set theorist). If equality is part of the logic, then we require equality to actually be equality in the semantics.

If, instead, we use a first-order logic without equality, then $=$ is just another binary predicate symbol like any other. No different than $\in$. In this case, the "axiom" of extensionality is merely its definition. The indiscernibility of identicals is now a meta-theorem that you would need to prove, and could easily fail to be true. This wouldn't be a problem but it would mean that the $=$ relation wouldn't behave like equality. (Again, to contrast, in FOL with equality, a failure of the indiscernibility of identicals would be a contradiction.) Semantically, $=$ would just be modeled as an equivalence relation that was compatible with $\in$.

So whether the "axiom of extensibility" is a non-trivial logical assertion or a definition depends on whether equality is already defined or not. In FOL with equality, it is and thus this is an axiom that, quite possibly, could cause a contradiction. In FOL without equality, this is merely a definition of a binary relation symbol and can't possibly cause contradiction; it can only be misleading. I should say from the perspective of informal proofs, it makes no difference which view you take. For more formal proofs, if you can use the meta-theorem proving the indiscernibility of identicals, then it also won't be much different, but that meta-theorem just means that an actual formal proof exists. The actual formal proof could be quite large where, if we were working in a logic with equality, it would be a single step of inference. This would be an issue for a mechanized implementation of logic or if you wanted to actually write out the formal proof. In this vein, working in a logic with equality is much more convenient.


If you're asking about why the axiom of extensionality is needed in set theory specifically, the answer isn't about $=$ but rather about $\in$.

Remember that a priori, "$\in$" is just some binary relation symbol. Structures in the language $\{\in\}$ are essentially just directed graphs where there is at most one edge between any two vertices. Thought of this way, extensionality says:

If $a, b$ are vertices, and every vertex with an edge to $a$ also has an edge to $b$ and vice versa, then $a=b$.

Obviously this is not true in general! E.g. consider a graph with fifteen vertices and no edges. Basically, the point is that the semantics for first-order logic does understand what "$=$" means; however, it doesn't understand what "$\in$" is supposed to mean until we add some axioms.