Polynomial equations $p(A, B) = 0$ for matrices that ensure $AB = BA$
Let $k$ be a field with characteristic different from $2$, and $A$ and $B$ be $2 \times 2$ matrices with entries in $k$. Then we can prove, with a bit art, that $A^2 - 2AB + B^2 = O$ implies $AB = BA$, hence $(A - B)^2 = O$. It came to a surprise for me when I first succeeded in proving this, for this seemed quite nontrivial to me.
I am curious if there is a similar or more general result for the polynomial equations of matrices that ensures commutativity. (Of course, we do not consider trivial cases such as the polynomial $p(X, Y) = XY - YX$ corresponding to commutator)
p.s. This question is purely out of curiosity. I do not know even this kind of problem is worth considering, so you may regard this question as a recreational one.
Your question is very interesting, unfortunately that's not a complete answer, and in fact not an answer at all, or rather a negative answer.
You might think, as generalization of $A^2+B^2=2AB$, of the following matrix equation in $\mathcal M_n\Bbb C$ : $$ (E) :\ \sum_{l=0}^k (-1)^k \binom kl A^{k-l}B^l = 0. $$
This equation implies the commutativity if and only if $n=1$ of $(n,k)=(2,2)$, which is the case you studied.However, the equation (E) has a remarkable property : if $A$ and $B$ satisfy (E) then their characteristic polynomials are equal. Isn't it amazing ? You can have a look at this paper for a proof.
I'm neither an expert on this field nor on the unrelated field of the facts I'm about to cite, so this is more a shot in the dark. But: Given a set of matrices, the problem whether there is some combination in which to multiply them resulting in zero is undecidable, even for relatively small cases (such as two matrices of sufficient size or a low fixed number of $3\times3$ matrices).
A solution to one side of this problem (the "is there a polynomial such that..." side) looks harder (though I have no idea beyond intuition whether it really is!) than the mortality problem mentioned above. If that is actually true, then it would at least suggest that $AB = BA$ does not guarantee the existance of a solution (though it might still happen).
In any case, the fact that the mortality problem is decidable for $2 \times 2$ matrices at least shows that the complexity of such problems increases rapidly with dimension, which could explain why your result for $2$ does not easily extend to higher dimensions.
Apologies for the vagueness of all this, I just figured it might give someone with more experience in the field a different direction to think about the problem. If someone does want to look that way, this paper has the mentioned results as well as references to related literature.