Why does a group homomorphism preserve more structure than a monoid homomorphism while satisfying fewer equations
Solution 1:
I'm not sure about a "categorical" reason, but generally if you restrict to nice subclass of objects, you have less "bad behavior" and so you're theorems/definitions need less restrictions/hypotheses.
As an extreme, what if we just considered the category of "Trivial Groups"? Then every function is a homomorphism! So we don't even need to specify that maps are operation preserving.
An analogy: In general, functions can be 1-to-1 or not, onto or not. But what if we restrict our attention to sets of size $15$? Then a function from a set of size $15$ to a set of size $15$ is 1-to-1 iff it's onto. Thus in my world of size $15$, I can define bijections to be 1-to-1 functions (I get onto for free). The definition of "bijection" is simplified merely because I've moved into a very restrictive world where the phenomena of 1-to-1 and onto are equivalent.
Solution 2:
Preserving the operation is a stronger condition for a group operation than for a monoid operation, since the group operation carries more information (namely inverse elements). So it might not really surprise that you have to make sure the identity element is preserved in a monoid, while it is automatic for groups.
Maybe you can compare this to those induction proofs, where you actually proof a stronger statement, but the induction step becomes easier, since the induction hypothesis carries more information.