Why should the underlying set of a model be a set?

One of the reasons is that the $\sf ZFC$ axioms are not equipped for handling classes properly. For example $\sf ZF$ proves that all the axioms of $\sf ZFC$ hold in the inner model $L$, which is a proper class. But we cannot define a truth definition for that class, as that would violate Tarski's theorem. And the proof that all the axioms hold in $L$ is not a single proof, but rather a schema of proofs.

Sure, we can handle some class models relatively okay (e.g. the surreal numbers, which have a relatively simple theory), but arbitrarily speaking? We want our foundational theory to have access to the truth definition of a structure. Otherwise, what sort of foundation does the theory provide us?

If you notice, almost all the statements about class models of set theory are meta-theoretic statements.

Yes, we could use class-theories like $\sf KM$ (Kelley-Morse) and $\sf NBG$ (von Neumann-Goedel-Bernays), and this might allow us to extend the reach of truth definitions for various class models; but certainly not for the entire universe, as that would contradict Tarski's theorem about the undefinability of the truth. So the universe of set theory will, essentially, always be an internal class.