I cannot join machines to domains when primary DC fails, everything else works fine

FSMO Roles were checked, all of them belong to the primary DC that was off. Could this last comment be it?

Yes.

How can I make the roles to fail over to the secondary DC when the primary is off?

There isn't a good way to fix this.

If the DC with the FSMO roles will be down for longer than expected, the roles should be moved to the other DC. But this isn't something that is typically moved between DC's to maintain availability. Not something I would expect to be "automated". Rather the DC that does hold these roles the objective should be to minimize the downtime for those DC's.