How to get a leading zero?

Rather than checking if a format specification character is present in the entire format string, you would be better off processing it character by character (in sequence) and acting on each as it "arrives".

Each format specifier character can be read by a state machine and just store its effect. Once you hit a character that ends a format specifier (like d or s), you grab an argument from the variable argument list and use those stored effects to modify the output.

Then clear the effects and start on the next specifier.