What can I use for input conversion instead of scanf?
I have very frequently seen people discouraging others from using scanf
and saying that there are better alternatives. However, all I end up seeing is either "don't use scanf
" or "here's a correct format string", and never any examples of the "better alternatives" mentioned.
For example, let's take this snippet of code:
scanf("%c", &c);
This reads the whitespace that was left in the input stream after the last conversion. The usual suggested solution to this is to use:
scanf(" %c", &c);
or to not use scanf
.
Since scanf
is bad, what are some ANSI C options for converting input formats that scanf
can usually handle (such as integers, floating-point numbers, and strings) without using scanf
?
The most common ways of reading input are:
using
fgets
with a fixed size, which is what is usually suggested, andusing
fgetc
, which may be useful if you're only reading a singlechar
.
To convert the input, there are a variety of functions that you can use:
strtoll
, to convert a string into an integerstrtof
/d
/ld
, to convert a string into a floating-point numbersscanf
, which is not as bad as simply usingscanf
, although it does have most of the downfalls mentioned belowThere are no good ways to parse a delimiter-separated input in plain ANSI C. Either use
strtok_r
from POSIX orstrtok
, which is not thread-safe. You could also roll your own thread-safe variant usingstrcspn
andstrspn
, asstrtok_r
doesn't involve any special OS support.It may be overkill, but you can use lexers and parsers (
flex
andbison
being the most common examples).No conversion, simply just use the string
Since I didn't go into exactly why scanf
is bad in my question, I'll elaborate:
With the conversion specifiers
%[...]
and%c
,scanf
does not eat up whitespace. This is apparently not widely known, as evidenced by the many duplicates of this question.There is some confusion about when to use the unary
&
operator when referring toscanf
's arguments (specifically with strings).It's very easy to ignore the return value from
scanf
. This could easily cause undefined behavior from reading an uninitialized variable.It's very easy to forget to prevent buffer overflow in
scanf
.scanf("%s", str)
is just as bad as, if not worse than,gets
.You cannot detect overflow when converting integers with
scanf
. In fact, overflow causes undefined behavior in these functions.
TL;DR
fgets
is for getting the input. sscanf
is for parsing it afterwards. scanf
tries to do both at the same time. That's a recipe for trouble. Read first and parse later.
Why is scanf
bad?
The main problem is that scanf
was never intended to deal with user input. It's intended to be used with "perfectly" formatted data. I quoted the word "perfectly" because it's not completely true. But it is not designed to parse data that are as unreliable as user input. By nature, user input is not predictable. Users misunderstands instructions, makes typos, accidentally press enter before they are done etc. One might reasonably ask why a function that should not be used for user input reads from stdin
. If you are an experienced *nix user the explanation will not come as a surprise but it might confuse Windows users. In *nix systems, it is very common to build programs that work via piping, which means that you send the output of one program to another by piping the stdout
of the first program to the stdin
of the second. This way, you can make sure that the output and input are predictable. During these circumstances, scanf
actually works well. But when working with unpredictable input, you risk all sorts of trouble.
So why aren't there any easy-to-use standard functions for user input? One can only guess here, but I assume that old hardcore C hackers simply thought that the existing functions were good enough, even though they are very clunky. Also, when you look at typical terminal applications they very rarely read user input from stdin
. Most often you pass all the user input as command line arguments. Sure, there are exceptions, but for most applications, user input is a very minor thing.
So what can you do?
First of all, gets
is NOT an alternative. It's dangerous and should NEVER be used. Read here why: Why is the gets function so dangerous that it should not be used?
My favorite is fgets
in combination with sscanf
. I once wrote an answer about that, but I will re-post the complete code. Here is an example with decent (but not perfect) error checking and parsing. It's good enough for debugging purposes.
Note
I don't particularly like asking the user to input two different things on one single line. I only do that when they belong to each other in a natural way. Like for instance
printf("Enter the price in the format <dollars>.<cent>: "); fgets(buffer, bsize, stdin);
and then usesscanf(buffer "%d.%d", &dollar, ¢)
. I would never do something likeprintf("Enter height and base of the triangle: ")
. The main point of usingfgets
below is to encapsulate the inputs to ensure that one input does not affect the next.
#define bsize 100
void error_function(const char *buffer, int no_conversions) {
fprintf(stderr, "An error occurred. You entered:\n%s\n", buffer);
fprintf(stderr, "%d successful conversions", no_conversions);
exit(EXIT_FAILURE);
}
char c, buffer[bsize];
int x,y;
float f, g;
int r;
printf("Enter two integers: ");
fflush(stdout); // Make sure that the printf is executed before reading
if(! fgets(buffer, bsize, stdin)) error_function(buffer, 0);
if((r = sscanf(buffer, "%d%d", &x, &y)) != 2) error_function(buffer, r);
// Unless the input buffer was to small we can be sure that stdin is empty
// when we come here.
printf("Enter two floats: ");
fflush(stdout);
if(! fgets(buffer, bsize, stdin)) error_function(buffer, 0);
if((r = sscanf(buffer, "%f%f", &f, &g)) != 2) error_function(buffer, r);
// Reading single characters can be especially tricky if the input buffer
// is not emptied before. But since we're using fgets, we're safe.
printf("Enter a char: ");
fflush(stdout);
if(! fgets(buffer, bsize, stdin)) error_function(buffer, 0);
if((r = sscanf(buffer, "%c", &c)) != 1) error_function(buffer, r);
printf("You entered %d %d %f %c\n", x, y, f, c);
If you do a lot of these, I could recommend creating a wrapper that always flushes:
int printfflush (const char *format, ...) { va_list arg; int done; va_start (arg, format); done = vfprintf (stdout, format, arg); fflush(stdout); va_end (arg); return done; }
Doing like this will eliminate a common problem, which is the trailing newline that can mess with the nest input. But it has another issue, which is if the line is longer than bsize
. You can check that with if(buffer[strlen(buffer)-1] != '\n')
. If you want to remove the newline, you can do that with buffer[strcspn(buffer, "\n")] = 0
.
In general, I would advise to not expect the user to enter input in some weird format that you should parse to different variables. If you want to assign the variables height
and width
, don't ask for both at the same time. Allow the user to press enter between them. Also, this approach is very natural in one sense. You will never get the input from stdin
until you hit enter, so why not always read the whole line? Of course this can still lead to issues if the line is longer than the buffer. Did I remember to mention that user input is clunky in C? :)
To avoid problems with lines longer than the buffer you can use a function that automatically allocates a buffer of appropriate size, you can use getline()
. The drawback is that you will need to free
the result afterwards. This function is not guaranteed to exist by the standard, but POSIX has it. You could also implement your own, or find one on SO. How can I read an input string of unknown length?
Stepping up the game
If you're serious about creating programs in C with user input, I would recommend having a look at a library like ncurses
. Because then you likely also want to create applications with some terminal graphics. Unfortunately, you will lose some portability if you do that, but it gives you far better control of user input. For instance, it gives you the ability to read a key press instantly instead of waiting for the user to press enter.
Interesting reading
Here is a rant about scanf
: https://web.archive.org/web/20201112034702/http://sekrit.de/webdocs/c/beginners-guide-away-from-scanf.html