Safe Alternative to gets [duplicate]
When I try to compile C code that uses the gets()
function with GCC, I get this warning:
(.text+0x34): warning: the `gets' function is dangerous and should not be used.
I remember this has something to do with stack protection and security, but I'm not sure exactly why.
How can I remove this warning and why is there such a warning about using gets()
?
If gets()
is so dangerous then why can't we remove it?
In order to use gets
safely, you have to know exactly how many characters you will be reading, so that you can make your buffer large enough. You will only know that if you know exactly what data you will be reading.
Instead of using gets
, you want to use fgets
, which has the signature
char* fgets(char *string, int length, FILE * stream);
(fgets
, if it reads an entire line, will leave the '\n'
in the string; you'll have to deal with that.)
gets
remained an official part of the language up to the 1999 ISO C standard, but it was officially removed in the 2011 standard. Most C implementations still support it, but at least gcc issues a warning for any code that uses it.
Why is gets()
dangerous
The first internet worm (the Morris Internet Worm) escaped about 30 years ago (1988-11-02), and it used gets()
and a buffer overflow as one of its methods of propagating from system to system. The basic problem is that the function doesn't know how big the buffer is, so it continues reading until it finds a newline or encounters EOF, and may overflow the bounds of the buffer it was given.
You should forget you ever heard that gets()
existed.
The C11 standard ISO/IEC 9899:2011 eliminated gets()
as a standard function, which is A Good Thing™ (it was formally marked as 'obsolescent' and 'deprecated' in ISO/IEC 9899:1999/Cor.3:2007 — Technical Corrigendum 3 for C99, and then removed in C11). Sadly, it will remain in libraries for many years (meaning 'decades') for reasons of backwards compatibility. If it were up to me, the implementation of gets()
would become:
char *gets(char *buffer)
{
assert(buffer != 0);
abort();
return 0;
}
Given that your code will crash anyway, sooner or later, it is better to head the trouble off sooner rather than later. I'd be prepared to add an error message:
fputs("obsolete and dangerous function gets() called\n", stderr);
Modern versions of the Linux compilation system generates warnings if you link gets()
— and also for some other functions that also have security problems (mktemp()
, …).
Alternatives to gets()
fgets()
As everyone else said, the canonical alternative to gets()
is fgets()
specifying stdin
as the file stream.
char buffer[BUFSIZ];
while (fgets(buffer, sizeof(buffer), stdin) != 0)
{
...process line of data...
}
What no-one else yet mentioned is that gets()
does not include the newline but fgets()
does. So, you might need to use a wrapper around fgets()
that deletes the newline:
char *fgets_wrapper(char *buffer, size_t buflen, FILE *fp)
{
if (fgets(buffer, buflen, fp) != 0)
{
size_t len = strlen(buffer);
if (len > 0 && buffer[len-1] == '\n')
buffer[len-1] = '\0';
return buffer;
}
return 0;
}
Or, better:
char *fgets_wrapper(char *buffer, size_t buflen, FILE *fp)
{
if (fgets(buffer, buflen, fp) != 0)
{
buffer[strcspn(buffer, "\n")] = '\0';
return buffer;
}
return 0;
}
Also, as caf points out in a comment and paxdiablo shows in his answer, with fgets()
you might have data left over on a line. My wrapper code leaves that data to be read next time; you can readily modify it to gobble the rest of the line of data if you prefer:
if (len > 0 && buffer[len-1] == '\n')
buffer[len-1] = '\0';
else
{
int ch;
while ((ch = getc(fp)) != EOF && ch != '\n')
;
}
The residual problem is how to report the three different result states — EOF or error, line read and not truncated, and partial line read but data was truncated.
This problem doesn't arise with gets()
because it doesn't know where your buffer ends and merrily tramples beyond the end, wreaking havoc on your beautifully tended memory layout, often messing up the return stack (a Stack Overflow) if the buffer is allocated on the stack, or trampling over the control information if the buffer is dynamically allocated, or copying data over other precious global (or module) variables if the buffer is statically allocated. None of these is a good idea — they epitomize the phrase 'undefined behaviour`.
There is also the TR 24731-1 (Technical Report from the C Standard Committee) which provides safer alternatives to a variety of functions, including gets()
:
§6.5.4.1 The
gets_s
functionSynopsis
#define __STDC_WANT_LIB_EXT1__ 1 #include <stdio.h> char *gets_s(char *s, rsize_t n);
Runtime-constraints
s
shall not be a null pointer.n
shall neither be equal to zero nor be greater than RSIZE_MAX. A new-line character, end-of-file, or read error shall occur within readingn-1
characters fromstdin
.25)3 If there is a runtime-constraint violation,
s[0]
is set to the null character, and characters are read and discarded fromstdin
until a new-line character is read, or end-of-file or a read error occurs.Description
4 The
gets_s
function reads at most one less than the number of characters specified byn
from the stream pointed to bystdin
, into the array pointed to bys
. No additional characters are read after a new-line character (which is discarded) or after end-of-file. The discarded new-line character does not count towards number of characters read. A null character is written immediately after the last character read into the array.5 If end-of-file is encountered and no characters have been read into the array, or if a read error occurs during the operation, then
s[0]
is set to the null character, and the other elements ofs
take unspecified values.Recommended practice
6 The
fgets
function allows properly-written programs to safely process input lines too long to store in the result array. In general this requires that callers offgets
pay attention to the presence or absence of a new-line character in the result array. Consider usingfgets
(along with any needed processing based on new-line characters) instead ofgets_s
.25) The
gets_s
function, unlikegets
, makes it a runtime-constraint violation for a line of input to overflow the buffer to store it. Unlikefgets
,gets_s
maintains a one-to-one relationship between input lines and successful calls togets_s
. Programs that usegets
expect such a relationship.
The Microsoft Visual Studio compilers implement an approximation to the TR 24731-1 standard, but there are differences between the signatures implemented by Microsoft and those in the TR.
The C11 standard, ISO/IEC 9899-2011, includes TR24731 in Annex K as an optional part of the library. Unfortunately, it is seldom implemented on Unix-like systems.
getline()
— POSIX
POSIX 2008 also provides a safe alternative to gets()
called getline()
. It allocates space for the line dynamically, so you end up needing to free it. It removes the limitation on line length, therefore. It also returns the length of the data that was read, or -1
(and not EOF
!), which means that null bytes in the input can be handled reliably. There is also a 'choose your own single-character delimiter' variation called getdelim()
; this can be useful if you are dealing with the output from find -print0
where the ends of the file names are marked with an ASCII NUL '\0'
character, for example.
Because gets
doesn't do any kind of check while getting bytes from stdin and putting them somewhere. A simple example:
char array1[] = "12345";
char array2[] = "67890";
gets(array1);
Now, first of all you are allowed to input how many characters you want, gets
won't care about it. Secondly the bytes over the size of the array in which you put them (in this case array1
) will overwrite whatever they find in memory because gets
will write them. In the previous example this means that if you input "abcdefghijklmnopqrts"
maybe, unpredictably, it will overwrite also array2
or whatever.
The function is unsafe because it assumes consistent input. NEVER USE IT!
You should not use gets
since it has no way to stop a buffer overflow. If the user types in more data than can fit in your buffer, you will most likely end up with corruption or worse.
In fact, ISO have actually taken the step of removing gets
from the C standard (as of C11, though it was deprecated in C99) which, given how highly they rate backward compatibility, should be an indication of how bad that function was.
The correct thing to do is to use the fgets
function with the stdin
file handle since you can limit the characters read from the user.
But this also has its problems such as:
- extra characters entered by the user will be picked up the next time around.
- there's no quick notification that the user entered too much data.
To that end, almost every C coder at some point in their career will write a more useful wrapper around fgets
as well. Here's mine:
#include <stdio.h>
#include <string.h>
#define OK 0
#define NO_INPUT 1
#define TOO_LONG 2
static int getLine (char *prmpt, char *buff, size_t sz) {
int ch, extra;
// Get line with buffer overrun protection.
if (prmpt != NULL) {
printf ("%s", prmpt);
fflush (stdout);
}
if (fgets (buff, sz, stdin) == NULL)
return NO_INPUT;
// If it was too long, there'll be no newline. In that case, we flush
// to end of line so that excess doesn't affect the next call.
if (buff[strlen(buff)-1] != '\n') {
extra = 0;
while (((ch = getchar()) != '\n') && (ch != EOF))
extra = 1;
return (extra == 1) ? TOO_LONG : OK;
}
// Otherwise remove newline and give string back to caller.
buff[strlen(buff)-1] = '\0';
return OK;
}
with some test code:
// Test program for getLine().
int main (void) {
int rc;
char buff[10];
rc = getLine ("Enter string> ", buff, sizeof(buff));
if (rc == NO_INPUT) {
printf ("No input\n");
return 1;
}
if (rc == TOO_LONG) {
printf ("Input too long\n");
return 1;
}
printf ("OK [%s]\n", buff);
return 0;
}
It provides the same protections as fgets
in that it prevents buffer overflows but it also notifies the caller as to what happened and clears out the excess characters so that they do not affect your next input operation.
Feel free to use it as you wish, I hereby release it under the "do what you damn well want to" licence :-)