While (( c = getc(file)) != EOF) loop won't stop executing

I can't figure out why my while loop won't work. The code works fine without it... The purpose of the code is to find a secret message in a bin file. So I got the code to find the letters, but now when I try to get it to loop until the end of the file, it doesn't work. I'm new at this. What am I doing wrong?

main(){

FILE* message;
int i, start;
long int size;
char keep[1];

message = fopen("c:\\myFiles\\Message.dat", "rb");

if(message == NULL){
    printf("There was a problem reading the file. \n");
    exit(-1);
}

//the first 4 bytes contain an int that tells how many subsequent bytes you can throw away
fread(&start, sizeof(int), 1, message);
printf("%i \n", start); //#of first 4 bytes was 280
fseek(message, start, SEEK_CUR); //skip 280 bytes
keep[0] = fgetc(message); //get next character, keep it
printf("%c", keep[0]); //print character

while( (keep[0] = getc(message)) != EOF) {
    fread(&start, sizeof(int), 1, message);
    fseek(message, start, SEEK_CUR);
    keep[0] = fgetc(message);
    printf("%c", keep[0]);
}

fclose(message);

system("pause");
}

EDIT:

After looking at my code in the debugger, it looks like having "getc" in the while loop threw everything off. I fixed it by creating a new char called letter, and then replacing my code with this:

fread(&start, sizeof(int), 1, message);
fseek(message, start, SEEK_CUR);

while( (letter = getc(message)) != EOF) {
    printf("%c", letter);
    fread(&start, sizeof(int), 1, message);
    fseek(message, start, SEEK_CUR);
}

It works like a charm now. Any more suggestions are certainly welcome. Thanks everyone.


Solution 1:

The return value from getc() and its relatives is an int, not a char.

If you assign the result of getc() to a char, one of two things happens when it returns EOF:

  • If plain char is unsigned, then EOF is converted to 0xFF, and 0xFF != EOF, so the loop never terminates.
  • If plain char is signed, then EOF is equivalent to a valid character (in the 8859-1 code set, that's ÿ, y-umlaut, U+00FF, LATIN SMALL LETTER Y WITH DIAERESIS), and your loop may terminate early.

Given the problem you face, we can tentatively guess you have plain char as an unsigned type.

The reason that getc() et al return an int is that they have to return every possible value that can fit in a char and also a distinct value, EOF. In the C standard, it says:

ISO/IEC 9899:2011 §7.21.7.1 The fgetc() function

int fgetc(FILE *stream);

If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the fgetc function obtains that character as an unsigned char converted to an int ...

If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of- file indicator for the stream is set and the fgetc function returns EOF.

Similar wording applies to the getc() function and the getchar() function: they are defined to behave like the fgetc() function except that if getc() is implemented as a macro, it may take liberties with the file stream argument that are not normally granted to standard macros — specifically, the stream argument expression may be evaluated more than once, so calling getc() with side-effects (getc(fp++)) is very silly (but change to fgetc() and it would be safe, but still eccentric).


In your loop, you could use:

int c;

while ((c = getc(message)) != EOF) {
    keep[0] = c;

This preserves the assignment to keep[0]; I'm not sure you truly need it.

You should be checking the other calls to fgets(), getc(), fread() to make sure you are getting what you expect as input. Especially on input, you cannot really afford to skip those checks. Sooner, rather than later, something will go wrong and if you aren't religiously checking the return statuses, your code is likely to crash, or simply 'go wrong'.

Solution 2:

There are 256 different char values that might be returned by getc() and stored in a char variable like keep[0] (yes, I'm oversummarising wildly). To detect end-of-file reliably, EOF has to have a value different from all of them. That's why getc() returns int rather than char: because a 257th distinct value for EOF wouldn't fit into a char.

Thus you need to store the value returned by getc() in an int at least until you check it against EOF:

int tmpc;
while( (tmpc = getc(message)) != EOF) {
    keep[0] = tmpc;
    ...