What are all the reasons `fgetc()` might return `EOF`?

Certainly fgetc() returns EOF when end-of-file or an input error occurs.
Is that all and does that mean no more data is available?

FILE *inf = ...;
int ch;
while ((ch = fgetc(inf)) != EOF) {
  ;
}
if (feof(inf)) puts("End-of-file");
else if (ferror(inf)) puts("Error");
else puts("???");

Is testing with feof(), ferror() sufficient?

Note: EOF here is a macro that evaluates to some negative int, often -1. It is not a synonym for end-of-file.

I have found some questions and more that are close to this issue, yet none that enumerate all possibilities.


Solution 1:

Is that all and does that mean no more data available?

No, there are more ways for EOF.
An EOF does not certainly mean no more data - it depends.

The C library lists three cases where fgetc() returns EOF.

If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-file indicator for the stream is set and the fgetc function returns EOF. Otherwise, the fgetc function returns the next character from the input stream pointed to by stream. If a read error occurs, the error indicator for the stream is set and the fgetc function returns EOF. C17dr § 7.21.7.1 3

Recall each stream, like stdin, has an end-of-file indicator and error indicator.

  • stream just encountered the end-of-file

    (Most common) An attempt has been made to get more data, but there was none.

  • end-of-file indicator for the stream is set

    The stream first examines its end-of-file indicator. If it sees that the indicator is set, it returns EOF. No attempt is made to see if more data exists. Some types of streams will report EOF, but data will have arrived after the prior EOF report. Until the end-of-file indicator is cleared as with clearerr(), the return remains EOF. Example 1. Example 2.

  • Input error

    The stream error indicator is not examined. Yet the function failed for some reason to read data other than end-of-file. A common example is fputc(stdin). Often input errors are persistent. Some are not. More data may be available. The common strategy is to end the input.

      // Example where ferror() is true, yet fgetc() does not return EOF
      FILE *inf = stdin;
      printf("end-of-file:%d error:%d\n", feof(inf), ferror(inf));
      printf("fputc():%d\n", fputc('?', inf));  // EOF reported
      printf("end-of-file:%d error:%d\n", feof(inf), ferror(inf));
      printf("fgetc():%d\n", fgetc(inf));  // User typed in `A`, 'A' reported
      printf("end-of-file:%d error:%d\n", feof(inf), ferror(inf));
    

    Output

    end-of-file:0 error:0
    fputc():-1
    end-of-file:0 error:1
    fgetc():65
    end-of-file:0 error:1
    

    When ferror() is true, it does not mean the error just occurred, just sometime in the past.

Other cases

  • Apparent EOF due to improperly saving as char

    fgetc() returns an int with a value in the unsigned char range and EOF - a negative value.
    When fgetc() reads character code 255, yet saves that as a char on a system where char is signed, that commonly results in the char having the same value as EOF, yet end-of-file did not occur.

        FILE *f = fopen("t", "w");
        fputc(EOF & 255, f);
        fclose(f);
        f = fopen("t", "r");
        char ch = fgetc(f); // Should be int ch
        printf ("%d %d\n", ch == EOF, ch);
        printf("end-of-file:%d error:%d\n", feof(f), ferror(f));
        fclose(f);
    

    Output

    1 -1  // ch == EOF !
    end-of-file:0 error:0
    
  • Systems where UCHAR_MAX == UINT_MAX. Rare.

    (I have only come across this in some older graphics processors, still something C allows.) In that case, fgetc() may read an unsigned char outside the int range and so convert it to EOF on the function return. Thus fgetc() is returning a character code that happens to equal EOF. This is mostly an oddity in the C history. A way to mostly handle is:

      while ((ch = fgetc(inf)) != EOF && !feof(inf) && !ferror(inf)) {
        ;
      }
    

    Such pedantic code is rarely needed.

  • Undefined behavior

    Of course when UB occurs, anything is possible.

          FILE * f = fopen("Some_non_existent_file", "r");
          // Should have tested f == NULL here
          printf("%d\n", fgetc(f) == EOF); // Result may be 1
    

A robust way to handle the return from fgetc().

FILE *inf = ...;
if (inf) {  // Add test
  int ch; // USE int !

  // Pedantic considerations, usually can be ignored
  #if UCHAR_MAX > INT_MAX
    clearerr(inf); // Clear history of prior flags
    while ((ch = fgetc(inf)) != EOF && !feof(inf) && !ferror(inf)) {
      ;
    }
  #else
    while ((ch = fgetc(inf)) != EOF) {
      ;
    }
  #endif

  if (feof(inf)) puts("End-of-file");
  else puts("Error");

If code needs to look for data after end-of-file or error, call clearerr() and repeat the if() block.

Solution 2:

Another case where EOF doesn't necessarily mean 'no more data' was (rather than 'is') reading magnetic tapes. You could have multiple files on a single tape, with the end of each marked with EOF. When you encountered EOF, you used clearerr(fp) to reset the EOF and error states on the file stream, and you could then continue reading the next file on the tape. However, magnetic tapes have (for the most part) gone the way of the dodo, so this barely counts any more.

Solution 3:

Here's one obscure reason:

On Windows, reading byte 0x1A in text mode causes EOF.

By "Windows" I mean both MSVC and MinGW (so it's probably a quirk of Microsoft's CRT). This doesn't happen on Cygwin.