What are all the reasons `fgetc()` might return `EOF`?
Certainly fgetc()
returns EOF
when end-of-file or an input error occurs.
Is that all and does that mean no more data is available?
FILE *inf = ...;
int ch;
while ((ch = fgetc(inf)) != EOF) {
;
}
if (feof(inf)) puts("End-of-file");
else if (ferror(inf)) puts("Error");
else puts("???");
Is testing with feof(), ferror()
sufficient?
Note: EOF
here is a macro that evaluates to some negative int
, often -1
. It is not a synonym for end-of-file.
I have found some questions and more that are close to this issue, yet none that enumerate all possibilities.
Solution 1:
Is that all and does that mean no more data available?
No, there are more ways for EOF
.
An EOF
does not certainly mean no more data - it depends.
The C library lists three cases where fgetc()
returns EOF
.
If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-file indicator for the stream is set and the
fgetc
function returnsEOF
. Otherwise, thefgetc
function returns the next character from the input stream pointed to by stream. If a read error occurs, the error indicator for the stream is set and thefgetc
function returnsEOF
. C17dr § 7.21.7.1 3
Recall each stream, like stdin
, has an end-of-file indicator and error indicator.
-
stream just encountered the end-of-file
(Most common) An attempt has been made to get more data, but there was none.
-
end-of-file indicator for the stream is set
The stream first examines its end-of-file indicator. If it sees that the indicator is set, it returns
EOF
. No attempt is made to see if more data exists. Some types of streams will reportEOF
, but data will have arrived after the priorEOF
report. Until the end-of-file indicator is cleared as withclearerr()
, the return remainsEOF
. Example 1. Example 2. -
Input error
The stream error indicator is not examined. Yet the function failed for some reason to read data other than end-of-file. A common example is
fputc(stdin)
. Often input errors are persistent. Some are not. More data may be available. The common strategy is to end the input.// Example where ferror() is true, yet fgetc() does not return EOF FILE *inf = stdin; printf("end-of-file:%d error:%d\n", feof(inf), ferror(inf)); printf("fputc():%d\n", fputc('?', inf)); // EOF reported printf("end-of-file:%d error:%d\n", feof(inf), ferror(inf)); printf("fgetc():%d\n", fgetc(inf)); // User typed in `A`, 'A' reported printf("end-of-file:%d error:%d\n", feof(inf), ferror(inf));
Output
end-of-file:0 error:0 fputc():-1 end-of-file:0 error:1 fgetc():65 end-of-file:0 error:1
When
ferror()
is true, it does not mean the error just occurred, just sometime in the past.
Other cases
-
Apparent
EOF
due to improperly saving aschar
fgetc()
returns anint
with a value in theunsigned char
range andEOF
- a negative value.
Whenfgetc()
reads character code 255, yet saves that as achar
on a system wherechar
is signed, that commonly results in thechar
having the same value asEOF
, yet end-of-file did not occur.FILE *f = fopen("t", "w"); fputc(EOF & 255, f); fclose(f); f = fopen("t", "r"); char ch = fgetc(f); // Should be int ch printf ("%d %d\n", ch == EOF, ch); printf("end-of-file:%d error:%d\n", feof(f), ferror(f)); fclose(f);
Output
1 -1 // ch == EOF ! end-of-file:0 error:0
-
Systems where
UCHAR_MAX == UINT_MAX
. Rare.(I have only come across this in some older graphics processors, still something C allows.) In that case,
fgetc()
may read anunsigned char
outside theint
range and so convert it toEOF
on the function return. Thusfgetc()
is returning a character code that happens to equalEOF
. This is mostly an oddity in the C history. A way to mostly handle is:while ((ch = fgetc(inf)) != EOF && !feof(inf) && !ferror(inf)) { ; }
Such pedantic code is rarely needed.
-
Undefined behavior
Of course when UB occurs, anything is possible.
FILE * f = fopen("Some_non_existent_file", "r"); // Should have tested f == NULL here printf("%d\n", fgetc(f) == EOF); // Result may be 1
A robust way to handle the return from fgetc()
.
FILE *inf = ...;
if (inf) { // Add test
int ch; // USE int !
// Pedantic considerations, usually can be ignored
#if UCHAR_MAX > INT_MAX
clearerr(inf); // Clear history of prior flags
while ((ch = fgetc(inf)) != EOF && !feof(inf) && !ferror(inf)) {
;
}
#else
while ((ch = fgetc(inf)) != EOF) {
;
}
#endif
if (feof(inf)) puts("End-of-file");
else puts("Error");
If code needs to look for data after end-of-file or error, call clearerr()
and repeat the if()
block.
Solution 2:
Another case where EOF doesn't necessarily mean 'no more data' was (rather than 'is') reading magnetic tapes. You could have multiple files on a single tape, with the end of each marked with EOF. When you encountered EOF, you used clearerr(fp)
to reset the EOF and error states on the file stream, and you could then continue reading the next file on the tape. However, magnetic tapes have (for the most part) gone the way of the dodo, so this barely counts any more.
Solution 3:
Here's one obscure reason:
On Windows, reading byte 0x1A
in text mode causes EOF.
By "Windows" I mean both MSVC and MinGW (so it's probably a quirk of Microsoft's CRT). This doesn't happen on Cygwin.