Read from file or stdin
I am writing a utility which accepts either a filename, or reads from stdin.
I would like to know the most robust / fastest way of checking to see if stdin exists (data is being piped to the program) and if so reading that data in. If it doesn't exist, the processing will take place on the filename given. I have tried using the following the test for size of stdin
but I believe since it's a stream and not an actual file, it's not working as I suspected it would and it's always printing -1
. I know I could always read the input 1 character at a time while != EOF but I would like a more generic solution so I could end up with either a fd or a FILE* if stdin exists so the rest of the program will function seamlessly. I would also like to be able to know its size, pending the stream has been closed by the previous program.
long getSizeOfInput(FILE *input){
long retvalue = 0;
fseek(input, 0L, SEEK_END);
retvalue = ftell(input);
fseek(input, 0L, SEEK_SET);
return retvalue;
}
int main(int argc, char **argv) {
printf("Size of stdin: %ld\n", getSizeOfInput(stdin));
exit(0);
}
Terminal:
$ echo "hi!" | myprog
Size of stdin: -1
Solution 1:
You're thinking it wrong.
What you are trying to do:
If stdin exists use it, else check whether the user supplied a filename.
What you should be doing instead:
If the user supplies a filename, then use the filename. Else use stdin.
You cannot know the total length of an incoming stream unless you read it all and keep it buffered. You just cannot seek backwards into pipes. This is a limitation of how pipes work. Pipes are not suitable for all tasks and sometimes intermediate files are required.
Solution 2:
First, ask the program to tell you what is wrong by checking the errno
, which is set on failure, such as during fseek
or ftell
.
Others (tonio & LatinSuD) have explained the mistake with handling stdin versus checking for a filename. Namely, first check argc
(argument count) to see if there are any command line parameters specified if (argc > 1)
, treating -
as a special case meaning stdin
.
If no parameters are specified, then assume input is (going) to come from stdin
, which is a stream not file, and the fseek
function fails on it.
In the case of a stream, where you cannot use file-on-disk oriented library functions (i.e. fseek
and ftell
), you simply have to count the number of bytes read (including trailing newline characters) until receiving EOF (end-of-file).
For usage with large files you could speed it up by using fgets
to a char array for more efficient reading of the bytes in a (text) file. For a binary file you need to use fopen(const char* filename, "rb")
and use fread
instead of fgetc/fgets
.
You could also check the for feof(stdin)
/ ferror(stdin)
when using the byte-counting method to detect any errors when reading from a stream.
The sample below should be C99 compliant and portable.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
long getSizeOfInput(FILE *input){
long retvalue = 0;
int c;
if (input != stdin) {
if (-1 == fseek(input, 0L, SEEK_END)) {
fprintf(stderr, "Error seek end: %s\n", strerror(errno));
exit(EXIT_FAILURE);
}
if (-1 == (retvalue = ftell(input))) {
fprintf(stderr, "ftell failed: %s\n", strerror(errno));
exit(EXIT_FAILURE);
}
if (-1 == fseek(input, 0L, SEEK_SET)) {
fprintf(stderr, "Error seek start: %s\n", strerror(errno));
exit(EXIT_FAILURE);
}
} else {
/* for stdin, we need to read in the entire stream until EOF */
while (EOF != (c = fgetc(input))) {
retvalue++;
}
}
return retvalue;
}
int main(int argc, char **argv) {
FILE *input;
if (argc > 1) {
if(!strcmp(argv[1],"-")) {
input = stdin;
} else {
input = fopen(argv[1],"r");
if (NULL == input) {
fprintf(stderr, "Unable to open '%s': %s\n",
argv[1], strerror(errno));
exit(EXIT_FAILURE);
}
}
} else {
input = stdin;
}
printf("Size of file: %ld\n", getSizeOfInput(input));
return EXIT_SUCCESS;
}
Solution 3:
You may want to look at how this is done in the cat
utility, for example.
See code here.
If there is no filename as argument, or it is "-", then stdin
is used for input.
stdin
will be there, even if no data is pushed to it (but then, your read call may wait forever).
Solution 4:
You can just read from stdin unless the user supply a filename ?
If not, treat the special "filename" -
as meaning "read from stdin". The user would have to start the program like cat file | myprogram -
if he wants to pipe data to it, and myprogam file
if he wants it to read from a file.
int main(int argc,char *argv[] ) {
FILE *input;
if(argc != 2) {
usage();
return 1;
}
if(!strcmp(argv[1],"-")) {
input = stdin;
} else {
input = fopen(argv[1],"rb");
//check for errors
}
If you're on *nix, you can check whether stdin is a fifo:
struct stat st_info;
if(fstat(0,&st_info) != 0)
//error
}
if(S_ISFIFO(st_info.st_mode)) {
//stdin is a pipe
}
Though that won't handle the user doing myprogram <file
You can also check if stdin is a terminal/console
if(isatty(0)) {
//stdin is a terminal
}