Reading a file character by character in C

There are a number of things wrong with your code:

char *readFile(char *fileName)
{
    FILE *file;
    char *code = malloc(1000 * sizeof(char));
    file = fopen(fileName, "r");
    do 
    {
      *code++ = (char)fgetc(file);

    } while(*code != EOF);
    return code;
}
  1. What if the file is greater than 1,000 bytes?
  2. You are increasing code each time you read a character, and you return code back to the caller (even though it is no longer pointing at the first byte of the memory block as it was returned by malloc).
  3. You are casting the result of fgetc(file) to char. You need to check for EOF before casting the result to char.

It is important to maintain the original pointer returned by malloc so that you can free it later. If we disregard the file size, we can achieve this still with the following:

char *readFile(char *fileName)
{
    FILE *file = fopen(fileName, "r");
    char *code;
    size_t n = 0;
    int c;

    if (file == NULL)
        return NULL; //could not open file

    code = malloc(1000);

    while ((c = fgetc(file)) != EOF)
    {
        code[n++] = (char) c;
    }

    // don't forget to terminate with the null character
    code[n] = '\0';        

    return code;
}

There are various system calls that will give you the size of a file; a common one is stat.


Expanding upon the above code from @dreamlax

char *readFile(char *fileName) {
    FILE *file = fopen(fileName, "r");
    char *code;
    size_t n = 0;
    int c;

    if (file == NULL) return NULL; //could not open file
    fseek(file, 0, SEEK_END);
    long f_size = ftell(file);
    fseek(file, 0, SEEK_SET);
    code = malloc(f_size);

    while ((c = fgetc(file)) != EOF) {
        code[n++] = (char)c;
    }

    code[n] = '\0';        

    return code;
}

This gives you the length of the file, then proceeds to read it character by character.


Here's one simple way to ignore everything but valid brainfuck characters:

#define BF_VALID "+-><[].,"

if (strchr(BF_VALID, c))
    code[n++] = c;