Having issues writing and reading binary from a .bin file
I'm working on a encoding/decoding program using the Huffman algoritm. Writing the tree to the file works great but I have ran into a problem when writing the encoded characters. For some reason it stops printing to the file if a byte has all bits set to 1.
The example that doesn't work is with the following encoding:
String: AAAAACCEEEEEEEEKDDDD
A: 10
C: 1101
D: 111
E: 0
K: 1100
This gives the encoded string:
101010101011011101000000001100111111111111
This encoded string is stored in an unsigned array of chars.
Before the encoded characters I have 4 bytes containing an unsigned long int, which represents the amount of characters being decoded. And the tree itself is placed before the encoded bits.
The full length of bits being printed is the following:
00000000 00000000 00000000 00010100 01010001 01010100 00010010 10010111 01000011
10100010 01010101 01011011 10100000 00011001 11111111 11100000
And my issue lies in writing this encoded string to the .bin file. For some reason it stops writing after a byte with eight 1's. I have tried for example changing the string so that it doesn't end up with a char with only 1's and it works just fine.
How I write to the file:
int main(void)
{
FILE *output = fopen("filename.bin", "wb");
unsigned char array[] = {0, 0, 0, 20, 81, 84, 18, 151, 67, 162,
85, 91, 160, 25, 254, 224};
int size = 16;
write_to_file(array, size, output);
return 0;
}
void write_to_file(const unsigned char *array, const int size, FILE *output)
{
fwrite(array, sizeof(char), size, output);
}
How I read the file:
int main(void)
{
FILE *input = fopen("filename.bin", "r");
read_file(input);
return 0;
}
void read_file(FILE *input)
{
unsigned long int num;
fread(&num, sizeof(unsigned long int), 1, input);
char c;
while((c = fgetc(input)) != EOF){
for(int i = 0; i < 8; i++){
int bit = !!((c << i) & 0x80);
do_something_with_bit(bit);
}
}
}
I have tried changing between char and unsigned char, as I thought it being a negative char was a problem but it didn't change the outcome.
I can see no reason as to why it wouldn't work but I've exhausted all possible solutions I can come up with and I cant see any information about this not being possible in documentation. Any clues why this would not work? Am I writing/reading the information incorrectly? Thanks!
The problem is here:
char c;
while((c = fgetc(input)) != EOF){
Change the definition of c
to int c;
.
fgetc
returns an int
, not char
, but c
cannot hold all the values of an int
. When fgetc
reads a character, it returns the code for that character as an unsigned char
(so you should not use a char
type to hold these values). When it encounters an error or end-of-file, it returns EOF
.
EOF
is commonly −1, although it can be another negative value determined by the C implementation. When fgetc
reads a character that is an eight-bit byte with all 1 bits, it returns 255. When 255 is assigned to an eight-bit signed char
, the value is converted. The conversion is implementation-defined, but it commonly wraps modulo 256. This means that, for a c
declared to be a char
, the assignment c = 255
sets c
to −1. Then c == EOF
is true, and your loop stops.