I'm having trouble with a text file being marked as a binary

I can answer at least the first question. If you're using Unix/Linux you can use tr

tr -d '\000' < filein > fileout

where \000 is the null char. You can also strip all non-printable chars as you can see on the example here: "Unix Text Editing: sed, tr, cut, od, awk"

Regarding your second question, I don't know which is your programming language but I'd search for uninitialized variables which could be end being printed to the output file.


I'm going to make a guess....

Your program writes the file in UTF-16, an encoding of Unicode that uses two bytes for each character. Every second byte is, most of the time, a null.

iconv -f utf-16 -t utf-8 < filein > fileout

will convert it to UTF-8, which most coreutils are comfortable with.