difference between text file and binary file

At the bottom level, they are all bits... true. However, some transmission channels have seven bits per byte, and other transmission channels have eight bits per byte. If you transmit ASCII text over a seven-bit channel, then all is fine. Binary data gets mangled.

Additionally, different systems use different conventions for line endings: LF and CRLF are common, but some systems use CR or NEL. A text transmission mode will convert line endings automatically, which will damage binary files.

However, this is all mostly of historical interest these days. Most transmission channels are eight bit (such as HTTP) and most users are fine with whatever line ending they get.

Some examples of 7-bit channels: SMTP (nominally, without extensions), SMS, Telnet, some serial connections. The internet wasn't always built on TCP/IP, and it shows.

Additionally, the HTTP spec states that,

When in canonical form, media subtypes of the "text" type use CRLF as the text line break. HTTP relaxes this requirement and allows the transport of text media with plain CR or LF alone representing a line break when it is done consistently for an entire entity-body.


All files are saved in one of two file formats - binary or text. The two file types may look the same on the surface, but their internal structures are different.

While both binary and text files contain data stored as a series of (bits (binary values of 1s and 0s), the bits in text files represent characters, while the bits in binary files represent custom data.


Distinguishing between the two is important as different OSs treat text files differently. For example in *nix you end your lines with just \n while in MS OSs you use \r\n and in Macs you use \n\r. Software such as FTP clients try to change the line endings on text files to match the destination OS by adding/removing the characters. This is to make sure that the text file will look properly on the destination OS.

for example, if you create a text file in *nix with line breaks and try to copy it to a windows box as a binary file and open it in notepad, you will not see any of the line endings, but just a clog of text.


Important to add to the answers already provided is that text files and binary files both represent bytes but text files differ from binary files in that the bytes are understood to represent characters. The mapping of bytes to characters is done consistently over the file using a certain code page or Unicode. When using 7 or 8-bit code pages you can spin the dial when reading these files and interpret them with an English alphabet, a German alphabet, Russian alphabet, or others. This spinning the dial doesn't affect the bytes, it does affect which characters are chosen to correspond to the bytes.

As others have stated, there is also the issue of the encoding of line break separators which is unique to text files and which may differ from platform to platform. The "line break" is not a letter in our alphabet or a symbol you can write, so other rules apply to it.

With binary files there is no implicit convention on character encoding or on the definition of a "line".