Why isn't a binary file shown as 0s and 1s?

Solution 1:

When dealing with computers, there are two interpretations of the word binary.

  1. In terms of number system, it refers to a base 2 number system which uses two symbols, 0 and 1.

  2. When speaking of a file, it refers to a file containing non-textual data (executables, libraries, data files etc.).

A binary file that can be run as a process is called an executable binary.

A file being binary file doesn't simply mean that it will be displayed simply in terms of 0's and 1's. There are layers of abstractions at work as to how files are handled by a computer.

Showing a binary file in terms of 0 and 1 would make the output unnecessarily lengthy, and is not the most optimal approach. A binary file is shown in a text editor according to the default encoding set for that editor.

Also, if an editor is configured to show binary output, it will also display every file, even plain text ones, in terms of 0's and 1's. Everything ultimately boils down to binary when stored in a computer.

To view a file in the most basic form of 0's and 1's, you'll need to use a special editor mode which is capable of displaying the binary data of a file. One such way is using the built-in xxd command in macOS. Type the following command-line in Terminal:

xxd -b filename

$ xxd -b a.out | head
00000000: 11001111 11111010 11101101 11111110 00000111 00000000  ......
00000006: 00000000 00000001 00000011 00000000 00000000 10000000  ......
0000000c: 00000010 00000000 00000000 00000000 00001111 00000000  ......
00000012: 00000000 00000000 11000000 00000100 00000000 00000000  ......
00000018: 10000101 00000000 00100000 00000000 00000000 00000000  .. ...
0000001e: 00000000 00000000 00011001 00000000 00000000 00000000  ......
00000024: 01001000 00000000 00000000 00000000 01011111 01011111  H...__
0000002a: 01010000 01000001 01000111 01000101 01011010 01000101  PAGEZE
00000030: 01010010 01001111 00000000 00000000 00000000 00000000  RO....
00000036: 00000000 00000000 00000000 00000000 00000000 00000000  ......

It will display the binary dump of filename on standard output. This works equally for both binary and text files.

A more compact and commonly used form, which is preferred over binary is hexadecimal, which uses a base 16 number system (0-9, A-F). To show the contents of file filename in hexadecimal, just run the xxd command without any options with just the filename as argument in Terminal:

xxd filename

$ xxd  a.out | head
00000000: cffa edfe 0700 0001 0300 0080 0200 0000  ................
00000010: 0f00 0000 c004 0000 8500 2000 0000 0000  .......... .....
00000020: 1900 0000 4800 0000 5f5f 5041 4745 5a45  ....H...__PAGEZE
00000030: 524f 0000 0000 0000 0000 0000 0000 0000  RO..............
00000040: 0000 0000 0100 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000060: 0000 0000 0000 0000 1900 0000 d801 0000  ................
00000070: 5f5f 5445 5854 0000 0000 0000 0000 0000  __TEXT..........
00000080: 0000 0000 0100 0000 0010 0000 0000 0000  ................
00000090: 0000 0000 0000 0000 0010 0000 0000 0000  ................

Solution 2:

Nearly all modern computers deal with bytes, instead of the individual bits. A single byte, as you may know, can store any of 256 different values; from eight zeroes to eight ones.

When you open the binary file in a text editor, it is showing you these byte-sized chunks instead of each individual bit. The symbols it picks are determined by your editor's default encoding. Often, one character in your editor corresponds to one byte in the actual file, though there are special cases.

If you see a string of readable text, such as __stub_helper, it means that particular text is stored as-is within the binary file.

The special cases I mentioned before are so-called control characters that are displayed with an escape code. Escape codes, as seen here, begin with ^ and are followed by a single additional character. This pair, such as ^@ are taken together to represent a single byte. In fact, the symbol ^@ is the value zero, meaning the bits at that location would be eight zeroes.

The reason that your text editor displays the binary file in this manner is that it simply displays all files in this manner. If you were to use a hex editor, it would display any and all files in hexadecimal instead. In fact, there's no fundamental difference between the contents of a binary file and the contents of a text file -- it's the metadata and file headers that let your computer know which is which.

Solution 3:

Start vi in binary mode - then you can run xxd to get hex view, binary view and edit the file as you would. (Of course most of these are read-only, but that's not about the editor and more the permissions/SIP).

vi -b /bin/ps

Then to convert the buffer to bits of 1 and 0

:%!xxd -b

Then you can see all the Mach-0 executable binary goodness, right from within the editor. If you drop the b, you get the more typical hexidecimal representation that's more space efficient encoding and you don't see the values mapped as ASCII derived characters where so many values end up as ^@^@^@ when you start from a plain text centric default.