Proof of Closed Form Formula to convert a binary number to it's Gray code

Solution 1:

Let the bits of an $n$ bit binary number $x$ be $b_{n-1},$ $b_{n-2},$ $\ldots,$ $b_0.$ The bits of $\lfloor x/2\rfloor$ are then $0,$ $b_{n-1},$ $b_{n-2},$ $\ldots,$ $b_1.$ Let the bits of the encoding of $x$ be $c_{n-1},$ $c_{n-2},$ $\ldots,$ $c_0.$ Then the rule for encoding, $$x\mapsto x\oplus\lfloor x/2\rfloor,$$ is given by the sum $$\begin{array}{cccccc} & b_{n-1} & b_{n-2} & b_{n-3} & \ldots & b_0\\ \oplus & 0 & b_{n-1} & b_{n-2} & \ldots & b_1\\ \hline & c_{n-1} & c_{n-2} & c_{n-3} & \ldots & c_0. \end{array}$$

Note that encoding is a one-to-one map from the $n$ bit binary numbers to the $n$ bit binary numbers. This follows from the decoding rule, which can be represented as follows $$\begin{array}{cccccc} & c_{n-1} & c_{n-2} & c_{n-3} & \ldots & c_0\\ \oplus & 0 & c_{n-1} & c_{n-2} & \ldots & c_1\\ \oplus & 0 & 0 & c_{n-1} & \ldots & c_2\\ & \vdots & \vdots & \vdots & & \vdots\\ \oplus & 0 & 0 & 0 & \ldots & c_{n-1}\\ \hline & b_{n-1} & b_{n-2} & b_{n-3} & \ldots & b_0. \end{array}$$ Why this works can be seen from an example: $$ \begin{aligned} c_{n-3}\oplus c_{n-2}\oplus c_{n-1}&=(b_{n-3}\oplus b_{n-2})\oplus (b_{n-2}\oplus b_{n-1})\oplus b_{n-1}\\ &=b_{n-3}\oplus(b_{n-2}\oplus b_{n-2})\oplus (b_{n-1}\oplus b_{n-1})=b_{n-3}. \end{aligned} $$ Formally, the decoding rule is given by $$x\mapsto x\oplus\lfloor x/2\rfloor\oplus\lfloor x/4\rfloor\oplus\ldots\oplus\lfloor x/2^{n-1}\rfloor.$$

The main thing to be proved, however, is that this is a Gray code, that is, that the encodings of $x$ and of $x+1$ differ in exactly one bit. Let $x$ be a binary number between $0$ and $2^n-2.$ How do the bits of $x$ and $x+1$ compare? If $f$ is the index of the rightmost $0$ bit in $x,$ so that the last $f+1$ bits of $x$ are $011\ldots1,$ then, by the carry rule, the last $f+1$ bits of $x+1$ are $100\ldots0.$ So let's compare the encodings of $x$ $$\begin{array}{cccccccccc} & b_{n-1} & b_{n-2} & \ldots & b_{f+1} & 0 & 1 & 1 & \ldots & 1\\ \oplus & 0 & b_{n-1} & \ldots & b_{f+2} & b_{f+1} & 0 & 1 & \ldots & 1\\ \hline & c_{n-1} & c_{n-2} & \ldots & c_{f+1} & b_{f+1} & 1 & 0 & \ldots & 0 \end{array}$$ and of $x+1$ $$\begin{array}{cccccccccc} & b_{n-1} & b_{n-2} & \ldots & b_{f+1} & 1 & 0 & 0 & \ldots & 0\\ \oplus & 0 & b_{n-1} & \ldots & b_{f+2} & b_{f+1} & 1 & 0 & \ldots & 0\\ \hline & c_{n-1} & c_{n-2} & \ldots & c_{f+1} & b_{f+1}+1 & 1 & 0 & \ldots & 0. \end{array}$$ One can see that the only bit that differs in these two sums is the bit with index $f.$

Added: Another way of characterizing this code is given in the Wikipedia article linked to in the original post; it uses a recursive procedure for listing the codewords of length $n+1,$ given the list of codewords of length $n.$ That procedure and the procedure above are equivalent. Here's an explanation:

First observe that if $x$ is an $n$-bit number then $x$ can also be regarded as an $n+1$-bit number by padding the binary representation with an extra $0$ on the left. Examining the procedure above, the $n+1$-bit Gray encoding of $x$ will also be the same as the $n$-bit Gray encoding except for an extra $0$ on the left.

Let the complement of bit $b,$ that is, the bit $1\oplus b,$ be denoted $\bar{b}.$ If $b_{n-1}b_{n-2}\ldots b_0$ represents the integer $x,$ then the complement, $\bar{b}_{n-1}\bar{b}_{n-2}\ldots\bar{b}_0,$ represents the integer $2^n-1-x.$ Let's examine the Gray encoding of the complement. If, as before, the Gray encoding of $b_{n-1}b_{n-2}\ldots b_0$ is $c_{n-1}c_{n-2}\ldots c_0,$ then the Gray encoding of $\bar{b}_{n-1}\bar{b}_{n-2}\ldots\bar{b}_0$ is $$\begin{array}{cccccc} & \bar{b}_{n-1} & \bar{b}_{n-2} & \bar{b}_{n-3} & \ldots & \bar{b}_0\\ \oplus & 0 & \bar{b}_{n-1} & \bar{b}_{n-2} & \ldots & \bar{b}_1\\ \hline & \bar{c}_{n-1} & c_{n-2} & c_{n-3} & \ldots & c_0. \end{array}$$ This implies that the Gray encoding of $x$ is the same as the Gray encoding of $2^n-1-x$ except for the highest order bit.

Therefore, if the $n+1$-bit Gray encodings of $0$, $1$, $\ldots,$ $2^{n+1}-1$ are listed in order, we get (for $n+1=4$) $$\begin{array}{c|ccc} 0&0&0&0\\ 0&0&0&1\\ 0&0&1&1\\ 0&0&1&0\\ 0&1&1&0\\ 0&1&1&1\\ 0&1&0&1\\ 0&1&0&0\\ \hline 1&1&0&0\\ 1&1&0&1\\ 1&1&1&1\\ 1&1&1&0\\ 1&0&1&0\\ 1&0&1&1\\ 1&0&0&1\\ 1&0&0&0, \end{array}$$ which consists of the codewords of length $n$ padded on the left with $0$ and listed in order, followed by the codewords of length $n$ padded on the left with $1$ and listed in reverse order. This is the recursive procedure given in the Wikipedia article for listing the Gray codewords of length $n+1$. Because this procedure involves reversing the list of words of length $n,$ this Gray code is called the binary reflected Gray code.