Addition and multiplication in a Galois Field

I am attempting to generate QR codes on an extremely limited embedded platform. Everything in the specification seems fairly straightforward except for generating the error correction codewords. I have looked at a bunch of existing implementations, and they all try to implement a bunch of polynomial math that goes straight over my head, particularly with regards to the Galois fields. The most straightforward way I can see, both in mathematical complexity and in memory requirements is a circuit concept that is laid out in the spec itself:

circuit diagram

With their description, I am fairly confident I could implement this with the exception of the parts labeled GF(256) addition and GF(256) Multiplication.

They offer this help:

The polynomial arithmetic for QR Code shall be calculated using bit-wise modulo 2 arithmetic and byte-wise modulo 100011101 arithmetic. This is a Galois field of 2^8 with 100011101 representing the field's prime modulus polynomial x^8+x^4+x^3+x^2+1.

which is all pretty much greek to me.

So my question is this: What is the easiest way to perform addition and multiplication in this kind of Galois field arithmetic? Assume both input numbers are 8 bits wide, and my output needs to be 8 bits wide also. Several implementations precalculate, or hardcode in two lookup tables to help with this, but I am not sure how those are calculated, or how I would use them in this situation. I would rather not take the 512 byte memory hit for the two tables, but it really depends on what the alternative is. I really just need help understanding how to do a single multiplication and addition operation in this circuit.


For your purposes, the elements of $GF(256)$ are polynomials in $x$ of degree $\le 7$ with coefficients in $GF(2)$. $GF(2)$ is just $\{0,1\}$ with binary addition and multiplication, but there are no carries: 1+1=0 in $GF(2)$.

So for example, $x^4 + x^3 + 1$, $x^7 + x^2 + x$, $x^5 + x^3 + 1$, are all elements of $GF(256)$. There are 256 elements in all (hence the name $GF(256)$).

Addition of 2 polynomials in $GF(256)$ is straightforward. For example: $(x^4 + x^3 + 1) + (x^3 + x^2 + 1) = x^4 + x^2$. This is just normal addition of polynomials, but the coefficients of the calculations take place in $GF(2)$. So when I added the 2 $x^3$ terms together, the coefficient became 1+1=0 (so the $x^3$ term disappeared altogether). As bit sequences this would be: 11001 + 1101 = 10100. This is straightforward to implement in most programming languages: It's just an XOR of the bit sequences.

To multiply 2 polynomials in $GF(256)$, first you multiply the polynomials just like ordinary polynomials but again, remembering that the calculations take place in $GF(2)$. Then divide the result by $x^8+x^4+x^3+x^2+1$ and take the remainder. Unlike addition, this is less straightforward to implement from scratch. (You can't use the usual multiplication operator in your programming language to do the multiplication, because ordinary multiplication has carries, but $GF(2)$ does not.)

See the example here which is a slightly different polynomial (they use $x^8+x^4+x^3+x+1$ instead of $x^8+x^4+x^3+x^2+1$), but it's the same process. There is also a description of a multiplication algorithm there (you'll have to change it slightly for your polynomial).


To add to Ted's answer, depending on the capabilities of your platform, the XOR of bit sequences that implements addition in GF$(256)$ might be implementable as the XOR of bytes; often available as a machine instruction, even on simple processors.

On the other hand, implementing the multiplication of two elements of GF$(256)$ as multiplication of two polynomials and then taking the remainder modulo $x^8+x^4+x^3+x^2+1$ is very time-consuming. I recommend you read an excellent tutorial description by @Jyrki Lahtonen. If you have memory available, Galois field multiplication can be implemented via log tables. See here for a description of log tables, and do follow all the links there to get some ideas on how to speed up things.

Edit: added new material on multiplication

Suppose that we want to multiply binary polynomials $c(x) = \sum_{i=0}^7 c_ix^i$ and $d(x) = \sum_{i=0}^7 d_ix^i$, and find the remainder after dividing the result by $m(x) = x^8+x^4+x^3+x^2+1$, that is, we want to compute $c(x)d(x)\bmod m(x)$. Sometimes it is convenient to embed the division by $m(x)$ into the multiplication process, and this idea can be particularly useful in the encoder figure shown in the question in which one polynomial (the one being fed back in the top line of the figure) is to be multiplied by $k$ different (fixed) polynomials $g^{(i)}(x), 0 \leq i \leq k-1$. We have $$\begin{align*} e(x) &= c(x)d(x) \bmod m(x)\\ &= \left(\sum_{i=0}^7 c_ix^i\right) d(x) \bmod m(x)\\ &= \left[c_0d(x) + c_1xd(x) + c_2x^2d(x) + \cdots + c_7x^7d(x)\right] \bmod m(x)\\ &= c_0d(x) + c_1[xd(x)\bmod m(x)] + c_2[x (xd(x)\bmod m(x))\bmod m(x) ]+\cdots\\ &\quad\quad\quad\quad\quad\quad\quad \cdots + c_7[x(x^6d(x)\bmod m(x)) \bmod m(x)]\\ &= \sum_{i=0}^7 c_if^{(i)}(x) \end{align*}$$ where $f^{(0)}(x) = f(x)$ and $f^{(i)}(x) = xf^{(i-1)}(x) \bmod m(x) = x^id(x) \bmod m(x)$. Note that the $c_i$ are $0$ or $1$. So, $e(x)$ is initialized to the zero polynomial and then if $c_0 = 1$, $d(x)$ is added to $e(x)$. Then, if $d(x)$ is not needed any more (it is not in the QR code application being considered here), it is replaced by $$xd(x) \bmod m(x) = \sum_{i=0}^7 d_ix^{i+1} \bmod m(x) = \sum_{i=0}^7 (d_{i-1}\oplus d_7m_i)x^{i}. $$ If $d(x)$ is needed elsewhere, this operation is carried out on the copy that has been called $f(x)$. Next, if $c_1 = 1$, the new $d(x)$ is added to the sum being accumulated in $e(x)$ and the new $d(x)$ replaced by $xd(x) \bmod d(x)$, etc. In short, the following two-step process is executed for $k = 0, \ldots , 7$.

  • If $c_k = 1$, $e(x):= e(x) + d(x)$
  • $d(x) := \sum_{i=0}^7 (d_{i-1}\oplus d_7m_i)x^{i}$

to complete the GF$(256)$ multiplication in eight iterations of the two-step process.

There are two advantages of this approach for the QR code application First, if $d(x)$ is chosen to be the polynomial being fed back and $c(x)$ one of the $g^{(j)}(x)$, then the coefficients of $c(x)$ are fixed and known ahead of time, and so the first of the two steps above can be simplified. Second, since the same $d(x)$ is being used to multiply $k$ different $g^{(j)}(x)$'s, the update of $d(x)$ can be shared among the $k$ different multiplications if they are arranged to proceed in parallel instead of one by one. Also, as a minor improvement, since the product $e(x)$ is to be added to a $b(x)$, $e(x)$ can be initialized to have value $b(x)$ instead of the zero polynomial.


I might be a little late, but in case you are still looking at methods for modular multiplication (ie. Multiplication of finite field elements followed by modular reduction), you might want to look at the Wikipedia page for Finite Field Arithmetic.

You want to read the modified version of 'Peasant's algorithm', described right below the long-division figure.