Fermat's Two Square Theorem: How do you prove that an odd prime is the sum of two squares iff it is congruent to 1 mod 4?

It is a theorem in elementary number theory that if $p$ is a prime and congruent to 1 mod 4, then it is the sum of two squares. Apparently there is a trick involving arithmetic in the gaussian integers that lets you prove this quickly. Can anyone explain it?


Let $p$ be a prime congruent to 1 mod 4. Then to write $p = x^2 + y^2$ for $x,y$ integers is the same as writing $p = (x+iy)(x-iy) = N(x+iy)$ for $N$ the norm.

It is well-known that the ring of Gaussian integers $\mathbb{Z}[i]$ is a principal ideal domain, even a euclidean domain. Now I claim that $p$ is not prime in $\mathbb{Z}[i]$. To determine how a prime $p$ of $\mathbb{Z}$ splits in $\mathbb{Z}[i]$ is equivalent to determining how the polynomial $X^2+1$ splits modulo $p$.

First off, $-1$ is a quadratic residue modulo $p$ because $p \equiv 1 \mod 4$. Consequently, there is $t \in \mathbb{Z}$ with $t^2 \equiv -1 \mod p$, so $X^2+1$ splits modulo $p$, and $p$ does not remain prime in $\mathbb{Z}[i]$. (Another way of seeing this is to note that if $p$ remained prime, then we'd have $p \mid (t+i)(t-i)$, which means that $p \mid t+i$ or $t \mid t-i$.)

Anyway, as a result there is a non-unit $x+iy$ of $\mathbb{Z}[i]$ that properly divides $p$. This means that the norms properly divide as well. In particular, $N(x+iy) = x^2+y^2$ properly divides $p^2$, so is $p$ or $1$. It cannot be the latter since otherwise $x+iy$ would be a unit. So $x^2+y^2 = p$.


Perhaps my favorite argument (other than any arguably "correct" arguments, such as the one Akhil has given, or arguments starting from the fact that $x^2 + y^2$ is the unique binary quadratic form of discriminant $-4$ up to equivalence) uses continued fractions. Suppose that $u^2 \equiv -1 \pmod{p}$, and consider the continued fraction expansion of the rational number $u/p$. Let $r/s$ be the last convergent to $u/p$ with the property that $s < \sqrt{p}$. Then, setting $x=s$ and $y = rp-us$, one has $x^2 + y^2 = p$.

Here's the argument: let $r'/s'$ be the convergent following $r/s$. Then the basic theory of continued fractions gives the estimate $|r/s - u/p| < 1/ss'$, and the right-hand side is less than $1/s\sqrt{p}$ by hypothesis. Clearing denominators gives $y < \sqrt{p}$, so that $0 < x^2 + y^2 < 2p$. On the other hand $x^2 + y^2$ is checked to be divisible by $p$ (by choice of $u$), so must be equal to $p$.


Here is another proof without complex numbers.

We start with proving that there exists $z \in \mathbb{N}$ such that $z^2 + 1 \equiv 0 \pmod p$. We do this in the same way as Akhil Mathew.

Let we have $a^2 + b^2 = pm$. Take $x$ and $y$ such that
$x \equiv a \pmod m$ and
$y \equiv b \pmod m$ and
$x, y \in [-m/2, m/2)$.

Consider $u = ax + by$ and $v = ay - bx$. Then $u^2 + v^2 = (a^2 + b^2)(x^2 + y^2)$. Moreover, $u$ and $v$ are multiples of $m$. Hence $(u/m)^2 + (v/m)^2 = p (x^2 + y^2)/m$. $(x^2 + y^2)/m$ is an integer because of the definition of $x$ and $y$ and that $a^2 + b^2 = pm$.
Also $(x^2 + y^2)/m$ is less than $m/2$.

Now we change $a$ by $u$ and $b$ by $v$ and continue this process until we get $m=1$.

Notice that this is quite efficient way to find representation of $p$ as a sum of two squares - it takes $O(\log p)$ steps to find it provided we have found $z$ such that $z^2 + 1$ is multiple of $p$.


There is an amazing proof of this due to Don Zagier : one-sentence proof.


Taken From I.N. Herstein There are 2 results which i am going to use.

  1. Let $p$ be a prime integer and suppose that for some integer $c$ relatively prime to $p$ we can find integers $x$ and $y$ such that $x^{2}+y^{2}=cp$. Then $p$ can be written as the sum of 2 squares.

  2. If $p$ is a prime of the form $4n+1$, then we can solve the congruence $x^{2} \equiv \ -1 \ (mod) \ p$.

Now the main result. If $p$ is a prime of the form $4n+1$ then $p$ is the sum of 2 squares.

Proof. By 2 there exists and $x$ such that $x^{2} \equiv -1 \text{mod} \ p$. So $x$ can be chose such that $0 \leq x \leq (p-1)$. We can restrict the size of $p$ even further, namely to satisfy $|x| \leq \frac{p}{2}$. For if $x > p/2$ then, $y=p-x$ satisfies $y^{2} \equiv -1 \text{mod} \ p$ but $|y| \leq p/2$. Thus we may assume that we have an integer $x$ such that $|x| \leq p/2$ and $x^{2}+1$ is a multiple of $p$ say $cp$. Now $cp=x^{2}+1 \leq p^{2}/4 +1 < p^{2}$, hence $c < p$ and hence $(c,p)=1$. Invoking (1) we have $p=a^{2}+b^{2}$.