What is the probability that a natural number is a sum of two squares?

Some natural numbers can be expressed as a sum of two squares:

$$2=1^2+1^2$$ $$25=3^2+4^2$$ $$50=7^2+1^2$$

If one chooses a random natural number, what would be the probability that that number is a sum of two squares? Is it zero?

I read about Lagrange´s theorem on squares, but it looks it can´t be useful here.


NOTE 1: "Square" means "square of a natural number".

NOTE 2: I am aware that the expression "random natural number" is not a strict math notion. However, as I said in a comment, one can adopt a reasonable strict definition, which is not difficult to devise at all. It is mentioned also in an answer below.

NOTE 3: A related question on SE: How to determine whether a number can be written as a sum of two squares?


Solution 1:

Without going into detail what a "random natural number" might be, we could consider the density of such numbers.

The possibility to express $n$ depends on the prime factorization of $n$: It may be divisible by $2$ and by primes $\equiv 1\pmod 4$ as much as it likes, but each prime $\equiv 3\pmod 4$ must occur in an even power.

The prime $3$ "spoils" all numbers divisible by $3$ (that's $\frac13$), except those divisible by $9$ (that's $\frac19$), though it does spoil those divisible by $27$, execept ... All in all the prime $3$ spoils $\frac13-\frac19+\frac1{27}-\frac1{81}\pm\ldots = \frac14$ (geometric series) of all numbers in a large range. In general, a prime $p\equiv 3\pmod 4$ spoils $\frac1{p+1}$ and the effects of distinct primes are independent. Hence the limit density of natural numbers expressible as sum of squares is $$ \prod_{p\equiv 3\pmod 4}\left(1-\frac1{p+1}\right)$$ This product however diverges to $0$ because it is well-known that $\sum_p\frac1p$ diverges (where it doesn't matter that we use only half of all primes).

Thus: For sufficiently large $N$, the probability that a number picked uniformly from $\{1,\ldots,N\}$ is the sum of two squares becomes arbitrarily small. It is for any $\let\epsilon\varepsilon\epsilon>0$ we can find $N_0$ such that for all $N>N_0$ said probability is $<\epsilon$.

Solution 2:

Though this is not an answer to the original question, I'll write it in answer to VividD's question under Hagen von Eitzen post (and I believe it is not totally unrelated to the question).

Let $A_n$ be the number of different pairs $(x,y)$ of non-negative integers solving the equation $x^2 + y^2 = n$. Then it is natural to define the expected number of compositions of a randomly chosen natural number into sum of two squares as
$$ A = \lim_{n\to\infty} \frac1n \sum_{k=1}^n A_k. $$ But the latter quantity is $\frac1n$ times the number of points of the form $(x/\sqrt{n},y/\sqrt{n})$, where $x,y$ are integers, inside the non-negative quarter of the unit circle centered at the origin (minus one, but this does not matter). So here we have a Jordan approximation of the area of the quarter. Therefore, $A = \frac\pi4$.

So, quite paradoxically, the expected value of the number of compositions of a randomly chosen non-negative integer into two squares is positive despite this number is zero "with probability 1".

Solution 3:

Consider the function $r: \mathbb{N} \to \mathbb{C}$: $$ r(n) = \begin{cases} 1 &n \text{ is the sum of two squares (0 is a square)} \\ 0 &\text{otherwise}. \end{cases} $$ One way to define the "probability that a random integer is the sum of two squares" would be to consider the distribution on the integers where $n$ selected with probability proportional to $n^x$, for $x > 1$, and then to take the limit as $x \to 1$. That is, we can try to compute: $$ \lim_{x \to 1} \frac{\sum_{i=1}^\infty \frac{r(n)}{n^x}}{\sum_{i=1}^\infty \frac{1}{n^x}} \tag{1}. $$

As mentioned here, $r(n)$ is $1$ iff every prime of the form $4k-1$ occurs an even number of times in $n$. It follows that $r(n)$ is multiplicative, and \begin{align*} \frac{\sum_{i=1}^\infty \frac{r(n)}{n^x}}{\sum_{i=1}^\infty \frac{1}{n^x}} &= \frac{\prod_{p \text{ prime}} \sum_{i=0}^\infty \frac{r(p^i)}{p^{ix}}}{\prod_{p \text{ prime}} \sum_{i=0}^\infty \frac{1}{p^{ix}}} \\ &= \prod_{p \text{ prime}} \frac{1 + \frac{r(p)}{p^x} + \frac{r(p^2)}{p^{2x}} + \cdots}{1 + \frac{1}{p^x} + \frac{1}{p^{2x}} + \cdots} \\ &= \prod_{p \equiv 3 \pmod{4}} \frac{1 + 0 + \frac{1}{p^{2x}} + 0 + \frac{1}{p^{4x}} + \cdots}{1 + \frac{1}{p^x} + \frac{1}{p^{2x}} + \cdots} \prod_{p \equiv 1,2 \pmod{4}} \frac{1 + \frac{1}{p^x} + \frac{1}{p^{2x}} + \cdots}{1 + \frac{1}{p^x} + \frac{1}{p^{2x}} + \cdots} \\ &= \prod_{p \equiv 3 \pmod{4}} \frac{1 + \frac{1}{p^{2x}} + \frac{1}{p^{4x}} + \cdots}{1 + \frac{1}{p^x} + \frac{1}{p^{2x}} + \cdots} \\ &= \prod_{p \equiv 3 \pmod{4}} \frac{1 \big/ \left(1 - \tfrac{1}{p^{2x}}\right)}{1 \big/ \left(1 - \tfrac{1}{p^{x}}\right)} \\ &= \prod_{p \equiv 3 \pmod{4}} \frac{1}{1 + p^{-x}}. \\ \end{align*} Now if you plug in $x = 1$, you get the product $$ \prod_{p \equiv 3 \pmod{4}} \frac{p}{p + 1} = \prod_{p \equiv 3 \pmod{4}} \left( 1 - \frac{1}{p + 1} \right) = 0, $$ by the reasoning of Hagen von Eitzen. So by this definition of probability, the probability that a random integer is a sum of two squares is zero.

Solution 4:

the density is zero, and one may be quite precise about it: the count of numbers up to some large real $x$ that are the sum of two squares is asymptotic to $$ \frac{0.7642... \, x}{\sqrt{\log x}} $$ where the logarithm is base $e,$ and the $0.7642...$ is defined by an infinite product. See the last few pages in LeVeque. This is combined volumes 1 and 2, it is the last few pages in volume 2.

Solution 5:

A simple numerical experiment shall confirm the answer given by zhoraster: the latter is only deviant from mine by a factor $2$. We define an initial segment of the naturals with length $N$ and count all sums of two squares in that segment. The accompanying program is in Pascal:

program kwadraat;
procedure test(N : integer); type data = record b : boolean; i,j : integer; end; var i,j,k,t : integer; rij : array of data; begin t := 0; SetLength(rij,N); for k := 0 to N-1 do begin rij[k].b := false; end; i := 0; while true do begin if sqr(i) > N-1 then Break; j := i; while true do begin k := sqr(i)+sqr(j); if k > N-1 then Break; { Writeln(k,' = ',i,'^2 + ',j,'^2'); } rij[k].i := i; rij[k].j := j; rij[k].b := true; j := j + 1; t := t + 1; end; i := i + 1; end; if N < 100 then for k := 0 to N-1 do begin if rij[k].b then Writeln(k,' = ',rij[k].i,'^2 + ',rij[k].j,'^2'); end; Writeln(t/N,' ->',Pi/8); end;
begin test(10); test(1000); test(100000); test(10000000); end.
Output (details for $N=10$ only):

0 = 0^2 + 0^2
1 = 0^2 + 1^2
2 = 1^2 + 1^2
4 = 0^2 + 2^2
5 = 1^2 + 2^2
8 = 2^2 + 2^2
9 = 0^2 + 3^2
 7.00000000000000E-0001 -> 3.92699081698724E-0001
 4.19000000000000E-0001 -> 3.92699081698724E-0001
 3.95420000000000E-0001 -> 3.92699081698724E-0001
 3.92969900000000E-0001 -> 3.92699081698724E-0001
Note that the results converge to $\;\pi/8$ , quite in agreement with the argument given by zhoraster, provided though that $i^2 + j^2$ and $j^2 + i^2$ give a double count of $\,\pi/4\,$ which must be halved.

EDIT. Question & Answer is related to : Double Think about Numerosity .

BONUS. In one of the comments with the answer by zhoraster, VividD has been asking for a variant of the original question for the sum of three/four squares. Minor modification of the above program gives the following output for the three squares case. It is seen that some numbers can be written as a sum of three squares in more than one way. Therefore two cases shall be distinguished: with or without these duplicates. Details again for $N=10$ :

0 = 0^2 + 0^2 + 0^2
1 = 0^2 + 0^2 + 1^2
4 = 0^2 + 0^2 + 2^2
9 = 0^2 + 0^2 + 3^2
2 = 0^2 + 1^2 + 1^2
5 = 0^2 + 1^2 + 2^2
8 = 0^2 + 2^2 + 2^2
3 = 1^2 + 1^2 + 1^2
6 = 1^2 + 1^2 + 2^2
9 = 1^2 + 2^2 + 2^2
    with duplicates = 10/10
 without duplicates = 9/10
    with duplicates = 3254/1000
 without duplicates = 835/1000
    with duplicates = 2807201/100000
 without duplicates = 83336/100000
    with duplicates = 87741031/1000000
 without duplicates = 833336/1000000
If the duplicates are counted, then the results are seen to diverge $\to \infty$ .
It is conjectured that, without the duplicates, the results converge to : $5/6$ (Michael Lugo).
Any takers to prove the latter statement?