If $a$ is a quadratic residue modulo every prime $p$, it is a square - without using quadratic reciprocity.

Let $D$ be a nonsquare. Equation (3.3) of Selberg's "An elementary proof of the prime number theorem for arithmetic progressions" states that $$\sum_{p \leq x,\ \left( \frac{D}{p} \right)=1} \frac{\log p}{p} = \frac{1}{2} \log x + O(1).$$ Combined with Mertens' first theorem $$\sum_{p \leq x} \frac{\log p}{p} = \log x + O(1),$$ this clearly implies that $\left( \frac{D}{p} \right)$ is infinitely often $1$ and infinitely often $-1$. As an exercise for myself, I will rewrite the proofs of Mertens' and Selberg's results as cleanly as I can. I learned about Selberg's approach from a beautiful Mathoverflow answer by Vesselin Dimitrov.


We start with the proof of Merten's theorem. Consider $x! = \prod_{n=1}^{x} n$. We estimate the size of $\log x!$ in two ways. On the one hand, $$\log x! \approx \int_{t=1}^x \log t dt = x \log x -x+1 = x \log x + O(x). \quad (1)$$

On the other hand, consider any prime number $p \leq x$. Then $p$ divides $(x/p)+O(1)$ of the terms in $x!$. Ignoring for the moment that some of those terms may be divisible by $p^2$, we get a contribution of $\sum_{p \leq x} \frac{x}{p} \log p + \sum_{p \leq x} O(\log p)$ from those terms. Using Chebyshev's bound $\sum_{p \leq x} \log p = O(x)$, we conclude that $$\log x! = x \sum_{p \leq x} \frac{\log p}{p} + O(x). \quad (2)$$ Including the possibility of $p$ dividing a term more than once only contributes $\sum_{p \leq x} \sum_{k=2}^{\infty} O(\frac{x \log p}{p^k}) = O(x)$, so that doesn't change (2). Combining (1) and (2) gives Merten's first theorem.


We now apply the same idea to prove Selberg's theorem. Fix a bounded convex region $C$ in $\mathbb{R}^2$ of area $1$, containing the origin. Selberg uses a particular parallelogram, but I find that more confusing than helpful. Write $\sqrt{x}C$ for the $\sqrt{x}$-fold dilation of $C$. We will estimate $$P(x) := \prod_{(u,v) \in \sqrt{x} C \setminus \{(0,0) \}} |u^2-D v^2| .$$ in two ways. Since $D$ is not square, the product is not zero. (However, this is not the main use of the fact that $D$ is not square; that is yet to come.)

We can explain the factor of $\sqrt{x}$ by noting that $\sqrt{x} C$ will contain roughly $x \mathrm{Area}(C) = x$ lattice points, so $P(x)$ is like a factorial.

On the one hand, we can justify approximating $\log P(x)$ by the integral $$ \log P(x) \approx \int_{(u,v) \in \sqrt{x} C} \log |u^2 - D v^2| du dv. $$ Putting $u_2=u/\sqrt{x}$ and $v_2=v/\sqrt{x}$, this is $$ \int_{(u_2, v_2) \in C} (\log x + \log |u_2^2-D v_2^2|) (\sqrt{x} du_2) (\sqrt{x} dv_2) = \int_{(u_2, v_2) \in C} (x \log x) du_2 dv_2 +O(x) $$ $$ = (x \log x) \mathrm{Area}(C)+O(x) = x \log x + O(x). \quad (3)$$ You can compare this to our earlier formula $\log x! = x \log x + O(x)$, also obtained by integration.

On the other hand, consider a prime $p$ and think about how many times it can divide $P(x)$. If $\left( \frac{D}{p} \right)=-1$, then the only way for $p$ to divide $u^2-D v^2$ is for $p$ to divide both $u$ and $v$. So the number of times this happens is $\# (\sqrt{x} C) \cap (p \mathbb{Z}^2) \approx \frac{\mathrm{Area}(\sqrt{x} C)}{p^2} = x/p^2$. As before, $\sum \frac{x \log p}{p^2} = O(x)$, so this is unimportant.

Now suppose that $p$ is odd, not dividing $D$, and $\left( \frac{D}{p} \right) = 1$. Then there are two square roots, $\pm \mu$, of $D$ in $\mathbb{Z}/p \mathbb{Z}$. We define two lattices $\Lambda^p+$ and $\Lambda^p_-$ by $\Lambda^p_{\pm} = \{ u \equiv \pm \mu v \bmod p \} \subset \mathbb{Z}^2$. Then $p$ divides $u^2-D v^2$ if and only if $p$ is in either $\Lambda^p_+$ or $\Lambda^p_-$. The fundamental domains of the $\Lambda^p_{\pm}$ both have area $p$. So the rough number of $p$ for which this happens is $2 \frac{\mathrm{Area}(\sqrt{x} C)}{p} = \frac{2x}{p}$, and we get $P(x) \approx \sum_{p \leq x,\ \left( \frac{D}{p} \right) = 1} \frac{2x}{p} \log p$. One can show that the contributions from $p$ dividing $u^2-D v^2$ more than once, and from $p$ dividing $D$ or equal to $2$ are negligible, and wind up with the conclusion $$\log P(x) = 2x \sum_{p \leq x,\ \left( \frac{D}{p} \right) = 1} \frac{\log p}{p} + O(x). \quad (4)$$

Comparing (3) and (4), we deduce Selberg's equation. QED, but we are not quite done.

We need to justify that the number of points in $\sqrt{x} C \cap \Lambda^p_{\pm}$ is well approximated by $\frac{\mathrm{Area}(\sqrt{x} C)}{\det \Lambda^p_{\pm}}$, where $\det \Lambda$ is the area of a fundamental domain for $\Lambda$. This requires us to show that $\Lambda^p_{\pm}$ is not too crazy. (We also need a similar claim for $p \mathbb{Z}^2$, but that is similar and easier.) Specifically, we will need

Lemma All nonzero vectors in $\Lambda_{\pm}$ have length at least $\sqrt{p/|D|} = \sqrt{\det \Lambda^p_{\pm}}/\sqrt{|D|}$.

Proof Let $(u,v) \in \Lambda^p_{\pm}$. Then $u^2 - D v^2 \equiv 0 \bmod p$. But, since $D$ is not square, $u^2-D v^2 \neq 0$ and we know that $|u^2-D v^2| \geq p$. Then $u^2+v^2 \geq \frac{|u^2-D v^2|}{D} \geq p/d$, as required. $\square$


We now turn to proving a bunch of Lemmas about integer points in Lattices.

Specifically, we show:

Key lattice lemma Let $C$ be as above. Fix a constant $c>0$, and let $\Lambda \subset \mathbb{R}^2$ be lattice in which each vector has length $\geq c \sqrt{\det \Lambda}$. Then, for $R>0$, $$ \#((RC) \cap \Lambda) = \frac{R^2}{\det \Lambda} + O(R/\sqrt{\det \Lambda}) .$$

This section amounts to me unpacking the phrases "it is easily shown" and "hence" in the paragraph at the bottom of Selberg's page 71.

Lemma Let $\Lambda$ be any lattice in $\mathbb{R}^2$. Then $\Lambda$ has a basis $v_1$, $v_2$ for which the angle between $v_1$ and $v_2$ is between $\pi/3$ and $2 \pi/3$.

Proof Let $v_1$ be the shortest nonzero vector in $\Lambda$. Rotating and rescaling, we may assume that $v_1 = (1,0)$. Let $w=(a,b)$ be an arbitrary other vector which forms a basis together with $v_1$. Then $w+k v_1 = (k+a, b)$ also forms a basis together with $v_1$. Suppose, for the sake of contradiction, that the angle between $v_1$ and $w+k v_1)$ never lies between $\pi/3$ and $2 \pi/3$. Then there is some $k$ where $\angle(v_1, w+k v_1) > 2 \pi/3$ and then $\angle(v_1, w+(k+1) v_1) < \pi/3$. But, also, $w+k v_1$ and $w+kv_2$ lie outside the unit circle, since we choose $v_1$ minimal. It is impossible to find two points which are the endpoints of a horizontal line segment of length $1$ so that both lie outside the unit circle, with one at angle $<\pi /3$ from horizontal and the other at angle $>2 \pi/3$. $\square$

From now on, let $\Lambda$ obey the condition that all vectors have length at least $c \sqrt{\det \Lambda}$.

Lemma $\Lambda$ has a basis $v_1$ and $v_2$ where both $v_i$ have length $\leq 2 \sqrt{\det \Lambda}/c$.

Proof Choose the basis from the previous Lemma, and let $\theta$ be the angle between $v_1$ and $v_2$. Then $\det \Lambda = |v_1 \times v_2| = |v_1| \ |v_2| \ |\sin \theta| \geq |v_1| \ |v_2|/2$. Since $|v_1| \geq c \sqrt{\det \Lambda}$, we deduce $|v_2| \leq 2 \sqrt{\det \lambda}/c$, and vice versa. $\square$

Now, apply a change of basis to make $v_1$, $v_2$ into the standard basis $(1,0)$ and $(0,1)$. Let $C'$ be the image of $C$ under this change of basis. So we are counting ordinate lattice points in $RC'$. The area of $RC'$ is $R^2/\det \Lambda$. The perimeter of $RC'$ is $O(R/\sqrt{\det \Lambda})$, because the fact that $v_1$ and $v_2$ are roughly the same length and form an angle not too far from $\pi/2$ means that our linear transformation can't distort $C$ too badly.

Now the Key Lattice Lemma follows from Lemma 2.1.1 of Huxley on counting points of $\mathbb{Z}^2$ in a convex region. (I can't believe I couldn't find a more elementary reference than Huxley, but it works, and this particular lemma is explained in a very clear and elementary way.) Now we get to write QED.