Unusual pattern in the distribution of odd primes

I have recently noticed an unusual pattern in the distribution of odd primes.

Each one of the following sets contains approximately half of all odd primes:

$A_n=\{4k+1: 0\leq k\leq n\}=\{1,5,9,13,\dots,4n+1\}$
$B_n=\{4k+3: 0\leq k\leq n\}=\{3,7,11,15,\dots,4n+3\}$
$C_n=\{6k+1: 0\leq k\leq n\}=\{1,7,13,19,\dots,6n+1\}$
$D_n=\{6k+5: 0\leq k\leq n\}=\{5,11,17,23,\dots,6n+5\}$

More precisely:

Let $P(S)$ denote the number of odd primes in the set $S$
Let $\pi(x)$ denote the number of odd primes smaller than $x$

Then for every sufficiently large value of $n$:

${P(A_n)}\approx{P(B_n)}\approx\frac12\pi(4n+4)$
${P(C_n)}\approx{P(D_n)}\approx\frac12\pi(6n+6)$

Now, all of this is pretty easy to observe (though probably not so easy to prove).

The following facts are subsequently obvious for every sufficiently large $n$ as well:

${P(A_n)}\leq{P(B_n)}\implies{P(A_n)}\leq\frac12\pi(4n+4)\leq{P(B_n)}$
${P(A_n)}\geq{P(B_n)}\implies{P(A_n)}\geq\frac12\pi(4n+4)\geq{P(B_n)}$
${P(C_n)}\leq{P(D_n)}\implies{P(C_n)}\leq\frac12\pi(6n+6)\leq{P(D_n)}$
${P(C_n)}\geq{P(D_n)}\implies{P(C_n)}\geq\frac12\pi(6n+6)\geq{P(D_n)}$

This is because $A_n$ and $B_n$ as well as $C_n$ and $D_n$ are "complementary" to each other:

The set ${A_n}\cap{B_n}$ is empty, and the set ${A_n}\cup{B_n}$ contains all odd primes smaller than $4n+4$
The set ${C_n}\cap{D_n}$ is empty, and the set ${C_n}\cup{D_n}$ contains all odd primes smaller than $6n+6$

Nevertheless, for almost every value of $n$:

${P(A_n)}\leq{P(B_n)}$
${P(C_n)}\leq{P(D_n)}$

The graphs and table below provide some empirical evidence:

enter image description here

 range     | odd primes | cases where either P(A)>P(B) or P(C)>P(D)
-----------|------------|-------------------------------------------
 10000     | 1228       | 0
 100000    | 9591       | 1
 1000000   | 78497      | 239
 10000000  | 664578     | 239
 100000000 | 5761454    | 1940

I would expect primes to be equally distributed between $A_n$ and $B_n$ and between $C_n$ and $D_n$.

In other words, I would expect:

[Number of primes of the form $4k+1$] $\approx$ [Number of primes of the form $4k+3$]
[Number of primes of the form $6k+1$] $\approx$ [Number of primes of the form $6k+5$]

But since the empirical evidence above suggests otherwise, my questions are:

Is this indeed the case, or do they become equally distributed on a larger range?
If this is indeed the case, what research has been conducted attempting to explain it?

Thanks

The phenomenon you observe is real. This is known, the case for $4$, as Chebyshev's bias Another relevant keyword is Shanks–Rényi race problem.

"Prime number races" by Granville and Martin is a fantastic introduction to this circle of ideas.

But, let me include some basic information here (more-or-less self-plagiarizing an MO-answer).

On a rough scale the frequency counts of primes congruent $1$ and $3$ modulo $4$ are the same; both counting functions are asymptotic to $\frac{1}{2} \text{li}(x)$ with error terms essentially as commonly know from the prime counting function. This is the well-know Prime Number Theorem for arithmetic progressions (mentioned by Dietrich Burde).

However, if one compares the exact counts of primes congruent to $1$ and $3$ modulo $4$ respectively, let me call the respective counting functions $\pi_1(x)$ and $\pi_3(x)$, then one notes (at least at the start) that there are more congruent to $3$ than congruent to $1$, so $\pi_3(x) > \pi_1(x)$, an observation made by Chebyshev. However, Littlewood showed that the difference $\pi_3(x) - \pi_1(x)$ can also be negative, and even is infinitely often essentially as negative as it can get (under the assumption that both should not deviate from $\text{li}(x)/2$ by more than $\sqrt{x}$ and a little).

So, now one might think that one just came across a phenomenon of small numbers, as you said, with this initial bias however this is not so there is a bias in the distribution.

If one defines $P$ to be the set of all integers such that $\pi_3(x) > \pi_1(x)$ then these are not "half of the integers". Rubinstein and Sarnak proved (under widely believe conjectures on zeros of L-functions, GRH and GSH) that the logarithmic density of this set, that is the limit of $$ \frac{1}{\log x} \sum_{n \in P} \frac{1}{n} $$
is $0.9959...$, so quite close to $1$ though not equal to $1$ so "almost all" is perhaps too strong.

The Dirichlet density of primes $p\equiv 1 \bmod 4$ and of primes $p\equiv 3 \bmod 4$ is both equal to $\frac{1}{2}$, but the number of primes up to $x$ in both classes can differ. This is what is meant by Prime number races ( see the answer of quid).

Unusual pattern in the distribution of odd primes

Related

Recent Posts