On the regularity of the alterning sum of prime numbers
Let's define $(p_n)_{n\in \mathbb N}$ the ordered list of prime numbers ($p_0=2$, $p_1=3$, $p_2=5$...).
I am interested in the following sum:
$$S_n:=\sum_{k=1}^n (-1)^kp_k$$
Since the sequence $(S_n)$ is related to the gaps between prime numbers, I would expect it to be quite irregular.
But if we plot $(S_n)_{1\leqslant n\leqslant N}$ for $N\in \{50,10^3,10^5,10^6,10^7\}$, we obtain the following:
We can observe a great regularity.
So my questions are:
Why is there so much regularity?
Can we find the equations of the two lines forming $(S_n)$?
Is there a proof that it will continue to be that regular forever?
Any contribution, even partial, will be greatly appreciated.
Updates.
Thanks to mixedmath and Daniel Fischer, here is more curves:
in blue, you have $S_n$;
in red, you have $\displaystyle 2^{1/6}\displaystyle \sum (-1)^k k\log k$;
in green, you have $\displaystyle \sum (-1)^k k\log k$;
in purple, you have $\displaystyle \sum (-1)^k k(\log k+\log\log k)$;
in yellow, you have $\displaystyle \sum (-1)^k k(\log k+\log\log k-1)$.
My question seems quite related to this one.
Solution 1:
This is a great question. Unfortunately, this is an incomplete answer. But I thought about this a bit and I noticed something interesting, but which I do not know how to explain.
With $$ S_n = \sum_{k \leq n} (-1)^k p_k,$$ where $p_n$ is the $n$th prime, some patterns are immediately clear. It is obvious that the sequence of $S_n$ alternates in sign for example. But some patterns are not obvious or clear.
By the prime number theorem, we expect that $p_n \approx n \log n$. If we plot $\sum_{k \leq n} (-1)^k k \log k$ against $S_n$ for all primes up to one million, we get
This is apparently a bit too small, it seems. This sort of makes sense, as deviations from the approximation $p_n \approx n \log n$ compound here.
However, I noticed that $$ 1.12 \sum_{k \leq n} (-1)^k k \log k $$ is actually a very good (experimental) estimate of what's going on, as can be seen in the following plot.
Perhaps $1.12$ is an incorrect choice --- it just happened to be a very nearby reasonable seeming number, and it does appear to reflect what's going on. I do not know why, though.
If we conjecture for a moment that $1.12 \sum (-1)^k k \log k$ is a good estimator, then we can write a good asymptotic for this series using partial summation. Namely
$$ \begin{align} \sum_{k \leq n} (-1)^k k \log k &= \left( \sum_{k \leq n} (-1)^k k \right) \log n - \int_1^n \left( \sum_{k \leq t} (-1)^k k \right) \frac{1}{t} dt \\ &= (-1)^n \left \lfloor \frac{n+1}{2} \right \rfloor \log n - \int_1^n (-1)^{\lfloor t \rfloor} \left \lfloor \frac{\lfloor t \rfloor+1}{2} \right \rfloor \frac{1}{t} dt \\ &= (-1)^n \left \lfloor \frac{n+1}{2} \right \rfloor \log n + O \left( \int_1^n \left( \frac{t+1}{2t} + \frac{2}{t} \right) dt\right) \\ &= (-1)^n \left \lfloor \frac{n+1}{2} \right \rfloor \log n + O(n). \end{align}$$
So I conjecture that $$S_n \approx 1.12 (-1)^n \left \lfloor \frac{n+1}{2} \right \rfloor \log n + O(n).$$ For comparison, the size of the alternating sum of the first 1001 primes is $3806$, where this estimate gives about $3876.6$. For $10001$, the actual is $52726$, compared to the estimated $51588.7$. These are both close, although apparently not super accurate.
It may be possible to describe the actual behavior of $S_n$ a bit more by using secondary terms in the prime number theorem, but I was not successful in my back-of-the-envelope computations. Nor do I know how to explain the $1.12$ that appears in this answer (or how to determine if it is $1.12$ as opposed to, say, $1.15$). Perhaps someone else will see how to fill in these gaps.
(Edited in after Daniel Fischer's comment)
Here are updated images, including plots of $\sum (-1)^n n (\log n + \log \log n)$.
As we can see, $\sum (-1)^n n (\log n + \log \log n)$ grows in magnitude just a little bit more quickly. Focusing a bit on just the upper half, we get