On the regularity of the alterning sum of prime numbers

Let's define $(p_n)_{n\in \mathbb N}$ the ordered list of prime numbers ($p_0=2$, $p_1=3$, $p_2=5$...).

I am interested in the following sum:

$$S_n:=\sum_{k=1}^n (-1)^kp_k$$

Since the sequence $(S_n)$ is related to the gaps between prime numbers, I would expect it to be quite irregular.

But if we plot $(S_n)_{1\leqslant n\leqslant N}$ for $N\in \{50,10^3,10^5,10^6,10^7\}$, we obtain the following:

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

We can observe a great regularity.

So my questions are:

  • Why is there so much regularity?

  • Can we find the equations of the two lines forming $(S_n)$?

  • Is there a proof that it will continue to be that regular forever?

Any contribution, even partial, will be greatly appreciated.


Updates.

Thanks to mixedmath and Daniel Fischer, here is more curves:

  • in blue, you have $S_n$;

  • in red, you have $\displaystyle 2^{1/6}\displaystyle \sum (-1)^k k\log k$;

  • in green, you have $\displaystyle \sum (-1)^k k\log k$;

  • in purple, you have $\displaystyle \sum (-1)^k k(\log k+\log\log k)$;

  • in yellow, you have $\displaystyle \sum (-1)^k k(\log k+\log\log k-1)$.

enter image description here

enter image description here

enter image description here


My question seems quite related to this one.


Solution 1:

This is a great question. Unfortunately, this is an incomplete answer. But I thought about this a bit and I noticed something interesting, but which I do not know how to explain.

With $$ S_n = \sum_{k \leq n} (-1)^k p_k,$$ where $p_n$ is the $n$th prime, some patterns are immediately clear. It is obvious that the sequence of $S_n$ alternates in sign for example. But some patterns are not obvious or clear.

By the prime number theorem, we expect that $p_n \approx n \log n$. If we plot $\sum_{k \leq n} (-1)^k k \log k$ against $S_n$ for all primes up to one million, we get

A pretty good, but not great, plot

This is apparently a bit too small, it seems. This sort of makes sense, as deviations from the approximation $p_n \approx n \log n$ compound here.

However, I noticed that $$ 1.12 \sum_{k \leq n} (-1)^k k \log k $$ is actually a very good (experimental) estimate of what's going on, as can be seen in the following plot.

A better fit

Perhaps $1.12$ is an incorrect choice --- it just happened to be a very nearby reasonable seeming number, and it does appear to reflect what's going on. I do not know why, though.

If we conjecture for a moment that $1.12 \sum (-1)^k k \log k$ is a good estimator, then we can write a good asymptotic for this series using partial summation. Namely

$$ \begin{align} \sum_{k \leq n} (-1)^k k \log k &= \left( \sum_{k \leq n} (-1)^k k \right) \log n - \int_1^n \left( \sum_{k \leq t} (-1)^k k \right) \frac{1}{t} dt \\ &= (-1)^n \left \lfloor \frac{n+1}{2} \right \rfloor \log n - \int_1^n (-1)^{\lfloor t \rfloor} \left \lfloor \frac{\lfloor t \rfloor+1}{2} \right \rfloor \frac{1}{t} dt \\ &= (-1)^n \left \lfloor \frac{n+1}{2} \right \rfloor \log n + O \left( \int_1^n \left( \frac{t+1}{2t} + \frac{2}{t} \right) dt\right) \\ &= (-1)^n \left \lfloor \frac{n+1}{2} \right \rfloor \log n + O(n). \end{align}$$

So I conjecture that $$S_n \approx 1.12 (-1)^n \left \lfloor \frac{n+1}{2} \right \rfloor \log n + O(n).$$ For comparison, the size of the alternating sum of the first 1001 primes is $3806$, where this estimate gives about $3876.6$. For $10001$, the actual is $52726$, compared to the estimated $51588.7$. These are both close, although apparently not super accurate.

It may be possible to describe the actual behavior of $S_n$ a bit more by using secondary terms in the prime number theorem, but I was not successful in my back-of-the-envelope computations. Nor do I know how to explain the $1.12$ that appears in this answer (or how to determine if it is $1.12$ as opposed to, say, $1.15$). Perhaps someone else will see how to fill in these gaps.


(Edited in after Daniel Fischer's comment)

Here are updated images, including plots of $\sum (-1)^n n (\log n + \log \log n)$.

now including a third line

As we can see, $\sum (-1)^n n (\log n + \log \log n)$ grows in magnitude just a little bit more quickly. Focusing a bit on just the upper half, we get

this actually looks pretty much the same