Intuition for the prime number theorem

(surprisingly, it appears that this question has not been asked before)

Let $\pi(n)$ denote the number of primes $\leq n$. The prime number theorem states that

$$\pi(n) \sim \frac{n}{\log n} \ \text{as} \ n \to +\infty$$

After painstakingly reading through Erdos's elementary proof of this theorem, I think I understand the mechanics of it from a formal perspective. However, I still don't seem to understand intuitively why this theorem is true. I would like some intuitive insight as to why this theorem holds.

I understand that for a result as deep as this one, even the intuition is going to contain some nitty-gritty details. It's probably not the sort of thing that you could explain to a child, for example. Nevertheless, I will ask this question regardless. There has to be some convincing argument for this theorem beyond the technical details of the proofs.


Solution 1:

Not a full explanation, but it is too long for a comment.

Consider the Sieve of Eratosthenes.

Start with the first $n$ numbers. Remove (one less than) $\frac 12$ of them as multiples of $2$, Of the remainder, remove $\frac 13$ of them as multiples of $3$. Of the remainder, remove $\frac 15$ as multiples of $5$, etc. You should be left with about $$n\prod_{p\le n, \text{ prime}}1 - \frac 1p$$

values as primes below $n$. Now, when multiplied out

$$\prod_{p\le n, \text{ prime}}1 - \frac 1p = 1 - \sum_{k \in S_n} \frac 1k$$ Where $S_n$ is the set of all square-free integers $> 1$ whose prime factors are $\le n$.

It remains to estimate that $1 - \sum_{k \in S_n} \frac 1k\approx \frac 1{\log n}$.

Edit:

Since it was already chosen as the solution and Winther has kindly provided Merten's 3rd theorem which says just what was needed, I could just let this go. But Merten's theorem strikes me as hardly more intuitively obvious than the prime number theorem itself, so I've been thinking on heuristic concepts to explain it.

Now for $|x| < 1$, $\frac1{1-x} = 1 + x + x^2 + ...$ Therefore $$\frac 1{\prod\limits_{p\le n}1 - \frac 1p} = \prod\limits_{p\le n}\left(1 + \frac 1p + \frac 1{p^2} + ...\right)$$

Multiplying the right side out, we get $\sum_{k \in R_n} \frac 1k$, where $R_n$ is the set of all integers whose prime factors are all $\le n$. Of these, it should be expected (I'm being heuristic here, so I can get away with that weasel-wording) that the sum for $k > n$ will be significantly smaller than for $k \le n$. Thus it seems reasonable that $$\sum_{k \in R_n} \frac 1k \sim \sum_{k \in R_n, k\le n} \frac 1k = \sum_{k=1}^n \frac 1k \sim \log n$$