On the number of integers with an even number of distinct prime factors

Solution 1:

The Generalized Prime Number Theorem gives a decent estimate of the number of numbers with k factors less than n. You can use this to see that the number of numbers with 2,4,6,... prime factors should not in the large differ from the number of numbers with 1,3,5,... factors.

The Generalized PNT states:

$$\pi_k(x) \sim \frac{x}{\log x}\frac{(\log\log x)^{k-1}}{(k-1)!} $$

in which $\pi_k(x) $ is the number of numbers with k factors less than or equal to x.

See also problem 168307 (Generalized PNT in limit as numbers get large) which shows that in a sense even numbers have more prime factors than odd ones, and cites a result of Ramanujan that the "normal order" of prime factors of a number n, with or without repetitions, is about $\log\log n.$

For the last see Ramanujan, Collected Works, p. 274. For the Generalized PNT see, for example, G.J.O. Jameson, The Prime Number Theorem, p.145.

Edit in response to comment: This is similar to Chebyshev's bias and it could well be that generalized primes have similar properties. I don't think the proposition in the OP has been proven either way and even the proof of Chebyshev's bias is conditional.

Solution 2:

Besides the generalized PNT, which does get the point across, here is how to see that the number of integers with an even number of prime factors equals the number of integers with an odd number of prime factors, asymptotically (this method will let us generalize to possible extensions of the problem).

(This technique is similar to, say, page 27 in the book Ten Lectures on the Probabilistic Method, so you can convince yourself of its formality).

Pick a integer $x$ at random from $1$ to $n$. For a prime $p$, let $X_p$ be the indicator that $p|x$, and let $X = \sum X_p$ where the sum is taken over all primes. Then we will show that $E[(-1)^X] = 0$ as $n$ goes to infinity, from which our claim follows (since this means that there is an equal chance that $X \equiv 0 \pmod 2$ as there is that $X \equiv 1 \pmod 2$.

First, note that as $n \to \infty$, we have that $P(X_q|X_p)=P(X_q)$ because $p$ and $q$ are of course coprime (take a minute to see why this is not necessarily true if $n$ is finite, but is obvious when $n$ goes to infinity) so $X_p \perp X_q$.

Thus, as $n\to \infty$ we therefore have $E[(-1)^X] = E[(-1)^{\sum X_p}]=\prod E[(-1)^{X_p}]$ by independence. But of course, $E[(-1)^{X_2}] = 0$.

The intuition here is that $2$ messes everything up (or makes it nice, depending on your view), and that the independence due to coprimality allows us to consider $2$ separately from everything else. Essentially, $2$ messes things up because of the following. Regardless of anything else, we know our number $x$ has a prime factor of $2$ in it 1/2 the time, and not the other half of the time. So intuitively, regardless of anything else, we add an extra prime factor with probability 1/2, and dont with probability 1/2 - that is, we change the parity of $X$ with probability 1/2, just by virtue of having $X_2$ in the sum. This is similar to having a biased coin, but changing it's flip outcome with probability 1/2 - see why by doing this, we will always have Heads with probability 1/2 and Tails with probability 1/2, regardless of the original bias!

If we restrict ourselves to not counting $2$ as a prime factor (perhaps, looking only at the odd numbers) then we will have a result of the expectation of $E[(-1)^X]$ related to the euler product $\prod_{p\ge 3}E[(-1)^{X_p}] = \prod_{p\ge 3} (1-\frac{2}{p}) $ (see why this is so) which gives us information on the proportion of number with even/odd number of (non-2) prime factors.

Of course, there could be some bias involved similar to Chebyshev bias. For example, if the sequence of $(-1)^{\omega(n)}$ went $-1,1,-1,1,-1,1,\dots$ then it would be true that $E[(-1)^{X}]=0$ but it is also true that $B(n)>A(n)$ far far more than $A(n)>B(n)$.

So here is a graph of $A(n)-B(n) = \sum_{x\le n}(-1)^{\omega(x)}$ for the first 1,000,000 values of $n$:

enter image description here

From which it looks like the statement we would want to prove is that (treating $x$ as a random variable on $[n]$) we have: $A(x)-B(x) = \sum_{k \le x}(-1)^{\omega(k)}$ has expectation equal to $-\log(f(x))$ (f going to infinity of course) and very small relative variance, from which the claim would follow (and looks to be true based on the graph). This should be able to be done with the probabilistic techniques above. The complication here is that whereas before, we were able to treat $x$ as a random variable in $\omega(x)$, now we have to treat $x$ as a random variable in the stopping time of the sum, but the $\omega$'s can no longer be treated as random - they are fixed, the only random thing is when we stop summing. Put another way, the notion of randomness can be "easily" captured when the randomness is in the parameter of our arithmetic function; but once we shift the randomness to a stopping time, the arithmetic function ceases to be random (each summand is non-random) and of course we have no (good) idea of how these summands behave, especially in a delicate case as this where we need to be exact in our computation of each.

The estimation that $\omega(n) \to \log \log n$ (again, page 27 of the book) may be useful for some complex analysis, but is not immediately obvious to apply and I suck at complex analysis.

I'll update if I come up with something.

DISCLAIMER: Im not a mathematician, not even close, so someone check this but it looks right...