Evaluating $\sum\limits_{n=1}^{\infty} \frac{1}{n\operatorname{ GPF}(n)}$, where $\operatorname{ GPF}(n)$ is the greatest prime factor

$\operatorname{ GPF}(n)=$Greatest prime factor of $n$, eg. $\operatorname{ GPF}(17)=17$, $\operatorname{ GPF}(18)=3$.

How to evaluate convergence/divergence/value of the sum

$$\sum_{n=1}^{\infty} \frac{1}{n\operatorname{ GPF}(n)}\,?$$


If $\{r_1,\dots,r_k,p\}$ is the set of primes $\le p$, then

$$\{n\in \mathbb{Z}: \operatorname{gpf}(n)=p\}=\{r_1^{e_1}\cdots r_k^{e_k}p^f:\text{each }e_i\ge 0,f\ge1\}.$$

Now we have the Euler product factorization

$$\sum_{e_1\,\ge0}\cdots\sum_{e_k\,\ge0}\sum_{f\ge1}\frac{1}{r_1^{e_1}\cdots r_k^{e_k} p^f}= \prod_{i=1}^k \left(1+\frac{1}{r_i}+\frac{1}{r_i^2}+\cdots\right)\cdot \left(\frac{1}{p}+\frac{1}{p^2}+\frac{1}{p^3}+\cdots\right) $$

$$\implies \bullet =\sum_p \frac{1}{p}\sum_{\operatorname{gpf}(n)=p}\frac{1}{n}=\sum_p \frac{1}{p}\left[\,\prod_{q\,< \,p}\left(1-\frac{1}{q}\right)^{-1}\right]\frac{1/p}{1-1/p}$$

$$=\sum_p \frac{1}{p(p-1)}\prod_{q<p}\left(1-\frac{1}{q}\right)^{-1}.$$

With Mertens' Theorem, we know that the $\prod$ above is $\le C\log p$ eventually, so after we ignore a finite sum corresponding to the $p$ before this "eventually" we may say that

$$\frac{\bullet}{C} + \zeta_P'(2)\le \sum_p \frac{\log p}{p}\left(\frac{1}{p-1}-\frac{1}{p}\right)\le \sum_p\frac{\log p}{p^2}=-\zeta_P'(2).$$

Here $\zeta_P$ is the prime zeta function. With monotonicity of the terms, this proves convergence.


Call the sum $S$, and assume that $\text{GPF}(1)$ is defined to be $1$. For $n\geq1$ and $p,q$ prime, $\text{GPF}(n)=p$ iff $p|n$ and $n$ is a product of primes $q\leq p$, so that, by the Euler product factorization (as noted by @anon), the sum of reciprocals of all such $n$ is $$ T_p= \sum_{\text{GPF}(n)=p} \frac{1}{n}= \frac{1}{p}\; \prod_{q \leq p}\; \sum_{i=0}^\infty q^{-i} = \frac{1}{p}\; \prod_{q \leq p} \left( 1-\frac{1}{q} \right)^{-1}. $$ Also (as @anon noted) by Mertens' third theorem, $p\;T_p\rightarrow e^\gamma\log p$ so that $T_p\rightarrow e^\gamma\frac{\log p}{p} \rightarrow 0$ as $p\rightarrow\infty$, with $T_p-e^\gamma\frac{\log p}{p}$ in fact changing sign infinitely often (Robin 1983) and $$ S = \sum_{n=1}^\infty \frac{1}{n\text{GPF}(n)} = 1 + \sum_{p} \frac{1}{p} T_p $$ must converge: $\sum\frac{T_p}{p}\sim \sum\frac{\log p}{p^2} <\sum\frac{\log n}{n^2}$ all converge by the limit comparison and integral tests since $\int_1^\infty\frac{logx}{x^2}dx= -\left|\frac{1+\log x}{x}\right|_1^\infty= 1$. Note that $$ R_p= \sum_{\text{GPF}(n)\leq p}\frac{1}{n\text{GPF}(n)}= 1+ \sum_{q\leq p}\; \sum_{\text{GPF}(n)=q} \frac{1}{nq}= 1+ \sum_{q\leq p} \frac{1}{q} T_q $$ monotonically increases to $S$ as $p\rightarrow\infty$, and that for $p,q$ and $r$ prime, $$ \sum_{\text{GPF}(n)>p}\frac{1}{n\text{GPF}(n)}= S-R_p= \sum_{q>p} \frac{1}{q} T_q= \sum_{q>p} \frac{1}{q^2} \prod_{r\leq q} \left( 1-\frac{1}{r} \right)^{-1}. $$ Furthermore, note that multiplying by $(1-\frac{1}{p})$ on the right above has the effect of removing the $r=p$ term from the product, thus eliminating all terms with $p\mid n$ from the summation on the left. Thus $$ S-R_p+\frac{1}{p}T_p= \sum_{\text{GPF}(n)\geq p}\frac{1}{n\text{GPF}(n)}= S_p+\sum_{ \begin{matrix} \text{GPF}(n)>p \\p\nmid n \end{matrix} } \frac{1}{n\text{GPF}(n)} $$ $$ =S_p+ \left(1-\frac{1}{p}\right) \sum_{q>p}\frac{1}{q}T_q =S_p+ \left(1-\frac{1}{p}\right) \left(S-R_p\right) $$ $$ \implies\quad 0 < S - R_p = p S_p - T_p \rightarrow 0 \quad\text{as}\quad p\rightarrow\infty $$ where $S_p$ is the sum of terms from $S$ for which $n$ is divisible by the prime $p$: $$ S_p= \sum_{p|n}\frac{1}{n\text{GPF}(n)}= \sum_{n=1}^{\infty}\frac{1}{pn\text{GPF}(pn)}. %ERROR: %= %\frac{1}{p^2}- %\frac{1}{p}+ %\sum_{n=1}^{\infty}\frac{1}{pn\text{GPF}(n)} %= %\frac{S}{p}-\frac{p-1}{p^2}. $$

To check this, we might want the first few values of $T_p$: $T_2=T_3=1$, $T_5=\frac{3}{4}$, $T_7=\frac{5}{8}$, $T_{11}=\frac{7}{16}$, $T_{13}=\frac{77}{192}$ & $T_{17}=\frac{1001}{3072}$, along with the following recursion, which follows from the formula above for $T_p$: if $p_k$ is the $k^{\text{th}}$ prime, then $$ \frac{T_{p_{k+1}}}{T_{p_k}} = \frac{p_k}{p_{k+1}-1}. $$ Working in sage (online), we could calculate and plot a rough lower bound for $S$ as follows:

P = Primes()
p = P.first()
T = 1; R = 3/2 # R = R_2 = 1 + T_2/2
x = []; y = []
for k in range(1,10000):
    q = p
    p = P.next(p)
    T = T * q/(p-1)
    R = R + (T/p).n(digits=16)
    x.append(log(p).n(digits=6))
    y.append(R)
print 'p = %3d T =' % p, T.n(digits=12), 'R =', R.n(digits=12)
print 'T ~', (e^euler_gamma*log(p)/p).n(digits=12)
list_plot(zip(x,y))

p = 104729 T = 0.000196636242440 R = 2.25441837672
T ~ 0.000196580221393

$R_p$ versus $\log p$ for primes $2$ through $10000$

where we plot the partial sums $R_p$ versus $\log p$ for $p\in\{p_2=3,\dots,p_{10000}=104729\}$. The last computed term (with $p=104729$) has contributed about $T_p/p \approx 0.000196636/104729 \approx 2\times 10^{-9}$ to $R_p \approx 2.2544184$, so the convergence seems adequate to offer the estimate as a fairly good lower bound. We also compare the computed (exact) value of $T_p=0.00019663624$ with the asymptotic estimate, $e^\gamma\frac{\log p}{p}=0.000196580221393$.

Notice that the plot above appears to repeatedly cross its "best fit" continuous approximation. This observation corresponds to the 1983 result of Guy Robin, and can be made more precise, yielding in the process another, presumably course analytic estimate for $S$ by way of the asymptotic Merten approximation $T_p \approx e^\gamma\frac{\log p}{p}$, whose corresponding sum can also be expressed analytically in terms of the prime zeta function: $$ S=1+\sum_p\frac{1}{p}\prod_{q\leq p}\left(1-\frac{1}{q}\right)^{-1} =1+\sum_p\frac{T_p}{p} \quad\implies $$ $$ S \approx 1-e^\gamma\zeta_{P}'(2) = 1+e^\gamma\sum_p\frac{\log p}{p^2} = 1+e^\gamma\sum_{k=1}^\infty\frac{\log p_k}{p_k^2}, \quad\text{where} $$ $$ \zeta_P(s) = \sum_p p^{-s} \implies \zeta_{P}'(s) = -\sum_{p}\frac{log p}{p^s} \quad\text{since}\quad \frac{d}{ds}e^{-s\log p} = e^{-s\log p}(-\log p). $$ Called from sage, mpmath calculates $\zeta_{P}'(2)=-0.93754825431584377$, yielding the "Merten" estimate $S\approx 1-e^\gamma\zeta_{P}'(2)=2.669841336296809$:

import mpmath; mpmath.zeta(2,1,1)
S_Merten = 1 + e^euler_gamma * 0.93754825431584377
S_Merten.n(digits=16) # -> 2.669841336296809

This in turn can be analytically approximated by transforming the discrete variables $k$ and $p=p_k$ to continuous variables $t=\pi(x)$ and $x$, respectively, where $\pi(x)$ is the prime-counting function. Using the logarithmic integral approximation for $\pi(x)$, the fundamental theorem of calculus (FTOC) and integration by substitution, we have: $$ t=\pi(x) \approx \text{li}(x)=\int_0^x\frac{dt}{\log t} \quad\implies\quad dt\approx\frac{dx}{\log x} \qquad\text{(FTOC)} $$ $$ \frac{\log p_k}{p_k^2} \approx \int_{t=k-.5}^{t=k+.5} \frac{\log x}{x^2}dt \quad \implies \quad S \approx 1 + e^\gamma \int_{\text{li}^{-1}(.5)}^\infty \frac{\log x}{x^2} \frac{dx}{\log x} = 1 + \frac{e^\gamma}{{\text{li}^{-1}(.5)}} $$

x=var('x'); c=find_root(Ei(log(x))==.5, 1, 10)
S=1+e^euler_gamma/c; [c, S.n(digits=12)]

Sage reports $\text{li}^{-1}(.5)\approx 1.6719305730098757$ and $S\approx 1+0.598111e^\gamma$ $=2.06527893367$, which is surprisingly accurate given the inherent inaccuracy in the approximation $\pi(x)=\text{li}(x)$, $$ \frac{\text{li}(x)-\pi(x)}{\pi(x)} \approx \log x \cdot \exp\left(-\frac{\sqrt{\log x}}{15}\right), $$ and given the initial negative bias in the Merten approximation for $T_p$. However, $T_p-e^\gamma\frac\log{p}{p}$ changes sign infinitely often (Robin 1983), as does $\pi(x)-\text{li}(x)$ (Littlewood 1914), which exhibits quite a delay before the first sign change.

$S-1$ also appears to be rather close to the number $1.25506$ offered (apparently without citation here) as a bound on the supremum of $\frac{\pi(x)\log(x)}{x}$ for $x>1$.

Robin, G. (1983). "Sur l’ordre maximum de la fonction somme des diviseurs". Séminaire Delange–Pisot–Poitou, Théorie des nombres (1981–1982). Progress in Mathematics 38: 233–244.

Littlewood, J. E. (1914), "Sur la distribution des nombres premiers", Comptes Rendus 158: 1869–1872


Let $p_k$ be the $k$-th prime number. $$ \sum_{n\ge1}\frac{1}{n\operatorname{GPF}(n)}=\sum_{k\ge1}\frac{1}{p_k}\sum_{\operatorname{GPF}(n)=p_k}\frac{1}{n}. $$ If $\operatorname{GPF}(n)=p_k$, then $n=p_1^{i_1}\dots p_{k-1}^{i_{k-1}}\,p_k^{i_k}$ with $i_j\ge0$, $1\le j<k$ and $i_k\ge1$. It folows that $$ \begin{align*} \sum_{\operatorname{GPF}(n)=p_k}\frac{1}{n}&=\Bigl(\sum_{i_1\ge0}p_1^{-i_1}\Bigr)\dots\Bigl(\sum_{i_{k-1}\ge0}p_{k-1}^{-i_{k-1}}\Bigr)\Bigl(\sum_{i_k\ge1}p_k^{-i_k}\Bigr)\\ &=\frac{1}{p_k}\,\prod_{i=1}^k\Bigl(1-\frac{1}{p_i}\Bigr)^{-1}. \end{align*} $$ From Merten's theorem $$ \prod_{i=1}^k\Bigl(1-\frac{1}{p_i}\Bigr)^{-1}\sim e^\gamma\,\log p_k\ , $$ and the original series has te same character as $$ \sum_{k\ge1}\frac{\log p_k}{p_k^2} , $$ which is convergent.


I've attempted to put a numerical lower bound on the sum. Let $S=\sum_{j=1}^\infty 1/jGPF(j)$, where I assume $GPF(1)=1$. Let $r_i$ be the $i$th prime. Then using the geometric series, we have

$S=1+\displaystyle\sum_{\{e_i\}} r_n^{-2}\left[\frac{1}{1-r_n^{-1}}\right]\prod r_i^{-e_i}$

where the sum is over sets of exponents $\{e_i\}$ such that the highest $i$ occurring is $k$, where $0 \le k < n$. The idea here is that most big contributions to $S$ come from terms with fairly small $k$. Let $m=\sum e_i$. Then

$S \ge 1+\displaystyle\sum_{n=1}^\infty r_n^{-2}\left[\frac{1}{1-r_n^{-1}}\right] \left[1+\sum_{m=1}^\infty \sum_{k=1}^{n-1} r_k^{-m} \sum_{\{e_i\}} 1\right]$ ,

where the 1 term before the sum over $m$ is to account for the $m=0$, $k=0$ term. Now let $B(m,k)$ be the number of partitions of $m$ into $k$ nonnegative terms, with order counted as significant, and with the final term being nonzero. Then $B(m,k)={{m+k-1} \choose k}\ge (1+(m-1)/k)^k$, so

$S \ge 1+\displaystyle\sum_{n=1}^\infty r_n^{-2}\left[\frac{1}{1-r_n^{-1}}\right] \left[1+\sum_{m=1}^\infty \sum_{k=1}^{n-1} r_k^{-m}\left(1+\frac{m-1}{k}\right)^k\right]$

The following code, written in the open-source programming language Yacas, is intended to evaluate this expression:

n_max := 100;
m_max := 100;
prec := 8; /* digits of precision */
n := 1;
rn := 1;
s := N(1,prec); /* take GPF(1)=1, so first term=1 */
While (n<=n_max) [
  u := N(1,prec); /* 1=contribution from m=0, k=0 */
  rn := NextPrime(rn); /* rn=nth prime */
  m := 1;
  While (m<=m_max) [
    k := 1;
    rk := 1;
    While (k<n) [
      rk := NextPrime(rk); /* rk=kth prime */
      u := N(u+(1+(m-1)/k)^k*rk^-m,prec);
      k := k+1;
    ];
    m := m+1;
  ];
  sn := N(u*rn^-2*(1/(1-1/rn)),prec);
  s := N(s+sn,prec);
  Write(n,sn,s); NewLine();
  n := n+1;
];
Write(s); NewLine();

Summing up to maximum $n$ and $m$ values of 100 gives $S\ge 1.39$, which is in agreement with, but poorer than, the estimates by J.M. and bgins.

I've caught and corrected several errors in this after I initially posted it. Maybe there are more...