Consider the notorious Collatz function $$ T(n) = \begin{cases}(3n+1)/2&\text{ if $n$ is odd,}\\n/2&\text{ if $n$ is even.}\end{cases} $$

One of the most important acceleration techniques of the convergence test is the usage of a sieve (test $k$ least significant bits of $n$, the sieve has the size of $2^k$ entries), and test only those numbers that do not join the path of a lower number in $k$ steps. This technique is greatly explained, e.g., here or here.

For example, consider the sieve for $k=2$ and particularly the numbers of the form $4n+1$ which join the path of $3n+1$ in two steps. Their path is $$ 4n+1 \rightarrow 6n+2 \rightarrow 3n+1 \text{.}$$

What I don't understand is how this can be used to search for the highest number occurring in the sequence (path records in the terminology of Eric Roosendaal). The sieve cuts the calculation before the computation of any intermediate value (which can actually be the maximum, like the value $6n+2$ in the above example). How can I detect that $4n+1$ does lead to a maximum if no $6n+2$ is computed? Testing the path of $3n+1$ no longer makes sense since the maximum $6n+2$ occurs before this term. Am I missing something?


Solution 1:

Quote: "As $k$ increases, the search only needs to check those residues $b$ that are not eliminated by lower values of $k$"

Take residue 15 for instance. It survives $\mod 2^5$ but is eliminated while sieving $2^7$ so any value $x\equiv 15 \mod 2^7$ will not be searched anymore for $k>7$

Residue 15 was eliminated because it reached a lower value then himself $\mod 2^7$. It means that these numbers can't reach higher values, later with $k>7$, that were not reached (with a smaller $k$) by the lower value they just hit.

Solution 2:

(Notation: residue $n_0\mod 2^{\lceil i \log_23\rceil}$ = residue $b\mod2^k$ from your wiki page)

About the "discarded" 5 reaching maximum 8 (or 16), already reached by "surviving" 3:

  • One of the discarded sequence is the inverse V-Shape sequence which rise for $i$ steps of $f(x)=\frac{3x+1}{2}$ and then fall bellow the initial value by successive division by $2$ (See here). Of all discarded sequences $2^{\lceil i \log_23\rceil}n+n_0$ for a specific $n$, this is the type of sequence that potentially reaches the highest value: $$(2^{\lceil i \log_23\rceil}n+n_0+1)\frac{3^i}{2^{i}}-1$$

Note: $n_0\leq 2^{\lceil i \log_23\rceil}-3$ and the exact value can be found in the link above

e.g. with $4n+1=5$ where $n_0=1$, $i=1$,$n=1$ which reaches $8$ before dropping to $4<5$

  • One of the surviving sequence is the straight line which rise for the whole $k={\lceil i \log_23\rceil}$ steps of $f(x)=\frac{3x+1}{2}$. Of all surviving sequences for a specific $n$, this is the sequence (starting from $2\cdot2^{\lceil i \log_23\rceil}n-1$) that reaches the highest value (limited to $k={\lceil i \log_23\rceil}$ steps): $$3^{\lceil i \log_23\rceil}(n+1)-1$$

Note: here we always have $n_0= 2^{\lceil i \log_23\rceil}-1$

e.g. with $4n+3=7$ where $i=1$,$n=1$ which reaches $17$ (in 2 steps), or with $n=0$: $3$ reaches $8$

Now it is easy to show that the highest value that can be reached by a discarded sequence at $n$ is smaller (or equal) than the highest value already reached by a surviving sequence at $n-1$

e.g with discarded $4(1)+1=5$ reaches $8$ which was already reached by surviving $4(1-1)+3=3$

Surviving highest value at $n-1$ is greater then discarded value at $n$?

$$3^{\lceil i \log_23\rceil}n-1 \geq (2^{\lceil i \log_23\rceil}n+n_0+1)\frac{3^i}{2^{i}}-1$$ and with $n_0< 2^{\lceil i \log_23\rceil}-1$, we just need to show that $$3^{\lceil i \log_23\rceil}n-1 \geq (2^{\lceil i \log_23\rceil}(n+1))\frac{3^i}{2^{i}}-1$$ $$\Big(\frac{3}{2}\Big)^{\lceil i \log_23\rceil}n \geq \Big(\frac{3}{2}\Big)^i(n+1)$$ $$\Big(\frac{3}{2}\Big)^{\lceil i \log_2\frac{3}{2}\rceil} \geq 1+\frac{1}{n}$$ which is already true for $n-1=0$ when $i\geq 3$ (manually checked for $i=1$ and $i=2$ by using the exact value of $n_0$ in those cases)

e.g. with $n-1=0$: discarded $32n+23$ reaches $188$ but surviving $32(n-1)+31$ already reached $242$

Note: you can multiply both side by 2 to get the "real" maximum (16 instead of 8).

The key idea is that even if the discarded inverse V-Shape at $n$ was at the highest possible residue $n_0= 2^{\lceil i \log_23\rceil}-3$, it would reach a smaller value than the straight line at $n-1$ (always with residue $n_0= 2^{\lceil i \log_23\rceil}-1$).

This means that record paths are always found in residue $b\mod2^k$ (in other word, at $2^k\cdot n+b$ with $n=0$)

EDIT:

even more, when sieving $2^{k+1}$: values below $2^k$ that are dropping cannot produce new path records (obviously), but value above $2^k$ that are not surviving after $2^{k+1}$ sieve are now known, and there maximum is still the RHS above: indeed the condition $n_0+2^{\lceil i \log_23\rceil}< 2^{\lceil i \log_23\rceil+1}-1$ or $n_0< 2^{\lceil i \log_23\rceil}-1$ do not change, and the value of $i$ (climbing steps) neither since the last step was a drop bellow initial value.

So even if the max value on the LHS do not climb anymore at step $k+1$, it would still be higher (the whole equation would stay the same).

This means that new record paths are only found in surviving residue $b\mod2^k$

No need to check discarded residue at all, even within the sieve range.