approximate greatest common divisor

Solution 1:

I made a similar question here, where I propose a partial solution.

How to find the approximate basic frequency or GCD of a list of numbers?

In summary, I came with this

  • being $v$ the list $\{v_1, v_2, \ldots, v_n\}$,
  • $\operatorname{mean}_{\sin}(v, x)$ $= \frac{1}{n}\sum_{i=1}^n\sin(2\pi v_i/x)$
  • $\operatorname{mean}_{\cos}(v, x)$ $= \frac{1}{n}\sum_{i=1}^n\cos(2\pi v_i/x)$
  • $\operatorname{gcd}_{appeal}(v, x)$ = $1 - \frac{1}{2}\sqrt{\operatorname{mean}_{\sin}(v, x)^2 + (\operatorname{mean}_{\cos}(v, x) - 1)^2}$

And the goal is to find the $x$ which maximizes the $\operatorname{gcd}_{appeal}$. Using the formulas and code described there, using CoCalc/Sage you can experiment with them and, in the case of your example, find that the optimum GCD is ~100.18867794375123:

testSeq = [399, 710, 105, 891, 402, 102, 397]
gcd = calculateGCDAppeal(x, testSeq)
find_local_maximum(gcd,90,110)
plot(gcd,(x, 10, 200), scale = "semilogx")

plot of GCD appeal

Solution 2:

One approach: take the minimum of all your numbers, here $102$ as the first trial. Divide it into all your other numbers, choosing the quotient that gives the remainder of smallest absolute value. For your example, this would give $-9,2,3,-27,-6,0,-11$ The fact that your remainders are generally negative says your divisor is too large, so try a number a little smaller. Keep going until the remainders get positive and larger. Depending on how the errors accumulate, you might also add up all the numbers and assume the sum is a multiple of the minimum interval. Here your numbers add to $3006$, so you might think this is $30$ periods of $100.2$ Are your periods constrained to integers?

If you have a stubbornly large remainder, you can think that the smallest number is not one interval but two. You might have a number around $150$, so the fundamental period is $50$, not $100$. The challenge will be that if $100$ fits, any factor will fit as well.

Solution 3:

This problem is actually not new. It is usually called the ACD problem (approximate common divisor problem) or the AGCD problem (approximate greatest common divisor) and there exist several algorithms to solve it.

Although no algorithm is efficient in general, in your case, since the integers are tiny, you can simply try to eliminate the noise and compute the gcd.

Namely, you want to recover $p$ given many values of the form $$x_i = pq_i + r_i$$ where $|r_i|$ is very small, say, smaller than $50$.

Then, you can take two of those values, say, $x_1$ and $x_2$, and compute $d_{a, b} = \gcd(x_1 - a, x_2 - b)$ for all $a, b \in [-50, 50]\cap\mathbb{Z}$. Notice that this step costs only $50^2$ gcds computations, which is very cheap for any computer.

It is guaranteed that one $d_{a,b}$ is equal to $p$. Thus, to check which one is the correct one, just compute the "centered modular reduction" $r_i' := x_i \bmod d_{a,b} \in [-d_{a,b}/2,~ d_{a,b}/2]\cap\mathbb{Z}$ for $i \ge 3$.

If $d_{a,b} = p$, then each $r_i'$ equals $r_i$, therefore, all the values $r_i'$ that you get are small (e.g., smaller than 50). Otherwise, the values $r_i'$ will look randomly distributed in $[-d_{a,b}/2,~ d_{a,b}/2]\cap\mathbb{Z}$.