Counting minimum elements needed such that their sum covers the whole finite space.

In $\mathbb{F}_p$, how to find a subset with smallest cardinality such that the sum between its pairs cover $\mathbb{F}_p\setminus\{0\}$. So for a subset with cardinality $n$ there are ${n\choose 2}$ pairs and correspondingly ${n\choose 2}$ sums. I want $n$ to be smallest such that those ${n\choose 2}$ sums cover $\mathbb{F}_p\setminus\{0\}$.

For example in $\mathbb{F}_7$, let $A=\{0,1,2,4\}$, The sums between its pairs covers $\mathbb{F}_7\setminus\{0\}$ since \begin{align*} 0+1=1,0+2=2,1+2=3,0+4=4,1+4=5,2+4=6\;. \end{align*} I want to know if over large $p$ similar set can be easily constructed and what would be the size of that set (I expect it to be close to $\sqrt{2p}$). Any similar literature is also appreciated.


Solution 1:

You can solve the problem via integer linear programming as follows. For $i\in \{0,\dots,n-1\}$, let binary decision variable $x_i$ indicate whether $i$ is selected. For $0 \le i < j \le n-1$, let binary decision variable $y_{i,j}$ represent the product $x_i x_j$. The problem is to minimize $\sum_i x_i$ subject to \begin{align} \sum_{\substack{0 \le i < j \le n-1:\\ i + j = k}} y_{i,j} &\ge 1 &&\text{for $k \in \{1,\dots,n-1\}$} \tag1 \\ y_{i,j} &\le x_i &&\text{for $0 \le i < j \le n-1$} \tag2 \\ y_{i,j} &\le x_j &&\text{for $0 \le i < j \le n-1$} \tag3 \end{align} Constraint $(1)$ forces each sum $k$ to be covered. Constraints $(2)$ and $(3)$ enforce the logical implication $y_{i,j} \implies (x_i \land x_j)$.

The values for $n \in \{1,\dots,50\}$ are $$ \begin{matrix} 0 &2 &3 &3 &4 &4 &4 &5 &5 &5\\ 6 &6 &6 &6 &7 &7 &7 &7 &8 &8 \\ 8 &8 &8 &9 &9 &9 &9 &9 &10 &10 \\ 10 &10 &10 &10 &11 &11 &11 &11 &11 &11\\ 11 &12 &12 &12 &12 &12 &12 &12 &13 &13 \end{matrix} $$ enter image description here

If you change the condition to $(i+j) \mod n = k$ in constraint $(1)$, the values for $n \in\{1,\dots,50\}$ are instead \begin{matrix} 0 &2 &3 &3 &4 &4 &4 &5 &5 &5 \\ 5 &6 &6 &6 &7 &7 &7 &7 &7 &7 \\ 8 &8 &8 &8 &8 &9 &9 &9 &9 &9 \\ 9 &10 &10 &10 &10 &10 &10 &11 &11 &11 \\ 11 &11 &11 &11 &12 &12 &12 &12 &12 &12 \end{matrix} enter image description here enter image description here

Solution 2:

Nice question! First, as in RobPratt's answer I'll generalize to the non-prime case. Rob discusses two variants of the question both of which seem quite interesting to me: the first is to find a set of non-negative integers of minimal size whose distinct pairwise sums cover $\{ 1, \dots n-1 \}$, and the second is the same question but working $\bmod n$. For lack of better notation let me call these $f_{+}(n)$ and $f_{cyc}(n)$. Neither of them appear to be in the OEIS but A046693 has similar values and is related, answering a similar question but about pairwise differences rather than sums; "perfect ruler" may be a useful keyword here. If we allow sums of the same element then we are studying the sumset from additive combinatorics and there may be useful tools there also.

We clearly have $f_{cyc}(n) \le f_{+}(n)$. Interestingly we also have $f_{+}(n) \le f_{+}(n-1) + 1$ since if we can cover $\{ 1, \dots n-1 \}$ we need at most one new integer to cover $\{ 1, \dots n \}$, suggesting that the integer version of the problem may have more usable inductive structure. The "information-theoretic lower bound" is

$$f_L(n) \le f_{cyc}(n) \le f_{+}(n)$$

where $f_L(n)$ is the smallest nonnegative integer $k$ such that ${k \choose 2} \ge n-1$, or equivalently $(2k - 1)^2 \ge 8n - 7$, hence

$$f_L(n) = \left\lceil \frac{\sqrt{8n-7} + 1}{2} \right\rceil \approx \sqrt{2n}.$$

(The above formula doesn't work for $n = 1$.) $f_L(n)$ is just the sequence $0, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, \dots $ which, after the somewhat degenerate value $0$, takes the value $k$ a total of $k-1$ times. This lower bound is achievable iff it is possible to find a subset of the above sum whose distinct pairwise sums cover $\{ 1, \dots n-1 \}$ exactly once each. According to Rob's data the first point at which $f_L(n) < f_{+}(n)$ is $n = 11$, where $f_L(11) = 5$ but $f_{+}(11) = 6$, and the first point at which $f_L(n) < f_{cyc}(n)$ is $n = 15$, where $f_L(15) = 6$ but $f_{cyc}(15) = 7$.

Here is the best upper bound I have so far. The construction is quite simple but all of the more complicated constructions I've come up with so far have done worse!

Proposition:

$$f_{+}(n) \le \text{min} \left( 2 \lceil \sqrt{n} \rceil - 1, 2 \left\lceil \frac{ \sqrt{4n+1} - 1}{2} \right\rceil \right).$$

Proof. Let $k, \ell$ be positive integers. Every integer in the interval $[0, k \ell - 1]$ can be written uniquely in the form $ak + b$ where $0 \le b \le k-1, 0 \le a \le \ell - 1$. It follows that if we take $S = \{ ak : 0 \le a \le \ell - 1 \} \cup \{ b : 0 \le b \le k - 1 \}$, which has size $k + \ell - 1$, then distinct pairwise sums of elements of $S$ cover $\{ 1, \dots k \ell - 1 \}$.

So we want to chose $k$ and $\ell$ such that $k + \ell - 1$ is as small as possible subject to the constraint that $k \ell \ge n$. Standard arguments imply that this means we want to choose $k$ and $\ell$ to be as close as possible. If we take $k = \ell = \lceil \sqrt{n} \rceil$ then this gives $f_{+}(n) \le 2 \lceil \sqrt{n} \rceil - 1$, which was the bound I had previously written down here.

But actually it is possible to do a little better than this for some values of $n$ by taking $\ell = k-1$. Then $k(k-1) \ge n$ gives $(2k-1)^2 \ge 4n+1$ which gives

$$k = \left\lceil \frac{ \sqrt{4n+1} + 1}{2} \right\rceil$$

and hence

$$f_{+}(n) \le 2k-2 = 2 \left\lceil \frac{ \sqrt{4n+1} - 1}{2} \right\rceil.$$

which is occasionally better, e.g. when $n = 50$ (the worst case for the $2 \lceil \sqrt{n} \rceil - 1$ bound, one larger than a perfect square) the original bound gives $15$ but this bound gives $14$. With a little more effort we should be able to characterize when one bound or the other is better. $\Box$

So, in summary, we have

$$\boxed{ \left\lceil \frac{\sqrt{8n-7} + 1}{2} \right\rceil \le f_{cyc}(n) \le f_{+}(n) \le \text{min} \left( 2 \lceil \sqrt{n} \rceil - 1, 2 \left\lceil \frac{ \sqrt{4n+1} - 1}{2} \right\rceil \right) }.$$

Rob's data implies that neither the upper nor the lower bound is sharp, although for $2 \le n \le 50$ it does give that the upper bound is sharp for $f_{+}(n)$ when $n$ or $4n+1$ is a perfect square. I am very curious if this will continue to be true. (Edit: Rob checked in the comments and it fails for $n = 64$. Welp!) For easier comparison with Rob's data here's a table of the upper bound for $n \in \{ 1, \dots 50 \}$, with points where it differs from $f_{+}(n)$ highlighted in red (in each case it's off by $1$):

$$ \begin{matrix} \color{red}{1} &2 &3 &3 &4 &4 & \color{red}{5} &5 &5 & \color{red}{6}\\ 6 &6 & \color{red}{7} & \color{red}{7} &7 &7 &\color{red}{8} &\color{red}{8} &8 &8 \\ \color{red}{9} & \color{red}{9} & \color{red}{9} &9 &9 & \color{red}{10} & \color{red}{10} & \color{red}{10} &10 &10 \\ \color{red}{11} & \color{red}{11} & \color{red}{11} & \color{red}{11} &11 &11 & \color{red}{12} & \color{red}{12} & \color{red}{12} & \color{red}{12}\\ \color{red}{12} &12 & \color{red}{13} & \color{red}{13} & \color{red}{13} & \color{red}{13} & \color{red}{13} & \color{red}{13} &13 & \color{red}{14} \end{matrix} $$

And here's a graph generated in Google Sheets:

enter image description here

It would be good to have empirical evidence for larger values of $n$. I don't know if Rob's approach will give, say, $n = 100$ or $n = 1000$, but I think larger values of $n$ should be approachable by just generating subsets randomly and checking their pairwise sums; for subsets close to the optimal bound this should work with high probability. The probabilistic method was the first thing I tried to write down a bound but I couldn't make it work.

As a concrete challenge I'd love to see if anyone could offer better upper or lower bounds than the above for $n = 100$ or $n = 1000$. You could take $n = 101$ or $n = 1009$ if you want to stick to primes. The lower and upper bounds here are, respectively,

$$15 \le f_{cyc}(100) \le f_{+}(100) \le 19$$ $$15 \le f_{cyc}(101) \le f_{+}(101) \le 20$$ $$46 \le f_{cyc}(1000) \le f_{+}(1000) \le 63$$ $$46 \le f_{cyc}(1009) \le f_{+}(1009) \le 63.$$

Edit: Lots of stuff including some references in this very closely related but not entirely identical question, which considers sumsets and so allows non-distinct sums, and also asks to cover $0$; this is arguably more natural (e.g. it gives the problem more symmetries). These are called sum covers or finite additive bases. Fitch and Jamison's Minimum sum covers of small cyclic groups opens with more or less the same bounds as above, considers both distinct and non-distinct sums, and has a table of values and of optimal covers for $3 \le n \le 54$.

Solution 3:

In this thread I answered a similar question for sumsets. Some of results mentioned in the thread can be applicable to this question. Namely, lower size sumsets bounds for $\Bbb Z$ provide lower bounds for $f_+(n)$ (see the beginning of Qiaochu Yuan’s answer for the definition of $f_+(n)$).:

it seems that in this direction was mainly investigated the Z counterpart of the problem, for which is devoted a lot of papers, in which the lower bound was more and more improved by complicated techniques, see, for instance [Mos], [GN], [Hor], [Yu].

I am checking the mentioned papers, but I already can say that the results of [Mos] imply that $f_+(n)>\sqrt{\tfrac {2n-2}{1-.0197}}$ for sufficiently large $n$.

I also found that another my colleague, Oleg Pikhurko dealt with a related stuff in his paper [Pih]. In particular, he noted that Mrose [Mro] constructed a set $S\subset [0, 10t^2+8t]$ of size $7t + 3$ such that $S + S\supset [0, 14t^2 + 10t − 1]$. In fact, $S$ is the union of five disjoint arithmetic progressions, see the description at p. 6 of Pikhurko’s paper.

Since sums of distinct elements of $S$ cover the discrete segment $[0, 14t^2 + 10t − 1]$ but a number $8t^2+4t-2$, see the last paragraph of [Pih, p.6], we can add one element to the set $S$ and obtain a set $S’$ such that sums of distinct elements of $S’$ cover the discrete segment $[0, 14t^2 + 10t − 1]$, which follows $f_+(14t^2 + 10t)\le 7t +4$.

This fact provides an upper bound for $f_+(n)$ as follows. Pick the smallest natural $t$ such that $14t^2+10t\ge n$. Then $n\ge 14(t-1)^2+10(t-1)+1=14t^2-18t+5$. Solving the respective quadratic equation, we see that for $n\ge 4$ we have $t\le\tfrac {9+\sqrt{56n-199}}{28}$ and so $$f_+(n)\le 7t+4\le \frac{25+\sqrt{56n-199}}{4}.$$

References

[GN] C. Sinan Güntürk, Melvyn B. Nathanson A new upper bound for finite additive bases, Acta Arithmetica 124 (2006), 235-255.

[Hor] G. Horváth An improvement of an estimate for finite additive bases, Acta Arithmetica 130 (2007), 369-380.

[Mos] L. Moser On the representation of $1,2,\dots, n$ by sums, Acta Arithmetica 6 (1960), 11-13.

[Pih] Oleg Pikhurko Dense Edge-Magic Graphs and Thin Additive Bases.

[Yu] Gang Yu Upper bounds for finite additive $2$-bases, Proc. Amer. Math. Soc. 137 (2009), 11-18.