maximize $\sum_{A\subseteq [q], A\neq \emptyset} \alpha_A \log(|A|)$ with nonlinear constraints

Let $[q]:=\{1,2,3, \ldots,q\}$, where $q$ is a positive integer. Consider a vector $\underline{\alpha}=(\alpha_A)_{A\subseteq [q], A\neq \emptyset}$, where each $\alpha_A \in \mathbb{R}$. Note that such a vector $\underline{\alpha}$ has $2^q-1$ entries.

Given such an $\underline{\alpha}$, we define the following:

$$ OBJ(\underline{\alpha}):=\sum_{A\neq \emptyset} \alpha_A \log(|A|), \quad v(\underline{\alpha})=\sum_{A\neq \emptyset} \alpha_A, \quad E(\underline{\alpha})=\sum_{\{ \{A,B\}: A,B \subseteq [q], A\cap B=\emptyset \}} \alpha_A \alpha_B, $$

where the sum in the definition of $E(\underline{\alpha})$ is taken over all unordered pairs of disjoint subsets of $[q]$.

Define $FEAS(1/4)=\{\underline{\alpha}: \alpha_A \geq 0 \text{ for all nonempty A },\, v(\underline{\alpha})=1, \, E(\underline{\alpha})\geq 1/4 \}$.

I believe that the following is true: $$OPT(1/4):= \max_{\underline{\alpha} \in FEAS(1/4) } OBJ(\underline{\alpha})=\frac{\log(\lfloor q/2 \rfloor \cdot \lceil q/2 \rceil)}{2}. $$

I know that it is true when $q$ is even (it was proven), but I want to show that it also holds for odd $q$. I have verified that this is true for $q=3,5,7,9$ using SageMath, but would like to prove it by hand for general odd $q$.

I think that this problem can be solved using the language of probability. It is natural to do so given that the sum $v(\underline{\alpha})=\sum \alpha_A=1$ in the set $FEAS(1/4)$.

Consider a random variable $X$ on $2^{q}\setminus \{\emptyset\}$ (the power set of $[q]$, excluding the empty set) such that $\mathbb{P}[X=A]=\alpha_A$, where $A \subseteq [q]$.

Note that $2\cdot E(\underline{\alpha})$ can be interpreted as the probability that from the sets in $2^{[q]}\setminus \{\emptyset\}$ one selects two disjoint sets $A$ and $B$. Since by definition of $FEAS(1/4)$, $2\cdot E(\underline{\alpha}) \geq 1/2$, we see that we are more likely to select two disjoint sets rather than two sets which have a nonempty intersection.

One can verify that the set $FEAS(1/4)$ is compact, so there exists some $\underline{\alpha}^*\in FEAS(1/4)$ for which $F(\underline{\alpha}^*)=\max_{\underline{\alpha} \in FEAS(1/4) } F(\underline{\alpha})$. I conjecture that this vector $\underline{\alpha}^*$ has exactly 2 nonzero entries $\alpha^*_{A_1}=1/2$ and $\alpha^*_{A_2}=1/2$, where $|A_1|=\lfloor q/2 \rfloor$, $|A_2|=\lceil q/2 \rceil$, $A_1\cap A_2 =\emptyset$, and $A_1\cup A_2=[q]$. That is, the sets $A_1$ and $A_2$ form a partition of the first $q$ positive integers and their sizes are as equal as possible.

If it is difficult to prove this for general odd $q$, how would I prove it for, say, $q=5$? I would like to avoid using Lagrange multipliers with so many variables.

Let $\gamma \in \mathbb{R}$ such that $0\leq \gamma \leq \frac{q-1}{2q}$. Then $$OPT(\gamma)=\{\underline{\alpha}: \alpha_A \geq 0 \text{ for all nonempty A },\, v(\underline{\alpha})=1, \, E(\underline{\alpha})\geq \gamma \}.$$

In the paper for which I provided the link above, Serguei Norine shows that $OPT(\gamma)\leq \log(q(1-2\gamma))$ , with equality holding if and only if $\gamma=\frac{r-1}{2r}$ for some positive integer $r$ dividing $q$. When $r=2$ we see that $\frac{r-1}{2r}=1/4$, so $OPT(1/4)=\log(q/2)$ if and only if $q$ is divisible by two.


Solution 1:

Let's do it for sufficiently large $q$. It is going to be somewhat long and boring but, I hope, somewhat illuminating as to what is going on here.

Step 1: If there is any counterexample, then there is a counterexample in which the number of sets with positive associated probabilities is the minimal possible. Now notice that such a minimal counterexample cannot contain $3$ sets with positive associated probabilities with pairwise non-empty intersections or two such intersecting sets of the same cardinality. Indeed, if $A,B,C$ are three pairwise intersecting sets, then (I prefer $p$ to $\alpha$ for probabilities) we can fix $p_A+p_B+p_C$ and $p_A\log |A|+p_B\log|B|+p_C\log|C|$ (two linear constraints allowing at least one-dimensional linear perturbation) and notice that the sum of products of probabilities of pairwise disjoint sets can be written as $ap_A+bp_B+cp_C+d$ where $a,b,c,d$ depend only on probabilities of other sets, i.e., we have another linear functional of $p_A,p_B,p_C$ whose value we can now increase or, at least, keep fixed when doing our linear perturbation all the way until one of the probabilities $p_A,p_B,p_C$ hits $0$. The same argument works for two intersecting sets $A,B$ of the same cardinality if one notices that fixing $p_A+p_B$ and $p_A\log|A|+p_B\log|B|$ is now one linear constraint rather than two.

Step 2: Let us recap the proof of the bound $\log(q/2)$. It doesn't use Step 1 but can be combined with it any time. Assume that we have some sets $A_i$ with positive associated probabilities $p_i$. Define $$ Q_i=\frac{|A_i|}{q};\quad P_i=\sum_{j:A_j\cap A_i\ne\varnothing}p_j. $$ Then our conditions are $\sum_i p_i=1$, $\sum_i p_iP_i\le \frac 12$ and we want to show that $\sum_i p_i\log (2Q_i)\le 0$. We just use the elementary inequalities $$ \log(2Q_i)\le 2(\sqrt{2Q_i}-1)\le \frac {Q_i}{P_i}+2P_i-2. $$ Now if we multiply by $p_i$, add up, and notice in addition to the given conditions that $\sum_i p_i\frac {Q_i}{P_i}\le 1$ (just because $\sum_i \frac{p_i}{P_i}\chi_{A_i}\le 1$ pointwise where $\chi_A$ is the characteristic function of the set $A$), we get the desired inequality at once.

For large odd $q$ we would like to write the same inequality but with $\log (2Q_i)+\frac 12\log\frac 1{1-\delta^2}$ on the left hand side where $\delta=1/q$. The advantage now is that $2Q_i$ can take only the values $1\pm\delta,1\pm 3\delta,\dots$, so we can gain a bit. Let's see how much. Assume $2Q_i=1+m\delta$ where $m$ is $\pm 1$ or $\pm 3$ (since $\log X$ is a concave function of $\sqrt X$, the gain for $|m|\ge 3$ is at least as large as the gain for $|m|=3$). Then in the desired inequality $$ \log (2Q_i)+\frac 12\log\frac 1{1-\delta^2}\le 2(\sqrt{2Q_i}-1) $$ we have the left hand side equal to $m\delta-\frac{m^2-1}2\delta^2+O(\delta^3)$ and the right hand side equal to $m\delta-\frac{m^2}4\delta^2+O(\delta^3)$. For $m=\pm 1$, we have not a gain, but the loss of $\frac 14\delta^2+O(\delta^3)$ (the amount by which the LHS exceeds the RHS), but for $m=\pm 3$ we have already a gain of $\frac 74\delta^2+O(\delta^3)$, which is almost $7$ times larger. It means that if we have a counterexample, then $$ \sum_{i:2Q_i=1\pm\delta}p_i\ge \frac 78-\varepsilon(q) $$ where $\varepsilon(q)\to 0$ as $q\to\infty$. Combined with Step 1, we see that in any minimal counterexample, we have at most 3 sets (at most one of cardinality $\frac{q+1}2$ and at most two of cardinality $\frac{q-1}2$) that dominate the whole family.

Step 3: In any counterexample we must have at least one set with $2Q_i=1-\delta$ and at least one set with $2Q_i=1+\delta$. Indeed, we do not really need the factor on the RHS in the inequality $$ \log (2Q_i)+\frac 12\log\frac 1{1-\delta^2}\le 2(\sqrt{2Q_i}-1) $$ to be exactly $2$. Any inequality of the kind $$ \log (2Q_i)+\frac 12\log\frac 1{1-\delta^2}\le \beta(\sqrt{2Q_i}-1) $$ with positive $\beta$ will work equally well. So if one of the test-points $1\pm\delta$ is actually missing, we can rotate the line a bit to pass it through the other, then necessarily present, point, for which we need to move the corresponding value by about $\frac 14\delta^2$. This will move the values at $1\pm3\delta$ by about $\frac 34\delta^2$, which is safe because the leeway there is about $\frac 74\delta^2$ and the rest of the points also stay below the line by concavity. I leave the details to you.

Thus, in a minimal counterexample, we must have exactly one set of cardinality $\frac {q+1}2$ (let's call it $A$) and at least one set of cardinality $\frac{q-1}2$.

The rest is a boring casework. Let's start with easy cases.

Suppose that the complement of $A$ is not present (in the sense that the associated probability is $0$). Then we should have either one set $B$ of cardinality $\frac{q-1}2$ overlapping with $A$ or 2 such sets $B$ and $C$ with $B\cap C=\varnothing$. In any case, there cannot be any set $D$ of cardinality greater than $\frac{q+1}2$ in the family because then $A,B,D$ will form a triple of pairwise intersecting sets. Thus, it will suffice to show that $p_A\le\frac 12$. Assume the contrary. Then in the first subcase (single set $B$), we have $(p_A+p_B)^2\le \frac 12$, so $p_A+p_B\le\frac 1{\sqrt 2}$, which contradicts the inequality $p_A+p_B\ge \frac 78-\varepsilon(q)$ from Step 2 for large $q$. Assume now that we have two sets $B$ and $C$. Then the condition that the probability of choosing two intersecting sets is at most $\frac 12$ implies that $(p_A+p_B+p_C)^2-2p_Bp_C\le\frac 12$. However, if $p_A>\frac 12$, then $p_B+p_C\le \frac 12$, so $2p_Bp_C\le \frac 18$ and $p_A+p_B+p_C\le \sqrt{\frac 58}$, which is again short of $\frac 78-\varepsilon(q)$ for large $q$.

Thus the complement of $A$ (let's call it $B$) has to be present. Assume that there is one more set $C$ of cardinality $\frac{q-1}2$. Then that set has to intersect with $A$ and, again, no set $D$ can have cardinality greater than $\frac{q+1}2$. All sets other than $A,B,C$ must have cardinalities at most $\frac{q-3}2$. Let $\sigma$ be the sum of their probabilities. Since every set must intersect either $A$ or $B$ and we are interested only in the case $p_A>\frac 12$ in this setting, we have $$ (p_A+p_C)^2+p_B^2+2p_B\sigma\le\frac 12. $$ One can check (just comparing $\log$ with the linear function through the two points of interest) that we have the desired inequality if $p_A\le\frac 12+\sigma$. However, the inequality above rewrites as $$ \frac 12+2(p_A+p_C-\frac 12)^2-\sigma^2=(p_A+p_C)^2+(p_B+\sigma)^2-\sigma^2\le\frac 12 $$ (the first identity follows from $p_A+p_B+p_C+\sigma=1$), so we conclude that even $p_A+p_C\le\frac 12+\sigma$.

Thus we are left with the case when we have one set $A$ of cardinality $\frac{q+1}2$, its complement $B$ and some sets of other cardinalities. If there is no set of cardinality $\frac{q+3}2$ or higher, we can treat this case similarly to the one just considered. So, suppose that a set $C$ of high cardinality is present. Then it intersects $A$ and $B$ but no other set (otherwise we would get an intersecting triple). Thus, if $|C|=\frac{q+m}{2}$ with some $m\ge 3$, then all other sets have cardinalities $\le\frac{q-m}{2}$. Let again $\sigma$ be the sum of probabilities of sets other than $A,B,C$. By the same comparison of $\log$ with the linear function through 2 interesting points, we conclude that it would suffice to prove the inequality $$ p_A-p_B\le 3(\sigma-p_C). $$ The restriction is now $$ p_A(p_A+p_C)+p_B(p_B+p_C)+p_C(p_A+p_B+p_C)+2\min(p_A,p_B)\sigma\le\frac 12. $$ Let us denote $p=\max(p_A,p_B), q=\min(p_A,p_B), r=p_C$ for brevity. The restriction will rewrite as $$ p(p+r)+q(q+r)+r(p+q+r)+2q\sigma\le\frac 12 $$ and we need to conclude that $p-q\le 3(\sigma-r)$. We also have the normalization $p+q+r+\sigma=1$. Let $p=s+t, q=s-t$. Then we can rewrite the restriction as $$ 2s^2+2t^2+r^2+4sr+2s\sigma-2t\sigma\le\frac 12 $$ or $$ (s+\sigma)^2+(s+r)^2+2sr+t^2+(t-\sigma)^2-2\sigma^2\le \frac 12 $$ Since $(s+\sigma)+(s+r)=p+q+r+\sigma=1$, the sum of the first two squares is already at least $\frac 12$, so we conclude that $2sr-2\sigma^2\le 0$. Since $2s=p+q$ is at least almost $7/8$, we have $\sigma\le s$ and thus $\sigma\ge r$. Let $\sigma=r+u$, $u>0$. Then the sum of the first two squares in the last displayed formula is $\frac 12+\frac 12u^2$, so we get $$ \frac 12u^2+2sr+2t^2-2t(r+u)-(r+u)^2\le 0 $$ i.e., $$ 2sr+2t^2-2tr-2tu-r^2-2ru-\frac 12u^2\le 0 $$ and we want to prove that $u\ge \frac 23t$.

Assume that $u<\frac 23 t$. Then the LHS is at least $$ \frac 49 t^2+2sr-\frac{10}3rt-r^2=\left(\frac 23 t-\frac 52 r\right)^2+2sr-\frac{29}{4} r^2 $$ so it remains to check that $2s>\frac{29}4 r$. However $2s$ is almost $\frac 78$ while $\frac{29}4r\le 4(2r)$ and $2r\le 2r+u=r+\sigma=1-2s$ is only about $\frac 18$ at most.

The careful analysis of the proof shows that it is enough to get $4/5$ instead of $7/8$. It looks like choosing the comparison line parallel to the line through the 2 interesting points instead of parallel to the tangent line at $1$ allows one to cover all $q\ge 5$ ($q=3$ is special anyway), but I haven't checked this carefully. Feel free to ask question if something is unclear.

Solution 2:

This is not an answer but some thoughts.

It seems difficult to me to verify that OPT even for $q=5.$ What you have is, naively, a nonconvex optimization problem over $2^q-1$ variables. So without some extra insight it seems hopelessly difficult computationally.

Here is an idea for a stronger conjecture that can be verified as a nonconvex optimization in $q^2$ dimensions, more specifically testing copositivity of a $q^2\times q^2$ matrix. That's still very difficult computationally, but slightly better in theory.

Let $$\hat E(\underline\alpha)=\sum_{A,B\in 2^{[q]}\setminus\{\emptyset\}}\alpha_A\alpha_B\frac{|A\cap B|}{\min(|A|,|B|)}.$$

$\hat E(\underline\alpha)$ is a lower bound on $1-2E(\underline\alpha).$ Optimistically we might have $$OBJ(\underline\alpha)\leq mq\hat E(\underline\alpha)+c\qquad(?)$$ where $m,c$ are the unique numbers satisfying $mx+c=\log x$ for $x\in \{\tfrac{q-1}2,\tfrac{q+1}2\}.$ This would imply your OPT by plugging in $\hat E(\underline\alpha)=\tfrac12.$

Homogenizing, we want $$OBJ(\underline\alpha)v(\underline\alpha)\leq mq\hat E(\underline\alpha)+cv(\underline\alpha)^2\qquad(?)$$ for all vectors $\alpha$ with non-negative entries.

The functions $OBJ,v,\hat E$ are homogeneous linear, linear, and quadratic functions respectively of the $q^2$ numbers

$$p_{i,j}(\underline\alpha)=\sum_{\substack{|A|=i\\A\ni j}}\alpha_A/|A|$$

In particular

$$\hat E(\underline\alpha)=\sum_{i,j,k=1}^q\max(i,k)p_{i,j}(\underline\alpha)p_{k,j}(\underline\alpha).$$

So this reduces to a slightly easier problem computationally.