Maximizing curious symmetric function from simple combinatorics

A curious symmetric function crossed my way in some quantum mechanics calculations, and I'm interested its maximum value (for which I do have a conjecture). (This question has been posted at MathOverflow on 13.12.2014; the question has been solved in a very nice way there.)

The problem

There are $n$ different objects $A_1,...,A_n$, and there are sets containing $m$ different $A_i$s: $C_i=(A_{i_1}, A_{i_2}, ..., A_{i_m})$. There are $i_{max}=\binom{n}{m}$ different combinations $C_i$. Each combination $C_i$ has a probability $p_i$ (with $\sum_{i=1}^{i_{max}} p_i=1$).

Defining the function

For a given pair of objects $A_k$ and $A_l$:

  • $f_1(k,l)$ contains all probabilities $p_i$ of the sets $C_i$, which contains both objects $A_k$ and $A_l$.
  • $f_2(k,l)$ contains all probabilities $p_i$ of the sets $C_i$, which contains either object $A_k$ or $A_l$ (if it contains both elements, we add $p_i$ twice).
  • $F(k,l)=\frac{f_1(k,l)}{f_2(k,l)}$

With that, we get the main-function $$D^{(n,m)}=\sum_{k=1}^{n-1} \sum_{l=k+1}^{n} F(k,l)$$

What is the maximum of $D^{(n,m)}$, given that the sum of all probabilities $p_i$ is 1?

Special cases

n=2, m=2

This is a trivial case. We have two objects $A_1$ and $A_2$, and only one set of combinations $C_1=(A_1,A_2)$ with $p_1$.

Thus $f_1(1,2)=p_1$, $f_2(1,2)=p_1+p_1$. This leads to $D^{(2,2)}=F(1,2)=\frac{1}{2}$.

Every other case with $n=m$ can be solved easily by $D^{(n,m)}=\frac{1}{n}$


n=3, m=2

This case is simple (but not trivial) and I found a solution:

We have n=3 objects $A_1$, $A_2$ and $A_3$, and combinations $C_i$ of m=2 objects $C_1$=($A_1$, $A_2$), $C_2$=($A_1$, $A_3$), $C_3$=($A_2$, $A_3$), with $p_1$, $p_2$, $p_3$ respectivly.

For k=1, l=2 we have $f_1(1,2)=p_1$ (because only $C_1$ contains both $A_1$ and $A_2$), and $f_2(1,2)=2p_1+p_2+p_3$ (because $A_1$ is contained in $C_1$ and $C_2$ and $A_2$ is in $C_1$ and $C_3$).

So we get $$D^{(3,2)}=F(1,2) + F(1,3) + F(2,3) = \frac{p_1}{2p_1+p_2+p_3} + \frac{p_2}{p_1+2p_2+p_3} + \frac{p_3}{p_1+p_2+2p_3}$$ A maximum can be found easily (due to normalisation of $p_1+p_2+p_3=1$): $$D^{(3,2)} = \frac{p_1}{1+p_1} + \frac{p_2}{1+p_2} + \frac{p_3}{1+p_3}$$ so each term can be maximized individually, which gives $D^{(3,2)}=\frac{3}{4}$ for $p_1=p_2=p_3$.


n=4, m=2

We have four objects $A_1$, $A_2$, $A_3$, $A_4$, and six combinations $C_1=(A_1,A_2)$, $C_2=(A_1,A_3)$, ..., $C_6=(A_3, A_4)$.

Therefore we get: $$D^{(4,2)} = \frac{p_1}{2p_1+p_2+p_3+p_4+p_5} + \frac{p_2}{p_1+2p_2+p_3+p_4+p_6} + \frac{p_3}{p_1+p_2+3p_3+p_5+p_6} + \frac{p_4}{p_1+p_2+2p_4+p_5+p_6} + \frac{p_5}{p_1+p_3+p_4+2p_5+p_6} + \frac{p_6}{p_2+p_3+p_4+p_5+2p_6}$$

I was not able to find a method for proving a global maximum.


n=4, m=3

We have $C_1=(A_1,A_2,A_3)$, $C_2=(A_1,A_2,A_4)$, $C_3=(A_1,A_3,A_4)$, $C_4=(A_2,A_3,A_4)$, which gives

$$D^{(4,3)}=\frac{p_1+p_2}{2p_1+2p_2+p_3+p_4}+\frac{p_1+p_3}{2p_1+p_2+2p_3+p_4}+\frac{p_1+p_4}{2p_1+p_2+p_3+2p_4}+\frac{p_2+p_3}{p_1+2p_2+2p_3+p_4}+\frac{p_2+p_4}{p_1+2p_2+p_3+2p_4}+\frac{p_3+p_4}{p_1+p_2+2p_3+2p_4}$$

This case can be simplified aswell, similar to $n=3,m=2$ case to $$D^{(4,3)}=\frac{p_1+p_2}{1+p_1+p_2}+\frac{p_1+p_3}{1+p_1+p_3}+\frac{p_1+p_4}{1+p_1+p_4}+\frac{p_2+p_3}{1+p_2+p_3}+\frac{p_2+p_4}{1+p_2+p_4}+\frac{p_3+p_4}{1+p_3+p_4}$$

but I'm not able to find any further method to calculate the maximum.

Conjecture

The two cases I solved had a maximum at equal $p_i=\frac{1}{\binom{n}{m}}$. Furthermore, the function $D^{(n,m)}$ is very symmetric, so I expect that the maximum is always at $p_1=p_2=...=p_i$. Numerical search up to n=7 confirms my expectation (but I'm not 100% sure about my Mathematica-based numerical maximization).

Questions

  1. How can you prove (or disprove) that the maximum for $D^{(n,m)}$ for arbitrary $n$ and $m$ is always at $p_1=p_2=...=p_i$?
  2. Is there literature on similar problems or is this function even known? Has the similarity to the Shapiro inequality some significance or is it just a coincidence?
  3. Is there a better (maybe geometrical) interpretation of this function?
  4. Can you find solutions for any other special case than $n=m$ (always trivial) and $n=3,m=2$?

Solution 1:

Partal solution:


The case of $m=n-1$ can be solved easily.

It is easy to check that $$ D^{(n,n-1)} = \sum_{1\le i < j \le n} \frac{p_1+\cdots + p_{i-1}+p_{i+1}+\cdots+p_{j-1}+p_{j+1}+\cdots +p_n}{2p_1+\cdots + 2p_{i-1}+p_i+2p_{i+1}+\cdots+2p_{j-1}+p_j+2p_{j+1}+\cdots +2p_n} \\ =\sum_{1\le i < j \le n} \frac{1-p_i-p_j}{2-p_i-p_j}$$

Now since $f(x)=\frac{1-x}{2-x}$ is concave, we may apply Jensen's inequality to obtain $$\sum_{1\le i < j \le n} \frac{1-p_i-p_j}{2-p_i-p_j} \le \frac{1-2/n}{2-2/n} = \frac{n-2}{2n-2}$$ since the average of the $p_i+p_j$s is $\frac{2}{n}$. Equality holds if and only if $p_1=\cdots = p_n=1/n$.


The case of $m=2,n=4$ is also not so difficult to solve.

We shall first prove that $\frac{a}{1+a-b} + \frac{b}{1+b-a} \le a+b$ if $a+b \le 1$. This inequality is true since it is equivalent to $(1-a-b)(a-b)^2 \ge 0$ after expansion. Now $$D^{(4,2)} = \frac{p_1}{2p_1+p_2+p_3+p_4+p_5} + \frac{p_2}{p_1+2p_2+p_3+p_4+p_6} + \frac{p_3}{p_1+p_2+2p_3+p_5+p_6} + \frac{p_4}{p_1+p_2+2p_4+p_5+p_6} + \frac{p_5}{p_1+p_3+p_4+2p_5+p_6} + \frac{p_6}{p_2+p_3+p_4+p_5+2p_6} \\ = \sum_{i=1}^{3} \left(\frac{p_i}{1+p_i-p_{i+3}} + \frac{p_{i+3}}{1+p_{i+3}-p_i} \right) \le \sum_{i=1}^{3} p_i+p_{i+3} = 1$$ Equality obviously holds if $p_1=p_2=\cdots=p_6$, but there are other cases of equality such as $(p_1,\cdots,p_6)=(k,0,0,1-k,0,0)$ or $(a,b,c,a,b,c)$.


However, I expect it will be extremely complicated to push it further by these kinds of brute-force methods.

Solution 2:

Not an answer - In case this helps:

Let $T$ be the indicator $c \times n$ matrix, with $c = {n\choose m}$, and $T_{i,j} \in \{0,1\}$

Then $$f_1(k,j)=\sum_{i=1}^c p_i T_{i,k} T_{i,j}$$

$$f_2(k,j)=\sum_{i=1}^c p_i (T_{i,k} + T_{i,j})$$

$$D = \sum_{k<j}\frac{\sum_{i=1}^c p_i T_{i,k} T_{i,j}}{\sum_{i=1}^c p_i (T_{i,k} + T_{i,j})}=\sum_{\ell}\frac{a_\ell}{b_\ell} \tag{1}$$ here $\ell$ indexes all pairs $(j<k)$.

Notice that $A=\sum_\ell a_\ell= \#\ell \, \langle T_{i,j} T_{i,k}\rangle $ where $\#\ell=n(n-1)/2$ and $$\langle T_{i,j} T_{i,k} \rangle=\frac{m (m-1)}{n (n-1)}$$ is an average over all pairs $(j<k)$ and all rows (notice that it not depends on $p_i$).

Similarly $B=\sum_\ell b_\ell= \#\ell \, \langle T_{i,j}+ T_{i,k}\rangle $, with $$\langle T_{i,j} + T_{i,k}\rangle=\frac{2 m }{n}$$

Further, when $p_i=1/c$ we get $$D_0 =\#\ell \frac{\langle T_{i,j} T_{i,k} \rangle}{\langle T_{i,j} +T_{i,k}\rangle}= \#\ell \frac{A}{B}= \frac{n(n-1)}{2} \frac{m-1}{2(n-1)} = \frac{n(m-1)}{4}$$

So, for the conjecture to be true, we'd need to prove that $(1)$ is concave in $p_i$ or

$$\sum_{\ell}\frac{a_\ell}{b_\ell}\le \#\ell \frac{A}{B}=\#\ell \frac{\sum_{\ell} a_\ell}{\sum_{\ell} b_\ell}$$

or

$$ \langle \frac{a_\ell}{b_\ell} \rangle \le \frac{\langle a_\ell \rangle }{\langle b_\ell \rangle}$$

Update: The expression $(1)$ does not seem to be concave on $p$ - so the conjecture stands unresolved.

For the record, I evaluated $D_e$ as the limit value for the probability concentrated on a single value, and I got

$$D_e=\frac{n}{(m-1)4}\frac{1-\frac{1}{n}+\frac{2}{n\,\left( n-m\right) }}{1+\frac{2}{m\,\left( n-m\right) }} < D_0$$

Though this extreme is less than the conjectured maximum $D_0$, notice that the difference vanishes as $n,m$ grow.