How much information can you transfer by choosing one number out of two?

Solution 1:

For large $n$ we can approach an efficiency of $0.188$ bits per symbol, in the following way:

In the protocol Alice can send a bit to Bob either "reliable-but-slow" or "fast-but-unreliable".

In reliable but slow mode, Bob ignores all symbols other than $0$ or $1$, and Alice simply waits until she gets a pair that allows her to send the symbol she needs. This takes $n/2$ symbols on average.

In fast but unreliable mode, Alice sends any even number for $0$ and any odd number for $1$. One fourth of the bits she sends will be garbled because she doesn't have a number of the right parity to send.

Alice and Bob agrees in advance on a block size $b$, and the communication starts when Alice has accumulated $b$ bits to send. She sends them using fast-but-unreliable mode. Alice knows which bits went wrong, so she computes a correction sequence that Bob should XOR what he got with in order to get the real message. The bits in the correction sequence are not uniformly distributed: $0$ is three times as common as $1$. Therefore Alice can compress it using arithmetic coding to a length of about $0.811$ of the original $b$ (namely $-\frac34 \log_2(\frac34) - \frac14 \log_2(\frac14)$).

Alice uses reliable-but-slow bits to tell Bob the exact length of the compressed correction sequence, and now sends it using the same procedure as the original block -- that is, with its own correction sequence following, etc.

When the compressed corrections sequence is shorter than some prearranged length, Alice switches to sending it in reliabe-but-slow mode.

Analysis. When $b$ is large, the expected number of fast-but-unreliable symbols sent is $$ b + 0.811b + 0.811^2b + 0.811^3b + \cdots = \frac{1}{1-0.811} b $$ There will be on the order of $\log b$ nested correction sequences, each needing up to $\log b$ reliable-but-slow bits to encode their lengths. So all in all in the reliable-but-slow phases use about $\frac n2 (\log^2 b+C)$ symbols, where $C$ accounts for the final reliable-but-slow corrections sequence).

For each particular $n$, by choosing $b$ large enough, Alice and Bob can make the $\frac n2 (\log^2 b+C)$ as small compared to $\frac{1}{1-0.811} b$ as they want! So in the limit of large $b$ they ought to achieve $1-0.811 = 0.189$ bits per symbol.


If $n\le 10$, the reliable mode is actually faster than the asymptotic behavior here, and should be used instead.


I suspect this might be asymptotically optimal for large $n$. I tried variants, such as decreasing the error percentage of the "fast" mode at the expense of speed by assigning a fraction of the $n$ symbols as "ignore this symbol" markers instead of letting all of them encode actual bits. But it looks like that always results in worse overall performance.

Solution 2:

If Alice were not able to see the numbers before sending the max or min, the best possible rate would be:

$$C=\fbox{$1-\frac{1}{n-2}\sum_{k=2}^{n-1}H\left(\frac{n-k}{n-1}\right)$}$$

Alice in the OP's problem has strictly more capabilities than Alice as I have written her, so the rate "OP's Alice" can achieve is higher than the above formula.

For large $n$ the sum becomes like an integral: $$1 - \frac{1}{n-2}\sum_{y=2}^{n-1} H\left(\frac{n-y}{n-1}\right) \approx 1-\int_0^1 H(p) dp = 1-1/\log(4)\approx 0.2787.$$


Your 'choose two numbers' specifies a channel state:

$$Z=(z_1,z_2)\sim \operatorname{Uniform}\{(a,b), 1\leq a< b\leq n\}$$

Each turn, the sender gets to input either $z_1$ or $z_2.$ So the channel input is some $X$ over an alphabet $\{\phi_i:\phi_i(Z)=z_i\},$ and the channel output is $Y=X(Z).$ Now you have a channel:

$$X\rightarrow \overset{\huge\overset{Z}{\downarrow}}{\fbox{$P_{Y|X,Z}$}}\rightarrow Y$$

and by the noisy channel coding theorem it is sufficient to compute the channel capacity: \begin{align} C = \max_{p_X} I(X;Y) &=\max_{P_X} H(X)-H(X|Y) \\ &=\max_{p} H(p)-H(X|Y) \end{align} where $p=\mathbb{P}(X=\phi_1),\ H(p)=-p\log_2(p)-(1-p)\log_2(1-p)$ and $H(X|Y)=\sum_y P_Y(y)H(P_{X|Y=y}(\phi_1))$.

To be able to go further we need to start computing probabilities, which is simple if you set it up right. View your space of $Z$'s layed out as a triangle: \begin{matrix} (1,2) \\ (1,3) & (2,3) \\ \vdots & & \ddots \\ (1,n) & \dots & & (n-1, n) \end{matrix} Now we can see: \begin{align} P_{Y|X=\phi_1}(y)&\textstyle =P(Z\text{ in }y^{th}\text{ column})=\mathbf{1}_{[1:n-1]}(y)\frac{n-y}{\frac{(n-1)\cdot (n-2)}{2}}=\mathbf{1}_{[1:n-1]}(y)\frac{2\cdot (n-y)}{(n-1)\cdot (n-2)},\\ P_{Y|X=\phi_2}(y)&\textstyle =P(Z\text{ in }(y-1)^{th}\text{ row})=\mathbf{1}_{[2:n]}(y)\frac{y-1}{\frac{(n-1)\cdot (n-2)}{2}}=\mathbf{1}_{[2:n]}(y)\frac{2\cdot (y-1)}{(n-1)\cdot (n-2)},\\ P_Y(y)&\textstyle = pP_{Y|X=\phi_1}(y)+(1-p)P_{Y|X=\phi_2}(y) = \left\lbrace \begin{matrix} \frac{2 p\cdot(n-y)}{(n-1)\cdot (n-2)} & y=1 \\ \frac{2 p\cdot(n-y)+2\cdot(1-p)\cdot(y-1)}{(n-1)\cdot (n-2)} & y\in [2:n-1] \\ \frac{2\cdot(1-p)\cdot(y-1)}{(n-1)\cdot (n-2)} & y=n \end{matrix}\right.\\ P_{X|Y=y}(\phi_1)&\textstyle = \left\lbrace \begin{matrix} 1 & y=1 \\ \frac{p\cdot (n-y)}{p\cdot(n-y)+(1-p)\cdot(y-1)} & y\in [2:n-1] \\ 0 & y=n \end{matrix}\right. \end{align}

So in our entropy expansion we have: \begin{align} \max_{P_X}I(X;Y) &= \max_{p\in[0,1]} H(p) - \sum_{y=2}^{n-1}P(Y=y)H(X|Y=y) \\ &= \textstyle \max_{p\in[0,1]} H(p) - \sum_{y=2}^{n-1}\frac{2p\cdot(n-y)+2\cdot(1-p)\cdot(y-1)}{(n-1)\cdot (n-2)} H\left(\frac{p\cdot (n-y)}{p\cdot(n-y)+(1-p)\cdot(y-1)}\right). \end{align}

The critical points of this function in $p$ are: \begin{align} \textstyle \left\lbrace p: - \log_2\left(\frac{p}{1-p}\right) + \sum_{y=2}^{n-1}\frac{2p\cdot(n-y)+2\cdot(1-p)\cdot(y-1)}{(n-1)\cdot (n-2)} \log_2\left(\frac{p\cdot (n-y)}{(1-p)\cdot(y-1)}\right)- \sum_{y=2}^{n-1}\left(\frac{2\cdot (n-2y+1)}{(n-1)\cdot (n-2)}\right)H\left(\frac{p\cdot (n-y)}{p\cdot(n-y)+(1-p)\cdot(y-1)}\right) = 0 \right\rbrace. \end{align}

It is easy to check that $p^\ast=0.5$ is a critical point. (Indeed, in this case the first sum telescopes and cancels the first log, and the second sum becomes symmetric and cancels to 0 since $H(p)=H(1-p)$).

It is also the only critical point: the second derivative can be bounded below 0 (simple computation but tedious to write out, recall $H''(p)=1/(p^2-p)$ and $1-1/x \leq \log(x)\leq x-1$), thus the function is concave with its max at its critical point $p^\ast = 0.5.$

So the maximum possible rate of reliable information transmission boils down to:

\begin{align} C&= 1 - \sum_{y=2}^{n-1}\frac{1}{n-2} H\left(\frac{n-y}{n-1}\right). \end{align}

Solution 3:

Interesting problem.

A quick bound comes from considering, for $n$ even, that half of the values represent a binary $X=0$ input, the other half $X=1$. This gives a BSC channel with crossover error probability given by the event that the two numbers I must choose from are on the "wrong" half

$$ p_e = \frac{\binom{n/2}{2}}{\binom{n}{2}}=\frac14 \frac{n-\frac12}{n-1}\approx \frac14$$

This gives a capacity of $C=1-h(p)=0.18872 \text{ bits}$, which agrees with Henning Makholm's answer.

Some caveats

  • We are wasting bits (when both numbers are "good", the freedom of choosing one of the two is not used). Hence the capacity should be larger.

  • This result (and this encoding schema) is in the context of the classsic Shannon capacity, where information is sent with vanishingly small probability of error - but not necessarily with zero probability of error.


A better, probably optimal (and even simpler) schema:

Let our "channel" simply pick the maximum of the pair when $X_n=1$, the minimum otherwise. This is channel with two inputs and $n$ outputs. The transition matrix is given by the probability of the extrema:

$$P(Y = y \mid X = 1) = \frac{2 (y-1)}{n (n-1)} =g(y;n)$$

while for $P(Y = y \mid X = 0)$ we have the symmetric distribution.

Then the capacity of this (weakly symmetric) channel is

$$C=\log(n) - H(g(y;n))$$

where $H(g(y;n))=h_T(n-1)$ is the entropy of a triangular distribution with $n-1$ positive values. Calling $s=\frac{2}{n(n-1)}$

$$h_T(n-1)= -s \left( \log s \sum_{k=1}^{n-1} k + \sum_{k=1}^{n-1} k \log k\right)\\ =- \log s + s \log({\mathcal H}(n-1))$$

where ${\mathcal H}$ is the hyperfactorial.

The capacity decreases asymptotically (and very slowly) to $C_{\infty}=1-\frac{\log e}{2}\approx 0.2786$ bits (this agrees with enthdegree's value).

enter image description here

BTW: The $n=3$ case is, in this scheme, esentially a binary erasure channel, with $p_e=1/3$ and $C=2/3$ bits. A zero-error code with this rate (but unbounded length) can be built (rather trivially) if (and only if) we have feedback. But in this case the "feedback" is inside this virtual channel, so it would be feasible (as noted by Henning Makholm).