Minimizing the probability of a draw in a democratic poll

A group of $k$ people wants to choose democratically between $n$ possible options. They arrange a poll in which every person votes for $r$ out of the $n$ options without repetition, meaning there are $n \choose r$ possible choices per person. The highest scoring option will be the winning choice - unless there's a draw of course...

Question: Assuming each person chooses $r$ out of $n$ options with uniform probability. What is the value of $r$ for which the probability of the poll ending in a draw is minimal? Hopefully the answer will be unique and in a closed form $r=f(k,n)$.

Sadly beyond trivial remarks about the extreme cases (like $r=n$ implies draw will happen with full probability and $r,k << n$ implies draw will be pretty likely because of sparse voting) I have nothing valuable to say about my attempts so far.

Edit: To anyone who downvotes this - please explain what you think is wrong with the question. To make things clear: This is not a homework problem!. Still, if you think this is a trivial problem please explain your solution - if not then please reconsider the downvote.


Solution 1:

We treat this rigorously for $k=2$, and conjecturally for all $k$.

For $k=2$, we find $r \sim \sqrt{n}$.

When A and B each vote $r$ options, they avoid a tie iff B chooses $1$ option already chosen by A, and then chooses $(r-1)$ options not chosen by A. So the probability of avoiding a tie is $$p=r {n-r \choose r-1} \ / \ {n \choose r}.$$ This probability can not be maximized at a boundary like $r=0$ or $r=n$, since those give a probability of $0$, so it must be maximized somewhere where the discrete derivative is roughly $0$.

The optimal $r$ will therefore be a solution to $$\frac{r {n-r \choose r-1}}{ {n \choose r}} \sim \frac{(r+1) {n-r-1 \choose r}}{ {n \choose r+1}}.$$ We can actually solve this: Cross-multiplying and cancelling reduces it to $$r^2(n-r)^2 \sim (r+1)^2(n-2r)(n-2r+1).$$ Neglecting the final $+1$ and square-rooting leads to a quadratic whose solution is $$r \sim \sqrt{n+1} -1.$$ To a first approximation $r\sim\sqrt{n}$, and more precisely $r$ is an integer within $2$ of $\sqrt{n}$.

Generalization for higher k

This suggests a strategy for higher $k$, namely targeting an expectation that exactly one option will receive two votes.

The $k,n,r$ scenario leads to exactly one option getting two votes with probability $$p=\frac{{n \choose kr-1}(kr-1)!(k-1)r^2} {\left({n \choose r}r!\right)^k}$$

Meanwhile the expected number of options receiving two votes is $$n{k \choose 2}\left(\frac{r}{n}\right)^2 \left(\frac{n-r}{n}\right)^{k-2}.$$ So we choose $$r \sim \sqrt{n/{k \choose 2}}$$ and for large $n$, this avoids ties with limiting probability $$p \sim \frac{2}{ke}.$$

Solution 2:

I think this is quite difficult.

Here's a simulation with $n=15$, $k=6$. The abscisa is $r$, in blue (right axis) we have the probability of draw, and in red (left axis) the average votes for the winner.

I cannot be sure about programming errors, but I'm quite sure that the striking uglyness of the blue line (the one that determines the minimum we are interested in, i.e, $r=f(k,n)$) is not due to statistical noise - each run consists of 10^6 trials, and different simulations give practically the same values.

enter image description here

In case you are interested, here's the source (mind you, you need to keep the number of tries low if you want to run it in Ideone).

Update: Here's the same graph for $n=100$, $k=5$. Honestly, I find that oscillation totally unexpected and hard to believe, but...

enter image description here

... but it seems that that surprising oscillation also appears in the somewhat simpler model (it can be evaluated numerically, at least) in which each vote is iid Bernoulli with $p=r/n$ (or equivalently, where the number of votes by each voter follows a Binomial distribution with mean $r$). Interesting.

Update 2: In the (not the same but similar) model in mentioned above, the probability of draw is given by

$$\begin{align} P_d(n,k,p) &= 1- n \sum_{m=1}^k B(m;p,k) \left[\sum_{t=0}^{m-1} B(t;p,k) \right]^{n-1}\\ &= 1- n \sum_{m=1}^k p^{m} (1-p)^{k-m} \binom{k}{m} \left[\sum_{t=0}^{m-1} p^{t} (1-p)^{k-t} \binom{k}{t} \right]^{n-1} \end{align} \tag{1} $$

where $p=r/n$, and $B(m;p,k)$ is the Binomial distribution with parameters $p,k$, evaluated at $m$. We should expect this model to give similar results to the original one, at least for not-so-small values of $n,r$; and indeed numerical experimentation seem to confirm this (green line in the graph above corresponds to $P_d$ in $(1)$).

Sadly, it doesn't seem easy to study the behaviour of $P_d$ in $(1)$ with respect to $p$, nor even asymptotically...

Added: more graphs of $P_d$, for $n=1000$, and $k=5,7,9,23$.

enter image description here

BTW, Matt F.'s answer seems to fit quite well with the first minimum (which is also the only local minimum for $k=2$).