I've been thinking about the following game for a while and am curious if anyone has any ideas of how to analyze it.

Problem description

Say I have two biased coins: coin 1 that shows heads with probability $p$ and coin 2 that shows heads with probability $q$. You and I both know the statistics of the coins.

The game proceeds in multiple rounds as follows:

  • In the starting round $n=0$:

    • I (privately) pick a coin and flip it, we both observe the outcome
    • you decide to make a guess of which coin I just flipped, or continue watching
    • if you guess correctly, I pay you $\$100$; if you guess incorrectly, you receive no reward and the game is over
  • At each subsequent round $n\ge 1$:

    • I decide to stay with my current coin or reach into my pocket and swap out the current coin for the other coin
    • you can see whether I swapped out the coin or not (assume I must switch if I reach into my pocket)
    • I flip the coin and we both observe the outcome
    • you decide to guess which coin was just flipped, or keep watching
    • if you guess correctly, I pay you $\$100\cdot\delta^n$, where $\delta\in(0,1)$; if you guess incorrectly the game ends with you getting nothing

Question

I want to find the "best" switching strategy to minimize the (expected) amount of money I have to pay you.

Notes

The probabilities $p$ and $q$ can take on any value, but let's assume that they cannot be equal.

Since you are trying to maximize your reward, the discount factor $\delta$ incentives you to guess correctly as quickly as possible.

Since there are only two coins and you observe when I switch, you are trying to discern between two possible coin sequences, one where the initial coin was coin 1 and the other where the initial coin was coin 2.

My first thoughts are that I would want to keep the empirical averages (of the two sequences) as close as possible to each other. Intuitively this will be easy if $p$ and $q$ are close, but hard if they are far apart.


Here are solutions for some very special cases. Assume, without loss of generality, that $\ p> q\ $.

If $\ p+q \le 1\ $, and $\ \delta\le \frac{1-q}{2-p-q}\ $, you can keep my expected winnings to at most $\ \frac{100\,(1-q)}{2-p-q}\ $ dollars by choosing coin $1$ with probability $ \frac{1-q}{2-p-q}\ $ and coin $2$ with probability $ \frac{1-p}{2-p-q}\ $. I can ensure that my expected winnings are at least $\ \frac{100\,(1-q)}{2-p-q}\ $ dollars by guessing coin $1$ if the result of the first toss is heads, or coin $1$ with probability $ \frac{1-p-q}{2-p-q}\ $ and coin $2$ with probability $ \frac{1}{2-p-q}\ $ if the result of the first toss is tails. Because I can't win more than $\ 100\,\delta \le \frac{100\,(1-q)}{2-p-q}\ $ dollars by waiting until the next toss, I have nothing to gain by doing so.

If $\ p+q = 1\ $ (and therefore $\ p>\frac{1}{2}\ $, given the above assumption that $\ p>q\ $), and $\ \delta\le p $, you can keep my expected winnings to at most $\ 100\,p\ $ by choosing either coin $1$ or $2$ with probability $\ \frac{1}{2}\ $ each. I can ensure my expected winnings are at least $\ 100\,p\ $ by guessing coin $1$ if the result of the first toss is heads, or coin $2$ if the result of the first toss is tails. Again I can do no better by waiting for the second toss.

If $\ p+q > 1\ $, and $\ \delta\le \frac{p}{p+q}\ $, you can keep my expected winnings to at most $\ \frac{100\,p}{p+q}\ $ by choosing coin $1$ with probability $\ \frac{q}{p+q}\ $ and coin $2$ with probability $\ \frac{p}{p+q}\ $. I can ensure my expected winnings are at least $\ \frac{100\,p}{p+q}\ $ by guessing coin $2$ if the result of the first toss is tails, or coin $1$ with probability $\ \frac{1}{p+q}\ $ and coin $2$ with probability $\ \frac{p+q-1}{p+q}\ $ if the result of the first toss is heads.Once again, I can't do better by waiting for the second toss.


This is a problem of Bayesian parameter estimation. The following link Bayesian coin flips might be useful. If you are interested in simulation , check out this link.

In any case essential here is the fact that the player doing the guessing can keep track of the outcomes related to each of the two coins (even if initially he does not know which one is the p-coin and which one is the q-coin).

The best strategy for the player doing the switching is to keep the number of outcomes related to each of the two coins, equal or close to equal, because the more outcomes are available to the guessing player (related to a coin), the probability of guessing correctly increases. In other words, the switching player must switch the coins at almost every stage of the game.

Edit. If you want an interesting statistical approach , see the related question confidence two biased dice are the same (and one of the comments, the Kolmogorov-Smirnov test). The transition from dice to coins is obvious.