Optimal strategy for Jackpot Rock Paper Scissors
Jackpot Rock Paper Scissors is a gambling variant of Rock Paper Scissors, wherein ties result in the wager being carried forward into a jackpot.
If a player plays the same hand (rock, paper or scissors) 3 or more consecutive turns and wins on that turn ... they win the total jackpot, and the game ends.
Each turn, the player must wager $1. The possible outcomes:
Winning: You get $2 (your money back, and the opponents)
Losing: You lose the $1 bet amount (it's given to the opponent)
Tie: You and your opponent lose $1. The jackpot increases by 2 dollars.
To prevent problem gambling (and infinite games), if 1000 turns elapse and the game has not ended and the jackpot is split 50-50,
The goal of the game is to maximize your own earnings -- not to necessarily win the jackpot.
Running a simple simulation, playing purely randomly -- I get these statistics:
--- Random vs Random (over 1000000 games) ---
Expected Return: -0.0212
Average turns: 48.355552
Average jackpot: 32.245392
Biggest jackpot: 440
Longest Game: 600 turns
Biggest Win: 210
----------------------
However, I have trivially been able to beat pure random play in my simulations (give more bias to a play that would win you the jackpot), and trivially been beat that strategy. What I have not been able to do, however, is come up with a strategy that I can not beat myself.
So I put it to you: For such a game, what would be the optimal strategy?
Solution 1:
Call the strategies of rock, paper, and scissors A, B, and C: C beats B beats A beats C. Label the possible positions in this game with $2(n-1)$ dollars in the pot as either: $T_{n-1}$ if the previous result was a tie; $G_{n-1}$ if player I has a winning streak of 1 using strategy A (where A could be any of rock, paper or scissors); or $H_{n-1}$ if player I has a winning streak of 2 using strategy A (and will thus take the pot if he wins the next round using strategy A). We are using $n-1$ for the amount in the pot rather than $n$ for algebraic convenience when presenting the sub-game matrices.
(Clearly, this omits the cases where player II has a winning streak, but the values of those positions $G^{-}_{n-1}$ and $H^{-}_{n-1}$ are just the negatives of the values of the corresponding $G_{n-1}$ and $H_{n-1}$ positions.)
Let the value of any position in this game be the optimum game value minus the result that would happen if the players were to quit and immediately split the pot. Then by symmetry $V(T_{n-1}) = 0$ for all $n >0$ and the optimal strategy for any $T_{n-1}$ position is the trivial one of choosing A, B, or C each with probability 1/3.
Because the value of $T_{n-1}$ is zero, in the position analyses the values and strategies are unaffected if we were to say we don't care how the $2(n-1$ dollars got into the pot, and that any tie ends the game by splitting the pot (and that the players. So the game overall for some value of $n-1$ is characterized by which of 4 positions you are at.
From $G_{n-1}$ the game may end (by a tie, in which case the players restart the game at $T_n$ but since that has zero value we don't care), transition to $H_{n-1}$ with player I gaining a dollar, or transition to $G^{-}_{n-1}$ with player II gaining a dollar.
From $H_{n-1}$ the game may end by a tie, transition to $G_{n-1}$ with player I gaining a dollar having won using strategy B or C, transition to $G^{-}_{n-1}$ with player II gaining a dollar, or end with player I winning the pot (gaining $n$ dollars altogether) by winning using strategy A.
For $n=1$, with no money in he pot, their is no advantage at all to having a winning streak, the position values are all zero, and the optimal strategies are the trivial equal-probability strategies. All the interesting features are in games where there is someting in the pot.
The two game matrices (for $n>1) are: $$ G_{n-1} = \left( \begin{array}{ccc} 0 & -g & h \\ +g & 0 & -g \\ -g & g & 0 \end{array} \right) $$ (where for convenience we introduce $g \equiv 1+V(G_{n-1})$ and $h \equiv 1+V(H_{n-1})$) and $$ H_{n-1} = \left( \begin{array}{ccc} 0 & -g & n \ +g & 0 & -g \ -g & g & 0 \end{array} \right) $$
$$ H_{n-1} = \left( \begin{array}{cc|c} 0 & -g & n \\ +g & 0 & -g \\ -g & g & 0 \end{array} \right) $$
To solve $G_{n-1}$ we do the usual mantra of subtracting columns and taking determinants: $$ G_{n-1}: \begin{array}{ccc|cc|c} 0 & -g & h & g & -h & 3g^2 \\ +g & 0 & -g & g & 2g & g^2 + 2hg \\ -g & g & 0 & -26 & -g & 2g^2 + hg \\ \hline -g & g & h+g & & & \\ g & -2g & h & & & \\ \hline 2g^2 + hg & g^2 + 2hg & 3g^2 & & & \end{array} $$ He game value is $\frac{hg-g^2}{3h+g6}$. So $$ g = 1 + V(G_{n-1}) = 1 + \frac{hg-g^2}{3h+g6} $$
Similarly, we solve $H_{n-1}$: $$ H_{n-1}: \begin{array}{ccc|cc|c} 0 & -g & n & g & -n & 3g^2 \\ +g & 0 & -g & g & 2g & g^2 + 2ng \\ -g & g & 0 & -26 & -g & 2g^2 + ng \\ \hline -g & g & n+g & & & \\ g & -2g & n & & & \\ \hline 2g^2 + ng & g^2 + 2ng & 3g^2 & & & \end{array} $$ The game value is $\frac{hg-g^2}{3h+g6}$. So $$ h = 1 + \frac{ng-g^2}{3n+g6} $$
Before working with these two equations, we can look at the situation for $n$ very large. There, $ h = 1 + g/3 + O(1/n)$ and we can substitute to find that $$ \begin{array}{l} g = \frac{15+\sqrt{1053}}{46} \approx 1.031521 \\ V(G_{\infty}) \approx 0.031521 \\ h = 1 + g/3 \approx 1.34384 \\ V(H_{\infty}) \approx 0.34384 \end{array} $$ That is, with a very large pot, if player I has a winning streak of 2 wins playing strategy A, player II will very rarely (probability $\frac{3g}{3N+6g}$) risk using strategy C, which could result in losing the whole pot. Player I will almost always choose strategies B (about 2/3 of the time) or C (about 1/3), and the game is favorable to player I with a value of $+\frac{1}{3}$. And if player 1 has just a 1 game winning streak, then the roughly 1.33 reward in the upper right hand corner (going over to $H_{n-1}$ with another A win) biases the game in player I's favor, but only by about 0.03.
Okay, now look at the case for $n$ not large enought to ignore $1/n$ effects: Substituting the expression for $h$ into the expression for $g$ we find $$ 40g^3 + (23n-21)g^2 - (15n+18)g - 9n = 0 $$
So for example, if there is $2 \times 1$ dollars in the pot ($n = 2$) then the value of the game to a player with a winning streak is $G_1 \approx 0.008118$ and the value of a two game winning streak is $G_1 \approx 0.082991$. In game $G_1$ the strategy for player I is to choose A (the winning streak choice) to (B which beats A) to C in ratio $$ 3g : g + 2h : 2g + h = 3.024 : 3.172 : 3.099 $$ and in game $H_1$ the ratios are $$ 3g : g + 2n : 2g + n = 3.024 : 5.008 : 4.016 $$ The game values for $n = 11$ (ten dollars in the pot) are $G_{10} = 0.024413$ and $H_{10} = 0.261048$; the strategies for player 1 in game $H_{10}$ are in ratio $$ 3.073 : 21.049: 1.024 \approx 1 : 7 : 4 $$ and for player II they are in ratio $$ 2n+g : n+2g : 3g = 21:024 : 12.049 : 3.073 \approx 7 : 4 : 1 $$ for A, B and C in that order.