Would you ever stop rolling the die? [duplicate]
You have a six-sided die. You keep a cumulative total of your dice rolls. (E.g. if you roll a 3, then a 5, then a 2, your cumulative total is 10.) If your cumulative total is ever equal to a perfect square, then you lose, and you go home with nothing. Otherwise, you can choose to go home with a payout of your cumulative total, or to roll the die again.
My question is about the optimal strategy for this game. In particular, this means that I am looking for an answer to this question: if my cumulative total is $n$, do I choose to roll or not to roll in order to maximize my cumulative total? Is there some integer $N$ after which the answer to this question is always to roll?
I think that there is such an integer, and I conjecture that this integer is $4$. My reasoning is that the square numbers become sufficiently sparse for the expected value to always be in increased by rolling the die again.
As an example, suppose your cumulative total is $35$. Rolling a $1$ and hitting 36 means we go home with nothing, so the expected value of rolling once is:
$$E(Roll|35) = \frac 0 6 + \frac {37} 6 + \frac {38} 6 + \frac{39} 6 + \frac {40} {6} + \frac{41}{6} = 32.5$$
i.e.
$$E(Roll|35) = \frac 1 6 \cdot (37 + 38 + 39 + 40 + 41) = 32.5$$
But the next square after $35$ is $49$. So in the event that we don't roll a $36$, we get to keep rolling the die at no risk as long as the cumulative total is less than $42$. For the sake of simplification, let's say that if we roll and don't hit $36$, then we will roll once more. That die-roll has an expected value of $3.5$. This means the expected value of rolling on $35$ is:
$$E(Roll|35) = \frac 1 6 \cdot (40.5 + 41.5 + 42.5 + 43.5 + 44.5) = 35.42$$
And since $35.42 > 35$, the profit-maximizing choice is to roll again. And this strategy can be applied for every total. I don't see when this would cease to be the reasonable move, though I haven't attempted to verify it computationally. I intuitively think about this in terms of diverging sequences.
I recently had this question in a job interview, and thought it was quite interesting. (And counter-intuitive, since this profit-maximizing strategy invariably results in going home with nothing.)
Solution 1:
How to use your observation in general:
Just checking things for $35$ isn't indicative of the general case, which for large values is different. Take your argument and use a general square $n^2$.
Suppose you're at $n^2-1$. You can leave with $n^2-1$, or you can roll. On a roll, you lose everything with probability $\frac{1}{6}$. With probability $\frac{5}{6}$ you'll get at least $(n+1)^2-6 = n^2 + 2n - 5$ by rolling until it isn't safe to any more. So, for a simple lower bound, we want to know if $\frac{5}{6}(n^2 + 2n - 5)$ is greater than $n^2-1$. For large $n$, this is not the case, and we can actually get an upper bound by similar reasoning:
An upper bound:
[A tidied up version of the original follows. Keeping the original upper bound would have required a few extra lines of logic, so I've just upped it slightly to keep it brief.]
An upper bound: Suppose we're between $n^2-5$ and $n^2-1$. If we roll, the best things could go for us would be to lose $\frac{1}{6}$ the time, and, when we don't lose, we get in the range $(n+1)^2-6$ to $(n+1)^2-1$ (the highest we could get without taking another risk). Just comparing current valuation, you're trading at least $n^2-5$ for at most $\frac{5}{6}(n^2 + 2n)$ by the time you get to the next decision. The difference is $-\frac{1}{6}n^2 + \frac{5}{3}n + 5$. For $n \geq 13$ this is negative. So if you're at $13^2-k$ for $1 \leq k \leq 6$, don't roll. (Higher $n$ making this difference even more negative gives this conclusion. See next paragraph for details.)
Extra details for the logic giving this upper bound: Let $W_n$ be the current expected winnings of your strategy for the first time you are within $6$ of $n^2$, also including times you busted or stopped on your own at a lower value. The above shows that, if your strategy ever includes rolling when within $6$ of $n^2$ for $n \geq 13$, then $W_{n+1}$ is less than $W_n$. Therefore there's no worry about anything increasing without bound, and you should indeed never go above $168$.
An easy lower bound (small numbers):
Low values: For $n = 3$ we have $(5+6+7+8)/6 > 3$ so roll at $n = 3$. Except for the case $n = 3$, if we roll at $n$, we'll certainly get at least $\frac{5}{6}(n+3)$. So if $\frac{5}{6}(n+3) - n > 0$, roll again. This tells us to roll again for $n < 18$ at the least. Since we shouldn't stop if we can't bust, this tells us to roll again for $n \leq 18$.
An algorithm and reported results for the general case:
Obtaining a general solution: Start with expected values of $n$ and a decision of "stay" assigned to $n$ for $163 \leq n \leq 168$. Then work backwards to obtain EV and decisions at each smaller $n$. From this, you'll see where to hit/stay and what the set of reachable values with the optimal strategy is.
A quick script I wrote outputs the following: Stay at 30, 31, 43, 44, 45, 58, 59, 60, 61, 62, and 75+. You'll never exceed 80. The overall EV of the game is roughly 7.2. (Standard disclaimer that you should program it yourself and check.)
Solution 2:
If you keep rolling the die forever, you will hit a perfect square with probability 1. Intuitively, every time you get close to a square (within distance 6), you have a 1/6 chance to hit the square. This happens infinitely many times, and so you're bound to hit a square eventually.
Slightly more formally, suppose that you roll the die infinitely often whatever happens. Your trajectory (sequence of partial sums) has the property that the difference between adjacent points is between $1$ and $6$. In particular, for each number $N$, there will be a point $x$ in the trajectory such that $N-6 \leq x < N$. If $x$ is the first such point, then you have a chance of $1/6$ to hit $N$ as your next point. Furthermore, if $N_2 > N_1+6$, then these events are independent. So your probability of hitting either $N_1$ or $N_2$ at the first shot once "in range" is $1-(5/6)^2$. The same argument works for any finite number of separated points, and given infinitely many points, no matter how distant, we conclude that you hit one of them almost surely.