A die is rolled once. Call the result N. Then, the die is rolled N times, and those rolls which are equal to or greater than N are summed

A die is rolled once. Call the result N. Then, the die is rolled N times, and those rolls which are equal to or greater than N are summed (other rolls are not summed). What is the distribution of the resulting sum? What is the expected value of the sum?

The probability of a sum of $k$ is the coefficeint on $x^k$ in the polynomial,

$$\frac{1}{6}(\frac{1}{6}(x + x^2 + x^3 + x^4 + x^5 + x^6))+ \frac{1}{6}(\frac{1}{6}(1 + x^2 + x^3 + x^4 + x^5 + x^6))^2+\frac{1}{6}(\frac{1}{6}(2 + x^3 + x^4 + x^5 + x^6))^3+ \frac{1}{6}(\frac{1}{6}(3 + x^4 + x^5 + x^6))^4 + \frac{1}{6}(\frac{1}{6}(4 + x^5 + x^6))^5+ \frac{1}{6}(\frac{1}{6}(5 + x^6))^6$$

Why do we have the constants $1 ,2 3, 4,5$ in these terms? Why dont we just have the $x$'s?


Solution 1:

I will expand slightly on my and Jaap's comments.

Denote by $S$ the random variable of the resulting sum. By the law of total probability, you can find its distribution by

$$ P(S=s) = \sum_{n=1}^6 P(S=s|N=n) P(N=n) = \frac16 \sum_{n=1}^6 P(S=s|N=n). $$

As the OP described, the probabilities $P(S=s|N=n)$ can be found via generating functions. I will shortly describe the idea in the following and highlight the importance of including the constants within the terms. Define the generating functions

$$f_n(x) = \left( \frac{n-1}6 + \sum_{i=n}^6 \frac{x^i}{6} \right)^n. $$

Then, $P(S=s|N=n)$ is the coefficient of $x^s$ in $f_n(x)$. This is because, conditioned on $N=n$, the variable $S$ is the sum of $n$ indendepent variables, which each assume the value $0$ with probability $\frac{n-1}{6}$ and assume each of the values $n,n+1,\dots,6$ with probability $\frac16$. Hence, the coefficient $x^s$ in $f_n(x)$ comprises the sum of all events in which the sum is $S=s$. Crucially, in the computation of the probability, we need to take into account those events, where some of the rolls might be $0$. Consider for example, $N=2$ and $S=5$. The sum $S=5$ can be obtained by the events $0+5$, $5+0$, $2+3$, or $3+2$. Therefore, it is necessary to include the constants $\frac{n-1}{6}$ in the generating functions as those accound for the case of rolling $0$.

Putting everything together, we see that the probability of $P(S=s)$ is obtained through the formula presented by the OP.