Formula for odds of succeeding at least one 1/4096 roll out of n attempts, and at what point should it reasonably of been successful
I'm trying to calculate the "expected" success/failure rate for getting a shiny pokemon after n encounters, which for my purposes has a flat 1/4096 chance of occuring.
To my understanding, the forumla to find what % of people are estimated to have succeeded would be 1-(1-x)^n, where n is the number of attempts and x is chance of success (in this case, 1/4096). The chance of failure would just be (1-x)^n.
I'm not too familiar with statistics, but this seems to check out. If using a coin (x=2) and marking heads as a successful trial and tails as failure, it holds up. When using a standard 6 sided die it also seems to hold true, but something just feels off when applied to a huge number like 4096, and I want to make sure the formula can still apply
With 1/4096 odds, you can expect it would take 4096 tries on average to succeed, so therefore using this number as our "n" gets us 1-(1-(1/4096))^4096 = ~0.6322; So would it be fair to estimate that about 63% of players still would not of gotten lucky, and about 37% would of been? Would it therefore be a 50% split at around n=2839 attempts since $\log_{1-(1/4096)}0.5$ = ~2838.78 and therefore 50% of players should of gotten a successful roll by this many attempts?
Or is this misleading/the forumla is wrong to begin with?
Solution 1:
If $X$ is the random number of attempts needed to observe the first success, then $$\Pr[X \le x] = 1 - (1-p)^x,$$ where $p = \frac{1}{4096}$. This represents the probability of being successful within $x$ attempts. So for instance, if we want to compute the probability of being successful within $x = 4096$ attempts, this is $$\Pr[X \le 4096] = 1 - \left(1 - \frac{1}{4096}\right)^{4096} \approx 0.632165.$$ You have interpreted this to be the probability of remaining unsuccessful; instead, it means that on average, about $63.2\%$ of players would have at least one successful attempt in $4096$ tries.
The median number of attempts needed is given by $m$, where $$\Pr[X \le m] = 1 - (1 - p)^m = \frac{1}{2},$$ hence we require $$m = \frac{\log \frac{1}{2}}{\log (1-p)} \approx 2838.784,$$ but since $m$ must be a positive integer, we round up to get $m = 2839$ attempts needed. But all this says is that the chance of being successful within this many tries is 50-50.
The required number of attempts to have a $100(1-\alpha)\%$ chance of being successful is called a quantile (in this case, a percentile): it is $$\Pr[X \le q_{1-\alpha}] = 1 - (1 - p)^{q_{1-\alpha}} = 1 - \alpha$$ or $$q_{1-\alpha} = \left\lceil \frac{\log \alpha}{\log (1-p)} \right\rceil,$$ so for a $95\%$ chance, it is $$q_{0.95} = \left\lceil \frac{\log 0.05}{\log \frac{4095}{4096}} \right\rceil = 12270,$$ and for a $99\%$ chance, it is $$q_{0.99} = 18861.$$
That said, the fact that the outcomes of subsequent attempts are independent of previous attempts, if you have already made $a$ attempts and were not successful, this does not increase your chances of being successful in the future. In other words, if you tried $2000$ times and failed, on average, you will still need to try another $2839$ times to have a 50-50 chance. This is what we call the memoryless property.