Regression towards the mean v/s the Gambler's fallacy

[TL;DR:] The key distinction to make, I think, is between the next event's theoretical probability v/s the cumulative empirical probability.

The Gambler's Fallacy of assuming the probability of the 10th toss being anything but exactly 50/50 is wrong (assuming the coin is fair).

However, since the probability should be 50/50, you are most likely to get 45 heads and 45 tails over the next 90 throws. So if the proportion of heads was 90% in the first 10 tosses, it will be ~54% over 100 tosses (including the ten before) — regression (moving closer) to the mean (here, 50%).

[Update:] Hadn't noticed it before, but @AndreNicholas got to it before me in the comments.


This is interesting because it shows how tricky the mind can be. I arrived at this web site after reading the book by Kahneman, "Thinking, Fast and Slow". I do not see contradiction between the gambler´s fallacy and regression towards the mean. According to the regression principle, the best prediction of the next measure of a random variable is the mean. This is precisely what is assumed when considering that each toss is an independent event; that is, the mean (0.5 probability) is the best prediction. This applies for the next event’s theoretical probability and there is no need for a next bunch of tosses. The reason we are inclined to think that after a “long” run of repeated outcomes the best prediction is other than the mean value of a random variable has to do with heuristics. According to Abelson's first law of statistics, "Chance is lumpy". I quote Abelson: "People generally fail to appreciate that occasional long runs of one or the other outcome are a natural feature of random sequences." Some studies have shown that persons are bad generators of random numbers. When asked to write down a series of chance outcomes, subjects tend to avoid long runs of either outcome. They write sequences that quickly alternate between outcomes. This is so because we expect random outcomes to be "representative" of the process that generates them (a problem related to heuristics). Therefore, assuming that the best prediction for a tenth toss in you example should be other than 0.5, is a consequence of what unconsciously (“fast thinking”) we want to be represented in our sample. Fool gamblers are bad samplers. Alfredo Hernandez


The Gambler's Fallacy is the incorrect belief that after a sequence of random events of one kind, the next event is more likely to be of an opposite or different kind. In the case of an equilibrium coin toss, odds of the next event are the same as the previous event, that is of equal chance. That is not inconsistent with our sense that things tend to even out over time as long as we appreciate that time is not defined as a single event but as a sequence.

A coin toss involves random chance the result of which cannot be determined but only described probabilistically as 50/50 per event and as something tending toward a mean of 50/50 in an indefinite sequence of such events.