Kolmogorov's probability axioms

You seem to be tackling several issues at once. First though, some inaccuracies. You write "when creating a system of axioms like these..." I'm not sure what 'these' refers to. Then you say "it's necessary the list of axioms is complete." Do you mean by 'complete' that there is only one model of the axioms (up to isomorphism)? if so, why is that necessary for modelling probability events? You comparison with the axioms of geometry is unclear as well. If you omit the fifth, you do not automatically get hyperbolic geometry, you can also get projective geometry. To claim that any of those is not what we wanted to have is peculiar, particularly from a modern perspective. Geometry encompasses much more than just Euclidean geometry. And again, even with the fifth there is not just one (up to isomorphism) Euclidean geometry, but infinitely many (of various dimensions).

Now I will try to address the question of what is so great about Kolmogorov's axiomatisation. The mathematics of probability is fraught with difficulties, both conceptual and technical. There are endless examples of seemingly simple questions that turn out to be very complicated or have severely counter intuitive answers (The Monty Hall paradox for instance). Problems that appear identical may turn out to be significantly different just because of changes in the protocol. In short, it's not easy.

Having said that, the probability theory of finite probability spaces is quite simple, at least in the sense that it is clear how to model finite probability spaces: Given a finite set of events, the probability of a subset of events is the ratio of that subset to the entire set. Sweet. From it flows quite a lot, but only when the total set of events is finite.

Often, the set of events is infinite. For instance, modelling throwing a dart at a dartboard is often done by imagining the dart board as a disk in $\mathbb R^2$, and then a throw of a dart corresponds to a choice of a point in the disk. Of course the disk has infinitely many points. What is the probability that the dart hits a given point, say the centre of the disk? Well, assuming the dart lands randomly at a uniform distribution over all points, the only possible answer is $0$. A point is just too small. This is already counter intuitive enough and raises the question of how to model all of this. Well, this is all related to the notion of how big a set is. An innocent question with a highly complicated answer. It's not simple at all to develop the theory that answers this question - measure theory. Issues related to the axiom of choice quickly creep up. A famous theorem of Vitali shows that it is impossible (assuming the axiom of choice) to meaningfully assign a measure to each and every subset of $\mathbb R$.

Now, measure theory was not developed to provide some foundations of probability theory. Instead it arose from questions of integrability. Kolmogorov's wonderful insight was that he realised the same formalism can be used to turn the intuition of what probability theory should be (as you say, pretty obvious axioms) into actual axioms. Before measure theory and Kolmogorov's seminal contribution nobody knew how to meaningfully and accurately work with infinite probability spaces. Thanks to Kolmogorov a formalism was born. Now that is truly wonderful.

Lastly, the paragraph you quote is talking about something all together different. Quantum mechanical considerations defy many conceptually obvious properties. Among them Kolmogorov's axiomatisation of probability. In the world of quantum mechanics even probability behaves differently than what we are used to. Such is life.


Kolmogorov was both interested in axioms and how probability realizes in systems. For the latter, see this paper.

Probability is notoriously difficult to correctly axiomatize. Kolmogorov's probability was a revolution in that it laid the foundations for a theory that is not only rigorous, but very applicable. The only similar "easy" example I can think of is the notion of compact sets for proving stuff in real analysis.

Kolmogorov's axioms by themselves are nothing new. However, it was Kolmogorov's reinterpretation of probability through measure theory that was truly revolutionary. This allowed for a much broader and more rigorous foundation for probability theory. Everything from Kolmogorov's 0-1 Law, to interpreting $P(A|B)$ when $P(B)=0$, becomes natural and useful in this measure theoretic approach. A further example is Brownian motion, whose rigorous foundations are solely rooted in measure theory.

Whether or not Kolmogorov's theory works in quantum mechanics is a completely separate issue. Quantum probability is a generalization, and you can find ways of connecting it in Kolmogorov's theory here.


Isn't the reason for their success precisely the fact that the Kolmogrov axioms are

  • small in number
  • simple staements
  • everyone can agree with?

(I repeat here the points of your statement, but doesn't your quote contradict the last of these points?)

It gets a bit problematic when we talk about completeness in this context: The intent of Euclid's axioms was to describe a single abstract object, "the" geometry of "the" plane (or "the" 3D space). We might also ask: Are the three group axioms (associativity, neutral, inverse) complete? In a sense they are not, for neither the statement $\forall x,y\colon xy=yx$ nor its negation can be proved from them. But that is because these axioms are there to describe many objects (i.e., models of the axiom system). And on the other end of the spectrum there are structures that fail to be groups (such as $\mathbb N$) and therefore do not suggests themselves to be treated with group theory methods.

Kolmogorv's axioms fall more in the second category: They are applicable to many different situations. And if $P(A\lor B)=P(A)+P(B)-P(A\land B)$ does not hold in real life, then this cannot be modelled as probability just like $\Bbb N$ is no group.