Intuition behind the Definition of Conditional Probability (for 2 Events)

What is some intuitive insight regarding the conditional probability definition: $P(A\mid B) = \large \frac{P(A \cap B)}{P(B)}$ ? I am looking for an intuitive motivation. My textbook merely gives a definition, but no true development of that definition. Hopefully that's not too much to ask.


Consider probabilities as proportions. To say that something has probability one-sixth is to say it occurs one-sixth of the time (this is only one interpretation: it suits our purposes and our intuition, so let's not worry too much about what it means philosophically). Often we calculate probabilities simply by dividing the number of possibilities in which our event of interest occurs, by the number of possibilities total – e.g. to calculate the odds of throwing an even number on a six-sided dice, we calculate $3/6$. (This works because each of the possibilities we are counting is equally likely, by assumption).

Now let's say we want to work out how often $A$ occurs, given that we know $B$ has occurred. Well, we need to find the occurrences of $A$ in this scenario, and divide by the total number of possibilities. When we know $B$ occurred, the occurrences of $A$ are all and exactly those situations in which both $A$ and $B$ occur, and since we're assuming $B$ occurred, the total number of possibilities are reduced to only those where that happened.

Hence \[\mathbb P(A\mid B) = \frac{\text{# occurrences of A and B}}{\text{# occurrences of B}} = \frac{\mathbb P(A \cap B)}{\mathbb P(B)}\]

because the "total number of possibilities" in the expressions for $\mathbb P(B)$ and $\mathbb P(A \cap B)$ cancel.

Essentially, what we are doing is focussing on a particular subsection of the potential events, and considering what proportion of that subsection satisfies whatever property you're interested in (think of Venn diagrams). So, for example, given that your roll result was even, on a six-sided die, it is less likely to be less than $4$, because half the numbers $\{1,2,3,4,5,6\}$ are less than $4$ but only a third of the numbers in our subsection $\{2,4,6\}$ are.


Think about this: if $B$ is very unlikely but when it happens $A$ becomes likely then $P(A \text{ and } B)$ is small while $P(A|B)$ is large.

I am extremely unlikely to win the lottery jackpot this weekend ($B$) but if I do I am likely to become a millionaire ($A$), so the probability I win the lottery jackpot and then become a millionaire $P(A \text{ and } B)$ is small, but the probability I become a millionaire if I win the lottery jackpot $P(A|B)$ is high.