Please explain algebra for conditional probability $ Pr(C|B) = ... 1Pr(A|B) + 0(Pr(A^C | B)) $

Blitzstein, Introduction to Probability (2019 2 edn), Chapter 2, Exercise 22, p 87.

  1. A bag contains one marble which is either green or blue, with equal probabilities. A green marble is put in the bag (so there are 2 marbles now), and then a random marble is taken out. The marble taken out is green. What is the probability that the remaining marble is also green?

Solution:

Let A be the event that the initial marble is green, B be the event that the removed marble is green, and C be the event that the remaining marble is green. We need to find $P(C \mid B)$. There are several ways to find this; one natural way is to condition on whether the initial marble is green:

$ P(C|B) = P(C|B \cap A)P(A|B) + P(C|B \cap A^C)P(A^C|B) = 1P(A|B) + 0(P(A^C | B)) $

I am having trouble seeing how an author reached the RHS of this conditional probability expression, particularly where the $0, 1$ coefficients came from. Could someone please explain the algebra?


$P(C \mid (A \cap B))$ is the conditional probability that the remaining marble is green, given that the original marble was green and a green marble was drawn. It should, I hope, be obvious that $P(C \mid (A \cap B)) = 1$.

$P(C \mid (A^c \cap B))$ is the conditional probability that the remaining marble is green, given that the original marble was blue and a green marble was drawn. It should, I hope, be obvious that $P(C \mid (A^c \cap B)) = 0$.

So the rightmost expression is obtained by substituting these values in the expression in the middle. It might have been better to use $\times$ or $\cdot$ between the numbers and $P$ and thus avoid the possibility of confusion.


This problem is used in Professor Blitzsteins' course Statistics 110 which is available from iTunes U. It is stated in Strategic Practice and Homework 2 problem 3.1. The first equality wasn't immediately clear to me so here are the details:

By the definition of conditional probability we have that $$ P(X \mid Y) = \frac{P(X \cap Y)}{P(Y)} $$

so to restate this in terms of the problem we have that $$ P(C \mid B) = \frac{P(C \cap B)}{P(B)} $$

Now we want to find $P(C \cap B)$. By the definition of the probability function we have that the probability of the union of disjoint events are the summation of the probability of each event. If we have an event $Z$, we can decompose that into two disjoint events like this $$ P(Z) = P((Z \cap A) \cup (Z \cap A^c)) = P(Z \cap A) + P(Z \cap A^c) $$

Notice that one can also use the law of total probability to get the above result since one can always partition a sample space $S$ into $A$ and $A^c$ given $A \subseteq S$.

Let $Z = C \cap B$. This leads to $$ P(C \cap B) = P((C \cap B \cap A) \cup (C \cap B \cap A^c)) = P(C \cap B \cap A) + P(C \cap B \cap A^c) $$

By using the definition of conditional probability once on each term in the above we get $$ P(C \cap B) = P(C \mid (B \cap A)) P(A \cap B) + P(C \mid (B \cap A^c)) P(A^c \cap B) $$

Using the definition of conditional probability again on the second factor of each term we get $$ P(C \cap B) = P(C \mid (B \cap A)) P(A \mid B) P(B) + P(C \mid (B \cap A^c)) P(A^c \mid B) P(B) $$

resulting in \begin{align} P(C \mid B) &= \frac{P(C \cap B)}{P(B)} \\ &= \frac{P(C \mid (B \cap A)) P(A \mid B) P(B) + P(C \mid (B \cap A^c)) P(A^c \mid B) P(B)}{P(B)} \\ &= P(C \mid (B \cap A)) P(A \mid B) + P(C \mid (B \cap A^c)) P(A^c \mid B) \end{align}