Why does conditioning over a conditional probability give this equation (from Harvard's STAT 110 problem set)?
Regarding $P(C \cap A \mid B) = P(C \mid B, A) P(A \mid B)$.
- One way to see this is to just note that it is a generalization of $P(C \cap A) = P(C \mid A) P(A)$ where the probability here is not the unconditional probability, but the conditional probability given $B$. Note that by adding "conditioned on $B$" to all three terms gives you $P(C \cap A \mid B) = P(C \mid A, B) P(A \mid B)$.
- If that is not convincing enough, you can always fall back on the definitions. Just subsititute $P(C\cap A \mid B) = \frac{P(A \cap B \cap C)}{P(B)}$ and $P(C \mid B, A) = \frac{P(A \cap B \cap C)}{P(A \cap B)}$ and $P(A \mid B) = \frac{P(A \cap B)}{P(B)}$ to immediately see why the equation holds.
$P(B \mid A^c) = \frac{1}{2}$ because if $A^c$ occurs, then the bag contains one green and one blue marble, so the [conditional] probability of taking out a green marble is $1/2$.