Bayes' Theorem with multiple random variables

I'm reviewing some notes regarding probability, and the section regarding Conditional Probability gives the following example:

$P(X,Y|Z)=\frac{P(Z|X,Y)P(X,Y)}{P(Z)}=\frac{P(Y,Z|X)P(X)}{P(Z)}$

The middle expression is clearly just the application of Bayes' Theorem, but I can't see how the third expression is equal to the second. Can someone please clarify how the two are equal?


Solution 1:

We know $$P(X,Y)=P(X)P(Y|X)$$ and $$P(Y,Z|X)=P(Y|X)P(Z|X,Y)$$ (to understand this, note that if you ignore the fact that everything is conditioned on $X$ then it is just like the first example).

Therefore \begin{align*} P(Z|X,Y)P(X,Y)&=P(Z|X,Y)P(X)P(Y|X)\\ &=P(Y,Z|X)P(X) \end{align*} Which derives the third expression from the second.

(However I don't have any good intuition for what the third expression means. Does anyone else?)

Solution 2:

We have

$$P(X,Y\mid Z) \tag1$$

Considering X and Y as a single event, we call them A. Now we have

$$P(A\mid Z) = P(X,Y\mid Z) \tag2$$

Using the Joint Probabilities Rule, we have

$$P(A,Z) = P(A\mid Z)\times P(Z) \tag3$$

So we can say that

$$P(A\mid Z) = \frac{P(A,Z)}{P(Z)} \tag4$$

We know that

$$P(A,Z) = P(Z,A) \tag5$$

Again using the Joint Probabilities Rule, we have

$$P(Z,A) = P(Z \mid A)\times P(A) \tag6$$

We defined $P(A)$ as the following

$$P(A) = P(X,Y) \tag7$$

Again using the Joint Probabilities Rule, we have

$$P(X,Y) = P(X\mid Y)\times P(Y) \tag8$$

Plugging $(8)$ into $(7)$, we have

$$P(A) = P(X\mid Y)\times P(Y) \tag9$$

Plugging $(9)$ into $(6)$, we have

$$P(Z,A) = P(Z\mid A)\times P(X\mid Y)\times P(Y) \tag{10}$$

Plugging $(10)$ into $(5)$ we have

$$P(A,Z) = P(Z\mid A)\times P(X\mid Y)\times P(Y) \tag{11}$$

Plugging $(11)$ into $(4)$, we have

$$P(A\mid Z) = \frac{P(Z\mid A)\times P(X\mid Y)\times P(Y)}{P(Z)} \tag{12}$$

Plugging $(12)$ into $(2)$, we have

$$P(X,Y\mid Z) = \frac{P(Z\mid A)\times P(X\mid Y)\times P(Y)}{P(Z)} \tag{13}$$

Observe that in $(13)$, using the Joint Probabilities Rule, we have

$$P(X,Y) = P(X\mid Y)\times P(Y) \tag{14}$$

Since we defined $P(A)$ is $P(X,Y)$, we have

$$P(A) = P(X\mid Y)\times P(Y) \tag{15}$$

Plugging $(15)$ into $(13)$, we have

$$P(X,Y\mid Z) = \frac{P(Z\mid A)\times P(A)}{P(Z)} \tag{16}$$

Observe that in $(16)$, using the Joint Probabilities Rule, we have

$$P(Z\mid A) = \frac{P(Z,A)}{P(A)} \tag{17}$$

Plugging $(17)$ into $(16)$, we have

$$P(X,Y\mid Z) = \frac{P(Z,A)}{P(Z)} \tag{18}$$

Now observe the following

$$P(Z,A) = P(Z,X,Y) = P(Y,Z,X) \tag{19}$$

Similar to what we did at the beginning, treating $Y$ and $Z$ as a single event and using the Joint Probabilities Rule, we have

$$P(Y,Z,X) = P(Y,Z\mid X)\times P(X) \tag{20}$$

Plugging $(20)$ into $(19)$, we have

$$P(Z,A) = P(Y,Z\mid X)\times P(X) \tag{21}$$

Plugging $(21)$ into $(18)$, we have

$$P(X,Y\mid Z) = \frac{P(Y,Z\mid X)\times P(X)}{P(Z)} \tag{22}$$

I don't know if this clarifies or complicates things more but nevertheless I wanted to include this here as well.

Right now, I can't prove why treating multiple joint events as if they were a single event is "legal".