Intuition for Inclusion-Exclusion Principle
Many of us are familiar with the inclusion-exclusion principle. I think the principle makes total sense when applied to the two or three sets and we have the following:
$|A\cup B|=|A|+|B|-|A\cap B|$
$|A\cup B\cup C|=|A|+|B|+|C|-|A\cap B|-|A\cap C|-|B\cap C|+|A+B+C|\text{.}$
However, it is not as easy to understand how this works in the general case. In lieu of a rigorous proof, it is easy to see that the IEP rests on the following principle: suppose that $x$ is a member of $n$ sets. Then $x$ gets counted $n$ times on the first count, subtracted $n$ choose $2$ times on the second count, added back in $n$ choose $3$ times on the third count, etc. In other words:
$${n \choose 1}-{n\choose 2}+{n\choose 3}-{n\choose 4}+\cdots+(-1)^{n+1}{n \choose n}=1$$
Or, perhaps more appropriately written as
$${n\choose 0}-{n \choose 1}+{n\choose 2}+\cdots+(-1)^{n}{n \choose n}=0$$
I should clarify that the proof of this is easy to do algebraically, but I am looking for a useful intuitive explanation of the above property, and I'm curious how people view the IEP from a combinatorial perspective.
Solution 1:
One essential aspect of the Inclusion-Exclusion Principle (IEP) is the transformation of at least information to exact information.
If counting objects having at least a number of properties is simple, but counting objects having exactly a number of properties is difficult, than the IEP comes into play.
The objects are represented by the elements contained in $A_1,\dots,A_n$ and the properties of an element $x$ are the sets $A_j,1\leq j\leq n$, which contain $x$.
If we have this essence of IEP in mind and we look at the expression:
\begin{align*} \left|\bigcup_{j=1}^{n}A_j\right|=\sum_{j=1}^{n}\left|A_j\right| -\sum_{1\leq i \leq j \leq n}\left|A_i\cap A_j\right|\pm\cdots+(-1)^{n-1}\left|\bigcap_{j=1}^{n}A_j\right| \end{align*}
we observe, that the right hand side (RHS) consists of summands with at least information.
Note, that the summand
$$\left|A_i\cap A_j\right|\quad \text{in}\quad\sum_{1\leq i \leq j \leq n}\left|A_i\cap A_j\right|$$
is not only the number of elements which are in $A_i$ and $A_j$ it is more precisely the number of elements which are at least in $A_i$ and $A_j$, since elements $x$ in $A_i\cap A_j$ may also be contained in other sets of $A_1,\dots,A_n$.
Whereas the LHS $$\left|\bigcup_{j=1}^{n}A_j\right|$$ presents the number of elements which are exactly in $\bigcup_{j=1}^{n}A_j$.
We observe the IEP transforms counting information with at least properties into counting information with exact properties.
Note: This intuitive connection between at least and exact information has a formal represention. Using generating functions you will get a kind of eye-birds view at hand, which transforms the at least to an exact information as simple shift by one of the argument. For more information about this approach you may have a look at section 4.2 of H.S. Wilfs Generatingfunctionology.
Solution 2:
The way I usually think of the Inclusion-Exclusion Principle goes something like this:
If something is in $n$ of the $S_j$, it will be counted $\binom{n}{k}$ times in the sum of the sizes of intersections of $k$ of the $S_j$. Therefore, it will be counted $$ \sum_{k\ge1}(-1)^{k-1}\binom{n}{k}=1\tag{1} $$ time in the expression $$ \begin{align} &\overbrace{\sum_i\left|S_i\right|}^{\binom{n}{1}}-\overbrace{\sum_{i\lt j}\left|S_i\cap S_j\right|}^{\binom{n}{2}}+\overbrace{\sum_{i\lt j\lt k}\left|S_i\cap S_j\cap S_k\right|}^{\binom{n}{3}}-\dots\tag{2} \end{align} $$ where the expression above each sum is the number of times an object in $n$ of the $S_j$ will be counted in that sum.
Thus, because of $(1)$, any object, no matter how many of the $S_j$ it is in (no matter what $n$ is), will be counted only once in $(2)$.