Probability Spaces and Events - Key Proof

We present a detailed proof of the inclusion-exclusion principle, one of the most important counting and probability formulas. This result generalizes the simple addition rule for disjoint events.

The Inclusion-Exclusion Principle

Theorem

For events $A_1, A_2, \ldots, A_n$ in a probability space: $P\left(\bigcup_{i=1}^n A_i\right) = \sum_{k=1}^n (-1)^{k+1} \sum_{1 \leq i_1 < \cdots < i_k \leq n} P(A_{i_1} \cap \cdots \cap A_{i_k})$

Equivalently: $P(A_1 \cup \cdots \cup A_n) = \sum_i P(A_i) - \sum_{i<j} P(A_i \cap A_j) + \sum_{i<j<k} P(A_i \cap A_j \cap A_k) - \cdots + (-1)^{n+1} P(A_1 \cap \cdots \cap A_n)$

Proof

We prove by induction on $n$ .

Base Case ( $n=2$ ): We need to show: $P(A_1 \cup A_2) = P(A_1) + P(A_2) - P(A_1 \cap A_2)$

Decompose $A_1 \cup A_2$ into disjoint pieces: $A_1 \cup A_2 = A_1 \cup (A_2 \setminus A_1)$

By additivity: $P(A_1 \cup A_2) = P(A_1) + P(A_2 \setminus A_1)$

Since $A_2 = (A_2 \cap A_1) \cup (A_2 \setminus A_1)$ is a disjoint union: $P(A_2) = P(A_2 \cap A_1) + P(A_2 \setminus A_1)$

Therefore $P(A_2 \setminus A_1) = P(A_2) - P(A_1 \cap A_2)$ , and: $P(A_1 \cup A_2) = P(A_1) + P(A_2) - P(A_1 \cap A_2)$

Inductive Step: Assume the formula holds for $n-1$ events. For $n$ events, write: $\bigcup_{i=1}^n A_i = \left(\bigcup_{i=1}^{n-1} A_i\right) \cup A_n$

By the base case: $P\left(\bigcup_{i=1}^n A_i\right) = P\left(\bigcup_{i=1}^{n-1} A_i\right) + P(A_n) - P\left[\left(\bigcup_{i=1}^{n-1} A_i\right) \cap A_n\right]$

By the inductive hypothesis applied to $A_1, \ldots, A_{n-1}$ : $P\left(\bigcup_{i=1}^{n-1} A_i\right) = \sum_{k=1}^{n-1} (-1)^{k+1} \sum_{1 \leq i_1 < \cdots < i_k \leq n-1} P(A_{i_1} \cap \cdots \cap A_{i_k})$

For the intersection term, note: $\left(\bigcup_{i=1}^{n-1} A_i\right) \cap A_n = \bigcup_{i=1}^{n-1} (A_i \cap A_n)$

Applying the inductive hypothesis to events $A_1 \cap A_n, \ldots, A_{n-1} \cap A_n$ : $P\left(\bigcup_{i=1}^{n-1} (A_i \cap A_n)\right) = \sum_{k=1}^{n-1} (-1)^{k+1} \sum_{1 \leq i_1 < \cdots < i_k \leq n-1} P(A_{i_1} \cap \cdots \cap A_{i_k} \cap A_n)$

Combining these expressions and collecting terms by the size of intersections yields the inclusion-exclusion formula for $n$ events. □

■

Application: Derangements

Example

A derangement is a permutation where no element appears in its original position. The probability that a random permutation of $n$ elements is a derangement can be computed using inclusion-exclusion.

Let $A_i$ = "element $i$ is in position $i$ ". The probability of at least one fixed point is: $P\left(\bigcup_{i=1}^n A_i\right)$

We have:

$P(A_i) = (n-1)!/n! = 1/n$
$P(A_i \cap A_j) = (n-2)!/n! = 1/(n(n-1))$
$P(A_{i_1} \cap \cdots \cap A_{i_k}) = (n-k)!/n! = 1/(n(n-1)\cdots(n-k+1))$

There are $\binom{n}{k}$ terms in the $k$ -th sum, so: $P\left(\bigcup_{i=1}^n A_i\right) = \sum_{k=1}^n (-1)^{k+1} \binom{n}{k} \frac{(n-k)!}{n!} = \sum_{k=1}^n \frac{(-1)^{k+1}}{k!}$

The probability of a derangement is: $P(\text{derangement}) = 1 - \sum_{k=1}^n \frac{(-1)^{k+1}}{k!} = \sum_{k=0}^n \frac{(-1)^k}{k!} \approx e^{-1} \approx 0.368$

Remark

The inclusion-exclusion principle has both combinatorial and probabilistic interpretations. It corrects for overcounting when computing the size or probability of a union by alternately adding and subtracting intersection terms.