Expectation and Variance - Key Proof

We present a complete proof of the Law of Total Expectation, a fundamental result that connects conditional and unconditional expectations.

Law of Total Expectation

Theorem

Let $X$ and $Y$ be random variables. Then: $E[X] = E[E[X|Y]]$

More generally, if $\{B_1, B_2, \ldots, B_n\}$ is a partition of the sample space: $E[X] = \sum_{i=1}^n E[X|B_i] P(B_i)$

Proof

We prove the discrete case; the continuous case follows analogously using integrals.

Discrete Case (Partition Formula): Let $\{B_1, \ldots, B_n\}$ be a partition with $P(B_i) > 0$ for all $i$ .

By definition of conditional expectation: $E[X|B_i] = \sum_x x \cdot P(X = x | B_i)$

Therefore: $\sum_{i=1}^n E[X|B_i] P(B_i) = \sum_{i=1}^n \left[\sum_x x \cdot P(X = x | B_i)\right] P(B_i)$

Rearranging the sums: $= \sum_x x \sum_{i=1}^n P(X = x | B_i) P(B_i)$

By the law of total probability: $\sum_{i=1}^n P(X = x | B_i) P(B_i) = P(X = x)$

Substituting: $= \sum_x x \cdot P(X = x) = E[X]$ □

General Case ( $E[E[X|Y]]$ ): When conditioning on a random variable $Y$ rather than a partition, we use the fact that $E[X|Y]$ is itself a random variable (a function of $Y$ ).

For discrete $Y$ taking values $y_1, y_2, \ldots$ : $E[E[X|Y]] = \sum_j E[X|Y = y_j] \cdot P(Y = y_j)$

This is precisely the partition formula with $B_j = \{Y = y_j\}$ .

For continuous $Y$ with density $f_Y$ : $E[E[X|Y]] = \int_{-\infty}^{\infty} E[X|Y = y] \cdot f_Y(y) \, dy$

By the definition of conditional expectation: $E[X|Y = y] = \int_{-\infty}^{\infty} x \cdot f_{X|Y}(x|y) \, dx$

Therefore: $E[E[X|Y]] = \int_{-\infty}^{\infty} \left[\int_{-\infty}^{\infty} x \cdot f_{X|Y}(x|y) \, dx\right] f_Y(y) \, dy$

Interchanging the order of integration (by Fubini's theorem): $= \int_{-\infty}^{\infty} x \left[\int_{-\infty}^{\infty} f_{X|Y}(x|y) f_Y(y) \, dy\right] dx$

Since $f_{X|Y}(x|y) f_Y(y) = f_{X,Y}(x,y)$ (joint density): $= \int_{-\infty}^{\infty} x \left[\int_{-\infty}^{\infty} f_{X,Y}(x,y) \, dy\right] dx$

The inner integral gives the marginal density $f_X(x)$ : $= \int_{-\infty}^{\infty} x \cdot f_X(x) \, dx = E[X]$ □

■

Applications and Consequences

Example

Computing Expectation via Conditioning: Roll a fair die. If the result is even, toss that many fair coins; if odd, toss one coin. Let $X$ = number of heads.

Let $D$ be the die outcome. Partition by whether $D$ is even or odd: $E[X] = E[X|D \text{ even}]P(D \text{ even}) + E[X|D \text{ odd}]P(D \text{ odd})$

When $D$ is even ( $D \in \{2,4,6\}$ ): $E[X|D \text{ even}] = \frac{1}{3}(E[X|D=2] + E[X|D=4] + E[X|D=6])$ $= \frac{1}{3}(1 + 2 + 3) = 2$

When $D$ is odd: $E[X|D \text{ odd}] = 0.5$

Therefore: $E[X] = 2 \times \frac{1}{2} + 0.5 \times \frac{1}{2} = 1.25$

Example

Recursive Calculation: A geometric random variable $N \sim \text{Geometric}(p)$ can be computed recursively.

On the first trial, either we succeed (probability $p$ , giving $N = 1$ ), or we fail (probability $1-p$ , and we're back at the start): $E[N] = p \cdot 1 + (1-p) \cdot (1 + E[N])$

Solving: $E[N] = p + (1-p) + (1-p)E[N]$ $p \cdot E[N] = 1$ $E[N] = \frac{1}{p}$

Remark

The law of total expectation is an indispensable tool for computing expectations in complex situations. By conditioning on an appropriate random variable or partition, we can break difficult problems into simpler conditional pieces, then reassemble the result.