Expectation and Variance - Applications

Variance decomposition and conditional expectation provide powerful techniques for analyzing complex probability problems by breaking them into manageable pieces.

Law of Total Variance

Theorem

For random variables $X$ and $Y$ : $\text{Var}(X) = E[\text{Var}(X|Y)] + \text{Var}(E[X|Y])$

This decomposes total variance into:

Within-group variance: $E[\text{Var}(X|Y)]$ (average variance within each value of $Y$ )
Between-group variance: $\text{Var}(E[X|Y])$ (variance of group means)

Example

A company has two factories. Factory 1 (probability 0.6) produces items with mean weight 100g and variance 25g². Factory 2 (probability 0.4) produces items with mean 110g and variance 16g².

Overall mean: $E[X] = 0.6(100) + 0.4(110) = 104$

Within-factory variance: $E[\text{Var}(X|Y)] = 0.6(25) + 0.4(16) = 21.4$

Between-factory variance: $\text{Var}(E[X|Y]) = E[(E[X|Y])^2] - (E[E[X|Y]])^2$ $= 0.6(100^2) + 0.4(110^2) - 104^2 = 10856 - 10816 = 40$

Total variance: $\text{Var}(X) = 21.4 + 40 = 61.4$

Wald's Equation

Theorem

Let $X_1, X_2, \ldots$ be IID random variables with mean $\mu$ , and let $N$ be a non-negative integer-valued random variable independent of the $X_i$ . Then: $E\left[\sum_{i=1}^N X_i\right] = E[N] \cdot \mu$

This applies when the number of terms is itself random.

Example

A gambler plays until winning, with each game costing $5. The number of games follows Geometric$ (p) $with$ E[N] = 1/p$. Total cost: $E[\text{Total cost}] = E[N] \times 5 = \frac{5}{p}$

For $p = 0.2$ , expected cost is $5/0.2 = 25$ dollars.

Compound Distributions

Definition

A compound distribution arises when $X = \sum_{i=1}^N Y_i$ where $N$ is random and $Y_i$ are IID.

Compound Mean: If $N$ and $Y_i$ are independent: $E[X] = E[N] \cdot E[Y]$

Compound Variance: $\text{Var}(X) = E[N] \cdot \text{Var}(Y) + \text{Var}(N) \cdot (E[Y])^2$

Example

Insurance Claims: An insurance company receives $N \sim \text{Poisson}(50)$ claims per month, each with amount $Y \sim \text{Exponential}(1/1000)$ (mean $1000).

Total monthly claims $S = \sum_{i=1}^N Y_i$ : $E[S] = 50 \times 1000 = 50,000$ $\text{Var}(S) = 50 \times 1000^2 + 50 \times 1000^2 = 100,000,000$ $\sigma_S = 10,000$

Jensen's Inequality

Theorem

If $g$ is a convex function and $X$ is a random variable: $E[g(X)] \geq g(E[X])$

If $g$ is concave, the inequality reverses.

Example

Since $g(x) = x^2$ is convex: $E[X^2] \geq (E[X])^2$

This shows $\text{Var}(X) = E[X^2] - (E[X])^2 \geq 0$ .

Since $g(x) = \log x$ is concave (for $x > 0$ ): $E[\log X] \leq \log E[X]$

This is the AM-GM inequality in probabilistic form.

Remark

These tools—total variance decomposition, Wald's equation, compound distributions, and Jensen's inequality—are essential for modeling complex real-world phenomena. They allow us to compute moments of complicated random structures by breaking them down into simpler independent components.