TheoremComplete

Expectation and Variance - Applications

Variance decomposition and conditional expectation provide powerful techniques for analyzing complex probability problems by breaking them into manageable pieces.

Law of Total Variance

Theorem

For random variables XX and YY: Var(X)=E[Var(XY)]+Var(E[XY])\text{Var}(X) = E[\text{Var}(X|Y)] + \text{Var}(E[X|Y])

This decomposes total variance into:

  • Within-group variance: E[Var(XY)]E[\text{Var}(X|Y)] (average variance within each value of YY)
  • Between-group variance: Var(E[XY])\text{Var}(E[X|Y]) (variance of group means)
Example

A company has two factories. Factory 1 (probability 0.6) produces items with mean weight 100g and variance 25g². Factory 2 (probability 0.4) produces items with mean 110g and variance 16g².

Overall mean: E[X]=0.6(100)+0.4(110)=104E[X] = 0.6(100) + 0.4(110) = 104

Within-factory variance: E[Var(XY)]=0.6(25)+0.4(16)=21.4E[\text{Var}(X|Y)] = 0.6(25) + 0.4(16) = 21.4

Between-factory variance: Var(E[XY])=E[(E[XY])2](E[E[XY]])2\text{Var}(E[X|Y]) = E[(E[X|Y])^2] - (E[E[X|Y]])^2 =0.6(1002)+0.4(1102)1042=1085610816=40= 0.6(100^2) + 0.4(110^2) - 104^2 = 10856 - 10816 = 40

Total variance: Var(X)=21.4+40=61.4\text{Var}(X) = 21.4 + 40 = 61.4

Wald's Equation

Theorem

Let X1,X2,X_1, X_2, \ldots be IID random variables with mean μ\mu, and let NN be a non-negative integer-valued random variable independent of the XiX_i. Then: E[i=1NXi]=E[N]μE\left[\sum_{i=1}^N X_i\right] = E[N] \cdot \mu

This applies when the number of terms is itself random.

Example

A gambler plays until winning, with each game costing 5.ThenumberofgamesfollowsGeometric5. The number of games follows Geometric(p)withwithE[N] = 1/p$. Total cost: E[Total cost]=E[N]×5=5pE[\text{Total cost}] = E[N] \times 5 = \frac{5}{p}

For p=0.2p = 0.2, expected cost is 5/0.2=255/0.2 = 25 dollars.

Compound Distributions

Definition

A compound distribution arises when X=i=1NYiX = \sum_{i=1}^N Y_i where NN is random and YiY_i are IID.

Compound Mean: If NN and YiY_i are independent: E[X]=E[N]E[Y]E[X] = E[N] \cdot E[Y]

Compound Variance: Var(X)=E[N]Var(Y)+Var(N)(E[Y])2\text{Var}(X) = E[N] \cdot \text{Var}(Y) + \text{Var}(N) \cdot (E[Y])^2

Example

Insurance Claims: An insurance company receives NPoisson(50)N \sim \text{Poisson}(50) claims per month, each with amount YExponential(1/1000)Y \sim \text{Exponential}(1/1000) (mean $1000).

Total monthly claims S=i=1NYiS = \sum_{i=1}^N Y_i: E[S]=50×1000=50,000E[S] = 50 \times 1000 = 50,000 Var(S)=50×10002+50×10002=100,000,000\text{Var}(S) = 50 \times 1000^2 + 50 \times 1000^2 = 100,000,000 σS=10,000\sigma_S = 10,000

Jensen's Inequality

Theorem

If gg is a convex function and XX is a random variable: E[g(X)]g(E[X])E[g(X)] \geq g(E[X])

If gg is concave, the inequality reverses.

Example

Since g(x)=x2g(x) = x^2 is convex: E[X2](E[X])2E[X^2] \geq (E[X])^2

This shows Var(X)=E[X2](E[X])20\text{Var}(X) = E[X^2] - (E[X])^2 \geq 0.

Since g(x)=logxg(x) = \log x is concave (for x>0x > 0): E[logX]logE[X]E[\log X] \leq \log E[X]

This is the AM-GM inequality in probabilistic form.

Remark

These tools—total variance decomposition, Wald's equation, compound distributions, and Jensen's inequality—are essential for modeling complex real-world phenomena. They allow us to compute moments of complicated random structures by breaking them down into simpler independent components.