TheoremComplete

The Law of Large Numbers

The Law of Large Numbers (LLN) formalizes the intuitive notion that averages of many independent observations converge to the expected value, providing the mathematical justification for the frequency interpretation of probability.


Weak Law

Theorem7.7Weak Law of Large Numbers (WLLN)

Let X1,X2,X_1, X_2, \ldots be i.i.d. random variables with finite mean μ=E[Xi]\mu = E[X_i]. Then the sample mean converges in probability to μ\mu: Xˉn=1ni=1nXiPμ\bar{X}_n = \frac{1}{n}\sum_{i=1}^n X_i \xrightarrow{P} \mu That is, for every ϵ>0\epsilon > 0, limnP(Xˉnμ>ϵ)=0\lim_{n \to \infty} P(|\bar{X}_n - \mu| > \epsilon) = 0.

When σ2=Var(Xi)\sigma^2 = \operatorname{Var}(X_i) is finite, the WLLN follows immediately from Chebyshev's inequality: P(Xˉnμ>ϵ)σ2nϵ20P(|\bar{X}_n - \mu| > \epsilon) \leq \frac{\sigma^2}{n\epsilon^2} \to 0.


Strong Law

Theorem7.8Strong Law of Large Numbers (SLLN)

Let X1,X2,X_1, X_2, \ldots be i.i.d. random variables with E[Xi]<E[|X_i|] < \infty (finite first moment). Then Xˉna.s.μ=E[X1]\bar{X}_n \xrightarrow{a.s.} \mu = E[X_1] That is, P(limnXˉn=μ)=1P\left(\lim_{n \to \infty} \bar{X}_n = \mu\right) = 1.

The strong law is a deeper result than the weak law: it says that not only does Xˉn\bar{X}_n get close to μ\mu with high probability, but the entire sequence of averages converges to μ\mu for almost every outcome.


Applications

ExampleMonte Carlo estimation

To estimate μ=E[g(X)]\mu = E[g(X)] for a random variable XX, generate i.i.d. samples X1,,XnX_1, \ldots, X_n and compute μ^n=1ng(Xi)\hat{\mu}_n = \frac{1}{n}\sum g(X_i). The SLLN guarantees μ^nμ\hat{\mu}_n \to \mu almost surely. The CLT gives the error estimate: μ^nN(μ,σ2/n)\hat{\mu}_n \approx N(\mu, \sigma^2/n), so doubling accuracy requires quadrupling nn.

ExampleGlivenko-Cantelli theorem

The empirical distribution function Fn(x)=1ni=1n1XixF_n(x) = \frac{1}{n}\sum_{i=1}^n \mathbf{1}_{X_i \leq x} satisfies supxFn(x)F(x)a.s.0\sup_x |F_n(x) - F(x)| \xrightarrow{a.s.} 0 (the Glivenko-Cantelli theorem). This "fundamental theorem of statistics" says the empirical CDF converges uniformly to the true CDF — a far-reaching generalization of the SLLN.


RemarkFinite mean is necessary for the SLLN

The condition E[X1]<E[|X_1|] < \infty cannot be removed. If XiX_i are i.i.d. Cauchy random variables (f(x)=1π(1+x2)f(x) = \frac{1}{\pi(1+x^2)}), then E[Xi]=E[|X_i|] = \infty and in fact Xˉn\bar{X}_n has the same Cauchy distribution as X1X_1 for all nn — the average does not concentrate at all.