TheoremComplete

Berry-Esseen Theorem and CLT Refinements

While the CLT states that normalized sums converge to the normal distribution, the Berry-Esseen theorem quantifies the rate of convergence, providing non-asymptotic error bounds.


Berry-Esseen Bound

Theorem7.9Berry-Esseen Theorem

Let X1,X2,X_1, X_2, \ldots be i.i.d. random variables with E[Xi]=0E[X_i] = 0, E[Xi2]=σ2>0E[X_i^2] = \sigma^2 > 0, and E[Xi3]=ρ<E[|X_i|^3] = \rho < \infty. Let Sn=X1++XnσnS_n = \frac{X_1 + \cdots + X_n}{\sigma\sqrt{n}}. Then there exists a universal constant CC such that supxRP(Snx)Φ(x)Cρσ3n\sup_{x \in \mathbb{R}} |P(S_n \leq x) - \Phi(x)| \leq \frac{C\rho}{\sigma^3 \sqrt{n}} The best known value is C0.4748C \leq 0.4748 (Shevtsova, 2011).

The Berry-Esseen theorem shows the CLT approximation error is O(1/n)O(1/\sqrt{n}), which is sharp in general. The bound depends on the ratio ρ/σ3\rho/\sigma^3, which measures the "non-normality" of the distribution.

ExampleBernoulli case

For XiBernoulli(1/2)1/2X_i \sim \text{Bernoulli}(1/2) - 1/2: σ2=1/4\sigma^2 = 1/4, ρ=1/8\rho = 1/8. The Berry-Esseen bound gives: supxP(Snx)Φ(x)0.47481/8(1/2)3n=0.4748n\sup_x |P(S_n \leq x) - \Phi(x)| \leq \frac{0.4748 \cdot 1/8}{(1/2)^3 \sqrt{n}} = \frac{0.4748}{\sqrt{n}} For n=100n = 100: error 0.047\leq 0.047, meaning the normal approximation is accurate to about 5%5\%.


Lindeberg-Feller CLT

Theorem7.10Lindeberg-Feller CLT

Let X1,X2,X_1, X_2, \ldots be independent (not necessarily identically distributed) with E[Xi]=0E[X_i] = 0, σi2=Var(Xi)\sigma_i^2 = \operatorname{Var}(X_i), and sn2=i=1nσi2s_n^2 = \sum_{i=1}^n \sigma_i^2. If the Lindeberg condition holds: for every ϵ>0\epsilon > 0, 1sn2i=1nE[Xi21Xi>ϵsn]0as n\frac{1}{s_n^2} \sum_{i=1}^n E[X_i^2 \cdot \mathbf{1}_{|X_i| > \epsilon s_n}] \to 0 \quad \text{as } n \to \infty then X1++XnsndN(0,1)\frac{X_1 + \cdots + X_n}{s_n} \xrightarrow{d} N(0, 1).

The Lindeberg condition ensures that no single summand dominates the sum. It is sufficient and, under a mild uniformity assumption (maxiσi2/sn20\max_i \sigma_i^2 / s_n^2 \to 0), also necessary.


RemarkMultivariate CLT

The CLT extends to random vectors: if X1,X2,\mathbf{X}_1, \mathbf{X}_2, \ldots are i.i.d. random vectors in Rd\mathbb{R}^d with mean μ\boldsymbol{\mu} and covariance matrix Σ\Sigma, then n(Xˉnμ)dN(0,Σ)\sqrt{n}(\bar{\mathbf{X}}_n - \boldsymbol{\mu}) \xrightarrow{d} N(\mathbf{0}, \Sigma) This multivariate version is the basis for multivariate statistical methods including principal component analysis and multivariate hypothesis testing.