Sufficiency and the Rao-Blackwell Theorem

Sufficient statistics capture all the information in a sample relevant to estimating a parameter, and the Rao-Blackwell theorem shows how to improve any estimator by conditioning on a sufficient statistic.

Sufficient Statistics

Definition

A statistic $T = T(X_1, \ldots, X_n)$ is sufficient for $\theta$ if the conditional distribution of the sample $(X_1, \ldots, X_n)$ given $T$ does not depend on $\theta$ . Intuitively, once $T$ is known, the remaining sample information is "noise" that carries no information about $\theta$ .

Theorem8.6Fisher-Neyman Factorization

A statistic $T(\mathbf{X})$ is sufficient for $\theta$ if and only if the joint density (or mass function) factors as $f(\mathbf{x}; \theta) = g(T(\mathbf{x}), \theta) \cdot h(\mathbf{x})$ where $g$ depends on $\mathbf{x}$ only through $T(\mathbf{x})$ and $h$ does not depend on $\theta$ .

ExampleSufficient statistics for common families

Normal $N(\mu, \sigma^2)$ : $(\bar{X}, \sum(X_i - \bar{X})^2)$ is sufficient for $(\mu, \sigma^2)$
Poisson $\text{Poi}(\lambda)$ : $\sum X_i$ is sufficient for $\lambda$
Exponential families: the natural sufficient statistic is $T(\mathbf{x}) = \sum t(x_i)$

The Rao-Blackwell Theorem

Theorem8.7Rao-Blackwell Theorem

Let $\hat{\theta}$ be an unbiased estimator of $\theta$ and $T$ a sufficient statistic. Define $\tilde{\theta} = E[\hat{\theta} | T]$ . Then:

$\tilde{\theta}$ is a function of $T$ only (does not depend on the full sample)
$E[\tilde{\theta}] = \theta$ (unbiased)
$\operatorname{Var}(\tilde{\theta}) \leq \operatorname{Var}(\hat{\theta})$ for all $\theta$ , with equality iff $\hat{\theta}$ is already a function of $T$

The Rao-Blackwell theorem says that conditioning any unbiased estimator on a sufficient statistic never increases the variance and typically strictly reduces it.

Complete Statistics and MVUE

Definition

A sufficient statistic $T$ is complete if for any measurable function $g$ , $E[g(T)] = 0$ for all $\theta$ implies $g(T) = 0$ almost surely. Completeness ensures uniqueness: there is only one unbiased estimator that is a function of $T$ .

RemarkLehmann-Scheffe theorem

If $T$ is a complete sufficient statistic and $\tilde{\theta} = g(T)$ is an unbiased estimator of $\theta$ , then $\tilde{\theta}$ is the unique minimum variance unbiased estimator (MVUE). The Lehmann-Scheffe theorem thus provides a constructive method: find a complete sufficient statistic, then find an unbiased function of it.