Proof of the Cramer-Rao Lower Bound

We prove the Cramer-Rao inequality, establishing the fundamental limit on the precision of unbiased estimators.

Proof

Theorem (Cramer-Rao): If $\hat{\theta}$ is an unbiased estimator of $\theta$ and the regularity conditions hold, then $\operatorname{Var}(\hat{\theta}) \geq 1/(nI(\theta))$ .

Step 1: The score function.

Define the score function $S_n(\theta) = \frac{\partial}{\partial \theta} \log L(\theta) = \sum_{i=1}^n \frac{\partial}{\partial \theta} \log f(X_i; \theta)$ .

The score has mean zero: $E[S_n(\theta)] = E\left[\frac{\partial}{\partial \theta} \log L(\theta)\right] = \int \frac{\partial}{\partial \theta} f(\mathbf{x}; \theta)\, \frac{1}{f(\mathbf{x};\theta)} \cdot f(\mathbf{x};\theta)\,d\mathbf{x} = \frac{\partial}{\partial \theta} \int f(\mathbf{x};\theta)\,d\mathbf{x} = \frac{\partial}{\partial \theta} 1 = 0$

where we used the regularity condition to interchange differentiation and integration.

The variance of the score is the Fisher information: $\operatorname{Var}(S_n(\theta)) = E[S_n(\theta)^2] = nI(\theta)$

Step 2: Covariance calculation.

Since $\hat{\theta}$ is unbiased, $E[\hat{\theta}] = \theta$ for all $\theta$ . Differentiating both sides with respect to $\theta$ : $\frac{\partial}{\partial \theta} E[\hat{\theta}] = 1$

$\frac{\partial}{\partial \theta} \int \hat{\theta}(\mathbf{x}) f(\mathbf{x}; \theta)\,d\mathbf{x} = \int \hat{\theta}(\mathbf{x}) \frac{\partial f}{\partial \theta}(\mathbf{x}; \theta)\,d\mathbf{x}$

$= \int \hat{\theta}(\mathbf{x}) \frac{\partial \log f}{\partial \theta}(\mathbf{x}; \theta) f(\mathbf{x}; \theta)\,d\mathbf{x} = E[\hat{\theta} \cdot S_n(\theta)]$

Since $E[S_n] = 0$ : $\operatorname{Cov}(\hat{\theta}, S_n) = E[\hat{\theta} \cdot S_n] - E[\hat{\theta}] \cdot E[S_n] = 1 - \theta \cdot 0 = 1$

Step 3: Apply the Cauchy-Schwarz inequality.

By the Cauchy-Schwarz inequality: $[\operatorname{Cov}(\hat{\theta}, S_n)]^2 \leq \operatorname{Var}(\hat{\theta}) \cdot \operatorname{Var}(S_n)$

$1 = 1^2 \leq \operatorname{Var}(\hat{\theta}) \cdot nI(\theta)$

Therefore: $\operatorname{Var}(\hat{\theta}) \geq \frac{1}{nI(\theta)}$

Step 4: Equality condition.

Equality in Cauchy-Schwarz holds if and only if $\hat{\theta} - \theta = c \cdot S_n(\theta)$ for some constant $c$ (not depending on $\mathbf{x}$ ). This occurs precisely when the score is a linear function of $\hat{\theta}$ , which is characteristic of exponential family distributions. $\square$

■

RemarkExtension to biased estimators

For a biased estimator $\hat{\theta}$ with $E[\hat{\theta}] = g(\theta)$ , the Cramer-Rao bound becomes $\operatorname{Var}(\hat{\theta}) \geq [g'(\theta)]^2 / (nI(\theta))$ . The proof is identical except $\operatorname{Cov}(\hat{\theta}, S_n) = g'(\theta)$ instead of $1$ .