Point Estimation

Point estimation concerns the problem of using sample data to produce a single best guess for an unknown population parameter, along with criteria for evaluating the quality of estimators.

Estimators and Their Properties

Definition

Let $X_1, \ldots, X_n$ be a random sample from a distribution $F_\theta$ parameterized by $\theta \in \Theta$ . A point estimator of $\theta$ is a statistic $\hat{\theta} = T(X_1, \ldots, X_n)$ — a function of the observed data that does not depend on $\theta$ .

Definition

Key properties of an estimator $\hat{\theta}$ for the parameter $\theta$ :

Bias: $\operatorname{Bias}(\hat{\theta}) = E[\hat{\theta}] - \theta$ . The estimator is unbiased if $E[\hat{\theta}] = \theta$ for all $\theta$ .
Variance: $\operatorname{Var}(\hat{\theta}) = E[(\hat{\theta} - E[\hat{\theta}])^2]$
Mean squared error: $\operatorname{MSE}(\hat{\theta}) = E[(\hat{\theta} - \theta)^2] = \operatorname{Var}(\hat{\theta}) + [\operatorname{Bias}(\hat{\theta})]^2$
Consistency: $\hat{\theta}_n \xrightarrow{P} \theta$ as $n \to \infty$ for all $\theta$
Efficiency: $\hat{\theta}$ achieves the Cramer-Rao lower bound

Common Estimators

ExampleSample mean and variance

For i.i.d. $X_1, \ldots, X_n$ with mean $\mu$ and variance $\sigma^2$ :

$\bar{X} = \frac{1}{n}\sum X_i$ is unbiased for $\mu$ with $\operatorname{Var}(\bar{X}) = \sigma^2/n$
$S^2 = \frac{1}{n-1}\sum(X_i - \bar{X})^2$ is unbiased for $\sigma^2$
The biased version $\hat{\sigma}^2 = \frac{1}{n}\sum(X_i - \bar{X})^2$ has $E[\hat{\sigma}^2] = \frac{n-1}{n}\sigma^2$ and lower MSE than $S^2$ for normal populations

The Bias-Variance Tradeoff

RemarkMSE decomposition

The MSE decomposition $\operatorname{MSE} = \operatorname{Variance} + \operatorname{Bias}^2$ reveals a fundamental tradeoff: reducing bias may increase variance and vice versa. A biased estimator can have lower MSE than an unbiased one if the variance reduction more than compensates for the bias. This tradeoff is central to modern statistical methods including regularization, shrinkage estimators, and machine learning.