The Gauss-Markov Theorem
The Gauss-Markov theorem establishes that ordinary least squares produces the best estimators among all linear unbiased estimators, justifying the central role of OLS in statistical practice.
The Theorem
In the linear model with and (no normality assumed), the OLS estimator is the Best Linear Unbiased Estimator (BLUE) of . Specifically, for any linear combination and any other linear unbiased estimator with for all :
Best means minimum variance, Linear means the estimator is a linear function of , and Unbiased means .
Implications
For the model (intercept only), and . The Gauss-Markov theorem says has the smallest variance among all linear unbiased estimators of . No other linear combination with can have smaller variance.
Extensions
If for a known positive definite matrix (heteroscedasticity or correlation), the BLUE is the generalized least squares (GLS) estimator: This reduces to OLS when .
The Gauss-Markov theorem restricts attention to linear unbiased estimators. Biased estimators (like ridge regression ) can achieve lower MSE by trading a small bias for a large variance reduction, especially when predictors are highly correlated (multicollinearity). This observation motivates regularization methods that dominate modern statistics and machine learning.