The Spectral Theorem

The spectral theorem is the crown jewel of linear algebra, establishing that symmetric matrices have complete orthonormal eigenvector bases. This result underlies countless applications across mathematics and science.

TheoremThe Spectral Theorem (Real Case)

Let $A$ be a real $n \times n$ symmetric matrix. Then:

All eigenvalues of $A$ are real
Eigenvectors corresponding to distinct eigenvalues are orthogonal
$\mathbb{R}^n$ has an orthonormal basis consisting of eigenvectors of $A$
$A$ $A$ is orthogonally diagonalizable: $A = Q\Lambda Q^T$ $A = Q Λ Q^{T}$ where:
- $Q$ is orthogonal with $Q^TQ = I$
- $\Lambda = \text{diag}(\lambda_1, \ldots, \lambda_n)$ contains the eigenvalues
- Columns of $Q$ are orthonormal eigenvectors

This decomposition is unique up to reordering eigenvalues and sign choices for eigenvectors.

TheoremThe Spectral Theorem (Complex Case)

Let $A$ be an $n \times n$ Hermitian matrix (i.e., $A^* = A$ ). Then:

All eigenvalues of $A$ are real
$\mathbb{C}^n$ has an orthonormal basis of eigenvectors
$A$ is unitarily diagonalizable: $A = U\Lambda U^*$ where $U$ is unitary and $\Lambda$ is real diagonal

The spectral theorem guarantees maximal diagonalizability: unlike general matrices (which may require Jordan form), symmetric/Hermitian matrices always have enough eigenvectors for a complete orthonormal basis.

TheoremSimultaneous Diagonalization

Two symmetric matrices $A$ and $B$ are simultaneously diagonalizable by the same orthogonal matrix if and only if $AB = BA$ (they commute).

If $AB = BA$ , there exists orthogonal $Q$ such that both $Q^TAQ$ and $Q^TBQ$ are diagonal.

ExampleApplication: Principal Component Analysis

Given data matrix $X$ (mean-centered), the covariance matrix is $C = \frac{1}{n}X^TX$ , which is symmetric PSD.

The spectral theorem gives $C = Q\Lambda Q^T$ where:

Eigenvectors (columns of $Q$ ) are the principal components—directions of maximum variance
Eigenvalues $\lambda_i$ measure variance along each principal component
The transformation $Y = Q^TX$ rotates data to principal axes

PCA reduces dimensionality by keeping only eigenvectors with largest eigenvalues.

TheoremRayleigh Quotient

For symmetric matrix $A$ and nonzero $\mathbf{x} \in \mathbb{R}^n$ , the Rayleigh quotient is: $R(\mathbf{x}) = \frac{\mathbf{x}^TA\mathbf{x}}{\mathbf{x}^T\mathbf{x}}$

Let $\lambda_{\min}$ and $\lambda_{\max}$ be the smallest and largest eigenvalues. Then: $\lambda_{\min} \leq R(\mathbf{x}) \leq \lambda_{\max}$

Moreover, $R(\mathbf{x})$ achieves its maximum $\lambda_{\max}$ when $\mathbf{x}$ is the corresponding eigenvector.

Remark

The spectral theorem is foundational because it reveals the geometric structure of symmetric matrices: they are exactly those linear transformations that can be understood purely through orthogonal scaling along perpendicular axes. This decomposition appears everywhere: in vibrating systems (normal modes), quantum mechanics (observables), statistics (decorrelation), and optimization (quadratic forms). The theorem's elegance—that symmetry implies such strong spectral properties—is one of the most beautiful results in mathematics.