TheoremComplete

The Spectral Theorem

The spectral theorem is the crown jewel of linear algebra, establishing that symmetric matrices have complete orthonormal eigenvector bases. This result underlies countless applications across mathematics and science.

TheoremThe Spectral Theorem (Real Case)

Let AA be a real n×nn \times n symmetric matrix. Then:

  1. All eigenvalues of AA are real
  2. Eigenvectors corresponding to distinct eigenvalues are orthogonal
  3. Rn\mathbb{R}^n has an orthonormal basis consisting of eigenvectors of AA
  4. AA is orthogonally diagonalizable: A=QΛQTA = Q\Lambda Q^T where:
    • QQ is orthogonal with QTQ=IQ^TQ = I
    • Λ=diag(λ1,,λn)\Lambda = \text{diag}(\lambda_1, \ldots, \lambda_n) contains the eigenvalues
    • Columns of QQ are orthonormal eigenvectors

This decomposition is unique up to reordering eigenvalues and sign choices for eigenvectors.

TheoremThe Spectral Theorem (Complex Case)

Let AA be an n×nn \times n Hermitian matrix (i.e., A=AA^* = A). Then:

  1. All eigenvalues of AA are real
  2. Cn\mathbb{C}^n has an orthonormal basis of eigenvectors
  3. AA is unitarily diagonalizable: A=UΛUA = U\Lambda U^* where UU is unitary and Λ\Lambda is real diagonal

The spectral theorem guarantees maximal diagonalizability: unlike general matrices (which may require Jordan form), symmetric/Hermitian matrices always have enough eigenvectors for a complete orthonormal basis.

TheoremSimultaneous Diagonalization

Two symmetric matrices AA and BB are simultaneously diagonalizable by the same orthogonal matrix if and only if AB=BAAB = BA (they commute).

If AB=BAAB = BA, there exists orthogonal QQ such that both QTAQQ^TAQ and QTBQQ^TBQ are diagonal.

ExampleApplication: Principal Component Analysis

Given data matrix XX (mean-centered), the covariance matrix is C=1nXTXC = \frac{1}{n}X^TX, which is symmetric PSD.

The spectral theorem gives C=QΛQTC = Q\Lambda Q^T where:

  • Eigenvectors (columns of QQ) are the principal components—directions of maximum variance
  • Eigenvalues λi\lambda_i measure variance along each principal component
  • The transformation Y=QTXY = Q^TX rotates data to principal axes

PCA reduces dimensionality by keeping only eigenvectors with largest eigenvalues.

TheoremRayleigh Quotient

For symmetric matrix AA and nonzero xRn\mathbf{x} \in \mathbb{R}^n, the Rayleigh quotient is: R(x)=xTAxxTxR(\mathbf{x}) = \frac{\mathbf{x}^TA\mathbf{x}}{\mathbf{x}^T\mathbf{x}}

Let λmin\lambda_{\min} and λmax\lambda_{\max} be the smallest and largest eigenvalues. Then: λminR(x)λmax\lambda_{\min} \leq R(\mathbf{x}) \leq \lambda_{\max}

Moreover, R(x)R(\mathbf{x}) achieves its maximum λmax\lambda_{\max} when x\mathbf{x} is the corresponding eigenvector.

Remark

The spectral theorem is foundational because it reveals the geometric structure of symmetric matrices: they are exactly those linear transformations that can be understood purely through orthogonal scaling along perpendicular axes. This decomposition appears everywhere: in vibrating systems (normal modes), quantum mechanics (observables), statistics (decorrelation), and optimization (quadratic forms). The theorem's elegance—that symmetry implies such strong spectral properties—is one of the most beautiful results in mathematics.