ProofComplete

Proof of Spectral Theorem

We prove that every real symmetric matrix is orthogonally diagonalizable. The proof proceeds by induction on the size of the matrix, using three key facts: eigenvalues are real, eigenvectors for distinct eigenvalues are orthogonal, and eigenspaces are invariant under the transpose.


Statement

Theorem8.5Spectral Theorem (real symmetric case)

Let AMn×n(R)A \in M_{n \times n}(\mathbb{R}) with AT=AA^T = A. Then there exists an orthogonal matrix QQ (QTQ=IQ^TQ = I) such that QTAQ=diag(λ1,,λn)Q^TAQ = \operatorname{diag}(\lambda_1, \ldots, \lambda_n), where all λiR\lambda_i \in \mathbb{R}.


Preliminary lemmas

ProofLemma 1: Eigenvalues of symmetric matrices are real

Let A=ATA = A^T (with real entries) and Av=λvAv = \lambda v for some vCnv \in \mathbb{C}^n, v0v \neq 0. We show λR\lambda \in \mathbb{R}.

Consider vAvv^* A v (where v=vˉTv^* = \bar{v}^T):

vAv=v(λv)=λ(vv)=λv2.v^* A v = v^* (\lambda v) = \lambda (v^* v) = \lambda \|v\|^2.

Also, since A=AT=AˉA = A^T = \bar{A} (real matrix) and A=AˉT=AT=AA^* = \bar{A}^T = A^T = A:

vAv=vAv=(Av)v=(λv)v=λˉ(vv)=λˉv2.v^* A v = v^* A^* v = (Av)^* v = (\lambda v)^* v = \bar{\lambda} (v^* v) = \bar{\lambda} \|v\|^2.

Comparing: λv2=λˉv2\lambda \|v\|^2 = \bar{\lambda} \|v\|^2. Since v2>0\|v\|^2 > 0: λ=λˉ\lambda = \bar{\lambda}, so λR\lambda \in \mathbb{R}. \blacksquare

ExampleVerifying real eigenvalues

A=(1224)A = \begin{pmatrix} 1 & 2 \\ 2 & 4 \end{pmatrix}: p(λ)=λ25λ=λ(λ5)p(\lambda) = \lambda^2 - 5\lambda = \lambda(\lambda - 5). Eigenvalues: 0,50, 5. Both real ✓.

A=(0110)A = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}: eigenvalues 1,11, -1. Real ✓.

Compare with the non-symmetric B=(0110)B = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix} (BTBB^T \neq B): eigenvalues ±i\pm i. Not real. Symmetry is essential.

ProofLemma 2: Orthogonality of eigenvectors for distinct eigenvalues

Let A=ATA = A^T, Av1=λ1v1Av_1 = \lambda_1 v_1, Av2=λ2v2Av_2 = \lambda_2 v_2, λ1λ2\lambda_1 \neq \lambda_2.

λ1(v1v2)=(Av1)v2=v1TATv2=v1TAv2=v1(Av2)=λ2(v1v2).\lambda_1 (v_1 \cdot v_2) = (Av_1) \cdot v_2 = v_1^T A^T v_2 = v_1^T A v_2 = v_1 \cdot (Av_2) = \lambda_2 (v_1 \cdot v_2).

So (λ1λ2)(v1v2)=0(\lambda_1 - \lambda_2)(v_1 \cdot v_2) = 0. Since λ1λ2\lambda_1 \neq \lambda_2: v1v2=0v_1 \cdot v_2 = 0. \blacksquare

ExampleChecking orthogonality

A=(2112)A = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}, eigenvalues 3,13, 1, eigenvectors (1,1)(1,1) and (1,1)(1,-1).

(1,1)(1,1)=0(1,1) \cdot (1,-1) = 0 ✓ (orthogonal).

ProofLemma 3: Eigenspaces of symmetric matrices are invariant

Let A=ATA = A^T and λ\lambda be an eigenvalue. Then EλE_\lambda^\perp is also invariant under AA.

If vEλv \in E_\lambda^\perp (meaning vw=0v \cdot w = 0 for all wEλw \in E_\lambda), then for any wEλw \in E_\lambda:

(Av)w=v(ATw)=v(Aw)=v(λw)=λ(vw)=0.(Av) \cdot w = v \cdot (A^Tw) = v \cdot (Aw) = v \cdot (\lambda w) = \lambda (v \cdot w) = 0.

So AvEλAv \in E_\lambda^\perp. This means AA maps EλE_\lambda^\perp to itself, so AA restricts to a linear map on EλE_\lambda^\perp. \blacksquare

ExampleChecking invariance of orthogonal complement

A=(310130005)A = \begin{pmatrix} 3 & 1 & 0 \\ 1 & 3 & 0 \\ 0 & 0 & 5 \end{pmatrix}, eigenvalue λ=5\lambda = 5 with E5=span{(0,0,1)}E_5 = \operatorname{span}\{(0,0,1)\}.

E5=span{(1,0,0),(0,1,0)}E_5^\perp = \operatorname{span}\{(1,0,0), (0,1,0)\} (the xyxy-plane).

A(1,0,0)=(3,1,0)E5A(1,0,0) = (3,1,0) \in E_5^\perp ✓ and A(0,1,0)=(1,3,0)E5A(0,1,0) = (1,3,0) \in E_5^\perp ✓. The restriction is (3113)\begin{pmatrix} 3 & 1 \\ 1 & 3 \end{pmatrix} (still symmetric).


Main proof

ProofProof of the Spectral Theorem by induction

Induction on nn. The case n=1n = 1 is trivial (A=(λ1)A = (\lambda_1) is already diagonal).

Inductive step. Assume the theorem holds for all symmetric matrices of size less than nn. Let AMn×n(R)A \in M_{n \times n}(\mathbb{R}) be symmetric.

Step 1: Find an eigenvalue. The characteristic polynomial pA(λ)=det(AλI)p_A(\lambda) = \det(A - \lambda I) is a real polynomial of degree nn. Over C\mathbb{C}, it has at least one root λ1\lambda_1. By Lemma 1, λ1R\lambda_1 \in \mathbb{R}.

Step 2: Find an eigenvector. Since λ1\lambda_1 is real, (Aλ1I)v=0(A - \lambda_1 I)v = 0 has a nonzero real solution v1Rnv_1 \in \mathbb{R}^n. Normalize: q1=v1/v1q_1 = v_1 / \|v_1\|, so q1=1\|q_1\| = 1.

Step 3: Reduce to a smaller matrix. Let W=span{q1}={vRn:vq1=0}W = \operatorname{span}\{q_1\}^\perp = \{v \in \mathbb{R}^n : v \cdot q_1 = 0\}, which has dimension n1n - 1.

By Lemma 3, AA maps WW to itself: if vWv \in W, then AvWAv \in W. The restriction AW:WWA|_W: W \to W is a linear operator on an (n1)(n-1)-dimensional space, and it is symmetric with respect to the standard inner product restricted to WW.

Step 4: Choose a basis for WW. Extend q1q_1 to an orthonormal basis {q1,w2,,wn}\{q_1, w_2, \ldots, w_n\} of Rn\mathbb{R}^n (using Gram--Schmidt on any extension). Let Q~1=(q1w2wn)\tilde{Q}_1 = (q_1 \mid w_2 \mid \cdots \mid w_n). Then:

Q~1TAQ~1=(λ100A),\tilde{Q}_1^T A \tilde{Q}_1 = \begin{pmatrix} \lambda_1 & 0 \\ 0 & A' \end{pmatrix},

where AM(n1)×(n1)(R)A' \in M_{(n-1) \times (n-1)}(\mathbb{R}) is the matrix of AWA|_W with respect to {w2,,wn}\{w_2, \ldots, w_n\}. The zeros in the first row and column arise because Aq1=λ1q1Aq_1 = \lambda_1 q_1 and q1wjq_1 \perp w_j for j2j \geq 2:

(Q~1TAQ~1)1j=q1TAwj=(Aq1)Twj=λ1q1Twj=0(\tilde{Q}_1^T A \tilde{Q}_1)_{1j} = q_1^T A w_j = (Aq_1)^T w_j = \lambda_1 q_1^T w_j = 0 for j2j \geq 2.

By symmetry: (Q~1TAQ~1)j1=0(\tilde{Q}_1^T A \tilde{Q}_1)_{j1} = 0 for j2j \geq 2.

Step 5: Apply induction. AA' is symmetric (since AA is, and the restriction preserves symmetry). By the inductive hypothesis, there exists an orthogonal matrix QQ' such that (Q)TAQ=diag(λ2,,λn)(Q')^T A' Q' = \operatorname{diag}(\lambda_2, \ldots, \lambda_n).

Step 6: Combine. Let Q2=(100Q)Q_2 = \begin{pmatrix} 1 & 0 \\ 0 & Q' \end{pmatrix} (orthogonal). Then Q=Q~1Q2Q = \tilde{Q}_1 Q_2 is orthogonal and:

QTAQ=Q2T(λ100A)Q2=(λ100(Q)TAQ)=diag(λ1,,λn).Q^T A Q = Q_2^T \begin{pmatrix} \lambda_1 & 0 \\ 0 & A' \end{pmatrix} Q_2 = \begin{pmatrix} \lambda_1 & 0 \\ 0 & (Q')^T A' Q' \end{pmatrix} = \operatorname{diag}(\lambda_1, \ldots, \lambda_n). \quad \blacksquare


Worked example of the inductive proof

ExampleWalking through the proof for 3x3

A=(210131012)A = \begin{pmatrix} 2 & 1 & 0 \\ 1 & 3 & 1 \\ 0 & 1 & 2 \end{pmatrix}.

Step 1-2: Characteristic polynomial: p(λ)=λ3+7λ214λ+8=(λ1)(λ2)(λ4)p(\lambda) = -\lambda^3 + 7\lambda^2 - 14\lambda + 8 = -(\lambda - 1)(\lambda - 2)(\lambda - 4).

Eigenvalues: 1,2,41, 2, 4. Eigenvector for λ=4\lambda = 4: solve (A4I)v=0(A - 4I)v = 0. (A4I)=(210111012)(A-4I) = \begin{pmatrix} -2 & 1 & 0 \\ 1 & -1 & 1 \\ 0 & 1 & -2 \end{pmatrix}. Row reduce to get v1=(1,2,1)v_1 = (1, 2, 1), q1=16(1,2,1)q_1 = \frac{1}{\sqrt{6}}(1, 2, 1).

Step 3-4: W=q1W = q_1^\perp has dimension 22. Find ONB for WW: w2=12(1,0,1)w_2 = \frac{1}{\sqrt{2}}(1, 0, -1), w3=13(1,1,1)w_3 = \frac{1}{\sqrt{3}}(-1, 1, -1) (orthogonal to q1q_1 and to each other).

Q~1=(1/61/21/32/601/31/61/21/3)\tilde{Q}_1 = \begin{pmatrix} 1/\sqrt{6} & 1/\sqrt{2} & -1/\sqrt{3} \\ 2/\sqrt{6} & 0 & 1/\sqrt{3} \\ 1/\sqrt{6} & -1/\sqrt{2} & -1/\sqrt{3} \end{pmatrix}.

Q~1TAQ~1=(4000ab0bc)\tilde{Q}_1^T A \tilde{Q}_1 = \begin{pmatrix} 4 & 0 & 0 \\ 0 & a' & b' \\ 0 & b' & c' \end{pmatrix} where A=(abbc)A' = \begin{pmatrix} a' & b' \\ b' & c' \end{pmatrix} is the restriction.

Computing AA': w2TAw2=12(1,0,1)(221)=12(2+1)=32w_2^T A w_2 = \frac{1}{2}(1,0,-1)\begin{pmatrix} 2 \\ 2 \\ -1 \end{pmatrix} = \frac{1}{2}(2+1) = \frac{3}{2}... let me just note the eigenvalues of AA' must be 11 and 22 (the remaining eigenvalues of AA).

Step 5-6: Diagonalize AA' (a 2×22 \times 2 symmetric matrix) to get QQ', then combine.


Alternative proof via Schur decomposition

ProofAlternative: via Schur decomposition

Every complex matrix AA has a Schur decomposition: A=UTUA = UTU^* where UU is unitary and TT is upper triangular with the eigenvalues on the diagonal.

If A=AA = A^* (Hermitian), then T=UAUT = U^*AU satisfies T=(UAU)=UAU=UAU=TT^* = (U^*AU)^* = U^*A^*U = U^*AU = T. An upper triangular matrix that is also Hermitian must be diagonal (the off-diagonal entries tijt_{ij} with i<ji < j must equal tji=0\overline{t_{ji}} = 0). So T=Λ=diag(λ1,,λn)T = \Lambda = \operatorname{diag}(\lambda_1, \ldots, \lambda_n).

For the real case: the real Schur decomposition gives A=QTQTA = QTQ^T with QQ orthogonal and TT quasi-upper-triangular (block upper triangular with 1×11 \times 1 and 2×22 \times 2 blocks on the diagonal). If A=ATA = A^T, symmetry forces TT to be symmetric, hence diagonal. So A=QΛQTA = Q\Lambda Q^T. \blacksquare

ExampleSchur decomposition of a symmetric matrix is diagonal

A=(2112)A = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}.

Schur decomposition over R\mathbb{R}: since AA is symmetric, the Schur form TT is diagonal. T=(3001)T = \begin{pmatrix} 3 & 0 \\ 0 & 1 \end{pmatrix} with Q=12(1111)Q = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}.

Compare with a non-symmetric matrix B=(2103)B = \begin{pmatrix} 2 & 1 \\ 0 & 3 \end{pmatrix}: the Schur form is T=(2103)T = \begin{pmatrix} 2 & 1 \\ 0 & 3 \end{pmatrix} (still upper triangular, not diagonal).


Verification examples

ExampleFull verification for 2x2

A=(4221)A = \begin{pmatrix} 4 & 2 \\ 2 & 1 \end{pmatrix}, eigenvalues 5,05, 0.

v1=(2,1)v_1 = (2, 1) for λ=5\lambda = 5, v2=(1,2)v_2 = (-1, 2) for λ=0\lambda = 0.

Q=15(2112)Q = \frac{1}{\sqrt{5}}\begin{pmatrix} 2 & -1 \\ 1 & 2 \end{pmatrix}, QTQ=IQ^TQ = I ✓.

QTAQ=15(2112)(4221)(2112)=(5000)Q^TAQ = \frac{1}{5}\begin{pmatrix} 2 & 1 \\ -1 & 2 \end{pmatrix}\begin{pmatrix} 4 & 2 \\ 2 & 1 \end{pmatrix}\begin{pmatrix} 2 & -1 \\ 1 & 2 \end{pmatrix} = \begin{pmatrix} 5 & 0 \\ 0 & 0 \end{pmatrix} ✓.

ExampleVerification with repeated eigenvalue

A=(300030005)A = \begin{pmatrix} 3 & 0 & 0 \\ 0 & 3 & 0 \\ 0 & 0 & 5 \end{pmatrix}: eigenvalue 33 has ma=2m_a = 2, E3=span{e1,e2}E_3 = \operatorname{span}\{e_1, e_2\}. Any ONB of E3E_3 works (e.g., e1,e2e_1, e_2). Q=IQ = I, QTAQ=A=diag(3,3,5)Q^TAQ = A = \operatorname{diag}(3, 3, 5) ✓.

A non-diagonal example with repeated eigenvalue: A=(211121112)A = \begin{pmatrix} 2 & 1 & 1 \\ 1 & 2 & 1 \\ 1 & 1 & 2 \end{pmatrix}. Eigenvalues: 44 (once) and 11 (twice). Eigenvector for 44: (1,1,1)/3(1,1,1)/\sqrt{3}. Eigenspace for 11: {(x,y,z):x+y+z=0}\{(x,y,z) : x+y+z = 0\}, a 22-dimensional space. Choose ONB: (1,1,0)/2(1,-1,0)/\sqrt{2} and (1,1,2)/6(1,1,-2)/\sqrt{6}.

ExampleVerification for a negative definite matrix

A=(5222)A = \begin{pmatrix} -5 & 2 \\ 2 & -2 \end{pmatrix}. Eigenvalues: 7±9+162=7±52\frac{-7 \pm \sqrt{9+16}}{2} = \frac{-7 \pm 5}{2}. So λ1=1\lambda_1 = -1 and λ2=6\lambda_2 = -6. Both negative (negative definite ✓).

Eigenvectors: (2,1)/5(2, 1)/\sqrt{5} for λ=1\lambda = -1 and (1,2)/5(-1, 2)/\sqrt{5} for λ=6\lambda = -6.

Q=15(2112)Q = \frac{1}{\sqrt{5}}\begin{pmatrix} 2 & -1 \\ 1 & 2 \end{pmatrix}, QTAQ=diag(1,6)Q^TAQ = \operatorname{diag}(-1, -6) ✓.

ExampleSpectral theorem for a 4x4 matrix

A=I+eeTA = I + ee^T where e=(1,1,1,1)Te = (1, 1, 1, 1)^T. Then A=I+eeTA = I + ee^T has eigenvalues: 11 (with multiplicity 33, eigenspace ee^\perp) and 55 (with multiplicity 11, eigenspace span{e}\operatorname{span}\{e\}).

diag(5,1,1,1)\operatorname{diag}(5, 1, 1, 1) in the basis {e/2,}\{e/2, \ldots\} (any ONB starting with e/2e/2).


Why the theorem fails for non-symmetric matrices

ExampleOrthogonal diagonalization fails without symmetry

A=(1102)A = \begin{pmatrix} 1 & 1 \\ 0 & 2 \end{pmatrix}: not symmetric. Eigenvalues 1,21, 2, eigenvectors (1,0)(1, 0) and (1,1)(1, 1).

(1,0)(1,1)=10(1, 0) \cdot (1, 1) = 1 \neq 0. The eigenvectors are not orthogonal.

AA is diagonalizable (distinct eigenvalues), but not orthogonally diagonalizable. Orthogonal diagonalizability requires A=ATA = A^T.

ExampleNon-diagonalizable non-symmetric matrix

A=(1101)A = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}: ATAA^T \neq A, and AA is not even diagonalizable (single eigenvalue 11 with mg=1m_g = 1). This cannot happen for symmetric matrices -- the spectral theorem guarantees full diagonalizability.


Summary

RemarkTwo proofs, one theorem

The Spectral Theorem admits two clean proofs:

Inductive proof: Peel off one eigenvector at a time. The key is that the orthogonal complement of an eigenspace is invariant under a symmetric operator, allowing the induction to proceed.

Schur decomposition proof: Start with the existence of the Schur form (upper triangular, unitarily similar to AA). Symmetry forces the upper triangular matrix to be diagonal.

Both proofs use the same essential fact: real eigenvalues (from λ=λˉ\lambda = \bar{\lambda}) and orthogonality of eigenspaces (from A=ATA = A^T). The result is the most perfect form a matrix can take: a real diagonal matrix in an orthonormal eigenbasis.