ConceptComplete

Diagonalization

A matrix is diagonalizable if it is similar to a diagonal matrix -- equivalently, if there exists a basis of eigenvectors. Diagonalization is the most desirable form a matrix can take: it reduces matrix computations to scalar operations and reveals the geometric action of the transformation as scaling along independent directions.


Definition

Definition5.8Diagonalizable matrix

A matrix AMn×n(F)A \in M_{n \times n}(F) is diagonalizable (over FF) if there exists an invertible matrix PP such that:

P1AP=D=diag(λ1,λ2,,λn),P^{-1}AP = D = \operatorname{diag}(\lambda_1, \lambda_2, \ldots, \lambda_n),

where λ1,,λn\lambda_1, \ldots, \lambda_n are the eigenvalues of AA. Equivalently, A=PDP1A = PDP^{-1}.

The columns of PP are eigenvectors of AA: Api=λipiAp_i = \lambda_i p_i where pip_i is the ii-th column of PP.

Theorem5.3Diagonalizability criterion

AMn×n(F)A \in M_{n \times n}(F) is diagonalizable over FF if and only if:

  1. The characteristic polynomial pA(λ)p_A(\lambda) splits completely over FF (all roots lie in FF), and
  2. For each eigenvalue λ\lambda, the geometric multiplicity equals the algebraic multiplicity: mg(λ)=ma(λ)m_g(\lambda) = m_a(\lambda).

Equivalently, AA is diagonalizable iff VV has a basis consisting of eigenvectors of AA.


Diagonalization procedure

ExampleDiagonalizing a 2x2 matrix

A=(4123)A = \begin{pmatrix} 4 & 1 \\ 2 & 3 \end{pmatrix}.

Step 1: Characteristic polynomial: λ27λ+10=(λ5)(λ2)\lambda^2 - 7\lambda + 10 = (\lambda - 5)(\lambda - 2).

Step 2: Eigenvalues: λ1=5\lambda_1 = 5, λ2=2\lambda_2 = 2.

Step 3: Eigenvectors:

  • λ=5\lambda = 5: (A5I)v=0(A - 5I)v = 0 gives (1122)v=0\begin{pmatrix} -1 & 1 \\ 2 & -2 \end{pmatrix}v = 0, so v1=(1,1)v_1 = (1, 1).
  • λ=2\lambda = 2: (A2I)v=0(A - 2I)v = 0 gives (2121)v=0\begin{pmatrix} 2 & 1 \\ 2 & 1 \end{pmatrix}v = 0, so v2=(1,2)v_2 = (1, -2).

Step 4: P=(1112)P = \begin{pmatrix} 1 & 1 \\ 1 & -2 \end{pmatrix}, D=(5002)D = \begin{pmatrix} 5 & 0 \\ 0 & 2 \end{pmatrix}, and A=PDP1A = PDP^{-1}.

ExampleDiagonalizing a 3x3 matrix

A=(100012014)A = \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & -2 \\ 0 & 1 & 4 \end{pmatrix}.

Characteristic polynomial: (1λ)[(1λ)(4λ)+2]=(1λ)(λ25λ+6)=(1λ)(λ2)(λ3)(1 - \lambda)[(1-\lambda)(4-\lambda) + 2] = (1-\lambda)(\lambda^2 - 5\lambda + 6) = (1-\lambda)({\lambda - 2})(\lambda - 3).

Eigenvalues: 1,2,31, 2, 3 (all distinct, so automatically diagonalizable).

Eigenvectors: v1=(1,0,0)v_1 = (1, 0, 0) for λ=1\lambda = 1; v2=(0,2,1)v_2 = (0, -2, 1) for λ=2\lambda = 2; v3=(0,1,1)v_3 = (0, -1, 1) for λ=3\lambda = 3.

P=(100021011)P = \begin{pmatrix} 1 & 0 & 0 \\ 0 & -2 & -1 \\ 0 & 1 & 1 \end{pmatrix}, D=diag(1,2,3)D = \operatorname{diag}(1, 2, 3).


When diagonalization fails

ExampleA non-diagonalizable 2x2 matrix

A=(3103)A = \begin{pmatrix} 3 & 1 \\ 0 & 3 \end{pmatrix}.

Characteristic polynomial: (λ3)2(\lambda - 3)^2. Eigenvalue λ=3\lambda = 3 with ma=2m_a = 2.

A3I=(0100)A - 3I = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix}, so ker(A3I)=span{(1,0)}\ker(A - 3I) = \operatorname{span}\{(1, 0)\}, mg=1m_g = 1.

Since mg=1<2=mam_g = 1 < 2 = m_a, AA is not diagonalizable. There is no basis of R2\mathbb{R}^2 consisting of eigenvectors.

ExampleNon-diagonalizable over R but diagonalizable over C

A=(0110)A = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}.

Over R\mathbb{R}: pA(λ)=λ2+1p_A(\lambda) = \lambda^2 + 1 has no real roots, so AA is not diagonalizable over R\mathbb{R}.

Over C\mathbb{C}: eigenvalues i,ii, -i with eigenvectors (1,i)(1, -i) and (1,i)(1, i). Then P1AP=(i00i)P^{-1}AP = \begin{pmatrix} i & 0 \\ 0 & -i \end{pmatrix}.

ExampleNilpotent matrices (except zero) are never diagonalizable

If N0N \neq 0 is nilpotent with Nk=0N^k = 0, then the only eigenvalue is 00, so if NN were diagonalizable, D=0D = 0, meaning N=P0P1=0N = P \cdot 0 \cdot P^{-1} = 0, contradicting N0N \neq 0.

Example: N=(010001000)N = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{pmatrix} has N3=0N^3 = 0, eigenvalue 00 with ma=3m_a = 3 but mg=1m_g = 1.


Sufficient conditions for diagonalizability

TheoremDistinct eigenvalues imply diagonalizability

If AMn×n(F)A \in M_{n \times n}(F) has nn distinct eigenvalues in FF, then AA is diagonalizable.

ExampleDistinct eigenvalues guarantee diagonalizability

A=(1234)A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}. Eigenvalues: 5±25(8)2\frac{5 \pm \sqrt{25 - (-8)}}{2}... let us compute: p(λ)=λ25λ2p(\lambda) = \lambda^2 - 5\lambda - 2, discriminant 25+8=33>025 + 8 = 33 > 0.

Two distinct real eigenvalues 5±3325.37,0.37\frac{5 \pm \sqrt{33}}{2} \approx 5.37, -0.37. Since they are distinct, AA is diagonalizable over R\mathbb{R}.

TheoremSymmetric matrices are diagonalizable

Every real symmetric matrix (AT=AA^T = A) is diagonalizable over R\mathbb{R}. Moreover, it is orthogonally diagonalizable: there exists an orthogonal matrix QQ (QTQ=IQ^T Q = I) such that QTAQ=DQ^T A Q = D.

ExampleOrthogonal diagonalization of a symmetric matrix

A=(2112)A = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}.

Eigenvalues: 3,13, 1. Eigenvectors: 12(1,1)\frac{1}{\sqrt{2}}(1, 1) and 12(1,1)\frac{1}{\sqrt{2}}(1, -1).

Q=12(1111)Q = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}, QTAQ=(3001)Q^TAQ = \begin{pmatrix} 3 & 0 \\ 0 & 1 \end{pmatrix}.


Powers and functions of diagonalizable matrices

TheoremPowers via diagonalization

If A=PDP1A = PDP^{-1} where D=diag(λ1,,λn)D = \operatorname{diag}(\lambda_1, \ldots, \lambda_n), then:

Ak=PDkP1=Pdiag(λ1k,,λnk)P1.A^k = PD^kP^{-1} = P \operatorname{diag}(\lambda_1^k, \ldots, \lambda_n^k) P^{-1}.

ExampleComputing A^{10}

A=(1102)A = \begin{pmatrix} 1 & 1 \\ 0 & 2 \end{pmatrix}, eigenvalues 1,21, 2, P=(1101)P = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}, P1=(1101)P^{-1} = \begin{pmatrix} 1 & -1 \\ 0 & 1 \end{pmatrix}.

A10=P(1001024)P1=(1101)(1001024)(1101)=(1102301024)A^{10} = P \begin{pmatrix} 1 & 0 \\ 0 & 1024 \end{pmatrix} P^{-1} = \begin{pmatrix} 1 & 1 \\ 0 & 1 \end{pmatrix}\begin{pmatrix} 1 & 0 \\ 0 & 1024 \end{pmatrix}\begin{pmatrix} 1 & -1 \\ 0 & 1 \end{pmatrix} = \begin{pmatrix} 1 & 1023 \\ 0 & 1024 \end{pmatrix}.

ExampleFibonacci numbers via diagonalization

The Fibonacci recurrence Fn+2=Fn+1+FnF_{n+2} = F_{n+1} + F_n is encoded by (Fn+1Fn)=An(10)\begin{pmatrix} F_{n+1} \\ F_n \end{pmatrix} = A^n \begin{pmatrix} 1 \\ 0 \end{pmatrix} where A=(1110)A = \begin{pmatrix} 1 & 1 \\ 1 & 0 \end{pmatrix}.

Eigenvalues: ϕ=1+52\phi = \frac{1+\sqrt{5}}{2} and ϕ^=152\hat{\phi} = \frac{1-\sqrt{5}}{2}.

By diagonalization: An=PDnP1A^n = PD^nP^{-1}, leading to Binet's formula:

Fn=ϕnϕ^n5.F_n = \frac{\phi^n - \hat{\phi}^n}{\sqrt{5}}.

ExampleMatrix exponential via diagonalization

If A=PDP1A = PDP^{-1}, then eA=PeDP1=Pdiag(eλ1,,eλn)P1e^A = Pe^DP^{-1} = P\operatorname{diag}(e^{\lambda_1}, \ldots, e^{\lambda_n})P^{-1}.

For A=(0110)A = \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix}: eigenvalues i,ii, -i, so eA=P(ei00ei)P1e^A = P\begin{pmatrix} e^i & 0 \\ 0 & e^{-i} \end{pmatrix}P^{-1}. Computing back in real coordinates gives eA=(cos1sin1sin1cos1)e^A = \begin{pmatrix} \cos 1 & \sin 1 \\ -\sin 1 & \cos 1 \end{pmatrix}, a rotation by 11 radian.


Simultaneous diagonalization

TheoremSimultaneous diagonalization

Two diagonalizable matrices A,BMn×n(F)A, B \in M_{n \times n}(F) are simultaneously diagonalizable (i.e., there exists PP such that both P1APP^{-1}AP and P1BPP^{-1}BP are diagonal) if and only if AB=BAAB = BA.

ExampleCommuting matrices are simultaneously diagonalizable

A=(1002)A = \begin{pmatrix} 1 & 0 \\ 0 & 2 \end{pmatrix}, B=(3005)B = \begin{pmatrix} 3 & 0 \\ 0 & 5 \end{pmatrix}. Since both are diagonal, AB=BAAB = BA, and they are already simultaneously diagonalized (with P=IP = I).

A nontrivial example: A=(2112)A = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}, B=(4114)B = \begin{pmatrix} 4 & -1 \\ -1 & 4 \end{pmatrix}. Check: AB=(7227)=BAAB = \begin{pmatrix} 7 & 2 \\ 2 & 7 \end{pmatrix} = BA. Common eigenvectors: (1,1)(1,1) and (1,1)(1,-1). In this basis, A=diag(3,1)A = \operatorname{diag}(3, 1) and B=diag(3,5)B = \operatorname{diag}(3, 5).

ExampleNon-commuting diagonalizable matrices

A=(1002)A = \begin{pmatrix} 1 & 0 \\ 0 & 2 \end{pmatrix}, B=(0110)B = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}. Then AB=(0120)(0210)=BAAB = \begin{pmatrix} 0 & 1 \\ 2 & 0 \end{pmatrix} \neq \begin{pmatrix} 0 & 2 \\ 1 & 0 \end{pmatrix} = BA.

Both are diagonalizable (distinct eigenvalues), but they cannot be simultaneously diagonalized. Their eigenvector bases are different: AA uses e1,e2e_1, e_2 while BB uses (1,1),(1,1)(1,1), (1,-1).


Applications of diagonalization

ExampleSystems of differential equations

The system x(t)=Ax(t)\mathbf{x}'(t) = A\mathbf{x}(t) with A=(3113)A = \begin{pmatrix} -3 & 1 \\ 1 & -3 \end{pmatrix}.

Eigenvalues: 2,4-2, -4. Eigenvectors: (1,1),(1,1)(1,1), (1,-1).

General solution: x(t)=c1e2t(11)+c2e4t(11)\mathbf{x}(t) = c_1 e^{-2t}\begin{pmatrix} 1 \\ 1 \end{pmatrix} + c_2 e^{-4t}\begin{pmatrix} 1 \\ -1 \end{pmatrix}.

Both eigenvalues are negative, so all solutions decay to 00 as tt \to \infty (the origin is a stable node).

ExampleSteady states of Markov chains

A=(0.80.30.20.7)A = \begin{pmatrix} 0.8 & 0.3 \\ 0.2 & 0.7 \end{pmatrix} (transition matrix). Eigenvalues: λ1=1\lambda_1 = 1, λ2=0.5\lambda_2 = 0.5.

Eigenvectors: v1=(3,2)v_1 = (3, 2) for λ1=1\lambda_1 = 1, v2=(1,1)v_2 = (1, -1) for λ2=0.5\lambda_2 = 0.5.

As nn \to \infty: AnA^n \to projection onto E1E_1. The steady-state distribution is π=15(3,2)\pi = \frac{1}{5}(3, 2): 60%60\% in state 11, 40%40\% in state 22.


Summary

RemarkDiagonalization as the ideal decomposition

Diagonalization decomposes VV into a direct sum of eigenspaces: V=Eλ1Eλ2EλkV = E_{\lambda_1} \oplus E_{\lambda_2} \oplus \cdots \oplus E_{\lambda_k}. In this decomposition:

  • TT acts on each EλiE_{\lambda_i} by scalar multiplication λi\lambda_i.
  • Powers TnT^n act by λin\lambda_i^n on each eigenspace.
  • Functions f(T)f(T) act by f(λi)f(\lambda_i) on each eigenspace.
  • Matrix equations reduce to scalar equations.

When diagonalization fails (mg<mam_g < m_a for some eigenvalue), the next best option is the Jordan normal form, which introduces Jordan blocks to handle the deficiency.