Cayley-Hamilton Theorem

The Cayley--Hamilton theorem states that every square matrix satisfies its own characteristic polynomial. This deep result connects the polynomial algebra of eigenvalues to the matrix algebra itself, and has far-reaching consequences for computing matrix functions, finding inverse matrices, and understanding the structure of linear operators.

Statement

Theorem5.4Cayley-Hamilton Theorem

Let $A \in M_{n \times n}(F)$ be a square matrix over a field $F$ , and let $p_A(\lambda) = \det(A - \lambda I)$ be its characteristic polynomial. Then:

$p_A(A) = 0.$

That is, if $p_A(\lambda) = (-1)^n \lambda^n + c_{n-1}\lambda^{n-1} + \cdots + c_1 \lambda + c_0$ , then:

$(-1)^n A^n + c_{n-1}A^{n-1} + \cdots + c_1 A + c_0 I = 0.$

RemarkWhat the theorem does NOT say

The theorem does not follow from simply "substituting $A$ for $\lambda$ " in $\det(A - \lambda I)$ . The expression $\det(A - AI) = \det(0) = 0$ is trivially true but irrelevant. The content is that when you expand $p_A(\lambda)$ as a polynomial in $\lambda$ , collect the scalar coefficients, and then form the matrix polynomial $p_A(A) = c_n A^n + \cdots + c_0 I$ , the result is the zero matrix.

Verification in small cases

ExampleCayley-Hamilton for a 2x2 matrix

$A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}$ .

Characteristic polynomial (monic): $p(\lambda) = \lambda^2 - 5\lambda - 2$ .

Verify $A^2 - 5A - 2I = 0$ :

$A^2 = \begin{pmatrix} 7 & 10 \\ 15 & 22 \end{pmatrix}$ , $5A = \begin{pmatrix} 5 & 10 \\ 15 & 20 \end{pmatrix}$ , $2I = \begin{pmatrix} 2 & 0 \\ 0 & 2 \end{pmatrix}$ .

$A^2 - 5A - 2I = \begin{pmatrix} 7-5-2 & 10-10-0 \\ 15-15-0 & 22-20-2 \end{pmatrix} = \begin{pmatrix} 0 & 0 \\ 0 & 0 \end{pmatrix}$ .

ExampleCayley-Hamilton for a 3x3 matrix

$A = \begin{pmatrix} 2 & 1 & 0 \\ 0 & 2 & 1 \\ 0 & 0 & 2 \end{pmatrix}$ .

Characteristic polynomial: $p(\lambda) = (2 - \lambda)^3 = -\lambda^3 + 6\lambda^2 - 12\lambda + 8$ , or in monic form: $p(\lambda) = \lambda^3 - 6\lambda^2 + 12\lambda - 8$ .

Verify: $A^3 - 6A^2 + 12A - 8I = 0$ .

$A - 2I = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{pmatrix}$ , $(A-2I)^2 = \begin{pmatrix} 0 & 0 & 1 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}$ , $(A-2I)^3 = 0$ .

So $p(A) = (A - 2I)^3 = 0$ .

ExampleCayley-Hamilton for diagonal matrices

If $A = \operatorname{diag}(\lambda_1, \ldots, \lambda_n)$ , then $p(\lambda) = \prod_i (\lambda - \lambda_i)$ , and $p(A) = \prod_i (A - \lambda_i I) = \operatorname{diag}(\prod_{i}(\lambda_1 - \lambda_i), \ldots) = 0$ since each diagonal entry contains a factor $(\lambda_j - \lambda_j) = 0$ .

Applications

TheoremMatrix inverse via Cayley-Hamilton

If $A$ is invertible ( $\det A \neq 0$ ), the Cayley--Hamilton theorem provides an explicit formula for $A^{-1}$ as a polynomial in $A$ .

For a $2 \times 2$ matrix with $p(\lambda) = \lambda^2 - \operatorname{tr}(A)\lambda + \det(A)$ : $A^2 - \operatorname{tr}(A) \cdot A + \det(A) \cdot I = 0$ , so:

$A^{-1} = \frac{1}{\det(A)}(\operatorname{tr}(A) \cdot I - A).$

ExampleComputing inverse via Cayley-Hamilton

$A = \begin{pmatrix} 4 & 3 \\ 2 & 1 \end{pmatrix}$ , $\operatorname{tr}(A) = 5$ , $\det(A) = -2$ .

$A^{-1} = \frac{1}{-2}(5I - A) = \frac{1}{-2}\begin{pmatrix} 1 & -3 \\ -2 & 4 \end{pmatrix} = \begin{pmatrix} -1/2 & 3/2 \\ 1 & -2 \end{pmatrix}$ .

Verify: $AA^{-1} = \begin{pmatrix} 4 & 3 \\ 2 & 1 \end{pmatrix}\begin{pmatrix} -1/2 & 3/2 \\ 1 & -2 \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix}$ .

ExampleReducing higher powers of A

For $A = \begin{pmatrix} 1 & 2 \\ 3 & 4 \end{pmatrix}$ with $p(\lambda) = \lambda^2 - 5\lambda - 2$ , the relation $A^2 = 5A + 2I$ lets us express any power of $A$ as a linear combination of $A$ and $I$ :

$A^3 = 5A^2 + 2A = 5(5A + 2I) + 2A = 27A + 10I$ .
$A^4 = 27A^2 + 10A = 27(5A + 2I) + 10A = 145A + 54I$ .
In general, $A^n = a_n A + b_n I$ where $a_{n+1} = 5a_n + b_n$ and $b_{n+1} = 2a_n$ .

ExampleCayley-Hamilton for nilpotent matrices

A nilpotent matrix $N$ with $N^k = 0$ has characteristic polynomial $p(\lambda) = (-1)^n \lambda^n$ , so Cayley--Hamilton gives $(-1)^n N^n = 0$ , i.e., $N^n = 0$ . This means:

The index of nilpotency of an $n \times n$ nilpotent matrix is at most $n$ .

For $N = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{pmatrix}$ : $N^2 \neq 0$ but $N^3 = 0$ . Cayley--Hamilton guarantees $N^3 = 0$ without computing.

Cayley--Hamilton and the minimal polynomial

Definition5.9Minimal polynomial

The minimal polynomial $m_A(\lambda)$ of $A$ is the monic polynomial of smallest degree such that $m_A(A) = 0$ .

By Cayley--Hamilton, $\deg m_A \leq n$ . The minimal polynomial divides the characteristic polynomial: $m_A \mid p_A$ .

ExampleMinimal polynomial vs characteristic polynomial

$A = \begin{pmatrix} 2 & 0 \\ 0 & 3 \end{pmatrix}$ : $p_A(\lambda) = (\lambda - 2)(\lambda - 3)$ . Since $A$ is diagonalizable with distinct eigenvalues, $m_A(\lambda) = (\lambda - 2)(\lambda - 3) = p_A(\lambda)$ .
$A = \begin{pmatrix} 2 & 0 \\ 0 & 2 \end{pmatrix} = 2I$ : $p_A(\lambda) = (\lambda - 2)^2$ . But $(A - 2I) = 0$ , so $m_A(\lambda) = \lambda - 2$ . Here $m_A \neq p_A$ .
$A = \begin{pmatrix} 2 & 1 \\ 0 & 2 \end{pmatrix}$ : $p_A(\lambda) = (\lambda - 2)^2$ . Check: $(A - 2I) = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix} \neq 0$ . So $m_A(\lambda) = (\lambda - 2)^2 = p_A(\lambda)$ .

ExampleMinimal polynomial of block diagonal

$A = \operatorname{diag}\left(\begin{pmatrix} 2 & 1 \\ 0 & 2 \end{pmatrix}, \begin{pmatrix} 3 & 0 \\ 0 & 3 \end{pmatrix}\right)$ .

$p_A(\lambda) = (\lambda - 2)^2 (\lambda - 3)^2$ , but $m_A(\lambda) = (\lambda - 2)^2(\lambda - 3)$ since the $3$ -block is $3I$ (killed by $\lambda - 3$ ) while the $2$ -block requires $(\lambda - 2)^2$ .

Consequences

ExampleEvery matrix satisfies a polynomial of degree n

Cayley--Hamilton guarantees that $A^n$ can be written as a linear combination of $I, A, A^2, \ldots, A^{n-1}$ . This means:

The set $\{I, A, A^2, \ldots, A^{n-1}\}$ spans all powers of $A$ . In particular, $A^{-1}$ (when it exists) is a polynomial in $A$ of degree at most $n - 1$ .

For an $n \times n$ matrix, the algebra $F[A] = \{p(A) : p \in F[x]\}$ has dimension at most $n$ over $F$ (in fact, dimension equals $\deg m_A$ ).

ExampleNewton's identities from Cayley-Hamilton

For $A \in M_{2 \times 2}(F)$ : $A^2 - \operatorname{tr}(A) \cdot A + \det(A) \cdot I = 0$ .

Taking traces: $\operatorname{tr}(A^2) - \operatorname{tr}(A)^2 + 2\det(A) = 0$ , so:

$\det(A) = \frac{\operatorname{tr}(A)^2 - \operatorname{tr}(A^2)}{2}.$

This is a special case of Newton's identities relating power sums to elementary symmetric functions.

For $A = \begin{pmatrix} 3 & 1 \\ 2 & 4 \end{pmatrix}$ : $\operatorname{tr}(A) = 7$ , $\operatorname{tr}(A^2) = \operatorname{tr}\begin{pmatrix} 11 & 7 \\ 14 & 18 \end{pmatrix} = 29$ . So $\det(A) = \frac{49 - 29}{2} = 10$ .

Summary

RemarkSignificance of Cayley-Hamilton

The Cayley--Hamilton theorem is a cornerstone result:

Every $n \times n$ matrix satisfies a polynomial relation of degree $n$ , constraining the algebra generated by $A$ .
It provides formulas for $A^{-1}$ and higher powers of $A$ in terms of lower powers.
It establishes that the minimal polynomial divides the characteristic polynomial.
It is the starting point for the theory of canonical forms (Jordan form, rational canonical form).
It generalizes to linear operators on infinite-dimensional spaces (with appropriate definitions) and to ring-theoretic settings (the structure theorem for modules over a PID).