Positive Definite Matrices

Positive definite matrices are symmetric matrices with positive eigenvalues. They arise naturally in optimization, statistics, and geometry as measures of "positive curvature."

DefinitionPositive Definite Matrix

A symmetric matrix $A$ is:

Positive definite (PD) if $\mathbf{x}^TA\mathbf{x} > 0$ for all $\mathbf{x} \neq \mathbf{0}$
Positive semi-definite (PSD) if $\mathbf{x}^TA\mathbf{x} \geq 0$ for all $\mathbf{x}$
Negative definite if $\mathbf{x}^TA\mathbf{x} < 0$ for all $\mathbf{x} \neq \mathbf{0}$
Indefinite if $\mathbf{x}^TA\mathbf{x}$ takes both positive and negative values

The quadratic form $Q(\mathbf{x}) = \mathbf{x}^TA\mathbf{x}$ generalizes the notion of squared length.

TheoremCharacterizations of Positive Definiteness

For symmetric matrix $A$ , the following are equivalent:

$A$ is positive definite
All eigenvalues of $A$ are positive
All leading principal minors are positive (Sylvester's criterion)
There exists invertible $B$ such that $A = B^TB$ (Cholesky decomposition)
$A = Q\Lambda Q^T$ where all diagonal entries of $\Lambda$ are positive

ExampleTesting Positive Definiteness

Consider $A = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix}$ .

Method 1 (Eigenvalues): Characteristic polynomial $\lambda^2 - 4\lambda + 3 = 0$ gives $\lambda_1 = 3, \lambda_2 = 1$ . Both positive, so $A$ is PD.

Method 2 (Sylvester): Leading principal minors are $a_{11} = 2 > 0$ and $\det(A) = 3 > 0$ , confirming PD.

Method 3 (Quadratic form): $\mathbf{x}^TA\mathbf{x} = 2x_1^2 + 2x_1x_2 + 2x_2^2 = (x_1+x_2)^2 + x_1^2 + x_2^2 > 0$ for $\mathbf{x} \neq \mathbf{0}$ .

DefinitionMatrix Square Root

If $A$ is positive definite with spectral decomposition $A = Q\Lambda Q^T$ , then the matrix square root is: $A^{1/2} = Q\Lambda^{1/2}Q^T$

where $\Lambda^{1/2}$ has diagonal entries $\sqrt{\lambda_i}$ . This satisfies $(A^{1/2})^2 = A$ .

Similarly, $A^{-1/2} = Q\Lambda^{-1/2}Q^T$ is well-defined.

ExampleApplications of Positive Definite Matrices

Covariance matrices in statistics are always PSD; if variables are linearly independent, the matrix is PD
Hessian matrices in optimization: PD Hessian implies local minimum
Inner products: $\langle \mathbf{x}, \mathbf{y} \rangle_A = \mathbf{x}^TA\mathbf{y}$ defines an inner product iff $A$ is PD
Ellipsoids: The set $\{\mathbf{x} : \mathbf{x}^TA\mathbf{x} \leq 1\}$ is an ellipsoid when $A$ is PD

Remark

Positive definiteness is a geometric property: it means the associated quadratic form is a "bowl" opening upward with a unique minimum at the origin. In machine learning, kernel matrices must be PSD for valid similarity measures. In numerical analysis, PD systems can be solved efficiently via Cholesky decomposition, which is faster and more stable than LU decomposition.