Cauchy-Schwarz Inequality
The Cauchy--Schwarz inequality is arguably the single most important inequality in mathematics. It bounds the inner product of two vectors by the product of their norms, provides the foundation for defining angles, and implies the triangle inequality. It holds in every inner product space, from to function spaces.
Statement
For all vectors in an inner product space :
Equality holds if and only if and are linearly dependent (i.e., one is a scalar multiple of the other).
Special cases
In : , or equivalently:
For , : and . So ✓.
For continuous functions on :
For and on : and . So ✓.
For real numbers and :
For : . Expanding: . The difference is ✓.
Equality condition
Equality in iff or .
, : and . Equality holds since .
, : and . Strict inequality because and are linearly independent.
The Cauchy--Schwarz inequality is equivalent to where is the angle between and :
Equality () means or , i.e., the vectors are parallel.
Consequences
For all : .
.
, : .
, : and . Equality because .
Also from Cauchy--Schwarz: .
, : ✓.
Applications
In probability, the Cauchy--Schwarz inequality applied to (with mean-centered random variables) gives:
or equivalently . The correlation coefficient satisfies by Cauchy--Schwarz.
Apply Cauchy--Schwarz to and :
i.e., , which says AM QM (arithmetic mean at most quadratic mean).
For : , . So ✓.
For sequences and with and :
For and : , , . So ✓.
The maximum of subject to is , achieved when .
This follows directly from Cauchy--Schwarz: .
Example: maximize subject to . This is with . Maximum is , achieved at .
For a matrix with columns , repeated application of Cauchy--Schwarz (via Gram--Schmidt) gives Hadamard's inequality:
Equality holds iff the columns are orthogonal. This is because Gram--Schmidt gives with and .
Summary
The Cauchy--Schwarz inequality underpins the entire structure of inner product spaces:
- It ensures the angle between vectors is well-defined.
- It implies the triangle inequality, making every inner product space a metric space.
- It bounds correlations in probability ().
- It gives Hadamard's bound on determinants.
- It provides the optimality condition for maximizing linear functionals.
- Its equality case characterizes linear dependence of two vectors.