TheoremComplete

Cauchy-Schwarz Inequality

The Cauchy--Schwarz inequality is arguably the single most important inequality in mathematics. It bounds the inner product of two vectors by the product of their norms, provides the foundation for defining angles, and implies the triangle inequality. It holds in every inner product space, from Rn\mathbb{R}^n to L2L^2 function spaces.


Statement

Theorem6.5Cauchy-Schwarz Inequality

For all vectors u,vu, v in an inner product space VV:

u,vuv.|\langle u, v \rangle| \leq \|u\| \cdot \|v\|.

Equality holds if and only if uu and vv are linearly dependent (i.e., one is a scalar multiple of the other).


Special cases

ExampleCauchy-Schwarz for the dot product

In Rn\mathbb{R}^n: xyxy|x \cdot y| \leq \|x\| \cdot \|y\|, or equivalently:

(i=1nxiyi)2(i=1nxi2)(i=1nyi2).\left(\sum_{i=1}^n x_i y_i\right)^2 \leq \left(\sum_{i=1}^n x_i^2\right) \left(\sum_{i=1}^n y_i^2\right).

For x=(1,2,3)x = (1, 2, 3), y=(4,5,6)y = (4, 5, 6): xy=4+10+18=32|x \cdot y| = |4 + 10 + 18| = 32 and xy=1477=107832.83\|x\| \cdot \|y\| = \sqrt{14} \cdot \sqrt{77} = \sqrt{1078} \approx 32.83. So 3232.8332 \leq 32.83 ✓.

ExampleCauchy-Schwarz for integrals

For continuous functions on [a,b][a, b]:

abf(x)g(x)dx2abf(x)2dxabg(x)2dx.\left|\int_a^b f(x)g(x)\,dx\right|^2 \leq \int_a^b f(x)^2\,dx \cdot \int_a^b g(x)^2\,dx.

For f(x)=xf(x) = x and g(x)=1g(x) = 1 on [0,1][0, 1]: 01xdx2=14|\int_0^1 x\,dx|^2 = \frac{1}{4} and 01x2dx011dx=13\int_0^1 x^2\,dx \cdot \int_0^1 1\,dx = \frac{1}{3}. So 1413\frac{1}{4} \leq \frac{1}{3} ✓.

ExampleCauchy-Schwarz for finite sums (classical form)

For real numbers a1,,ana_1, \ldots, a_n and b1,,bnb_1, \ldots, b_n:

(a1b1++anbn)2(a12++an2)(b12++bn2).(a_1 b_1 + \cdots + a_n b_n)^2 \leq (a_1^2 + \cdots + a_n^2)(b_1^2 + \cdots + b_n^2).

For n=2n = 2: (a1b1+a2b2)2(a12+a22)(b12+b22)(a_1 b_1 + a_2 b_2)^2 \leq (a_1^2 + a_2^2)(b_1^2 + b_2^2). Expanding: a12b12+2a1b1a2b2+a22b22a12b12+a12b22+a22b12+a22b22a_1^2 b_1^2 + 2a_1 b_1 a_2 b_2 + a_2^2 b_2^2 \leq a_1^2 b_1^2 + a_1^2 b_2^2 + a_2^2 b_1^2 + a_2^2 b_2^2. The difference is (a1b2a2b1)20(a_1 b_2 - a_2 b_1)^2 \geq 0 ✓.


Equality condition

ExampleWhen equality holds

Equality in u,v=uv|\langle u, v \rangle| = \|u\| \|v\| iff u=αvu = \alpha v or v=0v = 0.

u=(2,4)u = (2, 4), v=(1,2)v = (1, 2): u,v=2+8=10|\langle u, v \rangle| = |2 + 8| = 10 and uv=205=10\|u\| \|v\| = \sqrt{20} \cdot \sqrt{5} = 10. Equality holds since u=2vu = 2v.

u=(1,0)u = (1, 0), v=(0,1)v = (0, 1): u,v=0|\langle u, v \rangle| = 0 and uv=1\|u\| \|v\| = 1. Strict inequality because uu and vv are linearly independent.

ExampleAngle interpretation

The Cauchy--Schwarz inequality is equivalent to cosθ1|\cos\theta| \leq 1 where θ\theta is the angle between uu and vv:

cosθ=u,vuv[1,1].\cos\theta = \frac{\langle u, v \rangle}{\|u\| \|v\|} \in [-1, 1].

Equality (cosθ=1|\cos\theta| = 1) means θ=0\theta = 0 or θ=π\theta = \pi, i.e., the vectors are parallel.


Consequences

TheoremTriangle inequality (from Cauchy-Schwarz)

For all u,vu, v: u+vu+v\|u + v\| \leq \|u\| + \|v\|.

ProofDerivation from Cauchy-Schwarz

u+v2=u+v,u+v=u2+2Reu,v+v2u2+2u,v+v2u2+2uv+v2=(u+v)2\|u + v\|^2 = \langle u + v, u + v \rangle = \|u\|^2 + 2\operatorname{Re}\langle u, v \rangle + \|v\|^2 \leq \|u\|^2 + 2|\langle u, v \rangle| + \|v\|^2 \leq \|u\|^2 + 2\|u\|\|v\| + \|v\|^2 = (\|u\| + \|v\|)^2.

ExampleTriangle inequality in R^2

u=(3,0)u = (3, 0), v=(0,4)v = (0, 4): u+v=53+4=7\|u + v\| = 5 \leq 3 + 4 = 7.

u=(1,2)u = (1, 2), v=(3,6)v = (3, 6): u+v=(4,8)=45\|u + v\| = \|(4, 8)\| = 4\sqrt{5} and u+v=5+35=45\|u\| + \|v\| = \sqrt{5} + 3\sqrt{5} = 4\sqrt{5}. Equality because v=3uv = 3u.

ExampleReverse triangle inequality

Also from Cauchy--Schwarz: uvuv|\|u\| - \|v\|| \leq \|u - v\|.

u=(5,0)u = (5, 0), v=(3,4)v = (3, 4): uv=55=0(2,4)=25|\|u\| - \|v\|| = |5 - 5| = 0 \leq \|(2, -4)\| = 2\sqrt{5} ✓.


Applications

ExampleCorrelation coefficient

In probability, the Cauchy--Schwarz inequality applied to X,Y=E[XY]\langle X, Y \rangle = E[XY] (with mean-centered random variables) gives:

E[XY]2E[X2]E[Y2],|E[XY]|^2 \leq E[X^2] \cdot E[Y^2],

or equivalently Cov(X,Y)σXσY|\operatorname{Cov}(X, Y)| \leq \sigma_X \sigma_Y. The correlation coefficient ρ=Cov(X,Y)σXσY\rho = \frac{\operatorname{Cov}(X,Y)}{\sigma_X \sigma_Y} satisfies ρ1|\rho| \leq 1 by Cauchy--Schwarz.

ExampleAM-QM inequality from Cauchy-Schwarz

Apply Cauchy--Schwarz to a=(a1,,an)a = (a_1, \ldots, a_n) and b=(1,1,,1)b = (1, 1, \ldots, 1):

(a1++an)2n(a12++an2),(a_1 + \cdots + a_n)^2 \leq n(a_1^2 + \cdots + a_n^2),

i.e., (a1++ann)2a12++an2n\left(\frac{a_1 + \cdots + a_n}{n}\right)^2 \leq \frac{a_1^2 + \cdots + a_n^2}{n}, which says AM \leq QM (arithmetic mean at most quadratic mean).

For a=(1,2,3)a = (1, 2, 3): AM=2\text{AM} = 2, QM=14/32.16\text{QM} = \sqrt{14/3} \approx 2.16. So 22.162 \leq 2.16 ✓.

ExampleCauchy-Schwarz for infinite series

For sequences (an)(a_n) and (bn)(b_n) with an2<\sum a_n^2 < \infty and bn2<\sum b_n^2 < \infty:

(n=1anbn)2n=1an2n=1bn2.\left(\sum_{n=1}^\infty a_n b_n\right)^2 \leq \sum_{n=1}^\infty a_n^2 \cdot \sum_{n=1}^\infty b_n^2.

For an=1/na_n = 1/n and bn=1/n2b_n = 1/n^2: anbn=1/n3=ζ(3)1.202\sum a_n b_n = \sum 1/n^3 = \zeta(3) \approx 1.202, an2=π2/6\sum a_n^2 = \pi^2/6, bn2=π4/90\sum b_n^2 = \pi^4/90. So ζ(3)21.44π26π4901.78\zeta(3)^2 \approx 1.44 \leq \frac{\pi^2}{6} \cdot \frac{\pi^4}{90} \approx 1.78 ✓.

ExampleMaximizing a linear functional

The maximum of u,v\langle u, v \rangle subject to u=1\|u\| = 1 is v\|v\|, achieved when u=v/vu = v / \|v\|.

This follows directly from Cauchy--Schwarz: u,vu,vuv=v\langle u, v \rangle \leq |\langle u, v \rangle| \leq \|u\| \|v\| = \|v\|.

Example: maximize x+2y+3zx + 2y + 3z subject to x2+y2+z2=1x^2 + y^2 + z^2 = 1. This is (x,y,z),(1,2,3)\langle (x,y,z), (1,2,3) \rangle with (x,y,z)=1\|(x,y,z)\| = 1. Maximum is (1,2,3)=14\|(1,2,3)\| = \sqrt{14}, achieved at (x,y,z)=114(1,2,3)(x,y,z) = \frac{1}{\sqrt{14}}(1,2,3).

ExampleHadamard's inequality from Cauchy-Schwarz

For a matrix AA with columns a1,,ana_1, \ldots, a_n, repeated application of Cauchy--Schwarz (via Gram--Schmidt) gives Hadamard's inequality:

detAa1a2an.|\det A| \leq \|a_1\| \cdot \|a_2\| \cdots \|a_n\|.

Equality holds iff the columns are orthogonal. This is because Gram--Schmidt gives A=QRA = QR with detA=detR=r11rnn|\det A| = |\det R| = r_{11} \cdots r_{nn} and rjj=ujvj=ajr_{jj} = \|u_j\| \leq \|v_j\| = \|a_j\|.


Summary

RemarkThe most fundamental inequality

The Cauchy--Schwarz inequality underpins the entire structure of inner product spaces:

  • It ensures the angle between vectors is well-defined.
  • It implies the triangle inequality, making every inner product space a metric space.
  • It bounds correlations in probability (ρ1|\rho| \leq 1).
  • It gives Hadamard's bound on determinants.
  • It provides the optimality condition for maximizing linear functionals.
  • Its equality case characterizes linear dependence of two vectors.