TheoremComplete

Gram-Schmidt Process

The Gram--Schmidt process is an algorithm that takes any linearly independent set of vectors and produces an orthogonal (or orthonormal) set spanning the same subspace. It guarantees the existence of orthonormal bases in every finite-dimensional inner product space and is the constructive backbone of many algorithms in numerical linear algebra.


Statement

Theorem6.3Gram-Schmidt Orthogonalization

Let VV be an inner product space and {v1,v2,…,vk}\{v_1, v_2, \ldots, v_k\} a linearly independent set. Define recursively:

u1=v1,uj=vjβˆ’βˆ‘i=1jβˆ’1⟨vj,ui⟩⟨ui,ui⟩uiforΒ j=2,…,k.u_1 = v_1, \quad u_j = v_j - \sum_{i=1}^{j-1} \frac{\langle v_j, u_i \rangle}{\langle u_i, u_i \rangle} u_i \quad \text{for } j = 2, \ldots, k.

Then {u1,u2,…,uk}\{u_1, u_2, \ldots, u_k\} is an orthogonal set with span⁑{u1,…,uj}=span⁑{v1,…,vj}\operatorname{span}\{u_1, \ldots, u_j\} = \operatorname{span}\{v_1, \ldots, v_j\} for each jj.

Normalizing: ej=uj/βˆ₯ujβˆ₯e_j = u_j / \|u_j\| gives an orthonormal set.

RemarkWhy it works

At each step, uj=vjβˆ’proj⁑span⁑{u1,…,ujβˆ’1}(vj)u_j = v_j - \operatorname{proj}_{\operatorname{span}\{u_1,\ldots,u_{j-1}\}}(v_j) subtracts the component of vjv_j in the previously constructed orthogonal directions, leaving only the component perpendicular to all previous vectors. Since vjβˆ‰span⁑{v1,…,vjβˆ’1}v_j \notin \operatorname{span}\{v_1, \ldots, v_{j-1}\} (linear independence), this perpendicular component is nonzero.


Worked examples

ExampleGram-Schmidt in R^3

Input: v1=(1,1,0)v_1 = (1, 1, 0), v2=(1,0,1)v_2 = (1, 0, 1), v3=(0,1,1)v_3 = (0, 1, 1).

Step 1: u1=v1=(1,1,0)u_1 = v_1 = (1, 1, 0).

Step 2: u2=v2βˆ’βŸ¨v2,u1⟩⟨u1,u1⟩u1=(1,0,1)βˆ’12(1,1,0)=(12,βˆ’12,1)u_2 = v_2 - \frac{\langle v_2, u_1 \rangle}{\langle u_1, u_1 \rangle} u_1 = (1, 0, 1) - \frac{1}{2}(1, 1, 0) = (\frac{1}{2}, -\frac{1}{2}, 1).

Check: ⟨u1,u2⟩=12βˆ’12+0=0\langle u_1, u_2 \rangle = \frac{1}{2} - \frac{1}{2} + 0 = 0 βœ“.

Step 3: u3=v3βˆ’βŸ¨v3,u1⟩⟨u1,u1⟩u1βˆ’βŸ¨v3,u2⟩⟨u2,u2⟩u2u_3 = v_3 - \frac{\langle v_3, u_1 \rangle}{\langle u_1, u_1 \rangle} u_1 - \frac{\langle v_3, u_2 \rangle}{\langle u_2, u_2 \rangle} u_2.

⟨v3,u1⟩=0+1+0=1\langle v_3, u_1 \rangle = 0 + 1 + 0 = 1, ⟨v3,u2⟩=0βˆ’12+1=12\langle v_3, u_2 \rangle = 0 - \frac{1}{2} + 1 = \frac{1}{2}, ⟨u2,u2⟩=14+14+1=32\langle u_2, u_2 \rangle = \frac{1}{4} + \frac{1}{4} + 1 = \frac{3}{2}.

u3=(0,1,1)βˆ’12(1,1,0)βˆ’1/23/2(12,βˆ’12,1)=(0,1,1)βˆ’(12,12,0)βˆ’13(12,βˆ’12,1)=(βˆ’23,23,23)u_3 = (0, 1, 1) - \frac{1}{2}(1, 1, 0) - \frac{1/2}{3/2}(\frac{1}{2}, -\frac{1}{2}, 1) = (0, 1, 1) - (\frac{1}{2}, \frac{1}{2}, 0) - \frac{1}{3}(\frac{1}{2}, -\frac{1}{2}, 1) = (-\frac{2}{3}, \frac{2}{3}, \frac{2}{3}).

Check: ⟨u1,u3⟩=βˆ’23+23=0\langle u_1, u_3 \rangle = -\frac{2}{3} + \frac{2}{3} = 0 βœ“, ⟨u2,u3⟩=βˆ’13βˆ’13+23=0\langle u_2, u_3 \rangle = -\frac{1}{3} - \frac{1}{3} + \frac{2}{3} = 0 βœ“.

Normalize: e1=12(1,1,0)e_1 = \frac{1}{\sqrt{2}}(1, 1, 0), e2=13/2(12,βˆ’12,1)=16(1,βˆ’1,2)e_2 = \frac{1}{\sqrt{3/2}}(\frac{1}{2}, -\frac{1}{2}, 1) = \frac{1}{\sqrt{6}}(1, -1, 2), e3=14/3β‹…(βˆ’23,23,23)=13(βˆ’1,1,1)e_3 = \frac{1}{\sqrt{4/3}} \cdot (-\frac{2}{3}, \frac{2}{3}, \frac{2}{3}) = \frac{1}{\sqrt{3}}(-1, 1, 1).

ExampleGram-Schmidt in R^2

Input: v1=(3,4)v_1 = (3, 4), v2=(1,0)v_2 = (1, 0).

u1=(3,4)u_1 = (3, 4), e1=15(3,4)e_1 = \frac{1}{5}(3, 4).

u2=(1,0)βˆ’βŸ¨(1,0),(3,4)⟩25(3,4)=(1,0)βˆ’325(3,4)=(1625,βˆ’1225)u_2 = (1, 0) - \frac{\langle (1,0), (3,4) \rangle}{25}(3, 4) = (1, 0) - \frac{3}{25}(3, 4) = (\frac{16}{25}, -\frac{12}{25}).

e2=2520β‹…(1625,βˆ’1225)=15(4,βˆ’3)e_2 = \frac{25}{20} \cdot (\frac{16}{25}, -\frac{12}{25}) = \frac{1}{5}(4, -3).

The ONB is {15(3,4),15(4,βˆ’3)}\{\frac{1}{5}(3, 4), \frac{1}{5}(4, -3)\}.

ExampleGram-Schmidt for polynomials

V=P2V = \mathcal{P}_2 with ⟨f,g⟩=βˆ«βˆ’11f(x)g(x) dx\langle f, g \rangle = \int_{-1}^{1} f(x)g(x)\,dx, starting from {1,x,x2}\{1, x, x^2\}.

u1=1u_1 = 1, βˆ₯u1βˆ₯2=2\|u_1\|^2 = 2.

u2=xβˆ’βŸ¨x,1⟩⟨1,1βŸ©β‹…1=xβˆ’02=xu_2 = x - \frac{\langle x, 1 \rangle}{\langle 1, 1 \rangle} \cdot 1 = x - \frac{0}{2} = x (since βˆ«βˆ’11x dx=0\int_{-1}^1 x\,dx = 0).

u3=x2βˆ’βŸ¨x2,1⟩2βˆ’βŸ¨x2,x⟩⟨x,x⟩x=x2βˆ’2/32βˆ’0=x2βˆ’13u_3 = x^2 - \frac{\langle x^2, 1 \rangle}{2} - \frac{\langle x^2, x \rangle}{\langle x, x \rangle} x = x^2 - \frac{2/3}{2} - 0 = x^2 - \frac{1}{3}.

These are (up to normalization) the Legendre polynomials: P0=1P_0 = 1, P1=xP_1 = x, P2=x2βˆ’1/3P_2 = x^2 - 1/3.


QR decomposition

Theorem6.4QR decomposition

Every mΓ—nm \times n matrix AA with linearly independent columns can be factored as A=QRA = QR, where:

  • QQ is an mΓ—nm \times n matrix with orthonormal columns,
  • RR is an nΓ—nn \times n upper triangular matrix with positive diagonal entries.

The columns of QQ are the orthonormalized versions of the columns of AA (via Gram--Schmidt), and RR records the coefficients.

ExampleQR decomposition of a 3x2 matrix

A=(111001)A = \begin{pmatrix} 1 & 1 \\ 1 & 0 \\ 0 & 1 \end{pmatrix}.

Column 1: v1=(1,1,0)v_1 = (1, 1, 0), βˆ₯v1βˆ₯=2\|v_1\| = \sqrt{2}, q1=12(1,1,0)q_1 = \frac{1}{\sqrt{2}}(1, 1, 0).

Column 2: v2=(1,0,1)v_2 = (1, 0, 1), ⟨v2,q1⟩=12\langle v_2, q_1 \rangle = \frac{1}{\sqrt{2}}, u2=(1,0,1)βˆ’12q1=(1,0,1)βˆ’(12,12,0)=(12,βˆ’12,1)u_2 = (1, 0, 1) - \frac{1}{\sqrt{2}} q_1 = (1, 0, 1) - (\frac{1}{2}, \frac{1}{2}, 0) = (\frac{1}{2}, -\frac{1}{2}, 1).

βˆ₯u2βˆ₯=3/2\|u_2\| = \sqrt{3/2}, q2=16(1,βˆ’1,2)q_2 = \frac{1}{\sqrt{6}}(1, -1, 2).

Q=(1/21/61/2βˆ’1/602/6)Q = \begin{pmatrix} 1/\sqrt{2} & 1/\sqrt{6} \\ 1/\sqrt{2} & -1/\sqrt{6} \\ 0 & 2/\sqrt{6} \end{pmatrix}, R=(21/203/2)R = \begin{pmatrix} \sqrt{2} & 1/\sqrt{2} \\ 0 & \sqrt{3/2} \end{pmatrix}.

ExampleQR decomposition of a square matrix

A=(1203)A = \begin{pmatrix} 1 & 2 \\ 0 & 3 \end{pmatrix}.

v1=(1,0)v_1 = (1, 0), q1=(1,0)q_1 = (1, 0). v2=(2,3)v_2 = (2, 3), u2=(2,3)βˆ’2(1,0)=(0,3)u_2 = (2, 3) - 2(1, 0) = (0, 3), q2=(0,1)q_2 = (0, 1).

Q=IQ = I, R=AR = A. When AA already has orthogonal columns (in the first column direction), QR is trivial.

For A=(111βˆ’1)A = \begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}: q1=12(1,1)q_1 = \frac{1}{\sqrt{2}}(1, 1), ⟨v2,q1⟩=0\langle v_2, q_1 \rangle = 0, so q2=12(1,βˆ’1)q_2 = \frac{1}{\sqrt{2}}(1, -1).

Q=12(111βˆ’1)Q = \frac{1}{\sqrt{2}}\begin{pmatrix} 1 & 1 \\ 1 & -1 \end{pmatrix}, R=(2002)R = \begin{pmatrix} \sqrt{2} & 0 \\ 0 & \sqrt{2} \end{pmatrix}.


Applications

ExampleLeast squares via QR

The least-squares solution to Ax=bAx = b (overdetermined system) is x=Rβˆ’1QTbx = R^{-1}Q^Tb, which is numerically more stable than the normal equations ATAx=ATbA^TAx = A^Tb.

For A=(111001)A = \begin{pmatrix} 1 & 1 \\ 1 & 0 \\ 0 & 1 \end{pmatrix}, b=(120)b = \begin{pmatrix} 1 \\ 2 \\ 0 \end{pmatrix}: using the QR from before, QTb=(3/2βˆ’1/6)Q^Tb = \begin{pmatrix} 3/\sqrt{2} \\ -1/\sqrt{6} \end{pmatrix}, then solve Rx=QTbRx = Q^Tb.

ExampleEvery inner product space has an ONB

Starting from any basis {v1,…,vn}\{v_1, \ldots, v_n\}, Gram--Schmidt produces an ONB {e1,…,en}\{e_1, \ldots, e_n\}. This proves:

Every finite-dimensional inner product space has an orthonormal basis.

This existence theorem is nonconstructive in the sense that the result depends on the choice of starting basis, but the algorithm itself is fully constructive.

ExampleProjection via Gram-Schmidt

To find proj⁑W(v)\operatorname{proj}_W(v) where W=span⁑{w1,w2}W = \operatorname{span}\{w_1, w_2\}: first orthonormalize {w1,w2}\{w_1, w_2\} to {e1,e2}\{e_1, e_2\} via Gram--Schmidt, then:

proj⁑W(v)=⟨v,e1⟩e1+⟨v,e2⟩e2.\operatorname{proj}_W(v) = \langle v, e_1 \rangle e_1 + \langle v, e_2 \rangle e_2.

For W=span⁑{(1,0,1),(0,1,1)}W = \operatorname{span}\{(1,0,1), (0,1,1)\} and v=(1,1,1)v = (1, 1, 1): after Gram--Schmidt, e1=12(1,0,1)e_1 = \frac{1}{\sqrt{2}}(1,0,1), e2=16(βˆ’1,2,1)e_2 = \frac{1}{\sqrt{6}}(-1, 2, 1).

proj⁑W(v)=22e1+26e2=(1,0,1)+13(βˆ’1,2,1)=(23,23,43)\operatorname{proj}_W(v) = \frac{2}{\sqrt{2}} e_1 + \frac{2}{\sqrt{6}} e_2 = (1, 0, 1) + \frac{1}{3}(-1, 2, 1) = (\frac{2}{3}, \frac{2}{3}, \frac{4}{3}).


Numerical stability

RemarkModified Gram-Schmidt

The classical Gram--Schmidt algorithm can suffer from numerical instability due to floating-point errors. The modified Gram--Schmidt algorithm orthogonalizes against each previously computed vector one at a time (rather than computing all projections simultaneously), and is significantly more stable:

For j=1,…,kj = 1, \ldots, k: set uj=vju_j = v_j, then for i=1,…,jβˆ’1i = 1, \ldots, j-1: update uj←ujβˆ’βŸ¨uj,ei⟩eiu_j \leftarrow u_j - \langle u_j, e_i \rangle e_i.

Mathematically equivalent, but numerically superior. Householder reflections provide even better stability for the QR decomposition.


Summary

RemarkGram-Schmidt as the fundamental construction

The Gram--Schmidt process is the algorithmic heart of inner product space theory:

  • It proves existence of orthonormal bases.
  • It computes the QR factorization, used in solving least-squares problems and eigenvalue algorithms.
  • It connects the abstract theory (inner product, orthogonality) to computational practice.
  • The process preserves the "progressive spans": span⁑{e1,…,ej}=span⁑{v1,…,vj}\operatorname{span}\{e_1, \ldots, e_j\} = \operatorname{span}\{v_1, \ldots, v_j\} for all jj.