Iterative Methods for Linear Systems

Iterative methods generate a sequence of approximations $x^{(k)} \to x$ converging to the solution of $Ax = b$ . They are essential for large sparse systems where direct methods are too expensive in memory or computation.

Classical Iterative Methods

Definition5.4Jacobi and Gauss-Seidel Methods

Write $A = D - L - U$ where $D$ is diagonal, $-L$ is strictly lower, $-U$ is strictly upper triangular. The Jacobi iteration is $x^{(k+1)} = D^{-1}(L + U)x^{(k)} + D^{-1}b$ , i.e., $x_i^{(k+1)} = \frac{1}{a_{ii}}\left(b_i - \sum_{j \neq i} a_{ij} x_j^{(k)}\right)$ . The Gauss-Seidel iteration uses the latest values: $x_i^{(k+1)} = \frac{1}{a_{ii}}\left(b_i - \sum_{j < i} a_{ij} x_j^{(k+1)} - \sum_{j > i} a_{ij} x_j^{(k)}\right)$ .

Definition5.5Successive Over-Relaxation (SOR)

The SOR method with relaxation parameter $\omega \in (0, 2)$ is: $x_i^{(k+1)} = (1 - \omega) x_i^{(k)} + \frac{\omega}{a_{ii}}\left(b_i - \sum_{j < i} a_{ij} x_j^{(k+1)} - \sum_{j > i} a_{ij} x_j^{(k)}\right)$ . For $\omega = 1$ , SOR reduces to Gauss-Seidel. The iteration matrix is $M_\omega = (D - \omega L)^{-1}[(1-\omega)D + \omega U]$ . The Kahan theorem states that SOR converges only if $0 < \omega < 2$ .

Convergence Theory

RemarkSpectral Radius and Convergence

An iterative method $x^{(k+1)} = Mx^{(k)} + c$ converges for all initial guesses if and only if the spectral radius $\rho(M) < 1$ . The asymptotic convergence rate is $r = -\log_{10}\rho(M)$ : roughly $1/r$ iterations reduce the error by a factor of 10. For strictly diagonally dominant matrices ( $|a_{ii}| > \sum_{j\neq i}|a_{ij}|$ ), both Jacobi and Gauss-Seidel converge. For SPD matrices, Gauss-Seidel always converges.

ExampleOptimal SOR Parameter

For the discrete Laplacian on an $n \times n$ grid with mesh size $h = 1/(n+1)$ , the Jacobi spectral radius is $\rho_J = \cos(\pi h)$ . The optimal SOR parameter is $\omega^* = \frac{2}{1 + \sqrt{1 - \rho_J^2}} \approx 2 - 2\pi h$ for small $h$ , giving $\rho(M_{\omega^*}) = \omega^* - 1 \approx 1 - 2\pi h$ . This reduces the iteration count from $O(1/h^2)$ (Jacobi/Gauss-Seidel) to $O(1/h)$ .

Krylov Subspace Methods

Definition5.6Conjugate Gradient Method

For SPD $A$ , the conjugate gradient (CG) method generates iterates $x^{(k)} \in x^{(0)} + \mathcal{K}_k(A, r^{(0)})$ that minimize $\|x - x^{(k)}\|_A$ over the Krylov subspace $\mathcal{K}_k = \mathrm{span}\{r^{(0)}, Ar^{(0)}, \ldots, A^{k-1}r^{(0)}\}$ . The algorithm requires only one matrix-vector product, two inner products, and three vector updates per iteration. CG converges in at most $n$ iterations in exact arithmetic, and satisfies $\|x - x^{(k)}\|_A \leq 2\left(\frac{\sqrt{\kappa} - 1}{\sqrt{\kappa} + 1}\right)^k \|x - x^{(0)}\|_A$ where $\kappa = \lambda_{\max}/\lambda_{\min}$ is the condition number.

RemarkPreconditioning

Preconditioning replaces $Ax = b$ with $M^{-1}Ax = M^{-1}b$ where $M \approx A$ is easy to invert. The preconditioned CG converges with rate governed by $\kappa(M^{-1}A) \ll \kappa(A)$ . Common preconditioners include: incomplete Cholesky ( $M = \tilde{L}\tilde{L}^T$ ), algebraic multigrid (AMG), and domain decomposition methods. For well-designed preconditioners, the iteration count becomes $O(1)$ independent of problem size.