Optimization in Several Variables

Finding maxima and minima of multivariable functions requires extending the critical point analysis of single-variable calculus to higher dimensions, using the gradient and Hessian matrix.

Critical Points

Definition

A point $\mathbf{a}$ is a critical point of $f : \mathbb{R}^n \to \mathbb{R}$ if $\nabla f(\mathbf{a}) = \mathbf{0}$ (i.e., all partial derivatives vanish). A critical point is a local minimum if $f(\mathbf{x}) \geq f(\mathbf{a})$ for all $\mathbf{x}$ near $\mathbf{a}$ , a local maximum if $f(\mathbf{x}) \leq f(\mathbf{a})$ , and a saddle point if it is neither.

Definition

The Hessian matrix of $f$ at $\mathbf{a}$ is the matrix of second partial derivatives: $H_f(\mathbf{a}) = \begin{pmatrix} f_{x_1 x_1} & f_{x_1 x_2} & \cdots & f_{x_1 x_n} \\ f_{x_2 x_1} & f_{x_2 x_2} & \cdots & f_{x_2 x_n} \\ \vdots & \vdots & \ddots & \vdots \\ f_{x_n x_1} & f_{x_n x_2} & \cdots & f_{x_n x_n} \end{pmatrix}$ By Clairaut's theorem, $H_f$ is symmetric when $f \in C^2$ .

The Second Derivative Test

Theorem8.4Second Derivative Test

Let $f : \mathbb{R}^n \to \mathbb{R}$ be $C^2$ and $\mathbf{a}$ a critical point of $f$ . Then:

If $H_f(\mathbf{a})$ is positive definite (all eigenvalues $> 0$ ), then $\mathbf{a}$ is a local minimum.
If $H_f(\mathbf{a})$ is negative definite (all eigenvalues $< 0$ ), then $\mathbf{a}$ is a local maximum.
If $H_f(\mathbf{a})$ is indefinite (has both positive and negative eigenvalues), then $\mathbf{a}$ is a saddle point.
If $H_f(\mathbf{a})$ is semidefinite (has zero eigenvalues), the test is inconclusive.

ExampleTwo-variable second derivative test

For $f(x, y)$ with critical point $(a, b)$ , let $D = f_{xx}f_{yy} - (f_{xy})^2$ (the determinant of the Hessian):

$D > 0$ and $f_{xx} > 0$ : local minimum
$D > 0$ and $f_{xx} < 0$ : local maximum
$D < 0$ : saddle point
$D = 0$ : inconclusive

For $f(x,y) = x^2 + y^2 - 2x - 4y + 5$ : critical point at $(1, 2)$ with $D = 4 > 0$ , $f_{xx} = 2 > 0$ , so $(1, 2)$ is a local minimum with $f(1,2) = 0$ .

Lagrange Multipliers

RemarkConstrained optimization

To optimize $f(\mathbf{x})$ subject to a constraint $g(\mathbf{x}) = 0$ , the method of Lagrange multipliers states that at a constrained extremum, $\nabla f = \lambda \nabla g$ for some scalar $\lambda$ . This elegant condition says the gradient of $f$ is parallel to the gradient of $g$ at the optimum, meaning $f$ can only change in directions that would violate the constraint.