ConceptComplete

Optimization in Several Variables

Finding maxima and minima of multivariable functions requires extending the critical point analysis of single-variable calculus to higher dimensions, using the gradient and Hessian matrix.


Critical Points

Definition

A point a\mathbf{a} is a critical point of f:RnRf : \mathbb{R}^n \to \mathbb{R} if f(a)=0\nabla f(\mathbf{a}) = \mathbf{0} (i.e., all partial derivatives vanish). A critical point is a local minimum if f(x)f(a)f(\mathbf{x}) \geq f(\mathbf{a}) for all x\mathbf{x} near a\mathbf{a}, a local maximum if f(x)f(a)f(\mathbf{x}) \leq f(\mathbf{a}), and a saddle point if it is neither.

Definition

The Hessian matrix of ff at a\mathbf{a} is the matrix of second partial derivatives: Hf(a)=(fx1x1fx1x2fx1xnfx2x1fx2x2fx2xnfxnx1fxnx2fxnxn)H_f(\mathbf{a}) = \begin{pmatrix} f_{x_1 x_1} & f_{x_1 x_2} & \cdots & f_{x_1 x_n} \\ f_{x_2 x_1} & f_{x_2 x_2} & \cdots & f_{x_2 x_n} \\ \vdots & \vdots & \ddots & \vdots \\ f_{x_n x_1} & f_{x_n x_2} & \cdots & f_{x_n x_n} \end{pmatrix} By Clairaut's theorem, HfH_f is symmetric when fC2f \in C^2.


The Second Derivative Test

Theorem8.4Second Derivative Test

Let f:RnRf : \mathbb{R}^n \to \mathbb{R} be C2C^2 and a\mathbf{a} a critical point of ff. Then:

  1. If Hf(a)H_f(\mathbf{a}) is positive definite (all eigenvalues >0> 0), then a\mathbf{a} is a local minimum.
  2. If Hf(a)H_f(\mathbf{a}) is negative definite (all eigenvalues <0< 0), then a\mathbf{a} is a local maximum.
  3. If Hf(a)H_f(\mathbf{a}) is indefinite (has both positive and negative eigenvalues), then a\mathbf{a} is a saddle point.
  4. If Hf(a)H_f(\mathbf{a}) is semidefinite (has zero eigenvalues), the test is inconclusive.
ExampleTwo-variable second derivative test

For f(x,y)f(x, y) with critical point (a,b)(a, b), let D=fxxfyy(fxy)2D = f_{xx}f_{yy} - (f_{xy})^2 (the determinant of the Hessian):

  • D>0D > 0 and fxx>0f_{xx} > 0: local minimum
  • D>0D > 0 and fxx<0f_{xx} < 0: local maximum
  • D<0D < 0: saddle point
  • D=0D = 0: inconclusive

For f(x,y)=x2+y22x4y+5f(x,y) = x^2 + y^2 - 2x - 4y + 5: critical point at (1,2)(1, 2) with D=4>0D = 4 > 0, fxx=2>0f_{xx} = 2 > 0, so (1,2)(1, 2) is a local minimum with f(1,2)=0f(1,2) = 0.


Lagrange Multipliers

RemarkConstrained optimization

To optimize f(x)f(\mathbf{x}) subject to a constraint g(x)=0g(\mathbf{x}) = 0, the method of Lagrange multipliers states that at a constrained extremum, f=λg\nabla f = \lambda \nabla g for some scalar λ\lambda. This elegant condition says the gradient of ff is parallel to the gradient of gg at the optimum, meaning ff can only change in directions that would violate the constraint.