The Method of Lagrangian Multipliers and The KKT Conditions

This article simply introduced strategies for finding the stationary points of the objective function subject to one or more equality or inequality constraints.

Consider a standard form of continuous optimization problem, $$ \min\limits_{\bf x} f({\bf x}) \ {\rm s.t.};;g_k({\bf x})\leq0,;;k=1,2,\cdots,m \ h_l({\bf x})=0,;;l=1,2,\cdots,p \ $$ in which $$ {\bf x} \in \Re^{n} \ f,g_k,h_l:\Re^n \rightarrow \Re;;{\rm for};;k=1,2,\cdots,m;;{\rm and};;l=1,2,\cdots,p \ m,p\geq0 $$ And $f,g_k,h_l$ are all continuous differentable.

We divided the problem into two cases: $p = 0$ or $p \neq 0$. For the former we introduced The Method of Lagrange Multipliers as the solving strategy, and simply introduced KKT Conditions for the other one when it suits some Regularity Conditions.

Notice that it was easy to treat a maximization problem by negating the objective function, we only use the maximization problem as a general example.

The Method of Lagrange Multipliers

Consider the optimization problem $$ \min_{\bf x}f({\bf x}) \ {\rm s.t.};;g_k({\bf x})=0;;{\rm for};;k=1,2,\cdots,m. $$ To transfer the constrained problem into an unconstrained one, we define the Lagrange Function, $$ L({\bf x, \overrightarrow{\lambda}})=f({\bf x})+\sum_{k=1}^m \lambda_k g_k({\bf x}) $$ In which $\lambda_{k} \in \Re ;; {\rm for} ;; k=1,2,\cdots,m$. Then we have the necessary conditions for the optimal solution, which is $$ \left{\begin{matrix} \bigtriangledown_{\bf x}L={\bf 0} \ \bigtriangledown_{\bf \overrightarrow{\lambda}}L={\bf 0} \end{matrix}\right. $$ called Stationary Equation and Constraints separately. Then solving the $n+m$ simultaneous equations, and the solution $({\bf x^},\overrightarrow{\lambda}!^)$ are stationary points and corresponding coefficients.

${\bf N{\scriptsize OTE}}.$ Cause we could get stationary point through Lagrange Multipliers directly, the minimization/maximization problems are treated in same way.

${\bf E{\scriptsize XAMPLE}}.$ Maximize $f(x,y,z) = 8xyz$ subject to $\dfrac{x^2}{a^2}+\dfrac{y^2}{b^2}+\dfrac{z^2}{c^2}=1$.

${\bf S{\scriptsize OLUTION}}.$ Form the Lagrange Function $$ L(x,y,z,\lambda)=8xyz+\lambda(\dfrac{x^2}{a^2}+\dfrac{y^2}{b^2}+\dfrac{z^2}{c^2}-1) $$ Then calculate gradient of the function $L$ and set it to $\bf 0$. $$ \bigtriangledown L=\begin{bmatrix} (8yz+\dfrac{2\lambda x}{a^2}) & (8xz+\dfrac{2\lambda y}{b^2}) & (8xy+\dfrac{2\lambda z}{c^2}) & (\dfrac{x^2}{a^2}+\dfrac{y^2}{b^2}+\dfrac{z^2}{c^2}-1) \end{bmatrix} = {\bf 0} $$ We could get the solution $$ \left{\begin{matrix} x=\dfrac{\sqrt{3}}{3}a \ y=\dfrac{\sqrt{3}}{3}b \ z=\dfrac{\sqrt{3}}{3}c \ \end{matrix}\right. $$ Considering the background of the question, the maximum solution must exist. Now we can get the answer $$ f_{\rm max}(x,y,z)=\dfrac{8\sqrt{3}}{9}abc $$

KKT Conditions

For convinience, we consider the case without equality constraint but with a single inequality constraint first, $$ \min_{\bf x}f({\bf x}) \ {\rm s.t.} ;; g({\bf x}) \leq 0. $$ Then define the feasible region $\mathbb{K}=\left{ \Re^n,|,g({\bf x}) \leq 0 \right}$ and assuming that ${\bf x}^$ is the best solution under the constraint condition $g$. According to whether ${\bf x}^$ is on the border of $\mathbb{K}$ or not, we can divide the problem into two cases and discuss them separately.

${\bf C{\scriptsize ASE}; 1}.;; g({\bf x}^) <0.$ The best solution is inside $\mathbb{K}$. At this time we call ${\bf x}^$ as the interior solution. Obviously, at this time, an infinitesimal displacement of the point towards any direction will not against the constraint, so we call that the constraint condition$g$ is inactive.

${\bf C{\scriptsize ASE};2}.;;g({\bf x}^)=0.$ The best solution is on the border of $\mathbb{K}$. At this time we call ${\bf x}^$ as the boundary solution. Correspondingly, now we call that the constraint condition $g$ is active.

Likely, defining the Lagrangian Function $$ L({\bf x},\lambda)=f({\bf x})+\lambda g({\bf x}) $$ According to $g$ is active or inactive, the necessary condition for us to get the best solution are different.

${\bf C{\scriptsize ASE};I{\scriptsize NACTIVE}}. $Cause the constraint condition $g$ has no influence on getting the best solution, we could make $\lambda = 0$ directly. Now the task is equivalent to unconstrained optimization, and only $\bigtriangledown f={\bf 0}$ is needed.

${\bf C{\scriptsize ASE} ; A{\scriptsize CTIVE}}.$ Now the constraint condition $g$ is equivalent to $$ g({\bf x})=0 $$ Notice that for every points ${\bf x} \in g$, there is $\bigtriangledown g({\bf x})$ orthogonal to $g$. Likely, it is obvious to find that $\bigtriangledown f({\bf x}^)$ is also orthogonal to $g$. So, we can easily prove that $\bigtriangledown f \in {\rm span}(\bigtriangledown g)$ at ${\bf x}^$. That is to say, there exists $\lambda$ which makes that $$ \bigtriangledown_{\bf x} f=-\lambda \bigtriangledown_{\bf x} g $$ It’s easy to find that $\lambda \geq 0$ should be kept, cause we want to minimize $f$, and $\bigtriangledown f({\bf x}^*)$ (pointing to the fastest growing direction) should point to the interior of $\mathbb{K}$. However, $\bigtriangledown g$ points to the outside of $\mathbb{K}$, so $\lambda$ should be kept not less than$0$, which is called dual feasibility. Likely, if we want to maximize $f$, we should keep $\lambda \leq 0$.

Obviously, there will always be either $\lambda$ or $g$ equal to $0$, so it always holds that $\lambda g({\bf x})=0$, which is called complementary slackness.

Thus we can summarize all necessary conditions mentioned above as KKT Conditions,

$$ \begin{aligned} \bigtriangledown_{\bf x} f + \lambda \bigtriangledown_{\bf x} g &= {\bf 0} \ {\bf g}({\bf x}) &\leq 0 \ \lambda &\geq 0 \ \lambda g({\bf x}) &= 0 \end{aligned} $$

Similarly, we can also extend the conclusion to the general continuous optimization problem. The corresponding Lagrangian Function is defined as $$ L({\bf x}, \overrightarrow \lambda, \overrightarrow \mu)=f({\bf x})+\overrightarrow \lambda!^\top {\bf g}({\bf x}) + \overrightarrow \mu!^\top {\bf h}({\bf x}) $$ And it’s also convinient to write down the corresponding KKT Conditions $$ \begin{aligned} \bigtriangledown_{\bf x} f + \overrightarrow \lambda!^\top \bigtriangledown_{\bf x}{\bf g}+\overrightarrow \mu!^\top\bigtriangledown_{\bf x}{\bf h} &= {\bf 0} \ g_k({\bf x}) &\leq 0, ;; k=1,2,\cdots, m \ h_l({\bf x}) &= 0, ;; l=1,2,\cdots, p \ \lambda_k &\geq 0,\ \lambda_k g_k({\bf x}) &= 0 \end{aligned} $$

${\bf N{\scriptsize OTE}}.$ In order for existing a point ${\bf x}^*$ fitting the KKT Conditions, The primal question should satisfy some regular conditions, which has been listed on Wikipedia.

${\bf E{\scriptsize XAMPLE}}.$ Minimize $x_1^2 + x_2^2$ subject to $x_1 + x_2=1$ and $x_2 \leq \alpha$, in which $x_1, x_2, \alpha \in \Re$.

${\bf S{\scriptsize OLUTION}}.$ The corresponding Langrangian Function is $$ L({\bf x}, \lambda, \mu)=x_1^2 + x_2^2 +\lambda(x_2-\alpha)+\mu(x_1 + x_2 - 1) $$ According to KKT Condition, there must be $$ \left { \begin{aligned} \frac{\partial L}{\partial x_1} = \frac{\partial L}{\partial x_2} &= 0 \ x_1 + x_2 &= 1 \ x_2 &\leq \alpha \ \lambda &\geq 0 \ \lambda(x_2 - \alpha) &= 0 \end{aligned} \right. $$ which is equivalent to $$ \left { \begin{aligned} x_1 &= -\frac{\mu}{2} \ x_2 &= \frac{\mu}{2} + 1 \ \mu &\leq -1 \ \mu &\leq 2\alpha - 2 \end{aligned} \right. $$ Now we can divide the problem into 2 cases according to whether $2\alpha - 2 \geq -1$ or not.

${\bf C{\scriptsize ASE}};;\alpha \geq \dfrac{1}{2}.$ It is easy to verify that $\mu = -1$ satisfies all KKT Conditions above, so when $x_1 = x_2 = \dfrac{1}{2}$, $x_1^2 + x_2^2$ takes minimum value $\dfrac{1}{2}$.

${\bf C{\scriptsize ASE}};;\alpha < \dfrac{1}{2}.$ There is $\mu=2\alpha - 2$ satisfies all KKT Conditions above only, so when $x_1=1-\alpha$and$x_2 = \alpha$, $x_1^2 + x_2^2$ takes minimum value $1 - 2\alpha + 2\alpha^2$.