This article simply introduced strategies for finding the stationary points of the objective function subject to one or more equality or inequality constraints.

Consider a standard form of continuous optimization problem, $$ \min\limits_{\bf x} f({\bf x}) \ {\rm s.t.};;g_k({\bf x})\leq0,;;k=1,2,\cdots,m \ h_l({\bf x})=0,;;l=1,2,\cdots,p \ $$ in which $$ {\bf x} \in \Re^{n} \ f,g_k,h_l:\Re^n \rightarrow \Re;;{\rm for};;k=1,2,\cdots,m;;{\rm and};;l=1,2,\cdots,p \ m,p\geq0 $$ And $f,g_k,h_l$ are all continuous differentable.

We divided the problem into two cases: $p = 0$ or $p \neq 0$. For the former we introduced **The Method of Lagrange Multipliers** as the solving strategy, and simply introduced **KKT Conditions** for the other one when it suits some **Regularity Conditions**.

Notice that it was easy to treat a maximization problem by negating the objective function, we only use the maximization problem as a general example.

## The Method of Lagrange Multipliers

Consider the optimization problem
$$
\min_{\bf x}f({\bf x}) \
{\rm s.t.};;g_k({\bf x})=0;;{\rm for};;k=1,2,\cdots,m.
$$
To transfer the constrained problem into an unconstrained one, we define the **Lagrange Function**,
$$
L({\bf x, \overrightarrow{\lambda}})=f({\bf x})+\sum_{k=1}^m \lambda_k g_k({\bf x})
$$
In which $\lambda_{k} \in \Re ;; {\rm for} ;; k=1,2,\cdots,m$. Then we have the **necessary conditions** for the optimal solution, which is
$$
\left{\begin{matrix}
\bigtriangledown_{\bf x}L={\bf 0} \
\bigtriangledown_{\bf \overrightarrow{\lambda}}L={\bf 0}
\end{matrix}\right.
$$
called **Stationary Equation** and **Constraints** separately. Then solving the $n+m$ simultaneous equations, and the solution $({\bf x^*},\overrightarrow{\lambda}!^*)$ are stationary points and corresponding coefficients.

${\bf N{\scriptsize OTE}}.$ *Cause we could get stationary point through Lagrange Multipliers directly, the minimization/maximization problems are treated in same way.*

${\bf E{\scriptsize XAMPLE}}.$ Maximize $f(x,y,z) = 8xyz$ subject to $\dfrac{x^2}{a^2}+\dfrac{y^2}{b^2}+\dfrac{z^2}{c^2}=1$.

${\bf S{\scriptsize OLUTION}}.$ Form the Lagrange Function $$ L(x,y,z,\lambda)=8xyz+\lambda(\dfrac{x^2}{a^2}+\dfrac{y^2}{b^2}+\dfrac{z^2}{c^2}-1) $$ Then calculate gradient of the function $L$ and set it to $\bf 0$. $$ \bigtriangledown L=\begin{bmatrix} (8yz+\dfrac{2\lambda x}{a^2}) & (8xz+\dfrac{2\lambda y}{b^2}) & (8xy+\dfrac{2\lambda z}{c^2}) & (\dfrac{x^2}{a^2}+\dfrac{y^2}{b^2}+\dfrac{z^2}{c^2}-1) \end{bmatrix} = {\bf 0} $$ We could get the solution $$ \left{\begin{matrix} x=\dfrac{\sqrt{3}}{3}a \ y=\dfrac{\sqrt{3}}{3}b \ z=\dfrac{\sqrt{3}}{3}c \ \end{matrix}\right. $$ Considering the background of the question, the maximum solution must exist. Now we can get the answer $$ f_{\rm max}(x,y,z)=\dfrac{8\sqrt{3}}{9}abc $$

## KKT Conditions

For convinience, we consider the case without equality constraint but with a single inequality constraint first,
$$
\min_{\bf x}f({\bf x}) \
{\rm s.t.} ;; g({\bf x}) \leq 0.
$$
Then define the **feasible region** $\mathbb{K}=\left{ \Re^n,|,g({\bf x}) \leq 0 \right}$ and assuming that ${\bf x}^*$ is the best solution under the constraint condition $g$. According to whether ${\bf x}^*$ is on the border of $\mathbb{K}$ or not, we can divide the problem into two cases and discuss them separately.

${\bf C{\scriptsize ASE}; 1}.;; g({\bf x}^*) <0.$ The best solution is inside $\mathbb{K}$. At this time we call ${\bf x}^*$ as the

**interior solution**. Obviously, at this time, an infinitesimal displacement of the point towards any direction will not against the constraint, so we call that the constraint condition$g$ is

**inactive**.

${\bf C{\scriptsize ASE};2}.;;g({\bf x}^*)=0.$ The best solution is on the border of $\mathbb{K}$. At this time we call ${\bf x}^*$ as the

**boundary solution**. Correspondingly, now we call that the constraint condition $g$ is

**active**.

Likely, defining the Lagrangian Function
$$
L({\bf x},\lambda)=f({\bf x})+\lambda g({\bf x})
$$
According to $g$ is *active* or *inactive*, the necessary condition for us to get the *best* solution are different.

${\bf C{\scriptsize ASE};I{\scriptsize NACTIVE}}. $Cause the constraint condition $g$ has no influence on getting the *best* solution, we could make $\lambda = 0$ directly. Now the task is equivalent to *unconstrained optimization*, and only $\bigtriangledown f={\bf 0}$ is needed.

${\bf C{\scriptsize ASE} ; A{\scriptsize CTIVE}}.$ Now the constraint condition $g$ is equivalent to
$$
g({\bf x})=0
$$
Notice that for every points ${\bf x} \in g$, there is $\bigtriangledown g({\bf x})$ orthogonal to $g$. Likely, it is obvious to find that $\bigtriangledown f({\bf x}^*)$ is also orthogonal to $g$. So, we can easily prove that $\bigtriangledown f \in {\rm span}(\bigtriangledown g)$ at ${\bf x}^*$. That is to say, there exists $\lambda$ which makes that
$$
\bigtriangledown_{\bf x} f=-\lambda \bigtriangledown_{\bf x} g
$$
It’s easy to find that $\lambda \geq 0$ should be kept, cause we want to minimize $f$, and $\bigtriangledown f({\bf x}^*)$ (pointing to the fastest growing direction) should point to the interior of $\mathbb{K}$. However, $\bigtriangledown g$ points to the outside of $\mathbb{K}$, so $\lambda$ should be kept not less than$0$, which is called **dual feasibility**. Likely, if we want to maximize $f$, we should keep $\lambda \leq 0$.

Obviously, there will always be either $\lambda$ or $g$ equal to $0$, so it always holds that $\lambda g({\bf x})=0$, which is called **complementary slackness**.

Thus we can summarize all necessary conditions mentioned above as **KKT Conditions**,

$$ \begin{aligned} \bigtriangledown_{\bf x} f + \lambda \bigtriangledown_{\bf x} g &= {\bf 0} \ {\bf g}({\bf x}) &\leq 0 \ \lambda &\geq 0 \ \lambda g({\bf x}) &= 0 \end{aligned} $$

Similarly, we can also extend the conclusion to the general continuous optimization problem. The corresponding Lagrangian Function is defined as $$ L({\bf x}, \overrightarrow \lambda, \overrightarrow \mu)=f({\bf x})+\overrightarrow \lambda!^\top {\bf g}({\bf x}) + \overrightarrow \mu!^\top {\bf h}({\bf x}) $$ And it’s also convinient to write down the corresponding KKT Conditions $$ \begin{aligned} \bigtriangledown_{\bf x} f + \overrightarrow \lambda!^\top \bigtriangledown_{\bf x}{\bf g}+\overrightarrow \mu!^\top\bigtriangledown_{\bf x}{\bf h} &= {\bf 0} \ g_k({\bf x}) &\leq 0, ;; k=1,2,\cdots, m \ h_l({\bf x}) &= 0, ;; l=1,2,\cdots, p \ \lambda_k &\geq 0,\ \lambda_k g_k({\bf x}) &= 0 \end{aligned} $$

${\bf N{\scriptsize OTE}}.$ In order for existing a point ${\bf x}^*$ fitting the KKT Conditions, The primal question should satisfy some regular conditions, which has been listed on Wikipedia.

${\bf E{\scriptsize XAMPLE}}.$ Minimize $x_1^2 + x_2^2$ subject to $x_1 + x_2=1$ and $x_2 \leq \alpha$, in which $x_1, x_2, \alpha \in \Re$.

${\bf S{\scriptsize OLUTION}}.$ The corresponding Langrangian Function is $$ L({\bf x}, \lambda, \mu)=x_1^2 + x_2^2 +\lambda(x_2-\alpha)+\mu(x_1 + x_2 - 1) $$ According to KKT Condition, there must be $$ \left { \begin{aligned} \frac{\partial L}{\partial x_1} = \frac{\partial L}{\partial x_2} &= 0 \ x_1 + x_2 &= 1 \ x_2 &\leq \alpha \ \lambda &\geq 0 \ \lambda(x_2 - \alpha) &= 0 \end{aligned} \right. $$ which is equivalent to $$ \left { \begin{aligned} x_1 &= -\frac{\mu}{2} \ x_2 &= \frac{\mu}{2} + 1 \ \mu &\leq -1 \ \mu &\leq 2\alpha - 2 \end{aligned} \right. $$ Now we can divide the problem into 2 cases according to whether $2\alpha - 2 \geq -1$ or not.

${\bf C{\scriptsize ASE}};;\alpha \geq \dfrac{1}{2}.$ It is easy to verify that $\mu = -1$ satisfies all KKT Conditions above, so when $x_1 = x_2 = \dfrac{1}{2}$, $x_1^2 + x_2^2$ takes minimum value $\dfrac{1}{2}$.

${\bf C{\scriptsize ASE}};;\alpha < \dfrac{1}{2}.$ There is $\mu=2\alpha - 2$ satisfies all KKT Conditions above only, so when $x_1=1-\alpha$and$x_2 = \alpha$, $x_1^2 + x_2^2$ takes minimum value $1 - 2\alpha + 2\alpha^2$.