The Method of Lagrangian Multipliers and The KKT Conditions

This article simply introduced strategies for finding the stationary points of the objective function subject to one or more equality or inequality constraints.

Consider a standard form of continuous optimization problem, minxf(x) s.t.;;gk(x)0,;;k=1,2,,m hl(x)=0,;;l=1,2,,p  in which xn f,gk,hl:n;;for;;k=1,2,,m;;and;;l=1,2,,p m,p0 And f,gk,hl are all continuous differentable.

We divided the problem into two cases: p=0 or p0. For the former we introduced The Method of Lagrange Multipliers as the solving strategy, and simply introduced KKT Conditions for the other one when it suits some Regularity Conditions.

Notice that it was easy to treat a maximization problem by negating the objective function, we only use the maximization problem as a general example.

The Method of Lagrange Multipliers

Consider the optimization problem minxf(x) s.t.;;gk(x)=0;;for;;k=1,2,,m. To transfer the constrained problem into an unconstrained one, we define the Lagrange Function, L(x,λ)=f(x)+k=1mλkgk(x) In which λk;;for;;k=1,2,,m. Then we have the necessary conditions for the optimal solution, which is $$ \left{xL=0 λL=0\right. $$ called Stationary Equation and Constraints separately. Then solving the n+m simultaneous equations, and the solution $({\bf x^},\overrightarrow{\lambda}!^)$ are stationary points and corresponding coefficients.

NOTE. Cause we could get stationary point through Lagrange Multipliers directly, the minimization/maximization problems are treated in same way.

EXAMPLE. Maximize f(x,y,z)=8xyz subject to x2a2+y2b2+z2c2=1.

SOLUTION. Form the Lagrange Function L(x,y,z,λ)=8xyz+λ(x2a2+y2b2+z2c21) Then calculate gradient of the function L and set it to 0. L=[(8yz+2λxa2)(8xz+2λyb2)(8xy+2λzc2)(x2a2+y2b2+z2c21)]=0 We could get the solution $$ \left{x=33a y=33b z=33c \right. Consideringthebackgroundofthequestion,themaximumsolutionmustexist.Nowwecangettheanswer f_{\rm max}(x,y,z)=\dfrac{8\sqrt{3}}{9}abc $$

KKT Conditions

For convinience, we consider the case without equality constraint but with a single inequality constraint first, minxf(x) s.t.;;g(x)0. Then define the feasible region Missing or unrecognized delimiter for \left and assuming that ${\bf x}^isthebestsolutionundertheconstraintconditiong.Accordingtowhether{\bf x}^isontheborderof\mathbb{K}$ or not, we can divide the problem into two cases and discuss them separately.

${\bf C{\scriptsize ASE}; 1}.;; g({\bf x}^) <0.$ The best solution is inside K. At this time we call ${\bf x}^$ as the interior solution. Obviously, at this time, an infinitesimal displacement of the point towards any direction will not against the constraint, so we call that the constraint conditiong is inactive.

${\bf C{\scriptsize ASE};2}.;;g({\bf x}^)=0.$ The best solution is on the border of K. At this time we call ${\bf x}^$ as the boundary solution. Correspondingly, now we call that the constraint condition g is active.

Likely, defining the Lagrangian Function L(x,λ)=f(x)+λg(x) According to g is active or inactive, the necessary condition for us to get the best solution are different.

CASE;INACTIVE.Cause the constraint condition g has no influence on getting the best solution, we could make λ=0 directly. Now the task is equivalent to unconstrained optimization, and only f=0 is needed.

CASE;ACTIVE. Now the constraint condition g is equivalent to g(x)=0 Notice that for every points xg, there is g(x) orthogonal to g. Likely, it is obvious to find that $\bigtriangledown f({\bf x}^)isalsoorthogonaltog.So,wecaneasilyprovethat\bigtriangledown f \in {\rm span}(\bigtriangledown g)at{\bf x}^.Thatistosay,thereexists\lambdawhichmakesthatxf=λxgItseasytofindthat\lambda \geq 0shouldbekept,causewewanttominimizef,and\bigtriangledown f({\bf x}^*)(pointingtothefastestgrowingdirection)shouldpointtotheinteriorof\mathbb{K}.However,\bigtriangledown gpointstotheoutsideof\mathbb{K},so\lambdashouldbekeptnotlessthan0$, which is called dual feasibility. Likely, if we want to maximize f, we should keep λ0.

Obviously, there will always be either λ or g equal to 0, so it always holds that λg(x)=0, which is called complementary slackness.

Thus we can summarize all necessary conditions mentioned above as KKT Conditions,

xf+λxg=0 g(x)0 λ0 λg(x)=0

Similarly, we can also extend the conclusion to the general continuous optimization problem. The corresponding Lagrangian Function is defined as L(x,λ,μ)=f(x)+λ!g(x)+μ!h(x) And it’s also convinient to write down the corresponding KKT Conditions xf+λ!xg+μ!xh=0 gk(x)0,;;k=1,2,,m hl(x)=0,;;l=1,2,,p λk0, λkgk(x)=0

NOTE. In order for existing a point x fitting the KKT Conditions, The primal question should satisfy some regular conditions, which has been listed on Wikipedia.

EXAMPLE. Minimize x12+x22 subject to x1+x2=1 and x2α, in which x1,x2,α.

SOLUTION. The corresponding Langrangian Function is L(x,λ,μ)=x12+x22+λ(x2α)+μ(x1+x21) According to KKT Condition, there must be $$ \left { Lx1=Lx2=0 x1+x2=1 x2α λ0 λ(x2α)=0 \right. whichisequivalentto \left { x1=μ2 x2=μ2+1 μ1 μ2α2 \right. $$ Now we can divide the problem into 2 cases according to whether 2α21 or not.

CASE;;α12. It is easy to verify that μ=1 satisfies all KKT Conditions above, so when x1=x2=12, x12+x22 takes minimum value 12.

CASE;;α<12. There is μ=2α2 satisfies all KKT Conditions above only, so when x1=1αandx2=α, x12+x22 takes minimum value 12α+2α2.

updatedupdated2023-03-112023-03-11