Review of Quantitative MethodsLectures on Optimization in Economic Theory
Qing Liu
Department of Economics
Chinese University of Hong Kong
September, 2014
Qing (CUHK) Math Camp September, 2014 1 / 56
1. Unconstrained Optimization1.1 One Variable Function
Consider a C2 function f (x) of one variable
Definitions
Local versus global max (or min)
Interior versus boundary max (or min)
Critical point x0 : f ′ (x0) = 0.
First order (necessary) condition
If x0 is an interior max or min of f , then x0 is a critical point
note that the reverse is not ture!
e.g., f (x) = x3
Second order derivative: curvature of the curve
f ′′ (x) > 0: the slope of the curve increases.
f ′′ (x) < 0: the slope of the curve decreases.
Qing (CUHK) Math Camp September, 2014 2 / 56
1.1 One Variable FunctionSOC
Second order (sufficient) condition
If f ′ (x0) = 0 and f ′′ (x0) < 0, then x0 is a local max of f . (f ′
changes sign from + to −.)
If f ′ (x0) = 0 and f ′′ (x0) > 0, then x0 is a local min of f . (f ′
changes sign from − to +.)
If f ′ (x0) = 0 and f ′′ (x0) = 0, then x0 can be a max, a min,
or neither.
Inflection Point c of a C2 function f
f ′′ (c) = 0 and f ′′ (x) changes sign at c.
For example: f (x) = x3 at x = 0.
Qing (CUHK) Math Camp September, 2014 3 / 56
1.1 One Variable FunctionExample
Example
f (x) = x3 − 12x2 + 36x + 8
f ′ (x) = 3x2 − 24x + 36 = 0
⇒ x∗ = 2 or 6; (f (2) = 40; f (6) = 8)
f ′′ (x) = 6x − 24
f ′′ (2) < 0 and f ′′ (6) > 0
x∗ = 2 is a local max and x∗ = 6 is a local min.
Exercise:
f (x) = x2 − 4− x4
10+
3x6
1000
1.1 One Variable FunctionGlobal Max and Min
Global max and min
If f is a C2 function defined on an interval I and f ′′ (x) 6= 0
for any x ∈ I, then f has at most one critical point in I. This
critical point is a global min if f ′′ (x) > 0 and a global max if
f ′′ (x) < 0.
Qing (CUHK) Math Camp September, 2014 5 / 56
1.2. Multivariate Functions
Let f : U → R be a C2 function of n variables defined on a
subset U of Rn:
y = f (x1, x2, ..., xn).
First order condition
If x∗ ∈ Int(U) is a local max or min of f in U, then
Df (x∗) = 0 or∂f
∂xi
(x∗) = 0 for i = 1, ...,n.
We say that x∗ is a critical point of f .
Qing (CUHK) Math Camp September, 2014 6 / 56
SOC
Suppose that x∗ is a critical point of f .
Hessian
D2f (x∗) =
f11 f12 ... f1n
f21 f22 ... f2n
... ... ... ...fn1 ... ... fnn
.
Sufficient condition:
If D2f (x∗) is negative definite, then x∗ is a local max.
If D2f (x∗) is positive definite, then x∗ is a local min.
If D2f (x∗) is indefinite, then x∗ is neither a local max nor a
local min (saddle point).
Qing (CUHK) Math Camp September, 2014 7 / 56
SOC: ProofNecessary condition: If x∗ is a local max (min), then
D2f (x∗) is negative (positive) semidefinite.
If f is a concave (convex) function and Df (x∗) = 0, then x∗
is a global max (min).
Proof.By Taylor approximation
f (x∗ + h) ≈ f (x∗) +Df (x∗)h+1
2hT D2f (x∗)h.
Since x∗ is a critical point of f (Df (x∗) = 0),
f (x∗ + h)− f (x∗) ≈ 1
2hT D2f (x∗)h.
If D2f (x∗) is negative definite, then, for all small enough h, the
right-hand side is negative. Thus, f (x∗ + h)− f (x∗) < 0 and x∗
is a local max.
Applications in EconomicsDiscriminating Monopolist
A monopolist sells one product in two distinct and separated
markets. The inverse demand functions are P1 = 50− 5Q1
and P2 = 100− 10Q2 respectively, where Qi is the amount
supplied to market i and Pi is the corresponding price. The
monopolist’s cost function is C (Q) = 90+ 20Q. How to
maximize profit?
The monopolist’s profit function is
π (Q1,Q2) = Q1 (50− 5Q1) +Q2 (100− 10Q2)
− (90+ 20 (Q1 +Q2)) .
First order condition:
∂π
∂Q1
= 50− 10Q1 − 20 = 0⇒ Q1 = 3,
∂π
∂Q2
= 100− 20Q2 − 20 = 0⇒ Q2 = 4.
First order condition:
∂π
∂Q1
= 50− 10Q1 − 20 = 0⇒ Q1 = 3,
∂π
∂Q2
= 100− 20Q2 − 20 = 0⇒ Q2 = 4.
Second order condition:
π11 = −10, π22 = −20, π12 = π21 = 0,
the Hessian is negative definite. Thus, the monopolist
maximizes its profit at (Q1 = 3,Q2 = 4).
Qing (CUHK) Math Camp September, 2014 10 / 56
Applications in Economics
Application: Least Squares Analysis
Suppose that we are interested in the (linear) relationship
between two variables and that we have n observations or
data points: (xi , yi) , i = 1, ...,n. For a given line y = mx + b,
we can measure the vertical distance (|mxi + b− yi |) from
each data point (xi , yi) to the line. Method of least
squares: choose m∗ and b∗ such that the sum of the
squared distances is minimized, that is,
min(m,b)
S (m,b) = ∑n
i=1(mxi + b− yi)
2 .
First order condition:
∂S
∂m= ∑n
i=12 (mxi + b− yi) xi = 0
∂S
∂b= ∑n
i=12 (mxi + b− yi) = 0,
Qing (CUHK) Math Camp September, 2014 11 / 56
after rearrangement, we obtain(∑i
x2i
)m+
(∑i
xi
)b = ∑i
xiyi(∑i
xi
)m+ n · b = ∑i
yi .
Using Cramer’s rule:
m∗ =n ∑i xiyi − (∑i xi) (∑i yi)
n ∑i x2i − (∑i xi)
2
b∗ =∑i x2
i (∑ni=1 yi)− (∑i xi) (∑i xiyi)
n ∑i x2i − (∑i xi)
2.
Second order condition:
∂2S
∂m2= ∑i
x2i ,
∂2S
∂b2= n
∂2S
∂m∂b=
∂2S
∂b∂m= ∑i
xi
The Hessian is positive semidefinite as n(∑i x2
i
)≥ (∑i xi )
2 .Thus, (m∗,b∗) is the global minimizer (generically).
2. Equality Constraints2.1. Two Variables and One Constraint
Consider the two-variable problem
max(x ,y)
f (x , y)
subject to g(x , y) = c
f (x , y) is the objective function.
g(x , y) is a constraint function
Qing (CUHK) Math Camp September, 2014 13 / 56
Graphical Illustration
Graphical Illustration
x
y
g (x, y) = c
f (x, y)=kf (x, y)=k’
Note that the slope of the curve f (x , y) at point (x0, y0) is:
− fx (x0,y0)fy (x0,y0)
2.1.1 Necessary Conditions for an OptimumAssuming that f and g are differentiable. At a solution
(x∗, y∗) of the problem, the constraint curve is tangent to a
level curve of f , such that
− fx (x∗, y∗)fy (x∗, y∗)
= −gx (x∗, y∗)gy (x∗, y∗)
,
orfx (x∗, y∗)gx (x∗, y∗)
=fy (x∗, y∗)
gy (x∗, y∗),
assuming that gx (x∗, y∗) 6= 0 and gy (x∗, y∗) 6= 0.
Now introduce a new variable, λ, and let
λ =fx (x∗, y∗)gx (x∗, y∗)
=fy (x∗, y∗)
gy (x∗, y∗). Then,
fx (x∗, y∗)− λgx (x
∗, y∗) = 0
fy (x∗, y∗)− λgy (x
∗, y∗) = 0
Lagrangian Function
Thus the following conditions must be satisfied:
fx (x∗, y∗)− λgx (x
∗, y∗) = 0
fy (x∗, y∗)− λgy (x
∗, y∗) = 0
g(x∗, y∗) = c
Lagrangian Function
L (x , y ,λ) ≡ f (x , y)− λ (g(x , y)− c)
First order conditions
fx (x , y)− λgx (x , y) = 0
fy (x , y)− λgy (x , y) = 0
g(x , y) = c
Qing (CUHK) Math Camp September, 2014 16 / 56
Lagrangian Method
Theorem Let f and g be C1 functions of two variables.
Suppose that (x∗, y∗) is a solution for
max(x ,y)
f (x , y) subject to g(x , y) = c
Suppose further that (x∗, y∗) is not a critical point of g.
Then, there is a real number λ∗ such that (x∗, y∗,λ∗) is a
critical point of the Lagrangian function
L (x , y ,λ) ≡ f (x , y)− λ (g(x , y)− c) .
In other words, at (x∗, y∗,λ∗)
∂L
∂x=
∂L
∂y=
∂L
∂λ= 0.
Qing (CUHK) Math Camp September, 2014 17 / 56
Lagrangian Method
Remarks
The theorem holds whether we are maximizing or minimizing
f with the same constraint.
No restrictions on the sign of the multiplier
reduce the constrained problem to unconstrained problem
Qing (CUHK) Math Camp September, 2014 18 / 56
Example 1Utility Maximization with Budget Constraint
max(x ,y){xy} subject to pxx + pyy = I
The Lagrangian is
L (x , y ,λ) ≡ xy − λ (pxx + pyy − I)
The first-order conditions are
∂L
∂x= y − λpx = 0
∂L
∂y= x − λpy = 0
∂L
∂λ= pxx + pyy − I = 0
The unique solution is: (x∗, y∗,λ∗) =(
I2px, I
2py, I
2px py
).
Example 2
Example 2
Consider the problem
max(x ,y)
{x2y
}subject to 2x2 + y2 = 3
The Lagrangian is
L (x , y ,λ) ≡ x2y − λ(
2x2 + y2 − 3)
The first-order conditions are
∂L
∂x= 2x(y − 2λ) = 0
∂L
∂y= x2 − 2λy = 0
∂L
∂λ= 2x2 + y2 − 3 = 0
Analysis: Either x = 0 or y = 2λ
If x = 0, y = ±√
3, and λ = 0
If y = 2λ, x = ±2λ, thus x = ±1
If x = 1,y = 1 and λ = 1
2, or y = −1 and λ = −1
2
If x = −1,y = 1 and λ = 1
2, or y = −1 and λ = −1
2
The first-order conditions have six solutions:
1 (x , y ,λ) = (0,√
3,0), f (x , y) = 0.2 (x , y ,λ) = (0,−
√3,0), f (x , y) = 0.
3 (x , y ,λ) = (1,1, 12), f (x , y) = 1.
4 (x , y ,λ) = (1,−1,− 12), f (x , y) = −1.
5 (x , y ,λ) = (−1,1, 12), f (x , y) = 1.
6 (x , y ,λ) = (−1,−1,− 12), f (x , y) = −1.
We conclude that the problem has two solutions,
(x , y) = (1,1) and (x , y) = (−1,1).
2.1.2. Sufficient Conditions for a Local
Optimum
Consider the two-variable problem
max(x ,y)
f (x , y) subject to g(x , y) = c
Let h be implicitly defined by g(x ,h(x)) = c. Then the
problem is:
maxx
f (x ,h(x))
Define F (x) = f (x ,h(x)). Then
F ′(x) = fx (x ,h(x)) + fy (x ,h(x))h′(x)
Let x∗ be a critical point of F (i.e. F ′(x∗) = 0). A sufficient
condition for x∗ to be a local maximizer of F is that
F ′′(x∗) < 0. We have
F ′′(x∗) = fxx (x∗,h(x∗)) + 2fxy (x
∗,h(x∗))h′(x∗)
+fyy (x∗,h(x∗))
(h′(x∗)
)2+ fy (x
∗,h(x∗))h′′(x∗)
Now, since g(x ,h(x)) = c for all x , we have
gx (x ,h(x)) + gy (x ,h(x))h′(x) = 0
h′(x) = −gx (x ,h(x))
gy (x ,h(x))
Using this expression we can find h′′(x∗), and substitute it
into the expression for F ′′(x∗):
F ′′(x∗) = −|H (x∗, y∗,λ∗)|
(gy (x∗, y∗))2
Bordered Hessian of the Lagrangian
H (x∗, y∗,λ∗)
=
0 gx (x∗, y∗) gy (x∗, y∗)gx (x∗, y∗) Z ∗xx Z ∗xy
gy (x∗, y∗) Z ∗yx Z ∗yy
where
Z ∗xx = fxx (x∗, y∗)− λ∗gxx (x
∗, y∗)
Z ∗yy = fyy (x∗, y∗)− λ∗gyy (x
∗, y∗)
Z ∗xy = Z ∗yx = fxy (x∗, y∗)− λ∗gxy (x
∗, y∗)
Qing (CUHK) Math Camp September, 2014 24 / 56
Generalization
Theorem Consider the problems
max(x ,y)
f (x , y) subject to g(x , y) = c; or
min(x ,y)
f (x , y) subject to g(x , y) = c.
Suppose that (x∗, y∗,λ∗) satisfies the first order conditions
fx (x∗, y∗)− λgx (x
∗, y∗) = 0
fy (x∗, y∗)− λgy (x
∗, y∗) = 0
g(x∗, y∗) = c
1 If |H (x∗, y∗,λ∗)| > 0, then (x∗, y∗) is a local maximizer of f
subject to g(x , y) = c.2 If |H (x∗, y∗,λ∗)| < 0, then (x∗, y∗) is a local minimizer of f
subject to g(x , y) = c
Example
Utility maximization revisited
max(x ,y){xy} subject to pxx + pyy = I
The Lagrangian is: L (x , y ,λ) ≡ xy − λ (pxx + pyy − I)The unique solution is
(x∗, y∗,λ∗) =
(I
2px,
I
2py,
I
2pxpy
).
Bordered Hessian of the Lagrangian
H (x∗, y∗,λ∗) =
0 px py
px 0 1
py 1 0
.|H (x∗, y∗,λ∗)| = 2pxpy > 0, so
(I
2px, I
2py
)is indeed a local
maximizer
2.2. Generalization
The Lagrangian method can easily be generalized to a
problem of the form
maxx
f (x) subject to gj(x) = cj for j = 1, ...,m
with n variables (x = (x1, ...xn)) and m constraints.
The Lagrangian for this problem is
L (x,λ) ≡ f (x)−m
∑j=1
λj
(gj(x)− cj
),
that is, one Lagrange multiplier for each constraint
NDCQNondegenerate Constraint Qualification (NDCQ)
With two variables and one constraint, we require(∂g
∂x1
(x∗) ,∂g
∂x2
(x∗)
)6= (0,0) .
With n variables and one constraint, we generalize it as(∂g
∂x1
(x∗) , ...,∂g
∂xn(x∗)
)6= (0, ...,0) .
With n variables and m constraints, we further generalize it
as the Jacobian matrix
Dg (x∗) =
∂g1
∂x1(x∗) ... ∂g1
∂xn(x∗)
∂g2
∂x1(x∗) ... ∂g2
∂xn(x∗)
... ... ...∂gm
∂x1(x∗) ... ∂gm
∂xn(x∗)
has a rank of m
2.2.1 First Order Conditions
Theorem Let f and g1, ...,gm be C1 functions of n variables.
Consider the problem of maximizing (or minimizing) f (x) on
the constraint set:
Cg ≡{(x1, ..., xn) : gj(x) = cj for j = 1, ...,m
}.
Suppose that x∗ ∈ Cg is a (local) max or min of f on Cg .Suppose further that x∗ satisfies condition NDCQ above.
Then, there exist λ∗ = (λ∗1, ...,λ∗m) such that (x∗,λ∗) is a
critical point of the Lagrangian
L (x,λ) ≡ f (x)−m
∑j=1
λj
(gj(x)− cj
)In other words, at (x∗,λ∗)
∂L
∂xi
= 0, i = 1,2, ...,n,
∂L
∂λj
= 0, j = 1,2, ...,m.
2.2.2. Second Order ConditionsBordered Hessian
H ≡(
0 Dg (x∗)Dg (x∗)T D2
x L (x∗,λ∗)
)where
Dg (x∗) =
∂g1
∂x1(x∗) ... ∂g1
∂xn(x∗)
∂g2
∂x1(x∗) ... ∂g2
∂xn(x∗)
... ... ...∂gm
∂x1(x∗) ... ∂gm
∂xn(x∗)
and
D2xL (x∗,λ∗) =
∂2L
∂x21
(x∗) ... ∂2L∂x1∂xn
(x∗)
∂2L∂x2∂x1
(x∗) ... ∂2L
∂x22
(x∗)
... ... ...∂2L
∂xn∂x1(x∗) ... ∂2L
∂x2n(x∗)
.
SOC
Theorem Let f and g1, ...,gm be C2 functions on Rn.Consider the problem of maximizing (or minimizing) f (x) on
the constraint set
Cg ≡{(x1, ..., xn) : gj(x) = cj for j = 1, ...,m
}.
Suppose that (x∗,λ∗) is a critical point of the Lagrangian
L (x,λ) ≡ f (x)−m
∑j=1
λj
(gj(x)− cj
).
1 If the Hessian of L with respect to x at (x∗,λ∗) , D2xL (x∗,λ∗) ,
is negative definite on the linear constraint set
{v : Dg (x∗) v = 0} ; that is
v 6= 0 and Dg (x∗) v = 0
⇒ vT(
D2xL (x∗,λ∗)
)v < 0,
then, x∗ is a strict local constrained max of f on Cg
(cont.)
1 If the Hessian of L...2 If the Hessian of L with respect to x at (x∗,λ∗) , D2
xL (x∗,λ∗) ,is positive definite on the linear constraint set
{v : Dg (x∗) v = 0} ; that is
v 6= 0 and Dg (x∗) v = 0
⇒ vT(
D2xL (x∗,λ∗)
)v > 0,
then, x∗ is a strict local constrained min of f on Cg .
Remember that:
Negative definite: if the last (n−m) leading principle
minors of the bordered Hessian H alternate in sign, with the
sign of the determinant of H the same as the sign of (−1)n .Positive definite: if the last (n−m) leading principle minors
of the bordered Hessian H have the same sign as the sign of
(−1)m .
2.2.3. Summary
Consider the problem of maximizing f (x) on the constraint
set
Cg ≡{(x1, ..., xn) : gj(x) = cj for j = 1, ...,m
}.
Suppose that (x∗,λ∗) is a critical point of the Lagrangian
(i.e., first order condition)
L (x,λ) ≡ f (x)−m
∑j=1
λj
(gj(x)− cj
).
If the second order condition is satisfied at (x∗,λ∗), then x∗
is a strict local constrained max.
If L is concave, in particular, if f is concave and λjgj is
convex for j = 1, ...,m, then x∗ is a global constrained max.
If f is strictly quasiconcave and Cg is convex, then x∗ is a
unique global constrained max
3. Inequality Constraints3.1. Example: Failure of Lagrange Principle
Consider the consumer optimization problem with log-linear
utility function
max (x1 + ln x2) s.t.
{p1x1 + p2x2 ≤ I
x1 ≥ 0, x2 ≥ 0(3.1)
The Lagrangian is
L (x1, x2,λ) = x1 + ln x2 − λ (p1x1 + p2x2 − I) ,
and the FONCs set forth by Lagrange Theorem are∂L∂x1= 1− λp1 = 0
∂L∂x2= 1
x2− λp2 = 0
p1x1 + p2x2 = I
⇒ λ =1
p1
, x2 =p1
p2
, x1 =I − p1
p1
.
Example: Failure of Lagrange Principle
solution:
λ =1
p1
, x2 =p1
p2
, x1 =I − p1
p1
If p1 > I, then the above solution is not feasible. But notice
that the problem (3.1) certainly has an optimal solution
Therefore, if p1 > I, this optimal solution does not satisfy the
above Lagrange condition.
In fact, as we will see soon, if p1 > I, the optimal solution is:
x1 = 0, x2 = I/p2 (corner solution) with Lagrange multiplier
λ = 1/I. Hence at the optimum, ∂L/∂x2 = 0 while
∂L/∂x1 < 0.
Remark:
The failure demonstrated below is due to the fact that
optimum is reached at the boundary of the set of feasible
points, i.e. the "interior " requirement of Lagrange Theorem
is violated
3.2. Kuhn-Tucker Conditions
Many models in economics are naturally formulated as
optimization problems with inequality constraints.
Consider, for example, a consumer’s choice problem.
maxx
u(x) subject to p · x ≤ w and x ≥ 0.
Consider a problem of the firm
maxx
f (x) s.t. gj (x) ≤ cj for j = 1, ...,m.
where f and gj for j = 1, ...,m are functions of n variables,
x = (x1, ..., xn), and cj for j = 1, ...,m are constants
Qing (CUHK) Math Camp September, 2014 36 / 56
Case with One Constraint
Consider the case with one constraint
maxx
f (x) subject to g (x) ≤ c,
which has a solution x∗. There are two possibilities:
The constraint is binding: g (x∗) = c.The constraint is NOT binding: g (x∗) < c.
Define the Lagrangian function as before
L(x,λ) = f (x)− λ(g(x)− c).
If g(x∗) = c and the constraint satisfies a regularity
condition, then ∂L∂xi(x∗,λ∗) = 0 for all i . In this case, it must
be that λ ≥ 0. To see this, suppose that λ < 0, then a small
decrease in c raises the value of f . In other words, moving
x∗ inside the constraint set raises the value of f ,
contradicting the fact that x∗ is a local max.
Complementary Slackness Condition
If g(x∗) < c, then ∂f∂xi(x∗) = 0 for all i . In this case, the value
of λ does not enter the conditions, so we can choose any
value for it. Given the interpretation of λ, setting λ = 0
makes sense. Under this assumption we have∂f∂xi(x) = ∂L
∂xi(x,λ) for all x, so that ∂L
∂xi(x∗,λ∗) = 0 for all i .
We now combine the two cases by writing the conditions as
∂L
∂xi
(x∗,λ∗) = 0 for i = 1, ...,n
λ∗ ≥ 0, g(x∗) ≤ c
λ∗ [g(x∗)− c] = 0
Such a condition in which one of the two inequalities must
be binding is called a complementary slackness
condition.
Qing (CUHK) Math Camp September, 2014 38 / 56
Kuhn-Tucker Conditions
Definition The Kuhn-Tucker conditions for the problem
maxx
f (x) s.t. gj (x) ≤ cj for j = 1, ...,m.
are
∂L
∂xi
(x,λ) = 0 for i = 1, ...,n
λj ≥ 0, gj (x) ≤ cj
λj
[gj (x)− cj
]= 0, for j = 1, ...,m,
where
L(x,λ) = f (x)−m
∑j=1
λj(gj(x)− cj).
Qing (CUHK) Math Camp September, 2014 39 / 56
Example
Example: Consider the problem
max(x ,y)
[−(x − 4)2 − (y − 4)2]
subject to x + y ≤ 4 and x + 3y ≤ 9
Lagrangian:
L (x , y ,λ1,λ2) = [−(x − 4)2 − (y − 4)2]− λ1 (x + y − 4)
−λ2 (x + 3y − 9) .
The Kuhn-Tucker conditions are
−2(x − 4)− λ1 − λ2 = 0
−2(y − 4)− λ1 − 3λ2 = 0
x + y ≤ 4,λ1 ≥ 0
x + 3y ≤ 9,λ2 ≥ 0
λ1 (x + y − 4) = 0 and λ2 (x + 3y − 9) = 0
Analysis: There are four cases.1 None of the constraints is binding.
In this case, λ1 = λ2 = 0, then x = y = 4, the constraints
are not satisfied.2 Both constraints are binding.
In this case, x = 32, y = 5
2, then λ1 = 6, λ2 = −1 < 0.
3 Constraint 1 is binding, 2 is not.In this case, λ2 = 0, we have
2(x − 4) + λ1 = 0
2(y − 4) + λ1 = 0
x + y − 4 = 0
We have x = y = 2, λ1 = 4. f (2,2) = −8.4 Constraint 2 is binding, 1 is not.In this case, λ1 = 0, we have
2(x − 4) + λ2 = 0
2(y − 4) + 3λ2 = 0
x + 3y − 9 = 0
We have x = 3310, y = 19
10, but then constraint 1 is violated.
Thus, (2,2) is the solution
Failure of Kuhn-Tucker Conditions
It is not difficult to imagine that Kuhn-Tucker conditions are
not sufficient for a local constrained max. Without an extra
regularity condition, they are not even necessary conditions
Example: Consider the problem
max(x ,y){x} s.t. y − (1− x)3 ≤ 0; y ≥ 0
The solution is clearly (1,0). The Lagrangian is
L(x ,λ1,λ2) = x − λ1(y − (1− x)3) + λ2y .
The Kuhn-Tucker conditions are
1− 3λ1 (1− x)2 = 0
−λ1 + λ2 = 0
y − (1− x)3 ≤ 0,λ1 ≥ 0,
and λ1(y − (1− x)3) = 0
−y ≤ 0,λ2 ≥ 0, and λ2y = 0.
The Kuhn-Tucker conditions have no solution. From the
last condition, either λ2 = 0 or y = 0. If λ2 = 0 then λ1 = 0
from the second condition, so that no value of x is
compatible with the first condition. If y = 0 then from the
third condition either λ1 = 0 or x = 1, both of which are
incompatible with the first condition.
NDCQ: Suppose that k out of m inequality constraints are
binding at x∗, then the NDCQ requires that at x∗, the rank of
the Jacobian matrix of the binding constraints is k .
If at some x∗ in the constraint set Cg , the NDCQ is NOT
satisfied, then x∗ is a possible maximizer. No need to
see whether or not the Kuhn-Tucker conditions are also
satisfied at x∗.
If at some x∗ in the constraint set Cg , the NDCQ is
satisfied but the Kuhn-Tucker conditions are not
satisfied, then x∗ is NOT a possible maximizer.
3.3. Nonnegativity Constraints
Many of the optimization problems in economic theory have
nonnegativity constraints on the variables. For example, a
consumer chooses a bundle x of goods to maximize her
utility u(x) subject to her budget constraint p · x ≤ I and the
condition xi ≥ 0. The general form of such a problem is
maxx
f (x) subject to
gj (x) ≤ cj for j = 1, ...,m
xi ≥ 0 for i = 1, ...,n
This problem is a special case of the general maximization
problem with inequality constraints.
The Lagrangian is
L(x,λ, ν) = f (x)−m
∑j=1
λj(gj(x)− cj)−n
∑i=1
νi (−xi) .
Qing (CUHK) Math Camp September, 2014 44 / 56
Kuhn-Tucker Conditions
The Kuhn-Tucker conditions are
∂L
∂xi
(x,λ, ν) = 0 for i = 1, ...,n
λj ≥ 0, gj(x) ≤ cj
λj
[gj(x)− cj
]= 0, for j = 1, ...,m,
νi ≥ 0, xi ≥ 0, and νixi = 0, for all i .
In this way we have to work with n+m Lagrange multipliers,
which can be difficult if n is large.
Qing (CUHK) Math Camp September, 2014 45 / 56
Modified LagrangianNote that
∂L
∂xi
=∂f
∂xi
−m
∑j=1
λj
∂gj
∂xi
+ νi , thus
∂f
∂xi
−m
∑j=1
λj
∂gj
∂xi
≤ 0
xi
[∂f
∂xi
−m
∑j=1
λj
∂gj
∂xi
]= 0
It allows us to simplify the calculations as follows.
Modified Lagrangian
L̃ (x,λ) = f (x)−m
∑j=1
λj(gj(x)− cj).
Note that this Lagrangian does not include the nonnegativity
constraints explicitly.
Modified Lagrangian
Notice the difference:
1 Modified Lagrangian
L̃ (x,λ) = f (x)−m
∑j=1
λj(gj(x)− cj).
2 previously:
L(x,λ, ν) = f (x)−m
∑j=1
λj(gj(x)− cj)−n
∑i=1
νi (−xi) .
Qing (CUHK) Math Camp September, 2014 47 / 56
Kuhn-Tucker Conditions
Then we give the Kuhn-Tucker conditions for the modified
Lagrangian:
∂L̃
∂xi
(x,λ) ≤ 0, xi ≥ 0,
and xi
∂L̃
∂xi
(x,λ) = 0, for i = 1, ...,n
λj ≥ 0, gj(x) ≤ cj
λj
[gj(x)− cj
]= 0, for j = 1, ...,m.
If (x,λ,v) satisfies the original Kuhn-Tucker conditions, then
(x,λ) satisfies the conditions for the modified Lagrangian,
and if (x,λ) satisfies the conditions for the modified
Lagrangian then we can find numbers (ν1, ..., νn) such that
(x,λ,v) satisfies the original set of conditions.
Qing (CUHK) Math Camp September, 2014 48 / 56
Kuhn-Tucker Conditions
Remark:
This result means that in any problem for which the original
Kuhn-Tucker conditions may be used, we may alternatively
use the conditions for the modified Lagrangian. For most
problems in which the variables are constrained to be
nonnegative, the Kuhn-Tucker conditions for the modified
Lagrangian are easier to work with than the conditions for
the original Lagrangian.
Qing (CUHK) Math Camp September, 2014 49 / 56
Example
Example: Consider the problem
max(x ,y)
xy subject to
x + y ≤ 2, x ≥ 0 and y ≥ 0
The modified Lagrangian is
L̃ (x , y ,λ) = xy − λ (x + y − 2) ,
and the Kuhn-Tucker conditions are
y − λ ≤ 0, x ≥ 0, and x(y − λ) = 0
x − λ ≤ 0, y ≥ 0, and y(x − λ) = 0
λ ≥ 0, x + y ≤ 2, and λ(x + y − 2) = 0.
Analysis
1 If x > 0 then from the first set of conditions we have y = λ.
If y = 0 in this case then λ = 0, so that the second set of
conditions implies x ≤ 0, contradicting x > 0. Hence y > 0,
and thus x = λ, so that x = y = λ = 1.2 If x = 0 then if y > 0 we have λ = 0 from the second set of
conditions, so that the first condition contradicts y > 0. Thus
y = 0 and hence λ = 0 from the third set of conditions.
We conclude (as before) that there are two solutions of the
Kuhn-Tucker conditions, in this case (x , y ,λ) = (1,1,1) and
(0,0,0). Since the value of the objective function at (1,1) is
greater than the value of the objective function at (0,0), the
solution of the problem is (1,1)
3.4. Mixed Constraints
Consider a problem of the form
maxx
f (x) subject to
h1 (x) = c1, ...,hm (x) = cm.
g1 (x) ≤ b1, ...,gk (x) ≤ bk ;
where f , g1, ...,gk , h1, ...,hm are C1 functions of n variables;
b1, ...,bk , c1, ..., cm are constants.
The Lagrangian is
L(x,λ, µ) = f (x)−k
∑j=1
λj(gj(x)− bj)−m
∑j=1
µj
(hj (x)− cj
).
FOC
The first order conditions are
∂L
∂xi
(x,λ, µ) = 0 for i = 1, ...,n
λ1 ≥ 0, ..., λk ≥ 0
g1 (x) ≤ b1, ...,gk (x) ≤ bk
λ1 [g1 (x)− b1] = 0, ...,λk [gk (x)− bk ] = 0,
h1 (x) = c1, ...,hm (x) = cm.
NDCQ: Without loss of generality, suppose that the first k0
inequality constraints are binding at x∗, and the other
inequality constraints are not binding. Then, the NDCQ
requires that the rank of the Jacobian matrix of the m
equality constraints and the k0 binding inequality constraints
is k0 +m
Qing (CUHK) Math Camp September, 2014 53 / 56
Log-Linear Example Revisited
Let us now apply Kuhn-Tucker necessary conditions to
problem (3.1).
max (x1 + ln x2)
s.t. p1x1 + p2x2 ≤ I
x1 ≥ 0, x2 ≥ 0
where the Lagrangian is given by
L (x1, x2,λ) = x1 + ln x2 − λ (p1x1 + p2x2 − I) .
Kuhn-Tucker Conditions for this problem areLx1≤ 0, x1 ≥ 0 with "CS"
Lx2≤ 0, x2 ≥ 0 with "CS"
p1x1 + p2x2 ≤ I,λ ≥ 0 with "CS"
Qing (CUHK) Math Camp September, 2014 54 / 56
or
1 ≤ λp1, x1 ≥ 0 with "CS" (3.7)
1
x2
≤ λp2, x2 ≥ 0 with "CS" (3.8)
p1x1 + p2x2 ≤ I,λ ≥ 0 with "CS" (3.9)
According to (3.7), λ > 0. Hence "CS" in (3.9) implies that
p1x1 + p2x2 = I (3.10)
If x2 = 0 then by (3.8) λ = +∞. But then "CS" in (3.7) would
imply x1 = 0, which violates "CS" in (3.9) (0 < I and λ > 0).
The contradiction shows that we must have x2 > 0. Then
"CS" in (3.8) implies 1x2= λp2. Substitute this into (3.10), we
obtain:
p1x1 +1
λ= I (3.11)
Qing (CUHK) Math Camp September, 2014 55 / 56
we obtain: p1x1 +1λ = I
1 Case 1: Check whether x1 = 0 is consistent with
Kuhn-Tucker conditions. Substitute x1 = 0 in (3.11): λ = 1I.
Then (3.7) becomes: 1 ≤ p1I. This is consistent with "CS" if
p1 ≥ I. Thus, if p1 ≥ I, then x1 = 0,λ = 1I
is consistent with
Kuhn-Tucker conditions. And in this case, x2 =I
p2.
Hence, we have shown that x1 = 0, x2 =I
p2is the point
satisfying Kuhn-Tucker conditions when p1 ≥ I.
2 Case 2: Now let p1 < I. Then the above arguements imply
that x1 = 0 would violate (3.7). Therefore, we must have
x1 > 0. But then by "CS" in (3.7), 1 = λp1, so λ = 1/p1. It
follows that x2 = p1/p2 and x1 =I−p1
p1. Note this solution
satisfies Kuhn-Tucker conditions with λ = 1/p1, if p1 ≤ I
Qing (CUHK) Math Camp September, 2014 56 / 56