notes on optimization

10

Click here to load reader

Upload: fabianmontescu

Post on 12-Nov-2014

12 views

Category:

Documents


2 download

DESCRIPTION

Optmization notes

TRANSCRIPT

Page 1: Notes on optimization

Notes on Constrained Maximization,

Lagrangians, and First-order Conditions

by Duncan K. Foley

1. Constrained maximization problems

The problem of maximizing a function subject to constraints arises in many contexts in economics. Manypeople have learned one or another way to solve such problems, typically using the technique of Lagrangemultipliers. The purpose of these notes is to summarize the logic of this method, and to highlight thetypical cases where economists come to grief in attempting to apply it.

The general set up of a constrained maximization problem is to choose the levels of one or more variables,8x1 , x2 , ..., xn < in order maximize the value of a function f @x1 , x2 , ..., xn D subject to some constraintsg1 @x1 , x2 , ..., xn D § b1 , g2 @x1 , x2 , ..., xn D § b2 , …, gm @x1 , x2 , ..., xn D § bm . The function f is calledthe objective function, and the set of values of the variables that satisfies the constraints is called thefeasible set.

Typical examples are a household maximizing utility by choosing a consumption bundle subject to abudget constraint, a firm maximizing profit by choosing a production plan subject to a technology con-straint, a socialist planner choosing consumption and production plans subject to resource constraints. Theapplications of the general set up in economics are limited only by the imagination of the researcher.

Mathematicians call the set of real numbers the space R , and regard the list of variablesx = 8x1 , x2 , ..., xn < as a member of Rn , the space of n -dimensional vectors with real components. Mathe-maticians write f : Rn Ø R and gj : Rn Ø R to indicate that the functions have the set of n -dimensional realvectors, Rn , as their domain and the real numbers, R , as their range. For notational convenience, in thesenotes I will often abbreviate by writing a single letter for a vector, so that x stands for the vector8x1 , x2 , ..., xn < , g@xD for the vector 8g1 @xD, g2 @xD, ... gm @xD< , and b for the vector 8b1 , b2 , ..., bn < . All ofthese vectors are column vectors. If we regard the list of functions g = 8g1 , …, gm < and the constraintsb = 8b1 , …, bm < as vectors in Rm , we could write g : Rn Ø Rm , and the constraints as g@xD § b . (Here §means that the each component of the first vector is less than or equal to the corresponding component ofthe second.) In economic applications it is very often the case that the choice of variables is limited tosome economically relevant set, which can often be arranged to be the set of vectorsHRn L+ = 8x œ Rn » x ¥ 0< with nonnegative components.

With these notational conventions, we can write the basic maximization problem (P) as:

MaxNotes.nb 1

Page 2: Notes on optimization

The problem of maximizing a function subject to constraints arises in many contexts in economics. Manypeople have learned one or another way to solve such problems, typically using the technique of Lagrangemultipliers. The purpose of these notes is to summarize the logic of this method, and to highlight thetypical cases where economists come to grief in attempting to apply it.

The general set up of a constrained maximization problem is to choose the levels of one or more variables,8x1 , x2 , ..., xn < in order maximize the value of a function f @x1 , x2 , ..., xn D subject to some constraintsg1 @x1 , x2 , ..., xn D § b1 , g2 @x1 , x2 , ..., xn D § b2 , …, gm @x1 , x2 , ..., xn D § bm . The function f is calledthe objective function, and the set of values of the variables that satisfies the constraints is called thefeasible set.

Typical examples are a household maximizing utility by choosing a consumption bundle subject to abudget constraint, a firm maximizing profit by choosing a production plan subject to a technology con-straint, a socialist planner choosing consumption and production plans subject to resource constraints. Theapplications of the general set up in economics are limited only by the imagination of the researcher.

Mathematicians call the set of real numbers the space R , and regard the list of variablesx = 8x1 , x2 , ..., xn < as a member of Rn , the space of n -dimensional vectors with real components. Mathe-maticians write f : Rn Ø R and gj : Rn Ø R to indicate that the functions have the set of n -dimensional realvectors, Rn , as their domain and the real numbers, R , as their range. For notational convenience, in thesenotes I will often abbreviate by writing a single letter for a vector, so that x stands for the vector8x1 , x2 , ..., xn < , g@xD for the vector 8g1 @xD, g2 @xD, ... gm @xD< , and b for the vector 8b1 , b2 , ..., bn < . All ofthese vectors are column vectors. If we regard the list of functions g = 8g1 , …, gm < and the constraintsb = 8b1 , …, bm < as vectors in Rm , we could write g : Rn Ø Rm , and the constraints as g@xD § b . (Here §means that the each component of the first vector is less than or equal to the corresponding component ofthe second.) In economic applications it is very often the case that the choice of variables is limited tosome economically relevant set, which can often be arranged to be the set of vectorsHRn L+ = 8x œ Rn » x ¥ 0< with nonnegative components.

With these notational conventions, we can write the basic maximization problem (P) as:

P: choose x ¥ 0 to Max@ f @xDD subject to g@xD § b

The basic idea of the Lagrangian approach to this problem is to transform it into the problem of solving aset of equations, the first-order conditions. There are two complications that arise in this process. First,solutions to the first-order conditions are not always solutions to the basic maximization problem, (P).Second, it may not be easy to solve the first-order conditions when they involve nonlinear functions, asthey frequently do in economic applications. It is always worthwhile spending some time consideringwhether or not a brute-force approach may be possible to figure out the solution to any given maximumproblem before (or in addition to) resorting to the Lagrangian formalism.

2. The Lagrangian and saddle-points

The first step in trying to reduce (P) to a set of equations is to construct a new function, the Lagrangian,which involves the original variables x , and a new set of variables, one for each constraint,l = 8l1 , l2 , ..., lm < . These variables are called Lagrange multipliers, or, more evocatively in an eco-nomic context, shadow prices, since they have many of the properties of market prices, but arise in acompletely different way as purely mathematical objects in the solution of the constrained maximizationproblem. When y = 8y1 , y2 , ..., ym < is in Rm , the dot or inner productl ÿ y = lT y = l1 y1 + l2 y2 + ... + lm ym has the natural economic interpretation of the value of the vectory at the prices l . This interpretation also suggests that shadow prices must be nonnegative, l ¥ 0. TheLagrangian corresponding to the maximization problem (P) is (L):

L: L@x, lD = f @xD - l ÿ Hg@xD - bL = f @xD - lT Hg@xD - bLIn considering this expression, it is crucial to remember that x , l , g , and b are all potentially vectors, thatis, lists of variables. I will sometimes call f HxL the objective function, and - l ÿ HgHxL - bL the penaltyfunction. The variables x are often called the primal variables of the problem, and the shadow prices l thedual variables.

A key role in the analysis is played by the concept of a saddle-point of the Lagrangian function. A saddle-point is a pair of vectors Hx* , l* L such that x* maximizes the Lagrangian LHx, l* L taking l* as given, andl* minimizes the Lagrangian LHx* , lL , taking x* as given, with both x and l restricted to be nonnegative.Though the saddle-point seems to be an arbitrary mixing up of maximization and minimization from apurely mathematical point of view, it actually expresses a very natural economic insight, which is thatproducers and consumers act to maximize their objective functions, but resource owners act to maximizethe value of their resources. Since l ÿ b is the value of the resources, resource owners would want to put ashigh a price on each resource as they can, as long as it is fully utilized. There is thus a strong connectionbetween the saddle-point concept and the theory of rent. Another way to conceptualize the saddle-point isto imagine that the solution of the maximization problem is decentralized into an operations department,which controls x , and an accounting department, which controls l . The operations department is instructedto choose x to maximize profit at the prices l* without regard to resource availability, and the accountingdepartment is instructed to choose l to maximize the value of the over-utilization of resources, g@x* D - b ,taking x* as given. The accounting department will raise the price of any resource that the operatingdepartment plans to use in excess of its availability, and will lower the price of any resource that theoperating department is not fully utilizing.

These economic interpretations show that very simple arguments immediately establish the first relationbetween the Lagrangian (L) and the constrained maximization problem (P):

MaxNotes.nb 2

Page 3: Notes on optimization

In considering this expression, it is crucial to remember that x , l , g , and b are all potentially vectors, thatis, lists of variables. I will sometimes call f HxL the objective function, and - l ÿ HgHxL - bL the penaltyfunction. The variables x are often called the primal variables of the problem, and the shadow prices l thedual variables.

A key role in the analysis is played by the concept of a saddle-point of the Lagrangian function. A saddle-point is a pair of vectors Hx* , l* L such that x* maximizes the Lagrangian LHx, l* L taking l* as given, andl* minimizes the Lagrangian LHx* , lL , taking x* as given, with both x and l restricted to be nonnegative.Though the saddle-point seems to be an arbitrary mixing up of maximization and minimization from apurely mathematical point of view, it actually expresses a very natural economic insight, which is thatproducers and consumers act to maximize their objective functions, but resource owners act to maximizethe value of their resources. Since l ÿ b is the value of the resources, resource owners would want to put ashigh a price on each resource as they can, as long as it is fully utilized. There is thus a strong connectionbetween the saddle-point concept and the theory of rent. Another way to conceptualize the saddle-point isto imagine that the solution of the maximization problem is decentralized into an operations department,which controls x , and an accounting department, which controls l . The operations department is instructedto choose x to maximize profit at the prices l* without regard to resource availability, and the accountingdepartment is instructed to choose l to maximize the value of the over-utilization of resources, g@x* D - b ,taking x* as given. The accounting department will raise the price of any resource that the operatingdepartment plans to use in excess of its availability, and will lower the price of any resource that theoperating department is not fully utilizing.

These economic interpretations show that very simple arguments immediately establish the first relationbetween the Lagrangian (L) and the constrained maximization problem (P):

Proposition 1: If Hx* , l* L is a saddle-point of (L), then x* is the solution to (P).

Proof: First, x* must be feasible, that is g@x* D § b. If there were any resource overutilized, it would bepossible to lower the value of the Lagrangian by increasing that resource's price, which contradicts theassumption that Hx* , l* L is a saddle-point of (L). Second, any alternative feasible plan x must have at leastas high a value of the penalty, -l ÿ Hg@xD - bL as x* . If x also achieved a higher value for the objectivefunction, LHx, l* L > LHx* , l* L , which also would contradict the assumption that Hx* , l* L is a saddle-pointof L. Thus x* must maximize f among all the feasible x , so it is a solution to (P).

Notice that in proving Proposition 1 we have made no assumptions about the functions f @xD and g@xD .Proposition 1 holds for any objective function and any set of constraints. But this may not be much help,because the solution to (P) may not be a saddle-point of (L) for arbitrary objective functions andconstraints.

3. Convex sets and concave and quasi-concave functions

There is a big difference between situations in which it is possible and desirable to average outcomes andstrategies, and situations where it is either impossible or undesirable to do so. We can express a weightedaverage of two points x £ = 8x1

£ , …, xn£ < and x≥ = 8x1

≥ , …, xn≥ < as

x = t x£ + H1 - tL x≥ = 8t x1£ + H1 - tL x1

≥ , …, t xn£ + H1 - tL xn

≥ < , where t is a real number between 0 and 1.Notice that when we make a weighted average of vectors, we average each component with the sameweights. Thus if x£ = H1, 5L , and x≥ = H4, 2L , and t = 1 ê 3, x = H3, 3L . Geometrically, the weighted averageof two points lies on the straight line between them.

Definition (Convex Set): A set X is convex if t x£ + H1 - tL x≥ œ X whenever x£ œ X and x≥ œ X for allt œ @0, 1D .

The surfaces defined by a triangle, square, and circle are convex sets. The triangle, square, and circlethemselves are not convex sets. A solid disc is convex, but a disc with a central hole cut out is not convex.The set of points below the graph of the function Log@xD are a convex set, but the set of points above thesame graph are not.

MaxNotes.nb 3

Page 4: Notes on optimization

The surfaces defined by a triangle, square, and circle are convex sets. The triangle, square, and circlethemselves are not convex sets. A solid disc is convex, but a disc with a central hole cut out is not convex.The set of points below the graph of the function Log@xD are a convex set, but the set of points above thesame graph are not.

In economics applications, we are frequently interested in sets defined as the level sets of a function,defined as the points in the domain of the function at which the function exceeds (or falls short of) a givenvalue. For example, the "no-worse-than" set for a utility function [email protected] and a utility level uêê , is 8x » u@xD ¥ uêê< .The functions whose level sets are convex thus play a special role in economics.

Definition (Quasi-concave function): The function f : X Ø R is quasi-concave if its upper level sets8x œ X » f @xD ¥ a< are convex for all a œ R .

Definition (Quasi-convex function): The function g : X Ø R is quasi-convex if its lower level sets8x œ X » g@xD § a< are convex for all a œ R .

Quasi-convexity and quasi-concavity are the mathematical representation of the economic concepts of"diminishing returns" or "diminishing marginal utility". A consumer with a quasi-concave utility functionprefers the weighted average of two consumption bundles to either of them (or is indifferent), a sign ofdiminishing marginal utility.

A good exercise is to prove that a monotonically increasing function of a quasi-concave function is quasi-concave. What type of monotonic transformation preserves quasi-convexity?

A concave function has the property that its value at the weighted average of two points in its domain isgreater than the weighted average of its values at the two points:

Definition (Concave function): The function f : X Ø R is concave iff @t x£ + H1 - tL x≥ D ¥ t f @x£ D + H1 - tL f @x≥ D for all t œ @0, 1D .

Definition (Convex function): The function g : X Ø R is convex ifg@t x£ + H1 - tL x≥ D § t g@x£ D + H1 - tL g@x≥ D for all t œ @0, 1D .

It is easy to prove that if a function f is concave, - f is convex. It is also a good exercise to show that aconcave function is quasi-concave and a convex function is quasi-convex, and to find examples of quasi-concave and quasi-convex functions that are not concave and convex. Is a monotonically increasingtransformation of a concave function necessarily concave? Is it necessarily quasi-concave?

Linear functions c ÿ x = c1 x1 + … + cm xm are both concave and convex, a fact which plays a crucial rolein certain economic arguments.

A local maximum of a concave function is also a global maximum. Suppose that x* is the local maximumof a concave function f . If there is a point x£ with f @x£ D > f @x* D , the points x@tD = t x* + H1 - tL x£ musthave f @x@tDD > f @x* D for all 0 < t § 1 since f is concave. But x@tD can be made arbitrarily close to x* bytaking t arbitrarily close to 0, which contradicts the assumption that x* is a local maximum. By similarreasoning, a local minimum of a convex function is a global minimum.

MaxNotes.nb 4

Page 5: Notes on optimization

A local maximum of a concave function is also a global maximum. Suppose that x* is the local maximumof a concave function f . If there is a point x£ with f @x£ D > f @x* D , the points x@tD = t x* + H1 - tL x£ musthave f @x@tDD > f @x* D for all 0 < t § 1 since f is concave. But x@tD can be made arbitrarily close to x* bytaking t arbitrarily close to 0, which contradicts the assumption that x* is a local maximum. By similarreasoning, a local minimum of a convex function is a global minimum.

If the function f is twice differentiable, then we can compactly write the Hessian matrix of its secondpartial derivatives ∑2 f ê ∑ x2 , with a typical entry ∑2 f ê ∑ xi ∑ xj . The Hessian matrix of a concave functionis everywhere negative semi-definite. Thus any critical point of a concave function where∑ f ê ∑ x = H∑ f ê ∑ x1 , …, ∑ f ê xn L = 0 is a local maximum and hence a global maximum. By similar reason-ing, any critical point of a convex function is a local and global minimum.

These mathematical properties of concave and convex functions greatly simplify the analysis of modelsthat are based on them.

4. Price systems and convex sets

There is a very fundamental theorem relating convex sets and price systems, the Separating HyperplaneTheorem. Here is one version:

Theorem (Separating hyperplane): Let X be a convex set and y a point not in the interior of X . Then thereexists at least one vector l ∫ 0 with l ÿ y ¥ l ÿ x for all x œ X .

Theorems of this type hold in a great many mathematical constructs besides Euclidean vector spaces, aswe will discuss below. The intuition behind the theorem is suggested by considering the case of a pointand a disc. When the point is located inside the disk, it is impossible to draw a line with the disc on oneside and the point on the other, but when the point is located on the boundary of the disc or outside italtogether, it is always possible to draw a line with the point on one side and the disc on the other. Theseparating hyperplane theorem is important in economics because it mathematically creates prices, orshadow prices in constrained maximization problems.

For example, the separating hyperplane theorem allows us to prove a partial converse to Proposition 1:

Proposition 2: If x* is the solution to problem (P) when the objective function is concave and the con-straint functions are convex, and there exists a point x ¥ 0 such that g@xD < b , there exists a system ofshadow prices l* ¥ 0 such that Hx* , l* L is the saddle-point of (L).Proof: Consider the set

Y = 8Hy0 , y1 , …, ym L » y0 § f @xD, -yj § -gj @xD for some x ¥ 0<.The intuitive idea of the set Y is that you can produce the objective function outcome y0 with theresources y1 , …, ym given the technology defined by the constraints and the objective function. From theconcavity of the objective function and the convexity of the constraint functions it is evident that Y is aconvex set. Since x* is the solution to (P), we know that 8 f @x* D, -b1 , …, -bm < is not interior to Y . By theseparating hyperplane theorem there must be a non-zero price system l withl ÿ 8 f @x* D, -b1 , …, -bm < ¥ l ÿ y for all y œ Y . By the definition of Y , l ¥ 0, since if any component of lwere negative, we could decrease the corresponding component of y without limit and violate the inequal-ity. Since 8 f @xD, -g1 @xD, …, -gm @xD< œ Y for any x ¥ 0,

MaxNotes.nb 5

Page 6: Notes on optimization

The intuitive idea of the set Y is that you can produce the objective function outcome y0 with theresources y1 , …, ym given the technology defined by the constraints and the objective function. From theconcavity of the objective function and the convexity of the constraint functions it is evident that Y is aconvex set. Since x* is the solution to (P), we know that 8 f @x* D, -b1 , …, -bm < is not interior to Y . By theseparating hyperplane theorem there must be a non-zero price system l withl ÿ 8 f @x* D, -b1 , …, -bm < ¥ l ÿ y for all y œ Y . By the definition of Y , l ¥ 0, since if any component of lwere negative, we could decrease the corresponding component of y without limit and violate the inequal-ity. Since 8 f @xD, -g1 @xD, …, -gm @xD< œ Y for any x ¥ 0,

l0 f @x* D - l1 g1 @x* D - … - lm gm @x* D ¥ l0 f @x* D - l1 b1 - … - lm bm¥ l0 f @xD - l1 g1 @xD - … - lm gm @xD

This is almost the theorem, since if l0 > 0, we could divide through by it to get the Lagrangian shadowprices called for. If l0 = 0, we would have l1 b1 + … + lm bm § l1 g1 @xD + … + lm gm @xD for all x ¥ 0.But we have assumed that there is some point x ¥ 0 with gj @xD < bj for all j . Thus l0 > 0, and the pricesystem Hl1 ê l0 , …, lm ê l0 L are the required shadow prices.

If Hx* , l* L is a saddle-point of the Lagrangian (L), then x* is always the solution to the original maximiza-tion problem, (P), whatever the properties of the objective function and the constraint functions, but theconverse is not generally true. If, however, the objective function is concave and the constraint functionsare convex (often abbreviated by saying that (P) is a concave maximization problem), the converse is true:the maximum of the original maximization problem (P), x* is part of a saddle-point of the Lagrangian,Hx* , l* L . This result is of enormous importance in the development of marginalist and neoclassical economicthought. The marginalist vision sees real-world market prices as the shadow prices of a constrainedmaximization problem. In general these shadow prices may not exist unless the objective function isconcave and the constraint functions convex, so the marginalist program takes these conditions as paradig-matic, despite the fact that in the real world violations of the diminishing utility and diminishing returnsare extremely common.

5. First-order conditions

If we are lucky enough to be able to frame an economic model as a concave maximization problem, wewould also like to be able to solve it explicitly, or at least to make some general statements about theproperties of the solution (for example, the effect of changes in the constraints on the solution in a compara-tive statics analysis). Thus it is desirable to be able to convert the saddle-point condition into a set ofequations, whose solutions characterize the saddle-point. If Hx* , l* L is a saddle-point of the Lagrangian,then a small change in xj in either direction at x* must not raise L . If L is differentiable at x* and xi

* > 0,this implies that the ∑ L ê ∑ xi = 0. If xi

* = 0, it still must be the case that increasing xi cannot raise L , sothat ∑L ê ∑ xi § 0. Thus when L is differentiable, either xi

* > 0 and ∑ L@x* , l* D ê ∑ xi = 0, or xi* = 0 and

∑ L@x* , l* D ê ∑ xi § 0. Similarly either l j* > 0 and ∑ L@x* , l* D ∑ l j = 0 or l j

* = 0 and ∑ L@x* , l* D ê l j ¥ 0.These complementary slackness conditions can be expressed compactly as the first-order conditions(FOC):

FOC : ∑L@x*, l*DÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x § 0 and ∑L@x*, l* DÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x x* = 0

∑L@x*, l* DÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑l¥ 0 and l* T

∑L@x*, l* DÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑l= 0

Here ∑L@x, lD ê ∑ x means the 1 än vector of partial derivatives of L with respect to the n components of x ,H∑ L ê ∑ x1 , …, ∑L ê ∑ xn L , and the product H∑L ê ∑ xL x* is the vector dot productH∑ L ê ∑ x1 x1 + … + ∑L ê ∑ xn xn L , and ∑L@x, lD ê ∑ l means the 1ä m vector of partial derivatives of L withrespect to the m components of l . Thus we have:

MaxNotes.nb 6

Page 7: Notes on optimization

Here ∑L@x, lD ê ∑ x means the 1 än vector of partial derivatives of L with respect to the n components of x ,H∑ L ê ∑ x1 , …, ∑L ê ∑ xn L , and the product H∑L ê ∑ xL x* is the vector dot productH∑ L ê ∑ x1 x1 + … + ∑L ê ∑ xn xn L , and ∑L@x, lD ê ∑ l means the 1ä m vector of partial derivatives of L withrespect to the m components of l . Thus we have:

Proposition 3: If Hx* , l* L is a saddle-point of a differentiable Lagrangian (L) then Hx* , l* L satisfies the first-order conditions (FOC).

Notice that the proof of Proposition 3 doesn't require any assumptions about the properties of the objectivefunction or the constraints. Unfortunately it does not help much, either, because there might be solutions tothe first-order conditions that are not a saddle-point of the Lagrangian. If we have a concave maximizationproblem, however, in which the objective function is concave and the constraint functions are convex, theLagrangian will be concave in the primal variables x , and convex in the dual variables l . In this case asolution to the first-order conditions is a saddle-point of the Lagrangian.

Proposition 4: If Hx* , l* L is a solution to the first-order conditons (FOC) when the objective function isconcave and the constraint functions are convex, then Hx* , l* L is a saddle-point of the Lagrangian (L).Proof: The first-order conditions imply that the x* is a local maximum of L@., l* D , so that when L isconcave in the x , it is also a global maximum. Similar reasoning establishes that l* is a global minimum ofL@x* , .D , so Hx* , l* L is a saddle-point.

The first line of the first-order conditions actually represents n equations, since either ∑L ê ∑ xi = 0 orxi = 0 and the second line represents m equations in the m + n variables x* and l* . Thus in principle thereare just enough equations in the first-order conditions to determine the unknown variables, and we mighthope to solve the system explicitly to find the solution Hx* , l* L . The main complication here is that thefirst-order conditions define a number of different equation systems, depending on which variables mightbe set to zero. This issue arises in a large number of economic models, and further hypotheses are oftenrequired to pin down just which set of equations is relevant.Thus a basic strategy for solving a concave maximum problem (P) is to construct the Lagrangian, calculatethe first-order conditions, and solve the appropriate set of equations to find the saddle-point. Two thingsmight go wrong with this plan. First, we might have set up a maximum problem (P) that has no solution, either because the constraints areinconsistent, or because the constraints are not sufficient to limit the value of the objective function, so thatit can be made unboundedly large. In these cases, there is no guarantee that the Lagrangian has a saddle-point, and as a result, there may be no solution to the first-order conditions at all.Second, even if the problem (P) is well-defined and has a maximum, it may not be possible to solve thefirst-order conditions because the equations they define involve non-linear functions that cannot easily beinverted to express the solution explicitly. Economics, for some reason, puts a very high premium on theexplicit solution of models, which has had some unfortunate consequences. Frequently researchers choosefunctional forms for models, such as Cobb-Douglas production or utility functions, because they lead tosoluble first-order conditions rather even if there is no empirical evidence to support their relevance to thesituation being modeled.

One recourse in cases where it is impossible to solve the first-order conditions explicitly is to establishcomparative statics results by differentiating the first-order conditions with respect to exogenous parame-ters of the model. This always results in a system of linear equations in the derivatives of the variableswith respect to the parameter, which can be solved by general methods. In many cases it turns out thatplausible assumptions on the objective function and the constraints are sufficient to establish the sign ofthe derivatives with respect to the parameters of interest.

MaxNotes.nb 7

Page 8: Notes on optimization

One recourse in cases where it is impossible to solve the first-order conditions explicitly is to establishcomparative statics results by differentiating the first-order conditions with respect to exogenous parame-ters of the model. This always results in a system of linear equations in the derivatives of the variableswith respect to the parameter, which can be solved by general methods. In many cases it turns out thatplausible assumptions on the objective function and the constraints are sufficient to establish the sign ofthe derivatives with respect to the parameters of interest.

We can sum up the logic of this analysis in the following terms. A saddle-point of the Lagrangian function(L) is always a solution to the maximum problem (P), and always satisfies the first-order conditions(FOC), whether or not the original problem has a concave objective function and convex constraintfunctions. But in general a solution to the first-order conditions may not be the solution to the originalproblem. There may be solutions to the first-order conditions which are not saddle-points of theLagrangian, or the solution to the original problem may not be a saddle-point of the LagrangianIf the original problem is a concave maximization problem, with a concave objective function and convexconstraint functions, however, the solution to the original problem is always a saddle-point of theLagrangian, and a solution to the first-order conditions is also always a saddle-point of the Lagrangian, sothat solving the first-order conditions is equivalent to solving the original maximization problem.

6. The Envelope Theorem

In economic applications of the Lagrangian idea, there are often parameters that enter the basic maximiza-tion problem. The comparative static analysis of the impact of changing parameters on the maximizingpoint is considerably simplified because of certain general relations that hold in the first-order conditions,which lead to the envelope theorem.

Consider the modified constrained maximization problem:

P': choose x @b, uD ¥ 0 to Max@ f @x, uDD subject to g@x, uD § b

Here u is a vector of parameters that might change the objective function and the constraints. We willassume that the impact of the parameters is as smooth as necessary to make the following arguments workmathematically. We will also assume that f is concave and g are convex, so that the first-order conditionsare both necessary and sufficient to characterize the solution. Let F@b, uD = f @x@b, uD, uD be the valuefunction, the maximal value of the problem P'.

The Lagrangian for P' is:

L': L@x, l; b, uD = f @x, uD - l@b, uD ÿ Hg@x, uD - bLThe first-order conditions are:

MaxNotes.nb 8

Page 9: Notes on optimization

FOC' :∑f@x*; uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x - l* @b, uDT J ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x N § 0

and J ∑f@x*; uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x - l*@b, uDT J ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x NN x* = 0

-Hg@x*, uD - bL ¥ 0and l*@b, uDT Hg@x*, uD - bL = 0

First, consider a change in the resource constraints b .

We know for any variable i that either Hx* Li = 0 or:

∑f@x*; uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x - l* @b, uDT J ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x N = 0

But if Hx* Li = 0 and ∑ f @x* ;uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ∑x - l* @b, uDT I ∑g@x* ,uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ∑x M < 0, a very small change in the constraints b cannotmake Hx* Li > 0. Thus in these cases the planning variables are "stuck" at zero for small changes in theconstraints, and:

∑Hx* Li@b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑b = 0

As a result, it must be the case that:

∑f@x*; uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x ∑x* @b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑b = l* @b, uDT J ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x N ∑x* @b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑b

Differentiating the complementary slackness condition for the shadow prices with respect to the resourceavailabilities, we see that:

∑Hl* @b, uDT Hg@x*, uD - bLLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑b =

l* @b, uDT J ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x ∑x*@b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑b - IN + Hg@x*, uD - bLT ∑l*@b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑b = 0

We know for any constraint that either gj @x* , uD - bj = 0 or Hl* L j = 0. But if Hl* L j = 0 andgj @x* , uD - bj < 0, a very small change in the constraints b cannot make Hl* L j > 0. Thus in these cases theshadow prices are "stuck" at zero for small changes in the constraints, and:

∑Hl*Lj@b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑b = 0

As a result, we know that:

Hg@x*, uD - bLT ∑l* @b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑b = 0

l* @b, uDT J ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x ∑x* @b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑b - IN = 0

l* @b, uDT ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x ∑x* @b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑b = l*@b, uDT

MaxNotes.nb 9

Page 10: Notes on optimization

But these observations imply that when we differentiate the value function with respect to the constraints,we will have:

∑F@b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑b =∑f@x* @b, uDDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x ∑x*@b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑b

= l*@b, uDT ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x ∑x* @b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑b = l* @b, uDT

The shadow prices represent the marginal contribution to the maximized value of the program from asmall increase in the availability of the resources.

What happens to the shadow prices when the parameters change? Differentiating the complementaryslackness condition for the shadow prices with respect to the parameters, we have:

∑Hl* @b, uDT Hg@x*, uD - bLLÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u =

l* @b, uDT J ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x ∑x*@b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u +∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u N + Hg@x*, uD - bLT ∑l*@b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u =

l* @b, uDT J ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x ∑x* @b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u +∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u N = 0

l* @b, uDT ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x ∑x* @b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u = -l* @b, uDT ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u

For the value function, this implies:

∑F@b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u =∑f@x* @b, uDDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x ∑x*@b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u +∑f@x*@b, uDDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u

= l*@b, uDT ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑x ∑x* @b, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u +∑f@x* @b, uDDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u =

-l* @b, uDT ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u +∑f@x* @b, uDDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u

Thus we can evaluate the impact of a small change in the parameters on the value of the problem bysimply adding up the impact on the constraints valued at the shadow prices and the direct impact on theobjective function.

For example, if f is a utility function, and g is a single budget constraint, g@x, pD = pT x § m , we can treatthe prices p as parameters. In this case F@m, pD is the indirect utility function, and since the prices do notdirectly affect utility, and we have:

∑F@m, pDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑p = -l* @m, pD ∑g@x*, uDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑u = -l* @m, pD x*@m, pD∑F@m, pDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑m = l*@m, pDx* @m, pD = -

∑F@m, pDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑p í ∑F@m, pDÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅÅ

∑m

This is "Roy's Identity" which links the Marshallian demand functions to the derivatives of the indirectutility function.

MaxNotes.nb 10