+ All Categories
Home > Documents > MUÑOZ,F. Advanced microeconomic theory. WSU

MUÑOZ,F. Advanced microeconomic theory. WSU

Date post: 28-Nov-2015
Upload: ronaldovergara
View: 67 times
Download: 3 times
Share this document with a friend
Popular Tags:
EconS 501: ADVANCED MICROECONOMIC THEORY – I LECTURE NOTES Felix Munoz-Garcia 1 School of Economic Sciences Washington State University This document contains a set of partial lecture notes that are intended to serve as a starting point when coming to class, so every student can complement them with additional examples, exercises and applications discussed in class. (Do not quote). 1 103G Hulbert Hall, School of Economic Sciences, Washington State University. Pullman, WA 99164-6210, [email protected] . Tel. 509-335-8402.
Page 1: MUÑOZ,F. Advanced microeconomic theory. WSU



Felix Munoz-Garcia1

School of Economic Sciences

Washington State University

This document contains a set of partial lecture notes that are intended to serve as a starting

point when coming to class, so every student can complement them with additional examples,

exercises and applications discussed in class. (Do not quote).

1 103G Hulbert Hall, School of Economic Sciences, Washington State University. Pullman, WA 99164-6210, [email protected]. Tel. 509-335-8402.

Page 2: MUÑOZ,F. Advanced microeconomic theory. WSU


Chapter 1 – Preferences and Utility

Preference and Choice

We begin our analysis of individual decision-making in an abstract setting. We will first specify a set of possible alternatives (denoted by set X) for a particular decision maker. This set might include the consumption bundles that an individual is considering to consume, the career paths that the student is considering, or any general list of alternatives. Given this set, we will approach the decision making process in two different ways. First, using the “preference-based approach” and second using the “choice-based approach”. The first approach analyzes how the individual would use his preferences to choose an element (or elements) from the set of alternatives X. We will then impose some rationality assumptions on the individual’s preferences. The second approach analyzes, instead, the actual choices the individual makes when he is called to choose an element (or elements) from the set of possible alternatives. Similarly as we did for the preference-based approach, we will also impose some consistency conditions on the choices that the individual makes. Both of the approaches have their own advantages. For instance, the choice-based approach is based on observables (the actual choices made by the individual decision-maker) while the preference-based approach is based on unobservables (the individual’s preferences).1 On the other hand, the preference-based approach is more tractable than the choice-based approach, especially when the set of alternatives X contains many elements (which usually is the case in individual decision-making problems).2 After describing both approaches, and the assumptions that we will impose on each approach, we want to understand the relationship (and potential equivalence) between both approaches. Hence, we will examine under which conditions rational preferences imply a consistent choice behavior, and under which conditions the opposite relationship holds.

Preference-based approach

Let us start with the preference-based approach.3 In this regard, we will understand preferences as “attitudes” of the decision-maker towards the set of alternatives X. Preferences hence should specify the attitudes of the decision-maker towards each pair of alternatives. These attitudes are obtained by presenting a questionnaire Q to the individual. In particular, this questionnaire asks for all elements x and y that belong to the set of alternatives X, how do you compare element x and y? Check one and only one box.

□ I prefer x to y (which we write as x y), or

□ I prefer y to x (which we write as y x), or

□ I am indifferent (which we write as x y).

1 This approach could in principle allow for more general behavioral motives than the preference-based approach. However, as we will see, this is only in principle, since the preference-based approach will also allow for very general individual preferences.

2 This reason explains why the preference-based approach is explained in more detail in most intermediate microeconomics textbooks.

3 We will be using Rubinstein (lecture one) and MWG (Ch. 1B).

Page 3: MUÑOZ,F. Advanced microeconomic theory. WSU


Note that we are asking the individual decision-maker to check only one box. This is related with the completeness assumption on individual preferences.4 In particular, we define completeness in a preference relation if for any to alternatives x and y that belong to the set of alternatives X, we have that either alternative x is strictly preferred y, or y is strictly preferred to x, or both (which implies that the individual decision-maker is indifferent between x and y). This implies that the individual is capable of comparing any pair of alternatives that we present to him. This might be a relatively strong assumption if we think about goods that we haven't consumed in the past or goods that we haven’t even seen before. Think, for instance, about the last time you were in a new ethnic restaurant in which the descriptions in the menu did not help you decide what to order. This assumption hence considers that the individual decision-maker has had enough time to compare all alternatives, and that he is ready to express his preference over one of them (or indifference between two alternatives) when we ask him to compare any two alternatives x and y.

Remark: note however that not all binary relations satisfy completeness. Indeed, the binary relation “is the brother of” is not satisfied for all the elements (persons) in the set of available alternatives (set X in this case could be a given group of people). If we select John and Bob from this group, we might observe that neither John is the brother of Bob nor Bob is the brother of John; i.e., they are not related. That is, not all pairs of alternatives are comparable according to this binary relation. Hence, this binary relation does not satisfy completeness. Similarly, the binary relation “to be the father of” doesn't satisfy completeness since, from a group of people, we can select two persons that are not related.

Let us now turn into weak preferences. In order to learn the weak preferences of an individual we present a questionnaire R to him as follows: for all alternatives x and y in the set of alternatives X (where x and y are not necessarily distinct),5 is alternative x at least as preferred as y?

□ Yes, which we write as x y .

□ No, which we write as y x .

The respondents therefore must answer yes, no, or both.6 We are now ready to define what we mean by a rational preference relation. We say that a preference relation

is rational if it possesses the following

two properties:

4 Note also that we do not allow the individual to add a new box in which he writes “I love X. and Y.” in other words, we do not allow him to specify the intensity of his preferences over two alternatives.

5 Note that we do not assume that alternatives x and y are different. In the case that they coincide, the definition of completeness becomes the reflexivity assumption. We discuss this assumption below, but at this stage, we can understand the reflexivity assumption as a condition on the preference relation guaranteeing that every alternative x is weakly preferred to, at least, one alternative: itself.

6 Note that this refers to the assumption of completeness again, since we ask the individual to be able to compare any pair of two alternatives, where now this comparison is done using the weak preference symbol rather than the strict reference symbol.

Page 4: MUÑOZ,F. Advanced microeconomic theory. WSU


Completeness: For any pair of alternatives x and y in the set of alternatives X, either x y, or y

x, or

both (x y).

Transitivity: For any three alternatives x, y and z in the set of alternatives X, if x y and y

z, then it

must be that x z.

The assumption of transitivity is often understood as that individual preferences should not cycle. In order to understand this point, let us consider an example in which an individual’s preferences do not satisfy transitivity. James weakly prefers an apple to a banana, and he weakly prefers a banana to an orange. However, he prefers an orange to an apple. (Note that according to transitivity, he should have preferred an apple to an orange.) What is the problem associated to this intransitive preference relation? James would be wiped out from the market. Indeed, businessmen could approach James (when James owns an orange) and offer him a banana for one dollar. James will probably accept the deal since he prefers a banana to an orange. Then the businessmen could approach James again and offer him an apple for a dollar, something James will also accept, since he prefers an apple to a banana. Finally, the businessmen could approach James again offering him an orange for the apple he now owns. Since James’ preferences are intransitive (and therefore he prefers an orange to an apple) he would accept this deal, paying another dollar. However, this makes James return to his original position, owning an orange, but having spent three dollars in the process. Of course, this cycle could be repeated ad infinitum, extracting all James’ wealth.

Despite the previous argument about the reasons why we shouldn't observe individual decision-makers with intransitive preference relations, there are however situations in which intransitivities might arise:

First example. Comparing elements that are too close to be distinguishable.

When two alternatives are extremely similar we are often unable to state which of them we prefer. Consider the following example. Take the set of alternatives X to be the real numbers, e.g., a piece of pie. An individual states that he prefers alternative x to y if x>=y-1 (x+1>=y) but he is indifferent between x and y if the two alternatives are very close together, i.e., |x-y|<1. Intuitively, he prefers x to y only when alternative x is larger than y in one unit. If the difference between the two alternatives is smaller than one he cannot tell them apart, and the individual is indifferent between both of them. Then,

Alternative 1.5 is indifferent to 0.8 since 1.5-0.8=0.7<1, and

Alternative 0.8 is indifferent to 0.3 since 0.8-0.3=0.5<1

Therefore, by transitivity we would have that 1.5 is indifferent to 0.3, but in fact 1.5 is preferred to 0.3, since the former is larger than the later by more than one unit. This shows the presence of an intransitive preference relation.7

7 Note that this example could be applicable to milligrams of sugar in your coffee (very difficult to distinguish) or to similar shades of gray paint on the wall in a room. You might not be able to distinguish one milligram more of sugar in your coffee (a slightly darker gray color on your office walls, respectively), but you can probably detect when your coffee is becoming too sweet (or your office is almost black!).

Page 5: MUÑOZ,F. Advanced microeconomic theory. WSU


Second example. Framing effects. In certain cases intransitivity might be violated because of the way in which alternatives are presented to the individual decision-maker, also referred as “framing effects.” Let us consider an example from Rubinstein, where he showed the following holiday packages to his Masters students, asking each student: “Which holiday package do you prefer?”

a. A weekend in Paris for 574 at a four star hotel.

b. A weekend in Paris at the four star hotel for 574.

c. A weekend in Rome at the five star hotel for 612.

Alternatives a and b are, of course, the same. This was indeed detected by most of the students since they stated to be indifferent between alternatives a and b. Moreover, they strictly preferred alternative b to c. By transitivity, hence, we should expect that students who gave the previous responses should then strictly prefer alternative a to c when asked to compare options a and c. However, this didn't happen. Indeed, more than 50% of the students responded that they strictly preferred alternative c to a, showing an intransitive preference relation, merely induced by the way in which the options were presented (framed) to the students.

Third example. Aggregation of considerations. In some cases several individual preferences must be aggregated into only one. In these situations we might find that the resulting preference relation violates transitivity. Let us consider the following example. The set of possible alternatives X contains three universities where you were admitted: MIT, WSU, and your home University. When considering which university to attend you might compare them according to different criteria. First, if you only consider the academic prestige reasonable comparison would be:

1 : MIT


1 Home Univ

Second, considering the city size or congestion your comparison could be:

2 : WSU



Finally, considering the proximity of the university to your family and friends, a reasonable comparison would be:




We must now aggregate all of these considerations (for example, using majority rule). We do so by making all possible pair-wise comparisons and checking, for each pair, which university wins according to most of the criteria described above. When comparing MIT versus WSU, the former wins according to criteria 1 and 3. When comparing WSU versus your home university, the former beats the later according to criteria 1 and 2. Finally, comparing your home university with MIT, the former wins against the later

Page 6: MUÑOZ,F. Advanced microeconomic theory. WSU


according to criteria 2 and 3. However, this resulting preference relation violates transitivity.8 A similar argument can be used for the aggregation of individual preferences in group decision-making, where every person in the group has a different (transitive) preference relation but the group preferences (aggregated from these individual preferences in order to have a ranking of alternatives politicians can use to take decisions affecting the welfare of the entire group) are not necessarily transitive.9

Fourth example. Intransitive preferences because there is a change in the underlying preferences. This is very common in individual’s preferences over goods that create a strong dependency or that become addictive. For instance, when an individual starts smoking his preferences over cigarettes might be: one cigarette is weakly preferred to no smoking and no smoking is weakly preferred to smoking heavily. Hence, according to transitivity, he should prefer one cigarette to smoking heavily. However, once this individual has been smoking for several years, his preferences over cigarettes could have changed to: smoking heavily is weakly preferred to one cigarette and one cigarette is weakly preferred to no smoking at all. According to this new preference relation, and using transitivity, we can conclude that now this individual prefers to smoke heavily versus having only one cigarette. But this conclusion contradicts this individual’s past preferences when he started to smoke.10

Utility function

Once we have defined the main assumptions behind a rational preference relation, we are ready to define a utility function. A function u:X→R from the set of alternatives to the set of real numbers is a utility function representing a preference relation if, for every pair of alternatives x and y that belong to X,

x y is equivalent to u(x)≥ u(y).

Let us emphasize two main points from this definition. First, only the ranking of alternatives matters. Indeed, a utility function u(x) such that u(x)=14 and u(y)=10 provides the same ranking of alternatives x and y than utility function u’(x) where u’(x)=2000 and u’(y)=3, since both utility functions rank alternative x above alternative y. Hence, the individual does not care about cardinality (the number that the utility function associates with each alternative) but instead cares only about ordinality (the ranking of utility values among alternatives). Second, if we apply any strictly increasing function f(.) on the utility

8 you should check that by noticing that we created a cycle, since:

condition 1&3 cond. 1&2 cond. 2&3


9 This is the so-called Condorcet paradox, extensively studied in social choice problems.

10 This has been criticized as a real form of intransitivity, since the individual decision-maker could be regarded as different according to the period of time in which he states his preferences over alternatives. Hence we will only refer to the first three types of intransitivies.

Page 7: MUÑOZ,F. Advanced microeconomic theory. WSU


function u(x), i.e., v(x)=f(u(x)). Importantly, the values associated to this new function keep the ranking of alternatives intact, and therefore the new function still represents the same preference relation.11

Note on reflexivity. Note that we didn't define reflexivity in our previous discussion. In particular a preference relation satisfies reflexivity if for any alternative x in X, we have that:

1. x x, so that any bundle is indifferent to itself,

2. x x, so that any bundle is preferred or indifferent to itself, and

3. x x

4. This assumption ensures that any bundle belongs to at least one indifference set12, namely the set containing itself if nothing else. Note however, that reflexivity is implied from completeness. Indeed, if we replace alternative y for x, we can transform the assumption of completeness into the assumption of reflexivity.

Choice based approach

In the choice based approach we focus on the actual choices made by the individual, rather than on the process of introspection by which the individual discovers his own preferences over different alternatives. In the choice based approach we use the so-called “choice structure,” which contains two elements:

1. is a family of nonempty subsets of X, so that every element of is a set B X . Let us provide some examples of sets B.

a. In consumer theory, set B can be understood as a particular set of all the affordable bundles for a consumer, given his wealth and the market prices. (We refer to this set of affordable bundles as the consumer’s budget set.) Note that the budget set can be defined as a subset of the real numbers.13

b. B as a particular list of all the universities where you were admitted, among all universities in the scope of your imagination X, i.e., B X .

2. c(.) is a choice rule that selects, for each budget set B, a subset of elements of B, with the interpretation that c(B) are the chosen elements from B.

11 For instance, v(x)=3u(x), v(x)=5u(x)+8, etc. are all examples of strictly increasing functions applied to the original utility function u(x) that represent the same preference relation as u(x) since all of them maintain the same ranking of utility values associated to each alternative x in X.

12 Below we provide a more detailed description of indifference sets, but note that they can be understood as the set of alternatives over which the consumer is indifferent. Using an example from consumer theory, recall that indifference sets are graphically represented using indifference curves, reflecting the set of bundles for which the consumer reaches the same utility level.

13 In the case of consuming only two goods, the set of affordable bundles B (budget set) becomes a subset of R2, i.e., a subset of the positive quadrant, which represents all possible bundles.

Page 8: MUÑOZ,F. Advanced microeconomic theory. WSU


a. Following with our example of consumer theory, c(B) would be the bundle/s that the individual chooses to buy, among all bundles he can afford in the budget set B; and

b. In the example of the universities you were admitted to, c(B) would contain the university that you choose to attend.

Note that c(B) might contain a single element, in which case the choice rule is a function, or it might contain more than one element, in which case the choice rule is correspondence.14

Examples. Let us now see two examples of choice structures. Define the set of alternatives as X={x,y,z}, and consider two different budget sets (both of them being subsets of the set of alternatives X), budget set B1={x,y} and budget set B2={x,y,z}.

In choice structure one, the individual chooses element x, and only x, regardless of which budget set is presented to him. That is, c1({x,y})={x} and c2({x,y,z})={x}. In choice structure two the individual still selects only alternative x when he is confronted with budget set B1, i.e., c1({x,y})={x}. However, when the budget set is enlarged to contain alternative z as well, as it does in B2, his choice reverts to only alternative y, i.e., c2({x,y,z})={y}. (We will comment on the consistency of this choice rule below).

Consistency on choices: the Weak Axiom of Revealed Preference (WARP)

Paralleling the rationality assumption on the preference-based approach, we now impose a consistency requirement on the choice-based approach. Specifically, we consider that the actual choices of an individual are consistent if they satisfy the following weak axiom revealed preference (WARP).

We say that the choices structure (B,c(.)) satisfies the WARP if,

For some budget set with , we have that element is chosen, ( ), thenB x y B x x C B For any other budget set where alternatives and are also available, , , and where

alternative is chosen, ( ), then we must have that alternative is chosen as well, ( ).

B x y x y B

y y C B x x C B

Example. When the individual decision-maker faces budget set B={x,y}, he chooses only alternative x. When he faces an enlarged budget set B’ (which contains the same alternatives as budget set B (x and y), but also alternative z), then his “legal” choices according to the WARP are:

1. x, which can be rationalized because alternative x is still the best alternative even after including z as an additional alternative.

2. z, which can be explained because the new option z is better than the previous alternatives x and y.

14 Informally, we generally understand a function as a mathematical mapping that provides a single element in the range to each element in the domain, while a correspondence is understood as a mapping providing more than one element in the range to each element in the domain.

Page 9: MUÑOZ,F. Advanced microeconomic theory. WSU


3. x and z, which can be justified because new option z is very similar to alternative x and as a consequence the decision-maker chooses both.

Note, that the individual decision-maker cannot select alternative y alone if his choice rule satisfies WARP. Indeed, this alternative was available under budget set B but the individual did not select it when B was presented to him. Hence, the fact that new options are now available should not cause an alternative that was affordable but not selected under the old budget to be selected under the newly enlarged budget set B’. This means the individual cannot choose {x,y} when facing budget set B’ since alternative y is contained in this choice. As suggested in the previous argument, y was not selected under budget set B, and therefore cannot be part of the choices made by the individual under budget set B’.

The following figure illustrates a choice rule that satisfies the WARP. Indeed, the individual decision-maker selects alternatives x and y, both when facing budget set B and B’.

Figure #1.1

In contrast, the figure below represents a choice rule violating WARP, since the individual chooses only x when facing budget set B, but switches to y when facing budget set B’, despite of the fact that both alternatives x and y were available under budget set B. (Note that this choice rule is similar to choice rule 2 in our above example in the previous section, where the individual decision-maker chooses only x when confronted to budget set B, and only y when facing budget set B’, where both x and y belong to budget sets B and B’. For this reason, we can conclude that choice rule 2 above violated the WARP).

Page 10: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #1.2

We can now construct the preferences that the individual reveals in his actual choices when he is confronted to choose an element (or elements) from different budget sets.

A. First, if there is some budget set B for which the individual chooses x, where alternatives x and y belong to B, then we can say that alternative x is revealed at least as good as alternative y, and

denote it as *x y


B. Second, if there is some budget set B for which the individual chooses x but he does not select y, where alternatives x and y belong to B, then we can say that alternative x is revealed preferred to

alternative y, and denote it as *x y .

[Note that when x*

y in the above point, the individual decision-maker is allowed to choose both

alternative x and y. However, when x * y, the individual is only allowed to choose x.]

Let C*(B, ) be the set of optimal choices generated by the preference relation

when facing a budget

set B. using the notation, we can restate the WARP as follows:

If alternative x is revealed at least as good as y, then y cannot be revealed preferred to x, i.e., if x*

y, then we cannot have y * x.

We finally examine the relationship between the preference-based approach and the choice based approach. In particular we want to investigate under which cases a rational preference relation implies that the choices structure satisfies the WARP, and under which conditions the opposite relationship holds. Let us next check that a rational preference relation implies that the choices structure satisfies the WARP.


First, suppose that for some budget set B , we have that *, and ( , )x y B x C B .

* ( , ) , for all x C B x y y B

Page 11: MUÑOZ,F. Advanced microeconomic theory. WSU


In order to check WARP, assume some other budget set B with , andx y B*( , )y C B


* ( , ) , for all zy C B y z B

Combining the conclusions from the previous two points, x y and y z

, we can apply

transitivity (because the preference relation is rational), and we obtain x z . Then

*( , )x C B , and we find that *, ( , )x y C B

, which proves that WARP is satisfied.

The opposite relationship (where the choice structure satisfying the WARP implies a rational preference relation) only holds if the budget set B contains three or fewer elements. MWG describes a proof from Arrow about this result.

Consumption sets

In this lecture we start considering the set of feasible bundles for the consumer. The consumption set is the set of affordable bundles. One way to define a consumption set is by a set of prices, one for each possible good, and a budget. Or a consumption set could be defined in a model by some other set of restrictions on the set of possible consumption bundles. Also, consumption set is a subset of the commodity space RL denoted by X contain RL whose element are the consumption bundle that the individual can conceivably consume given the physical constrains imposed by his environment. e.g. if consumer i can consume nonnegative quantities of all goods, it is standard to define xi as its consumption set, a member of R+L where L is the number of goods. Normally if the agent is endowed with a set of goods, the endowment is in the consumption set.

Let's denote a commodity bundle x as the vector of L components. For generality, at this stage we allow each component to be positive or negative.

We can impose physical or economic constraints on the consumption set. The physical constraints are only related with legal constraints (such as the maximum amount of consumption of a particular good, the maximum amount of working hours, etc.). In contrast, economic constraints on the consumption set emerge from market prices and the individual's income, which determine the set of affordable bundles for the individual.

Page 12: MUÑOZ,F. Advanced microeconomic theory. WSU


Physical constraints

Let us first have a look at the effect of imposing physical constraints on a consumption set. For simplicity we will assume that the consumption set is defined in the set of positive real numbers. The following figure illustrates a particular physical constraint in the labor market. Specifically, the human worker cannot work more than 24 hours a day, and therefore his maximum amount of leisure is 24 hours. If, a law establishes a maximum working day of 16 hours a day, his consumption set would shrink, and would be represented by the area from 8 to 24 hours of leisure per day.

Figure #1.3

The following figure indicates the presence of indivisibilities in the consumption of good two. Indeed, this good can only be consumed in integer amounts while good one can be consumed in any small divisible parts. Therefore, the consumption set is given by the union of different horizontal lines, each of them representing a particular amount of the indivisible good.

Figure #1.4

The next figure represents the consumption of two goods that cannot be enjoyed simultaneously in the same consumption bundle. In particular, good one denotes the consumption of bread in Seattle at noon, while good two denotes the consumption of bread in New York City at noon in the same day. Therefore, the consumption set coincides with the two axes. Indeed, the consumer can choose to consume any amount of bread in Seattle, or any in New York City, but not a combination of the two.

Page 13: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #1.5

Finally, the following figure illustrates the presence of a minimum amount of bread that guarantees survival. Specifically, the consumer must eat at least four slices of either type of bread (white or brown) in order to avoid starvation. He can do so by consuming four slices of one type of bread, or a combination of the two types.

Figure #1.6

We can now define convexity in consumption sets. We say that a consumption set X is convex if, for two consumption bundles x and x’ in X, the bundle

(1 ) ( 0 ,1 )x x x

is also an element of the consumption set X. Intuitively, a consumption set is convex if for any two bundles that belong to the set we can construct a straight-line connecting them that lies completely in the set. As a practice, let us check if some of the previous consumption sets satisfy this definition of convexity. First, note that the consumption set representing the presence of indivisibilities in consumption, where the individual can only consume integer amounts of good two, is not convex since the linear combination (straight line) between any two bundles that belong to the set does not necessarily lie in the set. Similarly, the consumption set with consumption of bread in Seattle and New York City at noon on the same day does not satisfy convexity either. Indeed, the linear combination of any two bundles lies entirely outside the consumption set.15 Note that by aggregating data, such as considering the

15 As a remark, note that we are not conceding the extremes of the straight-line that connects bundles x and x’ in the

definition of convexity since we impose alpha>0 and <1 strictly.

Page 14: MUÑOZ,F. Advanced microeconomic theory. WSU


consumption of bread in Seattle (or in New York City) during an entire month, we would be able to “convexify” the consumption set, since individuals would be able to consume both goods during that time span.

Economic constraints

Before defining the economic constraints in the individual’s consumption set, let us first discuss some of the assumptions we make on the price vector.

1. We assume that all commodities can be traded in a market, at prices that are publicly observable. This is the so-called principle of completeness of markets (or universality of markets) seems all goods can be traded. Importantly, note that this assumption discards the possibility that some goods cannot be traded, such as pollution when no property rights are clearly defined.

2. Prices are assumed strictly positive for all L goods. We denote this by writing p>>0, i.e., pk>0 for all goods k. Again, note that some prices could be negative in some circumstances, such as pollution since individuals would be willing to pay in order to have less of them. We do not allow for negative prices in the following chapters but we return to the possibility of negative prices when we discuss externalities.

3. Price taking assumption: a consumer’s demand for all the goods he consumes represents a small fraction of the total demand for good. Therefore, his position on whether to buy or not buy the good does not affect market prices.16

We are now ready to define the set of affordable bundles for the consumer. In particular bundle x, describing the amounts purchased of L different goods, is affordable if

1 1 2 2 ...

or in vector notation L Lp x p x p x w

p x w

Note that px represents the total cost combined bundle X. at market prices p, while w represents the total wealth of the consumer.17 When we define the consumption set to coincide with the set of positive real numbers, then the set of feasible (affordable) consumption bundles consists of the elements in the following set:

, :Lp wB x p x w

Let us next see one example of a set of affordable consumption bundles where, for simplicity, we only consider two goods.

16 Note that this assumption will not be valid if the consumer possesses monopsony power in his demand for a particular good. This is the case, for instance, in labor markets where only one employer buys labor services in a relatively small locality.

17 Note here a usual distinction between wealth and income: wealth refers all of the resources of the consumer during a certain time span (which can potentially include his entire lifetime), whereas income refers to the individual’s resources during a single time period.

Page 15: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #1.7

Graphically, the upper boundary of the set of affordable consumption bundles represents the set of bundles for which the individual entirely exhausts his wealth buying different combinations of good one and two, i.e., p1x1+p2x2=w, or in vector notation, px=w. We refer to this upper boundary as the budget line. Intuitively, note that the individual is exhausting all his wealth buying only good two (one), the maximum amount of this good he can afford his w/p2 (w/p1, respectively). Finally, note that the slope of the budget line is given by the price ratio –p1/p2.18 In the case that the consumer can buy more than two goods, the budget line is usually referred as the budget hyperplane. The following figure illustrates the budget hyperplane for the case in which the consumer buys three different goods. Graphically, note that the budget hyperplane represents the surface of bundles for which the consumer exhausts his wealth.

Figure #1.8

One important characteristic of the price vector is that it is orthogonal to the budget line. In order to see this, first note that on the budget line px=w for any x on the budget line. We can then take any other

18 Note that, solving for good two, the equation of the budget line is given by 12 1

2 2

pwx x

p p , where w/p2

represents the vertical intercept while –p1/p2 represents the negative slope.

Page 16: MUÑOZ,F. Advanced microeconomic theory. WSU


bundle x’ which also lies on the budget line, so that px’=w. Similarly for any other bundle xbar, i.e., pxbar=w. We can now combine these results, finding that pxbar=px’=w, or p(x’-xbar)=0, or simply

0p x

And since this result is valid for any two bundles on the budget line, then the price vector must be perpendicular to deltax on the budget line. Hence, this implies that the price vector is perpendicular (orthogonal) to the budget line, as depicted in the following figure.

Figurer #1.9

Finally, we impose an assumption on the budget set which will become very convenient in later chapters when we analyze the optimal consumption bundle that the consumer selects among all the bundles he can afford. In particular, we consider that the budget set is convex. In this regard, we need that for any two bundles on the budget set x and x’, the linear combination

(1 ) (0 ,1 )x x x

also belongs to the budget set.19

We know that if and . Then,p x w p x w

(1 )

(1 )

p x p x p x

px px w

Note that the budget sets described above for two and three goods satisfied this definition of convexity since we could select any two bundles from the budget set, construct a linear combination between both of them (straight-line), and check that all the bundles in this linear combination belong to the budget set as well.

Let us see next an example of a budget set that doesn't satisfy convexity. In particular, it describes the set of affordable bundles for an individual working for a firm, with his consumption of leisure in the

19 Similarly as our definition of convexity for consumption set, note that here we only consider alpha>0 and <1 strictly, since otherwise the extremes of the linear combination between bundles x and x’ (that this, bundles x and x’ themselves) would be included in our definition of convexity.

Page 17: MUÑOZ,F. Advanced microeconomic theory. WSU


horizontal axis and his consumption of all other goods the vertical axis. Starting from the horizontal intercept (where this individual enjoys 24 hours of leisure with no consumption of other goods), this individual can start working and obtain a wage of s dollars per hour for his first eight hours of work. If he works more than eight hours, he received overtime wage of s’>s dollar per hour, which allows him to consume a larger amount of other goods. However, when his labor income exceeds M dollars, he must pay a proportion t from his total income, reducing his real wage (after taxes) to s’(1-t). Graphically, this implies that the budget line is relatively flat for the first eight hours of work, becomes steeper when the worker starts to receive overtime pay, but becomes flatter again when the worker is taxed.

Figure #1.10

Importantly, this budget set is not convex since for any two bundles, such as x and x’ in the figure, its linear combination does not lie in the budget set.20

Quasilinear preference relations

20 In our initial discussion of convexity of the budget set, we suggested that a non-convex budget set could lead to potential problems when solving for the optimal bundle that the consumer selects when solving his utility maximization problem. Indeed, note that for several preference relations the above non-convex budget set could lead to multiple solutions. Graphically, a given indifference curve could be tangent to the above budget line at several points.

Page 18: MUÑOZ,F. Advanced microeconomic theory. WSU


Intuitively, the first condition simply states that if two bundles lie on the same indifference curve then if we increase the amount of the first good contained in both bundles, then the newly created bundles must also lie on the same indifference curve. The second condition, on the other hand, states that if we increase the amount of the first good in bundle X the newly created bundle must be strictly preferred to the original bundle X. These conditions can be easily understood by looking at the following figure:

Figure #1.11

Finally, note implication of the above to conditions. In particular if bundle X is strictly preferred to bundle y then if we increase the amount of good one in bundle X and y it must be the case that the enlarged bundle X must be preferred to the enlarged bundle y this property is also illustrated in the figure.

After analyzing the definition of quasilinear preferences we can discuss how to detect quasilinear utility functions. In particular, a quasilinear utility function that you might have encountered in your intermediate microeconomics classes looks as follows

Page 19: MUÑOZ,F. Advanced microeconomic theory. WSU




An example from undergrad:

( , ) ( ) where 0 and ( ) non-linear. ex: ( )

Easily generizable to 2 goods,

( , , ) ( , )denon linear in all other goods

U x y v x b y b v x is v x x or x


U x y z v x y b z

sirable good

The MRS of such functions is constant in the good that enters linearly in the utility function. In other words, for a given level of good one, an increase in the amounts of good two does not affect the slope of the indifference curve. Let us see that with an example.

Figure #1.12

Note that another example is that a linear preference relation (perfect substitutes), where both goods enter linearly into the utility function. We can therefore conclude that preferences over perfectly substitutable goods are a particular case of quasilinear preferences.

So far we have examined assumptions behind the preference relations and particular types of preference relations and utility functions. However, we have not analyzed under which conditions we can guarantee that a preference relation can be represented with a utility function. Specifically, the assumptions we consider so far are not enough to guarantee that any preference relation can be represented with a utility function. One example of a preference relation that cannot be represented by a utility function is the so-called lexicographic preference relation that we discuss next.

Lexicographic preferences:

1 11 2 1 2

1 1 2 2

, or if( , ) ( , ) iff


x yx x y y

x y x y

Intuitively, note that this preference relation works like alphabetizing a dictionary: first the individual refers bundle X if it contains more of good one than bundle y if however, both bundles contain the same amount of good one, then the individual prefers the bundle which contains more of the second good. One important characteristic of this preference relation is that its indifference set cannot be drawn as an

Page 20: MUÑOZ,F. Advanced microeconomic theory. WSU


indifference curve. For a given bundle there are no more bundles for which the consumer is indifferent. Let us examine this property by identifying the upper contour set, lower contour set, and the indifference set.

Figure #1.13

1 1 11 2




( , )

( ) :

( ) :

( ) : singletons

x x x




First, note that the upper contour set of bundle x’ is the set of bundles containing more of good one and those bundles that, contain the same amount of good one but have more of good two. Similarly, the LCS is defined by those bundle that contain less of good one and those that, containing the same amount of good one, have less of good two. Hence, the UCS and LCS span all the positive quadrant, leaving no room for the indifference set of bundle x’, other than the bundle itself. As a consequence, we say that indifference set for bundle x’ is the bundle itself, or in other words, that IND(x’) is a singleton.

Hence, the previous example suggests that we need to impose an additional condition on preference relations in order to guarantee that they can be represented with a utility function. This property is continuity as we define below.

Continuity. A preference relation defined on X is continuous if it is preserved under limits. That is, for

any sequence of pairs 1

( , ) with for all n n n n

nx y x y n

and lim and limn n

n nx x y y

, then we

have the preference relation is maintained in the limiting points, x y . Intuitively, this implies that there are no jumps in my preferences over a sequence of pairs. Intuitively, this property states that there can be no sudden jumps in an individual preference over a sequence of bundles, i.e., there are no sudden preference reversals. The following figure illustrates

Page 21: MUÑOZ,F. Advanced microeconomic theory. WSU


preferences that satisfy continuity, where the individual decision-maker refers bundle x1 to y1, x2 to y2, … and similarly at limiting points of the sequence, where he still prefers bundle x to y

Figure #1.14

Let us next show why a lexicographic preference relation doesn't satisfy continuity.

Figure #1.15

Notice the limits of the sequences. Intuitively, the individual prefers bundle x1 to y1 since the former contains more of good one that the later. Similarly the individual prefers bundle x2 to y2 given that the former still contains more of good one than the later. However, at the limiting points of the sequence,

Page 22: MUÑOZ,F. Advanced microeconomic theory. WSU


bundle x becomes (0,0) while bundle y is still (0,1). Therefore, both bundles contain the same amount of good one, and the individual ranks them based on the content of good two, leading to bundle y being strictly preferred to bundle x. These is a preference reversal, and as a result a violation of continuity.

After describing continuity, we are ready to establish under which conditions any preference relation can be represented using a utility function.

Figure #1.16

Page 23: MUÑOZ,F. Advanced microeconomic theory. WSU


Note: as a remark, note that a utility function can satisfy continuity but still be non-differentiable. For instance, the Leontieff utility function, min{ax1,bx2}, is continuous but cannot be differentiated at the kink.

Page 24: MUÑOZ,F. Advanced microeconomic theory. WSU


Chapter 2 – Demand functions

The utility maximization problem

We are now ready to combine the tastes of the individual embodied in his utility function and the budget line representing the set of bundles he can afford, in order to examine the set of optimal choices for the individual. In particular, the consumer maximizes utility level by selecting a bundle X (choice variable) subject to the fact that the cost of such bundle cannot exceed his wealth.

0max ( ) ( )

. . x

u x UMP

s t p x w

One important point is to know whether the above maximization problem has a solution. The Weierstrass theorem provides us with an answer, since the objective function we are maximizing (utility function) is continuous and the budget constraint defines a closed and bounded set (given that p>>0 and w>0), therefore the problem does have a solution. Regarding the number of solutions to the above maximization problem, note that if preferences are strictly convex, then the solution is unique.

For simplicity, we denote the solution to the UMP as the argmax of UMP. Argmax means: the argument, x, that solves the maximization problem. We denote the solution as ( , )x p w : the Walrasian demand.

We can conclude three main properties from the solution of the above maximization problem.

First, note that homogeneity of degree zero should come as no surprise. Specifically, an increase in both the price vector and wealth level of the same extent doesn't change consumer’s budget set. Since the budget set is unchanged, the optimal bundle selected by the individual shouldn’t change either. Second, note that WL follows from LNS. Indeed, if the consumer were not selecting a bundle x that lies strictly inside the budget set (so that he is exhausting all of his wealth), we could find another bundle y at epsilon distance from bundle x that is strictly preferred by the individual to bundle x. In this case, however, the initial bundle x cannot be utility maximizing because there are other bundles that are still affordable and which are strictly preferred by the consumer. If bundle x in contrast lies on the budget line we could identify bundles that are strictly preferred to x but these bundles would be unaffordable to the consumer.

Page 25: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.1

Finally, note that if preferences are convex (but not strictly convex) the set of bundles that maximize the individual's utility define a convex set, as the figure below illustrates. If, in contrast, the consumer’s preferences are strictly convex, he selects a unique bundle as Walrasian demand.

Figure #2.2

After describing the UMP, we can now examine the first order conditions of these maximization problems.

Page 26: MUÑOZ,F. Advanced microeconomic theory. WSU


A natural question at this point is whether the above necessary conditions are also sufficient. In other words, under which conditions we can guarantee that the Walrasian demand that we have found is the maximum of the UMP and not the minimum. In particular, this is the case when the utility function is quasiconcave and monotone, and the vector of first order derivatives is different from zero for all x. Let us briefly analyze these conditions. First, the condition stating that the utility function should be monotone only implies that if we increase both goods simultaneously we reach a higher utility level, which is expected in most applications. Second, the condition that the first order derivatives are different from zero simply guarantees that there are no bliss points. Intuitively, if the vector of first-order derivatives was zero we would have reached the “peak” of utility. At this point, however, the individual would not be able to find any other preferred bundle, thus violating LNS. Finally, the condition that the utility function satisfies quasiconcavity is also easy to justify. The following figure represents an indifferece map of an individual whose preferences do not satisfy quasiconcavity.

Figure #2.3

Page 27: MUÑOZ,F. Advanced microeconomic theory. WSU


Indeed, note that the UCS is not convex. This implies that the tangency condition between the indifference curves under the budget line is not a sufficient condition for a utility maximization bundle. Specifically note that a point of tangency condition such as bundle C gives a lower utility level than a point of non-tangency, such as bundle B. therefore, if preferences do not satisfy quasiconcavity the KT conditions (graphically represented by the tangency condition) are not sufficient for a maximum.1 Because the three requirements for the necessary conditions to become sufficient are relatively mild, we can then expect KT conditions to be sufficient in most economic applications.

Note: why does the MRS represent the slope of the indifference curve? Answer: note that in order to find the slope of the indifference curve we must modify both x1 and x2 without altering the utility level of the individual. We do that by totally differentiating the individual’s utility function,

Importantly, note that so far we have been analyzing interior solutions. If, however, the individual prefers to consume zero amounts of some of the goods, the above tangency condition will not be satisfied. In particular, at the corner solution we find that, after taking the first order conditions,

* *( ) ( )

, , or alternatively, ,

because the consumer would like to consume even more of good !!

l kl


u x u xx xp

l k pl k

MRSp p


*( )


In the FOCs, this implies for those goods whose consumption is zero,

x 0, and...


u xkx



*( ) * for the good for which consumption is positve, x 0l

u xl lx p

* *( ) ( )

per dollar per dollarspent on good spent on good

l k

u x u xx x

l k

MU MUl k

p p

1 Note that the two maximum this case is bundle A.

Page 28: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.4

A note on the Lagrange multiplier. The Lagrange multiplier is usually referred as the marginal value of relaxing the constraint in the UMP (or alternatively as the shadow price of wealth). Let us analyze why this is the case. First, note that if we relax budget constraint in the UMP the consumer is capable of reaching a higher indifference curve and as a consequence of obtaining a higher utility level. The following figure illustrates this point.

Figure #2.5

Hence, we want to measure what is the increase in utility resulting from a marginal increase in wealth. In order to do so, we take first order conditions on the individual’s utility level measured at the bundle that maximizes his utility (Walrasian demand).

Page 29: MUÑOZ,F. Advanced microeconomic theory. WSU


As an example, note that if lambda=5, then a marginal increase in wealth induces an increase of five units of utility.

Example: Let’s consider a real example connected with utility maximization problem. Take the Cobb Douglas function expressed by U (X, Y) = , which is subject to the following budget constraint

X + Y, where for convenience we assume α+β=1. We can now solve for the utility maximizing values of X and Y for any prices ( , ) and income (I). Setting up the Lagrangian expression


yields the first order conditions:

= - μ =0

= β - µ =0

= 0

Taking the ratio of the first two terms shows that


or Y= X,

where the final equation follows because α+β=1. Substitution of the first order conditions to X +

Y gives 1 )= . Solving for X yields = αI/ and a similar set of manipulations would

give = βI/ .

Page 30: MUÑOZ,F. Advanced microeconomic theory. WSU


Walrasian demand

We found the Walrasian demand function, ( , )x p w , as the solution to the UMP. This demand function satisfies several properties:



Walras' Law: for every p 0, w 0 we have

p x w for every x x(p, w )

Generally, Homog(R ) of a function f (x, y) :

f (ax, ay) a f (x, y)

Example from production:

f (2L, 2K) 2 f (L, K)

Recall that homogeneity of degree zero can easily be understood by the fact that an increase in prices and wealth in the same proportion do not modify the consumer’s budget set.2 Regarding Walras' law, note that it only relies on LNS.

Let us now analyze how the Walrasian demand is affected by changes in the individual’s wealth level or in the prices of some of the goods. When demand increases in wealth we say that good is a normal good while when it decreases in wealth we refer to those goods as inferior. Examples of the former can be computers whereas examples of the later are Two-Buck Chuck or Wal-Mart during the economic crisis.3 Graphically an increase in the wealth level produces an outward shift in the budget line, as the following figure illustrates.

Figure #2.6

2 Remember that we say that a function is homogeneous of degree R if increasing all the elements of the function by a factor alpha produces an increase in the value of the function of alpha to the power of R. hence, when a function is homogeneous of degree zero an increase in all its arguments does not modify the initial value of the function. 3 Indeed, several reports suggest that a decrease in the average wealth during the 2009 economic crisis produced an increase in the sales of certain discount supermarkets such as Wal-Mart.

Page 31: MUÑOZ,F. Advanced microeconomic theory. WSU


At a given price level, the consumer chooses an optimal consumption bundle, as described in the figure. We can then connect all these optimal consumption bundles for different levels of wealth forming what we refer as the wealth expansion path, or Engel curve. When the wealth expansion indicates an increase (decrease) in the consumption of good j as a consequence of further increments in the wealth level, we say that this expansion path is reflecting that good j is normal (inferior, respectively). The above figure illustrates an example in which good one is initially normal but then becomes inferior, while good two is normal for all levels of wealth.

We now move to the analysis of how demand reacts to price changes. When the demand for good K decreases as a result of an increase in the price of good K we simply regard that good as a usual good, seems its quantity demanded reacts negatively to its own price. If, in contrast, quantity demanded of good K increases as a result of an increase in the price of good K, we regard that good as Giffen.4 We can illustrate these negative and positive relationships in the following two figures, with demand for good K. in the horizontal axis and own price in the vertical axis.

Figure #2.7

Other than analyzing the effect of its own price we are interested in examining the effect of a change in the price of good L on the quantity demanded for good K (more compact preferred as “cross-price effects”) we can either find that this relationship is positive for two goods regarded by the consumer as substitutable (such as two brands of mineral water) or negative for two goods regarded as complementary in consumption (such as left and right shoes, cars and gasoline, etc.). We can use a similar graphical representation is the one employed above in order to represent these cross-price effects.

4 One of the few examples of Giffen goods is that of potatoes in Ireland during the 19th century. However, this is still a strong controversy among economists on whether demand for potatoes actually moved in the same direction as its own price.

Page 32: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.8

In the figure on the left side we can observe that an increase in the price of one brand of mineral water increases the demand of the other brand over no water that the consumer regards as a close substitute. In the figure on the right, we observe how an increase in the price of gasoline reduces the demand for cars, shifting it inwards.

We have discussed the set of properties of the optimal consumption bundle (Walrasian demand) as the solution of the UMP. There are still, however, some important points about the UMP that we must stress.

First if we insert the optimal consumption bundle into the individual’s utility function we obtain the highest utility level that the individual can achieve by solving this UMP. More formally, we refer to the utility function evaluated at the solution of the UMP as the indirect utility function, v(p,w). [More generally, we will refer to the objective function of an optimization problem evaluated at the solution of the optimization problem as the “value function.” Hence, the value function of the UMP is the indirect utility function]. Function v(p,w) satisfies several properties:

1. Homogeneity of degree zero. 2. Strictly increasing in w and nonincreasing in kp for any k.

3. Quasiconvex: the set ( , ) : ( , )p w v p w v is convex for any v (Figures in Rubinsein and

MWG for examples). 4. Continuous in p and w.

First, note that homogeneity of degree zero should come as no surprise. In particular, it states that increasing market prices and wealth by the same proportion does not modify consumers budget set, as a consequence such increase does not modify the consumers optimal consumption bundle, and therefore it doesn't modify the maximal utility level that the individual can reach, as measured by v(p,w). The second property states that if we increase the wealth level of the individual we are enlarging the set of feasible bundles he can afford and as a consequence the indifference curve he can reach when selecting his optimal consumption bundle. Therefore, the maximal utility level that he can reach is strictly increasing in his wealth level. In contrast, an increase in the price of any good shrinks the set of affordable bundles and as a consequence the individual can only reach indifference curve associated to lower utility levels. Thus, an increase in the price of any good K produces a reduction in the maximal utility level that the individual

Page 33: MUÑOZ,F. Advanced microeconomic theory. WSU


can obtain by solving this UMP. Regarding the second property, quasi-convexity, let us provide an intuitive explanation by using the following figures.

Figure #2.9

First, note that the indirect utility function is depicted in the prices in the horizontal axis and wealth level in the vertical axis. Hence, when prices increase from P11 to P12, wealth must also increase in order to maintain the same utility level for this individual. In addition, note that lower prices and higher wealth levels are associated to higher maximal utilities. Quasiconvexity tells us that, if the max utility associated to a given pair of prices and wealth (A) is weakly higher than the max utility associated to another pair of prices of wealth (B), then max utility associated to the linear combination of prices and wealth between A and B is weakly lower than that associated with A.

We can provide an alternative interpretation of Quasiconvexity as follows. The indirect utility function satisfies quasiconvexity if the set of pairs of prices and wealth for which the max utility that the consumer can reach is lower than that under pair (p*,w*) then the function defines a convex set. More compactly,

* *

* *

( , ) is quasiconvex if the set of ( , ) pairs for which ( , ) ( , ) is convex.

i.e., ( , ) : ( , ) ( , ) is convex

v p w p w v p w v p w

p w v p w v p w

Page 34: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.10

An alternative way to understand Quasiconvexity uses only good one and two in the axis as follows.

Figure #2.11

Let us construct this figure sequentially. First, when the individual decision-maker is facing budget set Bp,w, his optimal consumption bundle is x(p,w). Second, when prices and wealth change to p’ and w’, he faces budget set Bp’,w’, and therefore selects bundle x(p’,w’). Third, note that both bundles x(p,w) and x(p’,w’) induce an indirect utility function of v(p,w)=v(p’w’)=ubar. Fourth, we can now construct a linear combination of prices and wealth

'' ''

'' '

,'' '

(1 )

(1 ) p w

p p pB

w w w

This combination of prices and wealth provides us with budget set Bp’’,w’’. Finally, note that any solution to the UMP facing budget set Bp’’,w’’ must provide a optimal consumption bundle that lies on a lower indifference curve (associated to a lower utility level) than ubar.

Page 35: MUÑOZ,F. Advanced microeconomic theory. WSU


WARP and demand

After presenting different properties about the UMP, its solution and its value function, we are now ready create the optimal consumption bundle obtained in the above UMP with the WARP. Hence, we want to understand if the consistency requirement imposed by the WARP limits the set of optimal consumption bundles that individual decision-maker can select when solving the UMP.

WARP and Demand: Take two different consumption bundles ' '( , ) and ( , )x p w x p w , both being affordable under (p,w).

' '( , )p x p w w

When prices and wealth are (p,w), the consumer chooses ' '( , ) despite ( , )x p w x p w was also affordable.

Then he “reveals” a preference for ' '( , ) over ( , )x p w x p w when both are affordable. Hence, we should

expect him to choose ' '( , ) over ( , )x p w x p w when both are affordable (consistency). Therefore, bundle

( , )x p w must not be affordable at ' '( , )p w because the consumer chooses ' '( , )x p w . That is ' '( , )p x p w w . We can conclude that Walrasian demand satisfies WARP if, for two different

consumption bundles, ' '( , ) ( , )x p w x p w :

' ' ' '( , ) ( , )p x p w w p x p w w

In words, if bundle x(p’w’) is affordable under budget set Bp,w, then bundle x(p,w) cannot be affordable under budget set Bp’,w’.

Let us first present an example of optimal consumption bundles that satisfy WARP. The following figure, note that bundles x(p,w) and x(p’,w’) are both affordable under initial prices and wealth, since they both lie below budget line Bp,w. However bundle x(p,w) is not affordable under final prices and wealth, since it lies above the budget line Bp’,w’. Therefore, WARP is satisfied.

Figure #2.12

Page 36: MUÑOZ,F. Advanced microeconomic theory. WSU


Let us now examine an example in which optimal consumption bundles do not satisfy WARP. The following figure demand under final prices and wealth, represented by bundle x(p’,w’), is not affordable under initial prices and wealth, since it lies above budget line Bp,w.5

Figure #2.13

Note the general procedure we have been using to test whether two particular bundles satisfy WARP.

First, we check if bundle x(p,w) and x(p’,w’) are both affordable under the initial prices and wealth. Graphically, this implies that both bundles lie on or below budget set Bp,w. If this first step of the procedure is satisfied then we can move to step two. Otherwise, the premise of the WARP is not satisfied, which doesn't allow us to continue checking whether it is violated of not. In these cases, we say that the WARP is “not violated.”

Second, we check if bundle x(p,w) is affordable under final prices and wealth. Graphically, bundle x(p,w) must lie on or below budget line Bp’,w’. If this condition is satisfied, then this Walrasian demand violates WARP. If, in contrast, this second step is not satisfied, then this Walrasian demand satisfies WARP.6

Let us next evaluate another example in which optimal consumption bundles do not satisfy WARP. The figure below represents another case in which demand under final prices and wealth, represented by bundle x(p’,w’), is not affordable under initial prices and wealth, since it lies above budget line Bp,w.

Figure #2.14

5 Importantly, note that here we can check if the conclusion of the WARP since the premise of WARP is not satisfied. 6 For more examples and practice about Walrasian demand functions that satisfy or violate WARP, see homework assignment #2.

Page 37: MUÑOZ,F. Advanced microeconomic theory. WSU


The following figure represents a similar case.

Figure #2.15

In the following figure, optimal consumption bundle under final prices and wealth, x(p’,w’) is affordable under initial prices and wealth, since it lies below the budget line Bp,w. However, the optimal consumption bundle x(p,w) under the initial prices and wealth is not affordable under the new prices and wealth, given that it lies above budget line Bp’,w’. Hence, WARP is not satisfied.

Figure #2.16

In our last example below we see a similar situation as the one represented above. Specifically, the optimal consumption bundle on the final prices and wealth, x(p’,w’), is affordable under initial price of wealth since it lies below budget set Bp,w. However the demand x(p,w) is affordable under the new prices and wealth since it lies below budget set Bp,w. Therefore, WARP is not satisfied.7

Figure #2.17 7 In the course website you can find more applications of the WARP to taxes and subsidies, since this type of policies modify the set of affordable bundles for the individual in a similar fashion as we did in the above figures,

Page 38: MUÑOZ,F. Advanced microeconomic theory. WSU


Implications of WARP

Interestingly, the WARP has important implications on the set of optimal consumption bundles that a given consumer chooses before and after a price change. Let us analyze these implications by considering a reduction in the price of good one as the following figure illustrates by an upward pivoting effect on the budget line.

Figure #2.18

But, after the price change, we want to adjust the consumer’s wealth so that he can consume he is initial demand x(p,w) at the new prices. In other words, we shift the final budget line inwards (reducing this consumer’s wealth) until the point at which we reach the initial consumption bundle x(p,w). Importantly, the budget line after the shift (after the reduction in wealth) is parallel to budget line Bp’,w, reflecting the final price ratio. But, what is in particular the reduction in wealth that we must apply to this consumer in order for him to afford bundle x(p,w)?

Page 39: MUÑOZ,F. Advanced microeconomic theory. WSU


Hence, the Slutsky wealth compensation reflects that the consumer’s wealth has been reduced so that he can afford his initial consumption bundle before the price change.8 Given this definition of the Slutsky wealth compensation, we are now ready to establish a relationship between the law of demand and the WARP.

This is indeed an important result. It establishes that if, after the price change, the consumer’s wealth is compensated “a la Slustky” as described above, then the WARP becomes equivalent to the law of demand, i.e., quantity demanded and price move in different directions.

Let us next see one example in which the WARP restricts behavior when we apply Slutsky wealth compensation.

Figure #2.19

8 in contrast, the so-called Hicksian wealth compensation is such that the wealth level of the individual after the price change is adjusted so that he can still reach the same indifference curve he was reaching before the price change. We will comment on this type of wealth compensation later on in this chapter.

Page 40: MUÑOZ,F. Advanced microeconomic theory. WSU


The figure depicts price change similar to that represented above, where the price of good two is not affected by the price of good one decreasing. After pivoting outwards the budget line, we apply a Slustky wealth compensation so that the consumer can afford his initial bundle x(p,w). The consumer’s budget line after the wealth compensation is hence Bp’,w’.

A natural question at this point is where can the optimal consumption bundle under Bp’,w’, x(p’,w’), lie. Let us first examine whether such bundle can lie to the left-hand side of bundle x(p,w) (on segment A). First, note that the premise of the WARP is satisfied because both bundles x(p,w) and x(p’,w’) would be affordable under budget set Bp,w, since they both lie below Bp,w. However bundle x(p,w) is affordable under final prices and wealth, given that it lies below budget set Bp’,w’, implying a violation of WARP. Therefore, bundle x(p’,w’) cannot lie on segment A. Let us examine whether such bundle can lie to the right-hand side of bundle x(p,w) (on segment B). First, note that bundle x(p,w) is affordable under initial prices and wealth, since it lies below budget set Bp,w, but bundle x(p’,w’) will not be affordable, given that it would lie above budget set Bp,w. Hence, the premise of the WARP does not hold, and as a consequence, WARP would be violated if bundle x(p’,w’) lies on segment B. Thus, bundle x(p’,w’) must contain more of good one than bundle x(p,w). We can therefore conclude that a decrease in the price of good one (when we appropriately compensate wealth effects) leads to an increase in the quantity demanded for such good. This is what we refer as the compensated law of demand.9

Note an important distinction between the uncompensated law of demand and the compensated law of demand we just described. Specifically, the demand for good one can fall as a consequence of a decrease in the price of good, but only when wealth is compensated, as illustrated in the following figure.

Figure #2.20

This figure depicts a reduction in the price of good one similar to the one that we analyzed before. The individual demand after the price change is given by x(p’,w), where the quantity demanded of good one

9 Interesting practice: can you repeat this analysis for the case of an increase in the price of good one? First, you will need to pivot the budget line inwards. Second, note that the wealth compensation must imply in this case an increase in consumer’s wealth. Finally, you will have a budget set after the wealth compensation with two segments A and B. Determine which one is restricted or allowed according to WARP.

Page 41: MUÑOZ,F. Advanced microeconomic theory. WSU


goes down despite of the fact that the group became cheaper. In this case, therefore, the uncompensated law of demand is not satisfied since quantity demanded and price move in the same direction. This is the reason why we say that WARP is not a sufficient condition to yield the uncompensated law of demand, i.e., law of demand for price changes that were not compensated. Hence, WARP and the compensated law of demand are equivalent, but WARP and the uncompensated law aren't necessarily related. We can examine the last point by checking whether the WARP was satisfied under the uncompensated law of demand. In particular, bundle x(p,w) was affordable under the initial budget set Bp,w, but the consumption bundle after the (uncompensated) price change x(p’,w) was not affordable, since it lies above budget set Bp,w. Therefore, the premise of WARP is not satisfied and hence WARP is not violated. This example shows a case in which WARP is not violated by the uncompensated law of demand is violated. As a consequence, this example illustrates that the WARP and the uncompensated law of demand are not necessarily related.

The Walrasian demand function is differentiable in both prices and wealth under relatively general conditions. Let us next examine the relationship between the compensated law of demand and the WARP. In order to do so, let us first totally differentiate the Walrasian demand function, as follows:

( , ) ( , )p wdx D x p w dp D x p w dw

obtained from the Slutsky wealth compensation

And since the consumer's wealth is compensated,

( , ) (this is the differential analog of ( , ) ).


( , ) ( ,p w

dw x p w dp w p x p w

dx D x p w dp D x p

)[ ( , ) ]

or equivalently,

( , ) ( , ) ( , )


Tp w

w x p w dp

dx D x p w D x p w x p w dp

Hence, the compensated law of demand, dpdx<=0, can be also expressed as

( , ) ( , ) ( , ) 0Tp wdp D x p w D x p w x p w dp

where the term in brackets is the so-called Slutsky (or substitution) matrix.

11 1


( , ) ... ( , )

( , )

( , ) ... ( , )


( , ) ( , )s ( , ) ( , )



l llk k


s p w s p w

S p w

s p w s p w

x p w x p wp w x p w

p w

The next proposition describes the conditions under which the substitution matrix in negative semidefinite.

Page 42: MUÑOZ,F. Advanced microeconomic theory. WSU


Proposition: If ( , ) is differentiable, satisfies WL, Homog(0) and WARP, then ( , ) is

negative semidefinite,

x p w S p w

( , ) 0 for any Lv S p w v v

The fact that the substitution matrix is negative semidefinite implies that all terms in the main diagonal of the matrix must be weakly negative. In particular, the terms in the main diagonal are sll(p,w), representing the “own-price effect”, i.e., how quantity demanded for good L is affected by the price of good L.10

As described above, the substitution effect sll(p,w) embodies two effects:

(-) for usual goods ( )for normal goods(+) for Giffen goods (-) for inferior goods

substitution effect (-)income effecttotal effect

( , ) ( , )( , ) ( , )l l

ll ll

x p w x p ws p w x p w

p w

We can rearrange this expression in order to state that the total effect is equal to the substitution effect minus the income effect. [Recall that when the quantity demanded of good L decreases (increases) in the price of good L we refer to that good as usual (Giffen, respectively). Similarly, when the quantity demanded of good L increases (decreases) in wealth, we prefer to that good as normal (inferior, respectively).] Let us describe each of the terms in the above expression. First, from our previous discussion, we note that the substitution effect is negative for all types of goods, since sll(p,w)<=0. Second, the total effect measures the change in the quantity demanded for good L. as a result of a change in the price of good L. hence, it considers and uncompensated price effect, given that it reflects a change in the price of good L. without adjusting the wealth of the individual. Finally, the third term measures the wealth effect since it measures the change in demand due to the adjustment in wealth as a result of a Slutsky wealth compensation.11

Let us next provide a more graphical intuition of the previous discussion. The following figure represents a reduction of the price of good one, where good one is considered “normal”. First, the reduction in the price of good one enlarges the set of feasible bundles for this consumer which allows him to reach a higher indifference curve, at which he increases the amount of good one consumed from x10 to x12. We also include a graphical representation of the Walrasian demand associated to the above figure. The demand indicates that a decrease in the price of good one leads to a increase the quantity demanded, i.e. inducing a negatively sloped Walrasian demand curve (so the good is “normal”).

10 Importantly, note that in the fact that the substitution matrix satisfies NSD doesn't imply that this matrix is symmetric. As a remark, in section 3G of MWG we will see that if the utility function is continuous and represents a strictly convex preference relation that satisfies LNS, then the substitution matrix is symmetric. In many applications, these conditions are assumed and therefore the substitution matrix will be symmetric. However, note that in the case of preferences for perfect substitutes the indifference curves will not satisfy strict convexity and we will not be able to use the results in section 3G in order to guarantee that the substitution matrix is symmetric. 11 We describe the concept of the Slutsky wealth compensation below. Recall from the lectures the difference between a Slutsky and a Hicksian wealth compensation. The former adjusts the consumer’s wealth so that he can still afford the initial bundle before the price change, whereas the later adjusts wealth so that the consumer can still reach the same utility level as before the price change (graphically reaching the same indifference curve).

Page 43: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.21

The increase in the quantity demanded of good one as a result of a decrease in its price represents what we refer above as the total effect. Nonetheless, it might be interesting to disentangle total effect into the substitution and income effects. The following figures replicate reduction in the price of good one and the total effect. The figures also include the Slutsky wealth compensation. In particular, after the price change we reduce this consumer’s wealth so that he can afford the same consumption bundle that he was buying before the price change. Graphically we do so by shifting the budget line after the price change inwards until it “touches” the initial bundle. The consumer can afford any consumption bundle along the new budget line. In the top figure, the consumer’s optimal point involves buying only X13 units of good one. The figure at the bottom we replicate the Walrasian demand we found above, but we also include the so-called “constant purchasing power” demand curve, resulting from applying the Slutsky wealth

Page 44: MUÑOZ,F. Advanced microeconomic theory. WSU


compensation. Specifically, note that a given reduction in the price of good one produces a relatively small increase in the quantity demanded for good one when we hold the consumer’s purchasing power constant (applying the Slutsky wealth compensation) but implies a relatively large increase in the quantity demanded for good one when we do not hold the consumer’s purchasing power constant (in the case of the Walrasian demand).

Figure #2.22

In the following figure we replicate our previous figure, adding the so-called Hicksian demand (also referred as “constant utility” demand curve). The graph shows the Hicksian wealth compensation after the

Page 45: MUÑOZ,F. Advanced microeconomic theory. WSU


price change. The consumer’s wealth level is adjusted so that he can reach his initial utility level (graphically reaching the same indifference curve). In this graphical example, this implies a more significant wealth reduction than when we apply the Slutsky wealth compensation. Therefore, the new budget line after the wealth compensation is tangent to the initial indifference curve at a point where the consumer demands X11 units of good one. In the bottom figure, the Hicksian demand curve reflects that for a given decrease in the price of good one, the consumer heavily reduces his consumption of good one.

Figure #2.23

Page 46: MUÑOZ,F. Advanced microeconomic theory. WSU


Let us now summarize our analysis about how different types of demand for good one are affected by a decrease in the price of that good. First, a decrease in the price of good one increases the quantity demanded of that good solely due to the price effect (either measured by the Hicksian demand curve or the CPP demand curve). This increase is smaller than the increase in the quantity demanded measured by the optimal consumption bundle x(p,w), since x(p,w) measures both price and wealth effects. Second, the wealth compensation (wealth reduction in our example) that maintains the original level of utility of the individual is larger than the wealth compensation that maintains purchasing power unaltered.12

Let us now briefly review the Slutsky equation, and the classification of goods as normal, inferior, or Giffen. The following three figures depict a decrease in the price of the good in the horizontal axis (food). In the first figure the substitution effect moves in the opposite direction as the price change (so a reduction in the price of food implies a positive substitution effect), and the income effect is also positive indicating that this good is normal.

Figure #2.24

In the second figure (below), the substitution effect is still moving in the opposite direction as price, but the income effect now negative, partially offsetting the increase in the quantity demanded associated with the substitution effect. Nonetheless, the total effect is still positive. In this case, food is an inferior good since the income effect is negative.

12 Reality check: note that is an undergrad, you have been using the Hicksian wealth compensation when examining income and substitution effects. However, the Slutsky equation (and the Slutsky matrix) is obtained from applying the Slutsky wealth compensation. An interesting practice is to redo the graphical representation of the income and substitution effects of your intermediate microeconomics textbook applying the Slutsky wealth compensation rather than the (probably used) Hicksian wealth compensation.

Page 47: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.25

Finally, in the third figure the income effect is also negative and sufficiently large to completely offset the substitution effect. As a consequence, the total effect becomes negative. [Graphically, note that the first two figures generate a negatively sloped Walrasian demand curve, while the third figure produced a positively sloped Walrasian demand.] 13

Figure #2.26

13 For a readable first approach to income and substitution effects, see NS pp. 141-158.

Page 48: MUÑOZ,F. Advanced microeconomic theory. WSU


Comparing the second and third figures we can conclude that an inferior good doesn't necessarily have to be a Giffen good, as shown in the second figure.

In our preceding discussions of consumer theory we considered an individual decision-maker who solves his Utility Maximizing Problem (UMP) by choosing a bundle that maximizes his utility level subject to his budget constraint. Alternatively, we can understand the process by which the consumer chooses an optimal consumption bundle as an optimization problem in which the consumer minimizes his total expenditure on goods and services subject to the constraint that he wants to reach a given utility. We refer to this minimization problem as the “expenditure minimization problem” (EMP). More formally,


Expenditure minimization problem:


. . ( )x

p x

s t u x u

As the following figure indicates, the EMP can be understood as a problem in which the consumer wants to reach a utility level associated with a particular indifference curve, while spending as little as possible (shifting the budget line towards the origin). Intuitively, note that the budget line strictly above bundle X* cannot be a solution to the EMP since, despite reaching the utility level U, it does not minimize total expenditure of this consumer, given that there is another budget line for which total expenditure is lower and utility level U is still reached. Finally, the budget line strictly below X* cannot be the solution to the EMP problem since, despite being very cheap, it does not satisfy the constraint of reaching utility level U.14

Similarly to the UMP process, we can set up the Lagragian associated to this minimization problem, and then apply first-order (necessary) conditions with respect to each good and the Lagrange multiplier. Setting up the Lagrangian:

Focusing on interior solutions, the above first order conditions are satisfied with equality, therefore:

Solving the EMP:

14 Assuming college students need a certain amount of groceries to survive the week, they may chose a set higher than x*, in which case the student is spending more than their budget (expensive, name-brand items). However, choosing a bundle below x*, the student is not reaching his ideal level of utility because they are buying all the cheapest brands, and not getting their favorite cereal.

Page 49: MUÑOZ,F. Advanced microeconomic theory. WSU


At the optimal bundle, x*, the slope of indifference curve = slope of budget line.

The bundle that solves the EMP is denoted as the Hicksian demand, h(p,u), which is the function of the price vector and the utility level that the individual wants to reach.15

Hicksian demand satisfies different properties. In particular, when the utility function is continuous and represents a preference relation satisfying LNS, the Hicksian demand associated to the EMP satisfies:

1. Homogeneity of degree zero in prices: h(p,u)=h(p, u) for any p, u and >0. This property hence states that if bundle X is a solution to the EMP facing a price vector P then it must also be a solution to the problem when all prices have been scaled by the factor . Intuitively, a common change in all prices doesn't alter the slope of the consumer’s budget line. The following figure provides a graphical representation. First, note that an increase in the price of both good one and two in the same proportion produces a downward shift in the budget line. However, the consumer must reach a utility level U in order to satisfy the constraint of the EMP. Hence, he will need to spend more in order to buy bundle X* at the new price ratio. As a consequence, he selects the same optimal bundle before and after the price change.1617

2. No excess utility: For any optimal consumption bundle x* that solves the EMP, the utility level satisfies u(x)=u. The following figure provides a graphical intuition of this property. First consider a bundle X that solves the EMP, for which the consumer obtains a utility level of u(x)=u1>u. That is, he obtains a utility level higher than that he must reach when solving his EMP. But, therefore, we can find another bundle X’ scaling down X, X’=X, but very close to X ( close to 1) for which the utility level associated to the consumption of bundle X’ is larger than U, u(X’)>U. [graphically, this implies that the indifference curve passing through bundle X’ is associated to a utility level higher than U.] Therefore, bundle X’ exceeds the minimal utility level that the consumer must reach in his EMP and it is cheaper than the original bundle X. Concluding, for a given utility level

15 As a formality, note that when the EMP provides more than one solution we write that X* belongs to h(p,u), whereas in the case in which the EMP this only one solution, we write X*=h(p,u). 16 As a remark, note that we define homogeneity of degree zero for all alpha> 0. This implies that a similar argument as the one developed here for any price increase in both goods one and two (for alpha>1) can be extended to a price decrease in both goods as well when alpha<1. 17 On a weekly basis, you must have 2 Sodas and 1 candy bar in order to function. Each of those items cost $1 in the vending machine. Therefore, you spend $4 each week to satisfy your needs, x*. If prices were to increase by 50%, each would now cost $1.50. With only $4 in your pocket, your budge constraint must shift downward. In order to reach your original indifference curve, you must now spend $6 per week in order to consume your ideal bundle, x*

Page 50: MUÑOZ,F. Advanced microeconomic theory. WSU


U that the consumer has to reach in the EMP, bundle h(p,u) does not exceed U since otherwise he would be able to find a cheaper bundle that exactly reaches utility level U.18

Figure #2.27

3. Convexity:

If the preference relation is convex, then h(p,u) is a convex set with multiple bundles satisfying the solution. (This property is also related with the uniqueness property.)

4. Uniqueness: If the preference relation is strictly convex, then h(p,u) contains a single element. The following figures describe properties 3 and 4.19 This is a more restrictive concept than convexity

18 Example of soda and candy bar continued: If you fulfill your necessary soda/candy ratio, but still have some loose change in your pocket, there is the opportunity to reach a higher utility by purchasing additional items. As a rational consumer, your goal is to maximize your happiness each week, therefore, you must spend all the money in your pockets.

19 Example 3E1 in MWG provides an explanation about how to find the Hicksian demand for a Cobb-Douglas utility function. It is simple to check for homogeneity of degree zero in prices, no excess utility and uniqueness.

Page 51: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.28

5. Compensated law of demand:

For changing prices p and p’,

' '

' '

( ) [ ( , ) ( , )] 0

Implication: for every good ,

( ) [ ( , ) ( , )] 0k k k k

p p h p u h p u


p p h p u h p u

Page 52: MUÑOZ,F. Advanced microeconomic theory. WSU


Interestingly, note that this property is definitely true for the Hicksian demand where we have already compensated the wealth level of the consumer so that he can still reach he is initial utility level. However, this property is not necessarily true for the uncompensated Walrasian demand, as described in previous lectures.

Plugging the result of the EMP into the objective function, we obtain the value function of this optimization problem

( , ) ( , )p h p u e p u

e(p,u) represents the minimum expenditure that the consumer needs to make in order to reach utility level U when prices are p. This expenditure function also satisfies a set of interesting properties when the utility function is continuous and it represents a preference relation that satisfies LNS:

1. Homogeneity of degree one in prices. Intuitively, we know that the optimal bundle of the EMP, h(p,u), is not changed when all prices change in the same extent (because the Hicksian demand satisfies homogeneity of degree zero in prices). Such change in prices just makes it more or less expensive to buy the same bundle. That is, e( ,u)= e(p,u).

2. Strictly increasing in U. intuitively, reaching a higher utility level for a given price vector requires an increase in your expenditure, as the following figure indicates.

Page 53: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.29

3. Non-decreasing in prices, for any good K.

That is, higher prices mean higher expenditure in order to reach a given utility level. The following slide provides a more detailed explanation about this property:

Page 54: MUÑOZ,F. Advanced microeconomic theory. WSU


Note, that we only increase the price of good K leaving the prices of all other goods unaltered. Then the minimal expenditure that the consumer must make in order to reach utility U at prices p’ is higher than when he buys expenditure minimizing bundle X’ at lower prices p. And similarly, his minimum expenditure is larger than it was at prices p. (Indeed, at prices p, bundle X minimizes the consumer expenditure more than any other bundle X’)

4. Concave in prices. We provide a graphical intuition of this property in the following figure.

Page 55: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.30

Starting from bundle X’, we know that the total expenditure at prices p’ is lower when buying bundle X’ than when buying any other bundle X. For instance, this is true for bundle X , since p’x’<=p’ X . Similarly, at prices p’’, the total expenditure when buying bundle X’’ is lower than from buying any other bundle X, where for instance X= X . Using the results found above for prices p’ and p’’, we obtain

' ''

' ' '' '' ' ''

' ' '' ''

( , )( , ) (1 ) ( , )

' ''

(1 ) [ (1 ) ]

(1 )

( , ) (1 ) ( , ) ( , ) Concavity is confirmed


e p ue p u e p u

p x p x p p x

p x p x p x

e p u e p u e p u

5. Continuous in prices and utility.

Page 56: MUÑOZ,F. Advanced microeconomic theory. WSU



In previous lectures we have described both the utility maximization problem (UMP), the so-called primal problem in consumer theory, and its dual: the expenditure minimization problem (EMP). But, when can we guarantee that the solution x* to both problems coincide?

Rubinstein provides a very intuitive approach to this question. Consider a function M(t) that describes the distance that a turtle travels in time t, as the following figure illustrates.

Figure #2.31

In this case the primal and dual problems would provide us with the same answer: “the maximal distance that a turtle can travel in t* units of time is x*” (in this case we're moving from the time axis to the distance axis), or alternatively, “the minimal time that a turtle needs to travel x* distance is t*” (here we are moving from the distance axis to the time axis). Note that in order to obtain the same answer from both statements we critically need that the M(t) function satisfies monotonicity. Otherwise, we would have a figure like the one below. In particular, the maximal distance traveled after both t1 and t2 is x*.20 Time, t, is not a function of x otherwise.

Figure #2.32

20 Note that this figure is extremely counterintuitive since the turtle increases the distance traveled until reaching a point from which the turtle starts to travel backwards, inducing a decrease in the distance traveled.

Page 57: MUÑOZ,F. Advanced microeconomic theory. WSU


Similarly we also require that the function M(t) satisfies continuity. Otherwise, the function would show a pattern similar to that depicted below. Specifically, the minimum time required to travel distance x1 and x2 is t*for both distances.21

Figure #2.33

Given the condition of monotonicity and continuity, we are now ready to specify under which conditions we can guarantee that the solution UMP and EMP problems coincide: let the utility function be monotonic and continuous. Then if bundle x* is the solution to the UMP, then it must also be the solution to the EMP. Let us now more formally approach the issue of duality in consumption. In this regard, we first need a few definitions.

Hyperplane: for some and , the set of points in such that:


Half-space: is the set of bundles for which . That is:





p c

x p x c

x p x c

x p x c

The following figure depicts a hyperplane and half-space in R2. First note that the hyperplane is simply represented by the set of bundles whose cost is c. (Note that this definition could be generalized to more dimensions, indicating the consumption of more than two goods. For instance, if the consumer is considering bundles with three different goods, the hyperplane would be a flat surface in R3 (or 3D).) On the other hand, the half-space represents the set of bundles whose cost is larger than c.

21 Again, this is very counterintuitive turtle since, after traveling a distance x1, the turtle makes a huge jump from x1 to x2 in a nanosecond.

Page 58: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.34

We are now ready to state the separating hyperplane theorem.

Intuitively, this theorem states that every convex and closed set K can be equivalently described as the intersection of the half-spaces that contain it. Indeed, as the following figure indicates, as we create more and more half-spaces, their intersection becomes set K.

Figure #2.35

A natural question at this point is what would happen if the set that we are trying to “equivalently describe” is not convex. As the following figure suggests, the intersection of half-spaces doesn't coincide with set K, and we cannot use several half-spaces to “equivalently describe” set K. interestingly, the intersection of half-spaces that contain set K is the smallest, convex set that contains K. this set is known

For every convex and closed set , there is a half-space containing and excluding

any point outside of this set.

That is, there exist and such that

for all elements in the set, x



x K

p c

p x c


but for all elements outside the set, x K, p x c

Page 59: MUÑOZ,F. Advanced microeconomic theory. WSU


as the convex hull of set K, and denoted as Kbar. (Of course, the convex hull Kbar is itself convex, unlike set K).

Figure #2.36

Let us now introduce an additional definition that will be used frequently in our future expositions. We denote the support function of the nonempty closed set K as the set

Intuitively, the support function selects, for a given price vector p, the bundle X that minimizes the total expenditure pX.22 Using the support function as defined above, we can now construct set K. In particular, for every price vector p, we can define half-spaces whose boundary is the support function of set K. That is, we define a set of bundles for which pX>=minuK(p). [Note that all bundles x in such half-space contain elements in the set K, but do not contain elements outside set K.]23

Therefore, the intersection of the half-spaces generated by all possible values of p describes (“ reconstructs”) the set K. That is, set K can be described by all those bundles X such that pX>=muK(p) for every p.

We describ the previous intuitions in the following figure. First, note that for a given price vector, the support function muK(p) selects the element in the set K that minimizes total expenditure pX. This element is bundle x1 in this graphical example. Second, we can now define the half-space of the previous

22 Remember that from a mathematical point of view, we use the inf operator when we cannot guarantee that the min of a particular function is well defined. Generally, however, most of the functions we encounter in this course have a well defined min and max. 23 Concavity of the support function is an interesting mathematical result. You would prove that in a homework assignment.

( ) inf for all and Lk

xu p p x x K p

: ( ) for every

By the same logic, if is not convex, then the set : ( ) for every

defines the smallest closed, convex set containing (i.e., the convex hull of set



K x R p x u p p

K x R p x u p p



Page 60: MUÑOZ,F. Advanced microeconomic theory. WSU


hyperplane, as follows: px>=px1 for all x in K, and where px1=muK(p). Graphically, this inequality identifies all bundles to the left of the hyperplane px1. Third, we can repeat the previous procedure for any other bundles (for example, a bundle X3 in the north boundary of set K. repeating this process enough times provides a full description of set K.

Figure #2.37

The above definition of the support function provides us with a useful duality theorem that we will use in the future. Consider a nonempty, closed set K, and let muK(.) be its support function. Then there is a unique element in set K, xbar, such that

Intuitively, note that the above theorem simply states that, for a given price vector pbar, the support function chooses bundle xbar that minimizes total expenditure, and that such expenditure is therefore pbar*xbar. In addition, the derivative of total expenditure with respect to price (when evaluated at the optimum) is xbar. We will use this theorem in our discussion of the expenditure function below.

Relationships between the expenditure function and hicksian demand

Let us assume that the utility function is continuous and represents a preference relation that satisfies LNS and that is strictly convex. Then for all p and u, we have


( ) ( ) is differentiable at

Moreover, in this case ( )k k


p x p

p x u p u p

u p x

( , )( , ) for every good k


e p uh p u k


Page 61: MUÑOZ,F. Advanced microeconomic theory. WSU


The identity tells us that, if we want to find the Hicksian demand for good k and we have information about the expenditure function, we just need to differentiate e(p,u) with respect to the price of good k.

Proof I (using the Duality Theorem)

First note that the expenditure function is the support function for the set of all bundles for which utility reaches at least a level U. That is,

or, alternatively, the upper contour set of bundle x is convex and closed. If we use the duality theorem, we can then state that this is a unique bundle in this set, h(p,u), such that


Where e(p,u) is the support function of this problem. Let us see this result in the following figure.

Figure #2.38

First, note that the upper contour set of bundle X is indeed closed and convex. We can then identify the support function of this set as the hyperplane associated to the lowest cost with elements that still belong to the upper contour set. In particular this occurs with hyperplane px*=e(p,u), which provides us with the minimal expenditure that still reaches utility level U.

We can more formally see the extreme similarity between the duality theorem and the main result of the EMP as follows.

: ( )Lx R u x u

Page 62: MUÑOZ,F. Advanced microeconomic theory. WSU


Moreover, note that from the duality theorem the limiting of the support function coincides with this unique bundle (expenditure minimizing bundle). That is,

Proof II - using first order conditions

Proof III - using the envelope theorem24

Let us first understand the economic intuition behind the envelope theorem

24 For an intuitive description of the envelope theorem from an economic point of view, see NS pp. 32-36.

( , )( , )

Duality Theorem

If is closed and ( ) is its support function, then there is unique element such that:

( ) ( ) is differentiable at

( , ) ( , ) ( , ) is

kUCS h p ue p u

k k

K u x K

p x u p u p

p h p u e p u e u

differentiable at


( )

( , ) ( , )

p k



u p x

e p uh p u


( , )( , ) for every good

( , ) ( , )



e p uh p u k


e p u h p u

Page 63: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.39

Using the envelope theorem function in the expenditure function we easily obtain:

Note that this result is convenient from a practical point of view. In particular, the researcher does not know the actual expression of the expenditure function or if its expression is relatively intractable, this result states that the researcher can still measure the reaction of a consumer’s minimal expenditure to changes in prices by knowing the Hicksian demand.

( , )

( , ) [ ( , )]


[ ( , )] ( , )( , )

and since the hicksian demand is already at the optimum, indirect effects

are negligible, 0, implying

( , )( , )


k k

k k

h p up


e p u p h p u

p p

p h p u h p uh p u p

p p

e p uh p u


(convenient if you don't know ( , ) or it is a huge expression). e p u

Page 64: MUÑOZ,F. Advanced microeconomic theory. WSU


Relationship between the Walrasian and Hicksian demand

Consider a continuous utility function, representing a strictly convex preference relation that satisfies LNS. Then for all p, w and u=v(p,w), we have

Importantly, its expression coincides with the slk(p,w) that we discussed in our explanation of the Slutsky equation. Therefore, the matrix of partial derivatives of the Hicksian demand, Dph(p,u), coincides with the Slutsky matrix.25 Let us further consider the relationship between these two demand curves. Consider a consumer facing prices and wealth (pbar,wbar) and attaining a utility level ubar.

Application of IE and SE: The consumer as a labor supplier

In this section we apply our analysis of the income and substitution effects to an individual’s decision about how many hours to work. In particular, this individual enjoys consumption of all other goods, x (a vector of N different goods), and leisure hours, L. Thus, his UMP can be expressed as follows



max( , )

. . , and

x L


i ii

x L

s t p x M wz M T z L

Where M is the individual’s total wealth coming from two sources: the z hours he dedicates to work (paid at a wage rate of w per hour) and his non-labor income, M , e.g., inheritance, government subsidies, etc. (Note that his total time, T, must be dedicated to either work, z, or leisure, L).

25 recall that both of these matrices have LxL dimension, since they reflect both own- and cross-price effects among all goods.

( ) inf for all and Lk x

u p p x x K p

Note that ( , ). In addition, we know that for any ( , ), ( , ) ( , ( , )).

Differentiating this expression with respect to , and evaluating it at ( , ), we get

( , ) ( , ( ,

l l


l l


w e p u p u h p u x p e p u

p p u

h p u x p e p


)) ( , ( , )) ( , )l

l k

u x p e p u e p u

p w p

Page 65: MUÑOZ,F. Advanced microeconomic theory. WSU


We can rewrite the above UMP using the Composite Commodity Theorem, as follows: if the prices of all goods maintain a constant proportion with respect to the price of labor (wage), i.e., 1 1p w , 2 2p w,…, then we can represent these goods by a single (composite commodity), y, with price p. This is useful because when we examine many goods the relationships between demand for all of them becomes very complicated. It is useful to be able to group them together into large groups and then we can examine one of these goods using this theorem. We therefore collapse the above UMP to only two goods: the composite commodity y and the number of hours dedicated to work z. That is, we can rewrite the above UMP as follows:

,max ( , )

. .

y zv y z

s t py wz M

The Lagrangian associated to this UMP is


marginal disutility per doll

( , ) ( )

and the FOCs (for interior optimum) are

: 0

: 0

and solving for in both of them, we obtain




zz y



L v y z M wz py

vv p

y p

vv w

z w

v w

v p



ar of labor marginal utility per dollar of good consumption

using the constraint, we finally obtain the Walrasian demand

for the composite commodity ( , , ) and the labor

supply function ( ,






x w p M

x w

, ).p M

The following figure illustrates the individual’s optimal choice of an amount of the composite commodity, y, and hours of work, z.

Page 66: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.40

First, note that the budget line is represented by an upward sloping straight line. Intuitively, an increase in the amount of hours worked provides the individual with a larger amount of wealth to spend on consumption goods, measured by the composite commodity in the vertical axis.26 Second, note that this individual’s indifference curves show increasing utility levels as we move northwest, indicating that the individual is better off when his consumption of the composite commodity increases and the number of working hours decreases. For a starting wage rate w, this individual selects bundle A, where he works 1z

hours. When the wage rate increases to 1w the budget line becomes steeper (i.e., for every extra hour of

work the individual can afford a larger amount of the composite commodity). At this new wage rate w1

26 In particular, note that the budget line originates at M

p, which indicates the amount of the composite commodity

he can afford when he does not obtain any resources from working. In addition, the positive slope of the individual’s budget line is given by the price ratio w/p. Also note that an increase in labor supplied must be accompanied by an increase in the composite consumption commodity to keep utility constant in the top figure. Further, the indifference curves reflect the fact that preferences are quasiconcave. Review the GR handout on labor supply on the class website for further details on this.

Page 67: MUÑOZ,F. Advanced microeconomic theory. WSU


the individual chooses bundle B, spending 2z hours working. The change from bundle A to B in the top

figure is also reflected in the bottom figure. Specifically, we represent working hours in the horizontal axis and the wage rate in the vertical axis. For an increase in the wage rate from w to 1w , the number of

working hours increases from 1z to 2z , indicating that a higher wage induces this worker to spend more

hours on the job. When the wage rate experiences a further increase, from 1w to 2w , the budget line

becomes steeper in the top figure, and the individual selects bundle C, illustrating a reduction in the number of working hours, from 2z to 3z . This effect is also reflected in the bottom figure, where an

increase in the wage rate from 2w to 3w induces a reduction in the individual’s labor supply curve.

Summarizing, labor supply initially increases as a result of higher wages but then decreases, acquiring a backward bending pattern as that in the bottom figure.27 This is due to the relative size of the substitution and income effect associated to the increase in the wage rate (a change in the price of one of the goods that the individual consumes), as we examine next. Intuitively this also makes sense. Unlike other goods where if demand rises shown by an increase in price supply will also rise. In this case however, the individual only has a fixed amount of “supply”, hours of labor so higher wages should not cause much of an increase in supply and can quite naturally cause supply to be decreased.

The following figure illustrates the substitution and income effects from a wage increase.

Figure #2.41

27 This effect has been empirically confirmed in many occupations, such as nursing services in Massachusetts. Experiencing a shortage in the number of nurses, managers of hospital facilities decided to increases the wage per hour in order to attract more nurses. Unfortunately, the increase in wages was counterproductive, at least in the short run, since it induced nurses currently working for those hospitals to reduce the number of hours they chose to work.

Page 68: MUÑOZ,F. Advanced microeconomic theory. WSU


Using the same analysis as for two consumption goods, we make a wealth compensation after the price change that leaves the consumer just as well off as before the price change. This is indicated in the figure by the downward shift in the budget line after the price change towards a new budget line that is tangent to the indifference curve the individual reaches before the price change, 1I . Given this new budget line,

the individual selects bundle D. Thus, the substitution effect of an increase in the wage rate is measured by an increase in the number of working hours, from az to dz , while the income effect is represented by

a decrease in the number of working hours, from dz to bz . Intuitively, working hours become more

attractive (relative to the composite good) for the individual, leading him to offer more labor services as reflected in the substitution effect. This higher wage per hour, however, allows the individual to afford more consumption without the need to work so many hours per day, which induces him to reduce his working hours, as indicated by the negative income effect. Therefore, the income effect partially offsets the substitution effect, leading a relatively minor (but still positive) total effect. (Note that when the income effect is significantly negative, and in absolute value, larger than the substitution effect, the total effect of a higher wage rate becomes negative. In such case, working hours decrease as a result of an increase in the wage rate, becoming a Giffen good. Another way of looking at this is that the worker is now choosing to consume another good ‘freetime’ so the person has reached the point where his marginal utility of ‘freetime’ outweighs his marginal utility of income.

The figure also illustrates the compensating variation (CV)28 associated with the wage increase. In particular, after the wage increase the worker’s wealth is compensated (reduced) so that the worker can maintain his initial utility level (before the wage increase). Graphically, we do so by shifting the worker’s budget line downwards after the wage increase in a parallel fashion (maintaining the price ratio) until the worker reaches his initial utility level. This wealth compensation represents the compensating variation. Indeed, recall that the vertical intercept of the worker’s initial budget line is Mbar/p, while the vertical

intercept of his final budget line is 'M

p, where 'M represents the wealth level that the individual needs

in order to reach his initial utility level at the final price ratio, i.e., 'M


. Therefore, the amount of

money that the individual is willing to give up in order to reach his initial utility level at the final price

ratio (the compensating variation) is'M M

p p . Hence, the difference in the vertical intercepts 'M M

represents the compensating variation if the price of all other goods is normalized to one.

After describing the income and substitution effect in the labor market from a graphical approach, we next formalize these two effects using the Slutsky equation.

28 Recall compensating variation is the the amount of money which must be taken from the consumer in the new situation to make him as well off as he was in the initial situation.

Page 69: MUÑOZ,F. Advanced microeconomic theory. WSU



First, let us state the previous problem as EMP


. . ( , )

From this EMP we can find the optimal Hicksian demands,

( , , ) and ( , , ), and inserting them into the

objective functio

y z

y z

M py wz

s t v y z v

h w p v h w p v

n, we obtain the value function of this EMP

(the expendature function):

( , , ) ( , , ) ( , , )y ze w p v ph w p v wh w p v

( , , )

We know that

( , , ( , , )) ( , , )

Differentiating on both sides and using the chain rule

and since we know that ( , , ), then

z z

z z z z z z

e w p vzw

z z

x w p e w p v h w p v

x x h x h xe e

w e w w w w e w

h w p v

x h

w w

( , , )z




xh w p v


Let us next interpret the above Slutsky equation in terms of the substitution and income effect in the labor market. First, the term /zh w represents the substitution effect. It is always positive, indicating that an

increase in wages increases the worker's supply of labor (as long as we compensate the wealth of the worker, so that his initial utility level is unaffected29). The second term, ( / )( ( , , ))z zx e h w p v denotes

the income effect. When /zx e >0 then an increase in wages makes the worker richer, and he decides

to work more. In this case, working hours are regarded as a normal good, and the income effect reinforces the substitution effect yielding an upward sloped labor supply curve. If, in contrast, /zx e <0 then an

increase in wages makes the worker richer, but he decides to work less. In this case, working hours are regarded as an inferior good, and the (negative) income effect moves in the opposite direction of the substitution effect. When the income effect is relatively small, the total effect of an increase in wages is still positive, producing a positively sloped labor supply curve. If, however, the income effect is sufficiently negative, the (negative) income effect completely offsets the (positive) substitution effect, implying that the total effect of an increase in wages becomes negative, yielding a negatively sloped labor supply curve.

The following figure illustrates a case in which the income effect resulting from a wage increase is positive, yielding a positively sloped labor supply curve. For completeness, we also include the compensated labor supply curve, which reflects the substitution effect due to the wage increase, but not the income effect. As a consequence, the compensated supply curve is always positive sloped. The uncompensated labor demand curve represents both the substitution and income effect associated to the wage increase.

29 Note that these would imply reducing the worker’s wealth in the case of a wage increase, or increasing the worker’s wealth in the case of a wage decrease.

Page 70: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.42

Hence, the uncompensated labor supply curve is positively sloped when the income effect is positive, or when, despite being negative, is still smaller than the substitution effect (in absolute value). When the income effect is negative, and sufficiently strong to totally offset the substitution effect, the uncompensated supply curve becomes negatively sloped as the figure below illustrates.30

30 From an empirical point of view, there is substantial evidence showing that: (1) the labor supply curve for British, American, Japanese and Dutch men is virtually vertical, indicating that the substitution and income effect completely offset each other; (2) the labor supply curve for British and German married women is almost vertical; (3) the labor supply curve for US and Canadian married women is slightly backward bending, indicating that the income effect becomes significantly negative for relatively high wages; and (4) supply curve for single women in most developed countries is clearly positively sloped, indicating that the income effect is not significantly negative (i.e., the elasticity of labor supply is larger than 4 in many studies).

Page 71: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #2.43

An interesting application of the income and substitution effects in the labor market is related with the so-called Laffer curve. The supporters of this curve argue that increasing the taxes on wages might initially increase tax revenue but, after a certain tax rate, further increasing taxes reduces the incentives to work as a consequence of workers being subjected to a higher tax rate. This means that there is an optimal tax rate which will bring in the most taxes and any increase or decrease from that point will cause a reduction in tax revenue. Graphically, the Laffer curve resembles an inverted-U, with tax rate in the horizontal axis and local tax revenue in the vertical axis31.

31 Invented by Arthur laffer the curves show the relationship between tax rates and tax revenue.

Page 72: MUÑOZ,F. Advanced microeconomic theory. WSU



Figure #2.44

In order to better understand the precise relationship between tax rate and tax revenue in this setting, let us define tax revenue as

Tax revenue = ( )w H ,

where H( ) represents the total number of hours supplied when the wage net of taxes is =(1-τ)w (so w is the nominal wage). Differentiating with respect to the tax rate, we obtain


positive effectnegative effect

( )T dH

w H wd

Intuitively, the first term represents an increase in total revenue because, for a given number of working hours, a higher tax rate increases total revenue. The second term denotes, however, a negative effect from increasing the tax rate. In particular, higher taxes induce workers to work fewer hours, decreasing the tax revenue. Therefore, the negative effect dominates the positive effect (and as a consequence an increase in the tax rate produces a reduction in tax revenue) if


( )

dH w

d H

And multiplying both sides by (1-τ),


1 (1 ) 1

( )

dH wE

d H

Where the term in the right-hand side is the elasticity of labor supply with respect to net wages. Hence, for tax revenue to fall from a small increase in the tax rate, it must be that the elasticity of labor supply is

larger than 1

suggesting that tau would have to be greater than .50. This condition however is

relatively difficult to be satisfied. If taxes on labor are significant (as in Japan, Sweden, or the US during

32 The Laffer curve is a large part of a supply side economics which believes that changes in marginal tax rates have a great effect on economic activity. www.econlib.org/library/Enc/SupplySideEconomics.html

Page 73: MUÑOZ,F. Advanced microeconomic theory. WSU


the 1970s), elasticity of labor supply was indeed larger than1

. Nonetheless, an average US worker

nowadays (making around $35,000 per year) is subject to approximately a 25% wage tax. This would imply that, in order for an increase in the tax rate to be counterproductive (actually reducing tax revenue), the elasticity of labor supply should be larger than (1 ) / =1-0.25/0.25=3, which is very unlikely to hold among most workers because that the outrage from such a huge tax increase would probably offset any potential tax gains the country would earn.

Income and substitution effects among different goods

In previous chapters we focused on the substitution and income effects of varying the price of good k on the demand for that same good k. In this chapter we emphasize the SE and IE of varying the price of good k on the demand for other goods j. In terms of the Slutsky matrix, we paid close attention to the elements along the main diagonal of the matrix. We briefly examined the elements away from the main diagonal, but in this chapter we investigate these elements in more detail.

We start our analysis using only two goods. Of course, the type of relationships that can occur between only two goods are relatively limited, but this will help us illustrate some intuitions using simple figures. Later on we generalize our analysis to N>2 goods.

Let us start defining goods that are gross complements or substitutes in consumption. In particular, when the price of good y falls, the substitution effect (Which by definition results exclusively from a change in prices. So that the consumption bundle remains on the same indifference curve as before.), can be so small that the consumer purchases a larger amount of both goods x and y. In this case, we denote good x and y as gross complements. The following figure illustrates this case. Specifically, the consumer starts purchasing bundle A. then the price of good y decreases, producing an upward pivoting effect on the consumer’s budget line. We can then apply a Hicksian wealth compensation, so that the consumer can maintain his utility level intact after the price change. The reduction in consumption of good x from A to B reflects the substitution effect, whereas the increase in consumption of good x from B to C illustrates the income effect. Indeed, the total effect on the consumption of good x is positive. Nonetheless, note that the total effect on the consumption of good y is also positive, since the consumer increases his consumption of good y from 0y to 1y . We can therefore conclude that a reduction in the price of good y

produces an increase in the consumption of good x, and thus the goods can be regarded as gross complements.

Page 74: MUÑOZ,F. Advanced microeconomic theory. WSU


Because / 0yx p they are gross complements

Figure #2.45

Let us now describe the opposite case. Specifically, when the price of good y falls, the substitution effect may be so large that the consumer purchases less of good x and more of good y. In this case, we regard goods x and y as gross substitutes in consumption. The following figure reflects the situation. Similarly as our previous figure, the price of good y decreases, rotating the consumer’s budget line. The reduction in consumption of good x from A to B illustrates the substitution effect, while the (small) increase in consumption of good x from B to C reflects the income effect. The total effect is negative, implying that a decrease in the price of good y leads a reduction in the consumption of good x. Hence, goods x and y can be regarded as gross substitutes.

Figure #2.46

Because / 0yx p they are gross substitutes

Page 75: MUÑOZ,F. Advanced microeconomic theory. WSU


After providing a graphical representation of the definition of gross substitutes and complements, let us next introduce a more mathematical treatment of the relationship between these two goods. In particular, the change in the consumption of good x caused by changes in yp is explained using the Slutsky equation,

as follows.

constantincome effect(-) if is normalsubstitution

effect (+)

combined effect(ambiguous)

y y U


x x xy

p p I

First, note that the substitution effect is positive. Intuitively, we are just saying that a decrease in the price of good y induces the consumer to buy less of good x, if his utility level is kept constant, i.e., graphically, the consumer moves along the same indifference curve. Indeed, good y became relatively cheaper and good x relatively more expensive, inducing the consumer to modify his consumption patterns towards the

cheaper good.33 Second, the derivativex


from the second term of the Slutsky equation is positive when

good x is a normal good but negative if x is an inferior good. Because of the minus sign on the front of the income effect, the income effect is therefore negative for normal goods and positive for inferior goods. Intuitively, the income effect in this case is representing that an increase in yp reduces the

consumer’s real purchasing power –makes him “poorer”— leading him to reduce his consumption of good x. As a consequence, an increase in yp reduces the consumption of good x due to the income effect

(if good x is normal) or increases the consumption of good x due to the income effect (if good x is inferior). Overall, the total effect of an increase in yp is therefore ambiguous, and depends on the relative

size of the substitution and income effects. The previous Slutsky equation can also be represented using elasticity terms, as described in previous chapters, as follows.

, ,,cy y

x p y x Ix pE E s E

This expression just confirms our previous intuition: the combined effect of an increase in yp (via the SE

and IE) on the observable Walrasian demand, x(p,w), is ambiguous, i.e., elasticity , yx pE can be positive,

negative, or zero. Also, the impact that a change in yP has on purchasing power is dependent on how

important good y is to the person.

Example. In the following example we use the Walrasian and Hicksian demand associated to a Cobb-

Douglas utility function, u(x,y)= .5 .5x y , in order to show the substitution and income effect across different goods. In particular the Walrasian and Hicksian demands for good x are:

33 Alternatively, this positive substitution effect represents that an increase in yp implies an increase in the

consumption of good x. Intuitively, good x becomes relatively cheaper while good y becomes more expensive. As a consequence, the consumer increases (decreases) his consumption of the former (latter).

Page 76: MUÑOZ,F. Advanced microeconomic theory. WSU


1( , , )


( , , )

x yx

ycx y


Ix p p I


px p p V V


First note that an increase in yP doesn't affect the Walrasian demand for good x but affects the Hicksian

demand for good x (an increase in yp increases the Hicksian demand for x). Indeed,

( , , )0

( , , ) 10


x y


cx y

y x y

x p p I


x p p V V

p p p

We can now find the substitution effect of changing yP . we do so by taking the derivative of the

Hicksian demand with respect to yP ,

( ) 1 and plugging in gives us the SE

2 2




y x y x y

x y

x V IV

p p p p p


p p

In order to find the income effect associated to the price change, we operate as follows

1 1 1 1

2 2 4y x x y

x I Iy

I p p p p

Therefore, we can now express the total effect of a change in yP on the Walrasian demand of good x as

the combination of the substitution and income effect:

1 10

4 4y x y x y


x I I

p p p p p

Intuitively, this implies that the substitution and income effect completely offset each other.34 We can therefore generalize the Slutsky equation to the case of N>2 goods as follows: for any two goods i and j, a change in the price of good j produces

34 A usual mistake is to interpret this result to be saying that goods x and y cannot be substituted in consumption. That is, they must be consumed in fixed amounts. This statement is only true if the income effect is zero.

Page 77: MUÑOZ,F. Advanced microeconomic theory. WSU



i ij

i j U

x xxx

p p I

Therefore, the concept of gross substitutes35 and complements36 include both the substitution and income effect. In particular we say that two goods are gross substitutes if the total effect is positive, / 0yx p ,

whereas we refer to two goods as gross complements if the total effect is negative, / 0yx p .

Asymmetry of the gross definitions

Importantly, the definitions of gross substitutes and complements are not necessarily symmetric. In particular, it is possible for good 1x to be a substitute for good 2x , and simultaneously for good 2x to be a

complement of good 1x . Let us next see this potential asymmetry with one example.

Example. Suppose that the utility function for two goods, x and y, is given by U(x,y)=lnx+y. Setting up the Lagrangian,

ln ( )x yL x y I p x p y

We obtain the following first order conditions:

1 0

1 0


Lxx x


Lx y



I p x p y

Manipulating the first two equations we get x yp x p . Inserting this information into the budget

constraint, we find the Walrasian demand for good y, y yp y I p . We can observe that an increase in

yp causes a decline in spending on y. Therefore, we can conclude that the spending on good x must rise,

since xp and I are unchanged. That is, / 0yx p . Hence, good y is a gross substitute of good x.

However, spending on good y is independent of xp (given that the demand for x and y are independent of

one another). Therefore, / 0yx p , yielding that good x is neither a gross substitute nor a gross

complement of good y. This shows the asymmetry of / 0yx p and / 0xy p .

This conclusion, suggests that it depending on how we check for the existence of gross substitutability or complementarities between two goods, there is potential to obtain different results. A natural question at this point is whether there is some other more precise measure to check if two goods are complements or substitutes in consumption. We next present such a measure.

35 Two goods are substitutes if one good may replace the other in use. For example: tea & coffee, butter & margarine. 36 Two goods are complements if they are used together. For example: coffee & cream, fish & chips.

Page 78: MUÑOZ,F. Advanced microeconomic theory. WSU


Net substitutes and Net complements

The concept of net substitutes and net complements focuses solely on the substitution effect. In particular 2 goods are regarded as net substitutes if 37


0ci i

j j U

x x

p p

While two goods are regarded as net complements if


0ci i

j j U

x x

p p

Graphically, this condition looks only at the shape of the indifference curve. We are analyzing how an increase in the price of one good affects the demand for another good, when the consumer remains at the same indifference curve.

In contrast to our definition of gross substitutes and gross complements, this definition is symmetric across two goods. This means once two goods are determined to substitutes or complements they stay that way no matter which direction the definition is applied. Specifically,



j i UU


p p

In terms of the substitution matrix, these conditions states that every element above the main diagonal is symmetric with respect to the corresponding element below the main diagonal,

Note that the symmetry in the elements away from the Main diagonal is easy to show: first recall that

( , )( , )k


e p uh p u


37 ci




is just the ( , )k


h p u


in MWG.

Page 79: MUÑOZ,F. Advanced microeconomic theory. WSU


Hence we can express the substitution effect as

2( , ) ( , )k

j k j

h p u e p u

p p p

And using Young’s theorem, we know that

2 2 ( , )( , )( , ) ( , ) jk

k j j k j k

h p uh p ue p u e p u

p p p p p p

Since our definition of net complements and net substitutes focuses solely on the substitution effect, two goods can be regarded as gross complements, even if they are net substitutes. Let us see an example. The following figure illustrates a decrease in yp that induces an increase in the consumption of good x due to

the substitution effect (so goods x and y are regarded as net substitutes), but an overall reduction in the consumption of good x due to the total effect (so goods x and y are regarded as gross complements).

Figure #2.47

More generally, the fact that the MRS between two goods is diminishing indicates that the substitution effect must be negative. Indeed, if good y becomes cheaper, and the consumer remains at the same indifference curve, the budget line becomes steeper and as a consequence the consumer reduces his consumption of good x (since this good became relatively more expensive) but increases his purchases of good y (since this good is now relatively cheaper).38

38 The opposite explanation is applicable for the case in which the MRS is increasing (i.e., indifference curves are bowed out from the origin). In particular, if good y becomes cheaper, and the consumer remains at the same indifference curve, the budget line becomes steeper and as a consequence the consumer increases his consumption of good x but reduces his consumption of good y.

Page 80: MUÑOZ,F. Advanced microeconomic theory. WSU


A note on the Euler’s theorem (and its relationship with homogeneity of degree k).

Let us briefly recall the definition of homogeneity. We say that the function 1 2( , )f x x is homogeneous of

degree k if

1 2 1 2(1) ( , ) ( , )kf tx tx t f x x

Note that differentiating this expression with respect to 1x , we obtain

1 2 1 2

1 1

( , ) ( , )kf tx tx f x xt t

x x

And rearranging,

11 1 2 1 1 2( , ) ( , )kf tx tx t f x x

We can hence conclude that, if a function is homogeneous of degree k, its first-order derivative must be homogeneous of degree k-1. This is a useful result that we use below.

Differentiating both sides of expression (1) with respect to the proportionality factor t, we obtain

1 21 1 2 1 2 1 2 2

11 21 2

( , )( , ) ( , )

( ( , )( , )


f tx txf tx tx x f tx tx x


t f x xk t f x x


Therefore we have,

11 1 2 1 2 1 2 2 1 2( , ) ( , ) ( , )kf tx tx x f tx tx x k t f x x

And making the proportionality factor t=1, we obtain

1 1 2 1 2 1 2 2 1 2( , ) ( , ) ( , )f x x x f x x x k f x x

where k is the degree of homogeneity of the original function 1 2( , )f x x . First, note that if the original function

is homogeneous of degree zero, i.e., k=0, then we obtain that the left-hand side of the previous expression is zero. This result is intuitive: it says that if a function is homogeneous of degree zero, increasing the proportionality factor t will not affect its value. Second, note that if the original function is homogeneous of degree one, i.e., k=1,

then we obtain that the left-hand side of the previous expression is 1 2( , )f x x . Intuitively, if we marginally

increase the proportionality factor t the function increases in its entire initial value 1 2( , )f x x .

We can apply this result to the Hicksian demand function. We know that the Hicksian demand is homogeneous

of degree zero in prices, i.e., k=0. That is, 1 2 1 2h tp , tp , , u h p , p , , u .k k . Hence,

1 21 2

... 0c c c


x x xp p p

p p p

Page 81: MUÑOZ,F. Advanced microeconomic theory. WSU


We can now continue with our previous discussion of net complementarity and substitutability between different goods. In particular, we want to understand whether substitutability or complementarity is more prevalent from an empirical point of view. This is of interest because whether two goods are net complements or net substitutes is basically up to that individual person. However, using the Hicksian

demand curve, 1 2h p , p , , uk , we can apply Euler’s theorem (as discussed above) yielding,

1 21 2=constant =constant =constant

... 0i i in

nU U U

x x xp p p

p p p

Alternatively, we can express the above expression using elasticities, as follows

1 2 ... 0c c ci i inE E E

We know, however, that own-price substitution effects are negative (the elements in the main diagonal of

the Slutsky matrix are negative). This implies that 0ciiE . Therefore, the sum of the compensated cross-

price elasticities for all other n-1 goods must be positive, 0cij

j i


, if we need the sum of the

compensated elasticities for all n goods to be exactly equal to zero. Intuitively, this result implies that most goods must be net substitutes. This is usually referred as Hicks second law of demand.39

Composite commodities

When analyzing the consumer’s purchasing decision among n goods, we deal with potentially n different

demand functions, with ( 1)


n n 40 different substitution effects. It is therefore often convenient to group

goods into larger aggregates, for instance, food, clothing, or more generally, “all other goods” different from the good that we are analyzing. In order to do that we make use of the so-called composite commodity theorem.

Suppose that consumers choose among n goods, and that the demand for 1x depends on the prices of all

other n-1 goods. If all of these prices move together, it may make sense to group them into a single

composite commodity (y). Let 02 ,... o

np p represent the initial prices of these other commodities. Let's

assume that they all vary together (so that the relative prices of 2 3,, ..., nx x x do not change). We can now

define the composite commodity y as the total expenditures on all other goods, 2 3,, ..., nx x x , at the initial

prices. That is,

2 2 3 3 ...o o on ny p x p x p x

39 Note that some textbooks use the notation Hicksian substitutes to refer to two goods that are net substitutes in consumption. Similarly they refer to goods that the net complements as Hicksian complements.

40 There are 2N elements but because of symmetry, only 2


Nare unrepeated, plus


Nfrom half of the main

diagonal elements. Thus we can find the number of different substitution effects is 2 ( 1)

2 2 2

N N N N .

Page 82: MUÑOZ,F. Advanced microeconomic theory. WSU


The individual's budget constraint is therefore

1 1 2 2 1 1...o on nI p x p x p x p x y

Moreover, if we assume that the prices of all other goods, 02 ,... o

np p , change by the same factor (t>0) then

the above budget constraint becomes

1 1 2 2 1 1...o on nI p x tp x tp x p x ty

And therefore we can analyze the substitution effect associated to price changes, where the prices that can be changed in this context are only 1p and t. Hence, this theorem allows us to say that as long as

02 ,... o

np p move together, we can restrict our examination of demand choices to two types of goods: the

good we are analyzing 1x and everything else. As a consequence, we can represent our results in two-

dimensional figures, with 1x in the horizontal axis and the composite commodity in the vertical axis.41

Example. Let us next examine an example of how to use the composite commodity theorem. Suppose that an individual who receives utility from three goods: food (x), housing services (y), measured in hundreds of square feet, and household operations (z), measured by electricity use. Let us next assume a CES utility function:

1 1 1( , , )U x y z

x y z

We can find the Walrasian demand function for each of the three goods

x x y x z

y y x y z

z z x z y


p p p p p


p p p p p


p p p p p

If initially the consumer’s income is I=$100 and prices are xp =$1, yp =$4 and zp =$1, we obtain that the

quantity demanded for the three goods is x*=25, y*=12.5 and z*=25 units. Hence, $25 is spent on food and $75 is spent on housing-related needs.

If we assume that prices yp and zp move together, we can use the initial prices to find the composite

commodity “housing” (h), as follows


41 As a disadvantage of the composite commodity theorem, however, note that the term makes no prediction about

how the choices between 2 3,, ..., nx x x behave, since it only focuses on the total expenditure on these other goods.

Page 83: MUÑOZ,F. Advanced microeconomic theory. WSU


The expenditure on housing goods implies a price of $4 for good y and $1 for good z. Therefore, the initial quantity of good h is the total amount of money spent on housing ($75). Hence, hp =$1 and


4h zy

p pp

. Plugging this information into the Walrasian demand for good x, we obtain





x x y x z x x h x h

x x h

I Ix

p p p p p p p p p p


p p p

And the consumer’s income is I=$100 and prices are xp =$1 and hp =$1, we obtain


If $100, $1, $1

100 10025

41 3 1

x hI p p


Finally in order to find the optimal amount of housing demanded by the consumer, h*, we just need to use the budget constraint

* *



$1 25 $1 $100


x hp x p h I



Therefore, the Walrasian demand for good x can be shown as a function of income, xp and hp ,

3x x h


p p p

And if the income is I=$100, xp =$1, yp =$4 and hp =$1, we obtain x*=25 and h*=75.

Note that if yp rises from $4 to $16 and zp rises from $1 to $4 (but px remains at $1), hp would also

rise to hp =$4. Indeed,

1 116 4

4 44

h y

h z

p p

p p

Due to this price change the Walrasian demand for good x would fall to

* 100 100

71 3 4x

while housing purchases would be given by

Page 84: MUÑOZ,F. Advanced microeconomic theory. WSU


* 100 600100 85.7

7 7hp h

And since hp =$4, then h*=85.7/4=21.43.

Finally, note that we could also find these results by plugging the initial information about income and prices, I=$100 and xp =$1, yp =$4 and zp =$1, into the expressions of Walrasian demand for all three

goods, obtaining

x*=100/7, y*=100/28 and z*=100/14

Which implies that the amount of housing consumed is h*=4y*+1z*=21.43.

For more practice with this concept problem 6.8 in Nicholson and Snyder provides a useful exercise.

Page 85: MUÑOZ,F. Advanced microeconomic theory. WSU


Chapter 3 – Aggregate demand

Aggregate demand

In this chapter we move from individual demand, xi(p,wi), where wi denotes individual i’s wealth level, to aggregate demand,


( , )I

i ii

x p w

In particular, this chapter focuses on answering 3 main questions:

1. We know that the individual demand depends on prices and individual wealth, xi(p,wi). But, when can we express aggregate demand as a function of prices and the aggregate wealth level? That is,

1 1

( , ) ,I I

i i ii i

x p w x p w

2. We know that individual demand satisfies the WARP as long as preferences are rational. But, when does aggregate demand satisfies the WARP?

3. Finally, we know how to measure welfare changes associated to a price change in the case of individual demand (using, for instance, the CV, EV, and AV). But, can we apply the same measures of welfare change in the case of aggregate demand?

First question: Aggregate demand and aggregate wealth

First, we want to understand under which conditions we can guarantee that the aggregate demand defined

as x(p,w1,w2,…,wI)=1

( , )I


x p w satisfies

1 1

( , ) ,I I

i i ii i

x p w x p w

That is, under which conditions aggregate demand depends only upon prices and the aggregate wealth level in the economy. The above condition is satisfied if, for any 2 distributions of wealth, (w1,w2, …,wI)

and (w1’,w2’,…,wI’) with the same aggregate wealth, 1 1


i ii i

w w

, we have that

1 1

( , ) ( , )I I

i i i ii i

x p w x p w

Intuitively, a change in the wealth distribution across individuals that does not modify the aggregate wealth in the economy might change individual demands but will not modify the aggregate demand for a particular good. In order for the above condition to be satisfied let us start with an initial wealth distribution (w1,w2, …,wI) and the apply a differential change in wealth (dw1,dw2, …,dwI) such that the

Page 86: MUÑOZ,F. Advanced microeconomic theory. WSU


aggregate wealth level is unchanged, i.e., 1




. Note that, if aggregate demand is just a function of

aggregate wealth, then we must have that


( , )0

Ik i

ii i

x p wdw


That is, the wealth effects of different individuals are compensated in the aggregate. Or, more compactly,

( , )( , ) kj jki i

i j

x p wx p w

w w

For every good k, and for every two individuals i and j. Note that this result implies that the income effect for individual i and j are equal in absolute value. That is, any redistribution of wealth between i and j will lead to

( , )( , )0lj jli i

i ji j

x p wx p wdw dw

w w

For example, if we redistribute wealth from subject i to subject j, we have

( , ) ( , )( , ) ( , )0 also: lj j lj jli i li i

i ji j i j

x p w x p wx p w x p wdw dw

w w w w

Indicating not only that the income effect for subject i (subject j) is negative (positive, respectively), but also that the absolute value of these two income effects exactly coincide across individuals. Summarizing, the above conditions states that for any fixed price vector p, for any good k, and for any wealth level of any two individuals i and j, the wealth effect of a redistribution of wealth is the same across individuals. In other words, the wealth effects arising from the redistribution of wealth across consumers cancel out.

Graphically, this condition is saying that all consumers exhibit parallel, straight wealth expansion paths. First, note that straight wealth expansion paths imply wealth effects do not depend upon the individual's wealth level. That is, a given increase in wealth produces a change in the consumption of good k that is independent on the individual’s wealth level. The following figure illustrates a straight wealth expansion path where, an increase in wealth produces an increase in the consumption of good one and two of the same size when the consumer’s wealth increases from w to w’, and from w’ to w’’. In contrast, a non-straight (curvy) wealth expansion path as the one depicted in the figure below, implies that a given increase in wealth might lead to changes in the consumption of good k that are dependent on the individual’s wealth level. This is illustrated in the figure where, the consumer regards good 1 as normal when his wealth level increases from w to w’, but considers good 1 an inferior good when his wealth is further increased from w’ to w’’.

Page 87: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #3.1

Second, note that a parallel wealth expansion path across individuals implies that individual's wealth effects must coincide across individuals. We illustrate this property in the figure below, where the wealth expansion path for consumers one and two are parallel to each other, indicating that both individuals demand for good one and two change similarly as they become richer.

Figure #3.2

Recall that in previous lectures we have seen several examples of preference relations that imply straight and parallel wealth expansion paths: homothetic preferences, quasilinear preferences, etc. hence, if both individuals exhibit either of these preference relations, we can guarantee that their wealth expansion paths will be straight and parallel to each other and, as a consequence, demand can be expressed as a function of market prices and aggregate wealth.

Page 88: MUÑOZ,F. Advanced microeconomic theory. WSU


One interesting question at this stage is whether we can group all these types of preference relations (homothetic, quasilinear, etc.) as special cases of a particular type of preference. Indeed, there is such a general type of preference relation. In particular, a necessary and sufficient condition for consumers who exhibit parallel, straight wealth expansion paths is that every consumer’s indirect utility function can be expressed as

( , ) ( ) ( )i i i iv p w a p b p w

This indirect utility function is usually referred as the Gorman form1.

Let us next show that an indirect utility function that can be represented using the Gorman form representation satisfies the property that aggregate demand can be represented as a function of prices and aggregate wealth. First, using Roy's identity on vi(p,wi) we obtain

( , )

( , )( , )

( ) ( )

( ) ( ) =

( ) ( ) ( )

( , ) = -A ( ) ( )

i i

i ii i



j j ii

j j

j j ji i i i

v p w

px p w

v p w


a p b pw

p p a p b pw

b p b p p b p p

x p w p B p w

And using the same approach in order to find the Walrasian demand of individual i for all goods, we have

( ) ( )

( ) ( )( , ) ( , )

( ) ( )i i

p i p ip i i i i

A p B p w

a p b p wv p w x p w

b p b p

Therefore, summing over all I individuals in the economy, we obtain

1 1 1 1


( , ) ( ) ( ) ( ) ( )



i i i i ii i i i




x p w A p B p w A p B p w

x p w

1 The Gorman utility function presents some interesting features. First, note that an increase in the individual's wealth level produces the same increase in utility level, b(p), across all individuals. Nonetheless, this utility function is not symmetric for all i, since it allows asymmetries in the first term, ai(p). Finally, note that in the case of quasilinear preferences, using b(p)=1/pk, we can represent the utility function of an individual with quasilinear

preferences using the Gorman form as follows, 1( , ) ( )ki i i ipv p w a p w . Practice: take some of the examples we

have seen about quasilinear preferences, find the indirect utility function, and show that it can be expressed in its Gorman Form representation.

Page 89: MUÑOZ,F. Advanced microeconomic theory. WSU


Hence, we can indeed express aggregate demand as a function of prices and aggregate wealth.

We conclude that we can represent aggregate demand as a function of prices and aggregate wealth when preferences can be represented with a Gorman form indirect utility function. This condition, however, might be somewhat restrictive. We wonder, hence, if we can obtain the same results using weaker conditions. The literature has shown that we can indeed use weaker conditions by using two different approaches. First, rather than assuming that aggregate demand depends on total (aggregate) wealth, note that we could assume that aggregate demand depends on a wider set of variables, e.g., average wealth level, the variance of the wealth distribution, etc. as shown in Deaton and Muellbauer (1980). The second approach, asks why we don’t restrict the type of admissible wealth distributions. Indeed in our previous analysis we were allowing a type of wealth distribution. However, the distribution of wealth among individuals is usually a direct consequence of the labor market (wage distribution), stock ownership, governmental programs, taxes, etc.2,3

Second question: aggregate demand and the WARP

In this section we seek to understand under which conditions aggregate demand satisfies the WARP. For simplicity, let us use wealth distribution rules, wi(p,w), a function that assign a wealth level to every individual i, depending on the price level p and the aggregate wealth in the economy w. In particular, we consider only wealth distribution rules that are independent of prices, and assign a constant fraction of the aggregate wealth to every individual,4


We can then express the aggregate demand function of the wealth distribution rule. That is,

2 One particular example of this approach uses the so-called wealth distribution rule, which considers a function wi(p,w) and assigns a wealth level to every individual i, depending on the price level and the aggregate wealth in the economy w.

3 Example: Having missed an opportunity with the recent vampire craze created by the Twilight series, Mattel has offered a new product targeted at younger kids than their competitors: the Vampire Teddy Bear. The Vampire Teddy Bear is a small, fluffy bear with two plastic fangs that (safely, according to Mattel) drill into the child’s neck. Mattel, is planning a blitz marketing campaign emphasizing that the more bears you buy for your child the longer they will stay silent (because of their extreme satisfaction with the bear). In fact, they provide an equation in the commercial (the marketing director is on vacation and the chief economist has been pitching in): Minutes of Silence = Number of Bears*20. Assuming this message penetrates the parenting market equally and there isn’t a diminishing return on silence from a child, should Mattel be concerned with income distribution? No, as long as the total income does not change, Mattel will sell exactly the same number of bears. Why? Because we assume all the parents want silence from their child, any demand lost from one parent with lower income is gained by a richer parent who wants more minutes of silence.

4 Note that this wealth distribution rule allows for different amounts of wealth to be distributed to every individual, i.e., αi being different from αj for any two subjects i and j, or to coincide across all individuals, i.e., αi=αj.

Page 90: MUÑOZ,F. Advanced microeconomic theory. WSU




( , ) ( , ( , ))

( , )


i ii


i ji

x p w x p w p w

x p w

We can now describe under which conditions the aggregate demand function satisfies the WARP. In particular, we extend the definition of WARP that we discussed in the chapter on Walrasian demand to the aggregate demand function, as follows: aggregate demand x(p,w) satisfies the WARP if

1. the new bundle that consumers choose under p’ and w’, x(p’,w’), is affordable under the old prices and wealth, i.e., p x(p’,w’)<w, but

2. the old bundle that consumers choose under p and w, x(p,w), is NOT affordable under the new prices and wealth, i.e., p’x(p,w)>w’.

One interesting property of this definition of the WARP at the aggregate level is that individual Walrasian demand might satisfy WARP at the individual level but the aggregate demand might violate WARP at the aggregate level. Let us illustrate this possibility with an example. For simplicity, consider that the wealth distribution rule assigns the same share of the wealth to both individuals 1 and 2, i.e., each receives w/2. The following figure represents individual 1’s Walrasian demand. It satisfies WARP since the new bundle x(p’,w/2) is affordable under the old budget set Bp,w/2, and the old bundle x(p,w/2) is not affordable under the new budget set Bp’,w/2.

Figure #3.3

Page 91: MUÑOZ,F. Advanced microeconomic theory. WSU


The figure below illustrates individual 2’s Walrasian demand. It also satisfies WARP given that the new bundle is an affordable under the old budget set5 Bp,w/2.

Figure #3.4

We can now aggregate the Walrasian demand for individuals 1 and 2. For completeness the following figure illustrates individual and aggregate demands. First, note that, for the old budget set Bp,w/2, the average consumption across both individuals, 1/2x(p,w/2), lies in the midpoint connecting individual 1’s and 2’s demand at the old budget line, Bp,w. (A similar argument is applicable for the new budget line Bp’,w/2 and the midpoint 1/2x(p’,w/2)). Using these midpoints, we obtain


2 2 2

w wp x p

Since bundle B is below the old budget set Bp,w/2, but


2 2 2

w wp x p

Given that bundle A is also below the new budget line Bp’,w/2. Multiplying both sides of these expressions by 2, we obtain a violation of the WARP at the aggregate level.

12 2 2( , )w wp x p

But, how can it be that the WARP is satisfied at the individual level not violated at the aggregate? First, note that the WARP at the individual level is equivalent to the compensated law of demand (CLD):

( ) [ ( , ) ( , )] 0i ip p x p w x p w

5 Recall that if, when applying the definition of the WARP the premise is false, then WARP cannot be violated.

Page 92: MUÑOZ,F. Advanced microeconomic theory. WSU


Where w’=p’x(p,w) is the wealth compensation we must make to the consumer so that he can still afford his old bundle x(p,w) at the new prices p’, i.e., Slutsky wealth compensation. And, if the change in prices is compensated for all the individuals, wi’= αiw, and thus αiw =p’xi(p, αiw), then we could have

( ) [ ( , ) ( , )] 0i i i ip p x p w x p w

for every individual i. Adding over all individuals, we have that in the aggregate

( ) [ ( , ) ( , )] 0p p x p w x p w

which implies that the compensated law of demand is satisfied in the aggregate and, as a consequence, the WARP is also satisfied in the aggregate.

Price changes, however, might not be accompanied with a wealth compensation for all individuals, i.e., αiw might differ from p’xi(p, αiw). In such case we would have that

( ) [ ( , ) ( , )] 0i i i ip p x p w x p w

doesn't hold for all individuals. As a result, the compensated law of demand

( ) [ ( , ) ( , )] 0p p x p w x p w

might not hold for aggregate demand, which implies that the WARP might not be necessarily satisfied either. The following figure summarizes our results:

Figure #3.5

Remark about the uncompensated law of demand at the aggregate level and the WARP: note that when a price change is not accompanied with a wealth compensation, we might have that for some individual i,

a. If the income effect is negative, then it “reinforces” the substitution effect and the uncompensated law of demand holds.

b. If the income effect is positive, then it goes in the opposite direction as the substitution effect. If the income effect partially offsets the substitution effect the uncompensated law of demand still holds. However, if the

Page 93: MUÑOZ,F. Advanced microeconomic theory. WSU


income effect totally offset the substitution effect, then the uncompensated law of demand doesn't hold for this individual i.

Importantly, when the uncompensated law of demand doesn't hold for some individuals we might have that the uncompensated law of demand doesn't hold in the aggregate, and therefore, the WARP is not necessarily satisfied.

The possibility of having the WARP satisfied at the individual but not at the aggregate level raises the question of whether we can impose some minimal conditions on the preference relations that guarantee that the WARP is satisfied for the aggregate Walrasian demand. The following proposition shows that we can.

Proposition. If every consumer’s Walrasian demand function xi(p,wi) satisfies the uncompensated law of

demand, then aggregate demand x(p,w)= 1

( , )I

i ii

x p w satisfies the compensated law of demand and it

also satisfies the WARP.

if ( , ) ( , )for all


adding ( ) [ ( , ) ( , )] 0 over all ,

we have ( ) [ ( , ) ( , )] 0 over all ,

Let us now check WARP:

1) Take any ( , ) and ( , ), such that

( ,

i i i i

i i i i

x p w x p wi

p p x p w x p w i

p p x p w x p w i

p w p w

p x p

new bundle is affordable at old ( , )

( , ) ( , )

( , )


2) Define

3) By homogeneity of degree zero of ( , ), we have

( , ) (

p w

x p w x p w

x p w

w w

wp p

wx p w

w wx p w x

w w

, )p w

Page 94: MUÑOZ,F. Advanced microeconomic theory. WSU


Hence ( , ) ( , )

4) a) From ULD at the aggregate level, we know that

( ) [ ( , ) ( , )] 0

) From the equaltiy in (3) and step (1) [affordablity] we have:

step (1) ( , )

equality i

x p w x p w

p p x p w x p w


p x p w w

n (3) ( , )

) From Walras' Law, we know that: ( , ) and ( , )

5) From step 4(a), we can conclude:

( , ) ( , ) ( , ) ( , ) 0w w

p x p w w

c p x p w w p x p w w

p x p w p x p w p x p w p x p w

, from step 4(b)


2 ( , ) ( , )

6) Hence,

( , )

Which implies:

( , ) ( , )

That is, the bundle ( , ) was unaffordable at

w w


w p x p w p x p w

p x p w w

wp x p w w p x p w w


x p w

new prices and wealth ( , )

is satisfied.

p w


Intuitively, this proposition states that if the uncompensated law of demand property is satisfied at the individual level then everything will work out nicely at the aggregate level: uncompensated law of demand will hold at the aggregate level and the WARP will be satisfied as well.

Remark on the uncompensated law of demand and NSD: recall that if the derivative of the Walrasian demand with respect to prices, Dpxi(p,wi) is negative semidefinite (NSD), then the elements of the Main diagonal of Dpxi(p,wi) must be weakly negative. Intuitively, own-price effects must be weakly negative. Therefore, the uncompensated law of demand holds. We can hence conclude that if Dpxi(p,wi) is NSD then the uncompensated law of demand holds for xi(p,wi).


One question at this point is whether assuming that the uncompensated law of demand holds across all consumers at the individual level is a very restrictive assumption, i.e.,

6 In homework #4 you are asked to show that the converse relationship is not necessarily true.

Page 95: MUÑOZ,F. Advanced microeconomic theory. WSU


( ) [ ( , ) ( , )] 0i i i ip p x p w x p w

Let us next see one example of individual preference relations for which the uncompensated law of demand holds at the individual level (and, as a consequence, at the aggregate level as well).

If a preference relation is homothetic, then this individual Walrasian demand satisfies the compensated law of demand (while the converse is not necessarily true).


Slutsky equation

( , ) ( , ) ( , ) ( , )

and for homothetic preference relations, ( , ) ,

( , )or alternatively, , we have that ( , ) ,

which we can write as D

Ti i p i i w i i i i

i i i i

i ii w i i i


S p w D x p w D x p w x p w

x p w w

x p wD x p w


0 if 0 if

( , )( , ) . Plugging and


( , )D ( , ) ( , ) ( , )

Now we pre- and post-multiply all elements by ,

dp D ( , ) ( , )

i iw i i


Ti ip i i i i i i


p i i i i

dp pdp p

x p wx p w


x p wx p w S p w x p w



x p w dp dp S p w dp

0 if 00 if 0

( , )dp ( , )

Either way, dp D ( , ) 0, except when zero

consumption ( 0) and the change in prices is proportional

to the initial price level, i.e.,



Ti ii i



p i i


x p wx p w dp


x p w dp


. Since D ( , ) is

then negative semidefinite, and we already know

ULD D ( , ) is negative semidefinite

Hence, ( , ) satisfies ULD.

[note: we just showed homotheticity in preferences ULD

p i i

p i i

i i

dp p x p w

x p w

x p w


Recall that the homothetic preferences we analyzed above are just one example of a preference relation that satisfies the uncompensated law of demand at the individual level (and therefore it also satisfies WARP at the aggregate level). Can we identify more general conditions under which the uncompensated law of demand holds? First, recall that

Page 96: MUÑOZ,F. Advanced microeconomic theory. WSU



( , ) ( , ) ( , )( , ) 0ki i ki ki i

ki ik k i


x p w h p u x p wx p w

p p w

for normal goods (which have a positive income effect), we have that income effect reinforces the substitution effect, and therefore the total effect associated to a price change is negative. In other words, the uncompensated law of demand holds. For inferior goods, the income effect is negative, which implies that we can either have: (1) the absolute value of the substitution effect is still larger than that of the income effect and, as a consequence, the total effect is still negative. In this case the uncompensated law of demand still holds; (2) the absolute value of the substitution effect is smaller than that of the income effect and, as a result, the total effect is positive, implying that the uncompensated low demand is violated (intuitively, this good is a so-called Giffen good). We can therefore conclude that the uncompensated law demand is satisfied at the individual level as long as consumer i doesn't regard good k as a Giffen good.

Hence, at the aggregate level, the compensated law of demand is satisfied as long as there is a positive total effect (TE>0, associated to those goods that some consumers might regards as Giffen goods),

( , )ki i


x p w


, does not completely offset the negative total effect (TE<0, associated to usual goods) from

the rest of consumers( , )ki i


x p w


. Therefore, we can conclude that assuming the compensated law of

demand at the individual level doesn't seem very restrictive assumption (since Giffen goods are rare), and constitutes an even milder assumption at the aggregate level.

In all our previous discussion we showed that if the uncompensated law of demand is satisfied at the individual level then it must be satisfied at the aggregate level as well. The converse, however, is not necessarily true. Let's see one example.

Example. Suppose that all consumers have identical preferences, with individual demand functions (p,w) –where ( (p,w) is denoted without a subscript because all individuals have the same individual demand function- and that individual wealth is uniformly distributed on the interval [0, ] (with a continuum of consumers). Then, the aggregate demand function

0( ) ( , )

wx p x p w dw

satisfies the uncompensated law of demand.7,8

7 You are asked to show that this aggregate Walrasian demand satisfies the uncompensated law of demand at the aggregate level but it does not affect the individual level in homework #5 (you can also read about this in MWG, page 113).

8 Example: After a congressional investigation, it is revealed that the silence equation provided by Mattel for the Vampire Teddy Bear is inaccurate. The equation only holds up to five bears. At that point, the effect on the child diminishes and by ten bears, no additional minutes of silence are obtained. Other than the enviable lawsuits, should this concern Mattel? Yes. Mattel assumed that they could base pricing and production decisions on aggregate demand. Aggregation depends on the uncompensated law of demand being satisfied which, in turn, depends on homothetic preferences. Because of the diminishing nature of the minutes of silence, the preferences are not homothetic. Intuitively this makes sense. If you already purchased ten bears for your child and someone shifted

Page 97: MUÑOZ,F. Advanced microeconomic theory. WSU


Aggregate demand and the representative consumer (3rd question of the chapter)

In this section we analyze under which conditions can use the welfare measures from previous chapters (CV, EV, and AV) to evaluate aggregate welfare. In other words, we seek to answer the question: when can we treat aggregate demand as if it were generated by a fictional representative consumer whose preferences can be used as a measure of aggregate social welfare. We begin the section with two definitions about the representative consumer: one from a positive (or behavioral) perspective and the other from the normative approach.

Positive or behavioral definition: a positive representative consumer exists if there is a rational preference relation on RL+ such that the demand is precisely the Walrasian demand function generated by this preference relation. That is, the bundle chosen by the aggregate demand function x(p,w) is strictly preferred to any other affordable bundle, i.e., x(p,w) is strictly preferred to any bundle x such that px≤w, where x is different from x(p,w). Or, in other words, x(p,w) is the argmax of the representative consumer's UMP when facing budget set px≤w.

Normative definition:

In order to be able to assign welfare significance to this fictional individual’s demand function we must first define what we mean by social welfare. Let’s first introduce the concept of the social welfare function (SWF).

Bergson-Samuelson SWF: a social welfare function (SWF) is a function W:RI→R that assigns a utility value to each possible vector of individual utility levels (u1,u2,…,uI) for each of the I consumers in the economy. We assume that the social welfare function W(u1,u2,…,uI) is increasing, concave and differentiable in each argument. The figure below illustrates an example of a social welfare function. Note that W(.) is increasing in u1 and experiences an upward shift when u2 increases.

Figure #3.6

income from some poorer individual to you, you would not buy an eleventh bear. However, the change in income for the poorer individual would result in fewer bears being purchased.

Page 98: MUÑOZ,F. Advanced microeconomic theory. WSU


Note that a utilitarian social welfare function such as W(u1,u2)=u1+u2 satisfies the above properties, and so does a social welfare function such as W(u1,u2)=au1+bu2, where a,b>0 denote the weights that society assigns to the welfare of individuals 1 and 2, respectively. Generally, note that social welfare functions do not need to be additive. Indeed, a social welfare function such as W(u1,u2)=(au1xbu2)^1/2 also satisfies the above assumptions.9 Finally, recall the Rawlsian social welfare function W(u1,u2)=min{au1,bu2}, where a,b>0. Intuitively, this function represents that society is only as well as the individual in the worst position. Finally, we assume that there is some process (central authority, benevolent planner, IRS…) that, for any prices and aggregate wealth (p,w) gives the optimal wealth distribution among individuals by solving:

1 21 1

, ,...,


max ( ( , ),..., ( , ))

. .


w w w



W v p w v p w

s t w w

Intuitively, the social planner first distributes a wealth level to every individual i, wi, and afterwards every individual chooses these optimal consumption bundle xi(p,wi) independently solving his UMP, reaching an associated utility level vi(p,wi). (For this reason, the social planner considers in his maximization problem the maximum utility level that every individual will achieve when independently solving his own UMP, vi(p,wi).) The optimal value of the previous social planner’s maximization problem defines a value function v(p,w), which is referred to as the social indirect utility function.

We can relate the results from our previous maximization problem with the concept of a representative consumer. Suppose that for every (p,w), the wealth distribution (w1(p,w), w2(p,w), …, wI(p,w)) solves the above maximization problem. Then the resulting social indirect utility function v(p,w) is an indirect utility function of a positive representative consumer for the aggregate demand function


( , ) ( , ( , ))I

i ii

x p w x p w p w

That is, from the above maximization problem we obtain a positive representative consumer for the aggregate demand function in which every consumer’s wealth is the argmax of the above problem, i.e., a wealth distribution among individuals that maximizes social welfare. We can now define the normative representative consumer.10

9 Let’s consider functions that cannot be regarded as social welfare functions: W(u1,u2)=au1-bu2, or W(u1,u2)=u1/u2, since they are both decreasing in u2, W(u1,u2)=(u1)

2+u2, since it is convex in u1. (Note that in the last function, individual 1 raises social welfare so much that it could be convenient to make u2=0).

10 For some examples, see MWG pp. 119-120.

Page 99: MUÑOZ,F. Advanced microeconomic theory. WSU


Proposition 4.D.3.


The positive representative consumer (with preferences ) for the aggregate demand function

( , ) ( , ( , )) is a normative representative consumer (relative to the SWF ( )) if,

for every pai


i iix p w x p w p w W

1r ( , ), the distribution of wealth among consumers ( ( , ),..., ( , )) solves

problem 4.D.1, and therefore the value function ( , ) from problem 4.D.1 is the Social indirect

utility function.

Ip w w p w w p w

v p w

The regularizing effects of aggregation

The theory of aggregate demand presents two advantages: first, most of the data is in aggregate terms, making the theory extremely applicable. Second, the use of aggregate (rather than individual) demand has “regularizing” effects. In particular, the average (per consumer) demand tends to be more continuous, as a function of prices, than the individual demands separately. For example, we know that if individual preferences are strictly convex then individual demand functions are continuous. In this case, aggregate demand is continuous as well. But, what if individual demands are not continuous? In this case, aggregate demand can be (nearly) continuous. Let us elaborate on this property. The following figure illustrates the case in which preferences are strictly convex and, as a consequence, individual demand is continuous.

Figure #3.7

Page 100: MUÑOZ,F. Advanced microeconomic theory. WSU


If preferences are concave, however, discontinuities might arise, as the next figure depicts. Specifically, and the initial price ratio the consumer chooses bundle A, spending all his income on good 2. When the price of good 1 decreases sufficiently the consumer might find it profitable to switch all his consumption towards good 1, since by doing so he can reach a higher indifference curve I2. The bottom figure illustrates the demand curve at different price levels: when the price of good 1 slightly decreases, the consumer still uses all his income to buy good 2 alone. When the price of good 1 decreases below , however, the consumer stops buying good 2 and starts using all his income on purchases of good 1.11

Figure #3.8

In cases as the one analyzed above, the only requirement we will impose in order to guarantee that the aggregate demand is continuous in prices is that the preferences of all individuals are not too concentrated around the prices for which individual demands are discontinuous. Let us examine an example.

Example 4.AA.1. Consider two goods, and consumers with quasilinear preferences with respect to the first good (we will treat the second good as the numeraire). For simplicity, we assume that the first good is only available in integer amounts, and consumers have no wish for more than one unit of it (e.g., appliances, cars, etc.) We can hence normalize consumer i’s utility to be zero when he consumes zero amounts of the first good, v1i=0 when x1=0, and positive otherwise, v1i>0. Thus, v1i represents the utility

11 Note that a similar discussion is applicable for the case of linear preferences (when the consumer regards two goods as perfectly substitutable).

Page 101: MUÑOZ,F. Advanced microeconomic theory. WSU


of holding positive amounts of good 1 in terms of good 2 (the numeraire). Consumer i’s demand for the first good is hence

1 1

1 1 1 1

1 1

1 if

( ) 0,1 if

0 if


i i


p v

x p p v

p v

Graphically, this demand function can be represented as the figure below.

Figure #3.9

Intuitively, v1i can be interpreted as the reservation price of good 1 for consumer i, since it denotes the maximum price at which he is willing to buy the good: for prices above that level he demands zero units, while for prices below he demands one unit. When constructing the aggregate demand for good 1, we can horizontally add the individual demand for all consumers. Note that some consumers might be in the interval of zero demand (where p1>v1i), others might be in their interval of one-unit demand (because p1<v1i), while other consumers might simply be indifferent between buying and not buying the good (i.e., price is exactly at the discontinuity point of individual demand p1=v1i). The only assumption we need to impose is that most of these consumers are not at the discontinuity point p1=v1i.

Denote by x1(p1) the demands of consumers with one-unit demand for the good (i.e., those with a reservation price sufficiently high, v1i>p1). Therefore, if consumers’ reservation price is distributed according to a cumulative distribution function G(p1) representing all consumers with reservation prices below p1, v1i<p1, we can express x1(p1) more compactly as x1(p1)=1- G(p1). Hence, the aggregate demand x1(p1) is a continuous function even though none of the individual demand correspondences are so.12 The figure below represents a demand curve. Note that the vertical intercept reflects the highest reservation

12 Note that if we are dealing with a small number of consumers, the above condition on “not everybody having the same v1i point” might not be satisfied. For this reason, we assume a continuum of consumers.

Page 102: MUÑOZ,F. Advanced microeconomic theory. WSU


price among all consumers, while the horizontal intercept represents a low enough price such that all consumers (100%) decide to buy the good.

Figure #3.10

Finally, the following figure illustrates the role of G(p1) on the demand for good 1: G(p1) can be understood as the share of consumers for which a particular price p1 is higher than the reservation price for the good, and who therefore demand zero units of it. In contrast, 1- G(p1) reflects the share of consumers for which the particular price p1 is low enough to justify a one-unit demand.

Figure #3.11

Page 103: MUÑOZ,F. Advanced microeconomic theory. WSU


Chapter 4 – Production Theory

Production theory

In this chapter we introduce the production set and production function of a firm and their related properties. In addition, we will examine the firm’s profit maximization problem, and its dual: the firm’s cost minimization problem. We will also describe the cost function, and aggregate production decision among several firms.1 Finally, we will investigate under which conditions we can state that a firm's production decision is efficient.2

Production sets

Let us define a production vector (or production plan) y=(y1,y2,…,yL) as a vector with L components. If a particular component of the vector is positive, e.g., y2>0, it denotes that the firm is producing positive amounts of good 2. If instead, a complement of the vector is negative, y2<0, it denotes the firm is using the good 2 as an input in its production process.

We are especially interested in production plans that are technologically feasible. We represent all technologically feasible production plans as part of the production set Y.

{ }: ( ) 0LY y F y= ∈ ≤

where F(y) is the transformation function. This function can be intuitively understood as a production function, as the following figure illustrates.

Figure #4.1

1 Aggregation in production theory will prove easier than in consumer theory. In particular, no wealth effects arise in production theory, making the aggregation among several firms easier.

2 This chapter follows chapter 5 in MWG. For an intermediate microeconomics approach see chapters 6 to 8 in Besanko and Braeutigam, and for a presentation combining sections of MWG and intermediate microeconomics, see Varian (chapters 1-5).

Page 104: MUÑOZ,F. Advanced microeconomic theory. WSU


In particular the firm uses units of good 1 as an input in its production process in order to produce units of good 2 as an output. For this reason, in the left-hand side of the figure y1<0 (input) while y2>0 (output). On the one hand, the boundary of the production function indicates production plans for which F(y)=0. (We also referred to this boundary as the transformation frontier). On the other hand, points below the transformation frontier indicate feasible production plans, since F(y)<0.

Therefore, for any production plan y on the transformation frontier such that F( y )=0, we can totally differentiate the transformation frontier, as follows

( ) ( )0k l

k l

F y F ydy dy

y y

∂ ∂+ =

∂ ∂

and solving for dyl/dyk, we obtain

( ) ( )

,( ) ( ) where ( )k k

l l

F y F yy yl

l kF y F yk y y

dyMRT y


∂ ∂∂ ∂

∂ ∂∂ ∂

= − =

The marginal rate of transformation between good l and k, evaluated at y , MRTl,k( y ), measures how much the (net) output of good k can increase if the firm decreases the (net) output of good l by one marginal unit.

Let us next the note inputs and outputs with different letters. In particular,

1, 2

1 2

( ,..., ) 0 outputs

( , ,..., ) 0 outputsM


q q q q

z z z z −

= ≥

= ≥

where the number of inputs, L, is larger or equal to the number of outputs, M. in this case, hence, goods are transformed into outputs by the production function f(z1,z2,…,zL-M), i.e., f:RL-M→RM. Let us consider an example. A firm producing one single output, M=1, using L inputs, has a collection set Y that can be described as

{ }1 2 1 1 2 1

1 2 1

( , ,..., , ) : ( , ,...,

and ( , ,..., ) 0L L


Y z z z q q f z z z

z z z− −

= − − − ≤

Totally differentiating this firm's production function f(z) –similar to how as we did above for the transformation function-- and holding output level fixed, we obtain

( ) ( )0k l

k l

f z f zdz dz

fz fz

∂ ∂+ =

and rearranging

( ) ( )

,( ) ( ) where ( )k k

l l

F z F zz zl

l kF z F zk z z

dzMRTS z


∂ ∂∂ ∂

∂ ∂∂ ∂

= − =

Page 105: MUÑOZ,F. Advanced microeconomic theory. WSU


Intuitively, the marginal rate of technical substitution between inputs l and k, evaluated at input vector z , MRTSl,k( z ), measures the additional amount of input k that must be used when we decrease the amount of input l marginally, and we want to keep output level unchanged at q =f( z ).3 The following figure depicts the combinations of inputs 1 and 2 (e.g., capital and labor) for which the firm reaches a production level of 200 units. We refer this level set of (z1,z2) pairs reaching the same total output as an isoquant curve for the firm.4 As we have described above, the slope of the isoquant is the MRTS1,2, since it depicts by how much we must increase the use of input 1 if we are to marginally decrease the amount of input 2.

Figure #4.2

Example. Let us next find the MRTSl,k( z ) for Cobb-Douglas production function.

1 2 1 2

11 21 2


11 22 2


1 ( 1)1 2 1 2

1,2 1 ( 1)2 2 1 1

Cobb-Douglas Production Function

( , ) where 0 and 0

( , )

( , )

( )

If we were given a

l z z z z

f z zz z


f z zz z


z z z zMRTS z

z z z z

α β

α β

α β

α β β β

α β α α

α β



α α αβ β β

− −

− − −

= ≥ ≥∂



= = =

1 2 1,2

particular value of inputs:

3z ( , ) (2,3), then ( )

2z z MRTS z


= = =⋅

3 Note the close relationship between the MRTSl,k in production theory and the MRSx,y in consumer theory. In particular, the later measures additional amount of good y that an individual must consume when we decrease the amount of good x marginally, we want to keep the utility level of this individual unchanged at u .

4 Note also the close relationship between isoquant curves for a firm (representing combinations of inputs for which the firm reaches the same level of output) and indifference curves for a consumer (representing combinations of goods for which the consumer reaches the same utility level).

Page 106: MUÑOZ,F. Advanced microeconomic theory. WSU


Properties of production sets

Let us next describe different properties or production sets. Note that the following properties can be mutually exclusive.

1. Y is nonempty: That is, we have inputs and/or outputs. If, using inputs we can only obtain zero amounts of output, our production set would still be nonempty.

2. Y is closed: the production set includes its boundary points. 3. No free lunch: this property states that the firm must use inputs in order to produce output. Or, in

other words, the firm cannot be producing outputs 1 and 2 without using any inputs. The following figure represents a production set that satisfies the no free lunch condition, since the firm is using amounts of input 1 in order to produce positive amounts of good 2 as an output. In contrast, the two figures at the bottom illustrate production plans that violate the no free lunch condition given that the firm produces positive amounts of good 1 and 2 without the need to use any inputs.

Figure #4.3

4. Possibility of inaction. This condition states that the firm can choose to use no inputs and obtain no output as a consequence. In other words, the input-output vector 0 (in the origin of the figure) is part of the production set.

Page 107: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.4

Let us examine the relationship between this condition and the presence of fixed or sunk costs. If the firm experiences fixed costs, as the figure below depicts, the firm is using an amount of input 1 without obtaining any output in return. Inaction, however, is still possible since the origin still belongs to the production set. If the fixed costs that the firm must incur (e.g., setup costs) are sunk, then the firm cannot move towards the origin 0. For instance, the firm already signed for the purchase of 1y input, and cannot renege from such a contract. In this case, inaction is not


Figure #4.4

5. Free disposal: if y is a production plan that belongs to the production set Y, and y’≤y, then y’ must also belong to production set Y. Intuitively, note that production plan y’ is less efficient than production plan y: either it produces the same output using more inputs, or produces less output using the same amount of inputs. Then, production plan y’ must also belong to the firm's production set. Hence, the producer can use more inputs without the need to reduce his output: in particular, the producer can dispose of (eliminate) the additional inputs he doesn't need at no cost (for this reason this property is referred as “free disposal”). The following two figures illustrate the no free lunch property for two very different production sets: if production plan y belongs to the production set Y, then y’<y must also be part of Y.

Page 108: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.5

6. Irreversibility: suppose that production plan y belongs to production set Y (and that it does not

coincide with the origin). Then, production plan –y cannot belong. The following two figures illustrate the no irreversibility property. This property illustrates that there is “no way back” for the firm. It is easy to construct a production set Y and a production plan y belonging to that set, for which its “mirror” in the right-hand side quadrant does not belong to the production set.

7. Nonincreasing returns to scale: if production plan y belongs to Y, then a scaling down of production plan y, αy for [0,1]α ∈ , is also part of the production set Y. the following figures illustrate a production set meeting nonincreasing returns to scale (since scaling down any production plan y denotes a new production plan that also lies in the production set Y), and a production set that violates nonincreasing returns to scale (where scaling down production plan y creates a new production plan that does not belong to the production set Y, the right graph).

Figure #4.6

Nonincreasing returns to scale maintain an interesting relationship with the presence of fixed and sunk costs. In particular, the presence of any of these costs implies that the firm's production set violates nonincreasing returns to scale. The following two figures illustrate that scaling down a given production plan y when the firm incurs fixed or sunk costs yields a new (scaled down) production plan that that doesn’t necessarily lie within production set Y.

Page 109: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.7

8. Nondecreasing returns to scale: if production plan y belongs to Y, then a scaling up of production plan y, αy for α ≥ 1, is also part of the production set Y. the following figure (left) depicts a collection set satisfying nondecreasing returns to scale, since scaling up any production plan y yields a new production plan that also belong to the production set Y. In contrast the figure on the right shows a production set that violates nondecreasing returns to scale: scaling up production plan y yields a new production plan that does not belong to production set Y.

Figure #4.8

Unlike our previous discussion about the relationship between nonincreasing returns to scale and fixed and sunk costs, nondecreasing returns to scale can be satisfied even when firms incur fixed and sunk costs. The next two figures illustrate this point: scaling up production plan y yields a new production plan that belongs to production set Y, both when firms incur fixed costs (left figure) and when they incur sunk costs (right figure).

Page 110: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.9

9. Constant returns to scale: if production plan y belongs to Y, then production plan αy also belongs to Y, for any α ≥ 0. Hence, both scaling down an original production plan y (when alpha takes a value between zero and one [0,1]α ∈ ) and scaling up an original production plan y (when alpha takes a value larger than one, α >1) yields new production plans that still belong to the production set Y. This point is illustrated in the following figure, which emphasizes that in order for constant returns to scale to be satisfied we need the transformation frontier to be represented with a straight line.5 Another way to interpret constant returns to scale is by noticing that production set Y satisfies both non-increasing and non-decreasing returns to scale simultaneously.6

5 Note that in the case in which the firm uses two inputs in order to produce one output, constant returns to scale implies that the cone representing the production set for the firm must have a straight surface. That is, making a vertical slice of the cone we can obtain a production set in 2D as that in the above figure.

6 One interesting exercise is to check whether constant returns to scale can be satisfied when the firm incurs fixed or sunk costs. Another interesting exercises to show that a production set Y satisfies constant returns to scale if and only if the production function is homogeneous of degree one (see Exercise 5.B.2, and the review sessions).

Page 111: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.10

In the following figures we include an alternative graphical representation of constant, increasing and decreasing returns to scale using isoquants. The figure on the left shows constant returns to scale: an increase in both labor and capital by a factor of 2 (doubling their initial amounts) increases output proportionally (doubling it, allowing the firm to move from isoquants Q=100 to Q=200). The figure at the center shows that, a same increase in labor and capital increases output more than proportionally (the firm moves from isoquants Q=100 to Q=300) when the production process exhibits increasing returns to scale. Finally, the figure on the right hand side reflects that a similar increase in both inputs produces a less than proportional increase in output when the firm exhibits decreasing returns to scale.

Figure #4.11

Page 112: MUÑOZ,F. Advanced microeconomic theory. WSU


Importantly, note that the presence of increasing, decreasing or constant returns to scale can have regulatory implications. In particular, if a firm exhibits significant increasing returns to scale, it will be able to produce a given amount of output at a lower cost per unit than could two equal-size smaller firms, each of them producing half as much output. In such context, a market would be most efficiently served by a large firm than by several small firms. If, in contrast, firms in an industry exhibit decreasing returns to scale, an opposite argument applies, and a market would be most efficiently served by several firms rather than by a large firm.7

Note that the presence of constant returns to scale does not necessarily imply increasing marginal product. Indeed, as the figure below illustrates, a firm can exhibit constant returns to scale (an increase in both inputs by a factor of 2 produces a proportional increase in output) but diminishing marginal product of labor, since an increase in labor from 10 to 20 workers produces an increase in output of 40 units (from 100 to 140), but a further increase in labor from 20 to 30 workers only induces an increase in output of 30 units (from 140 to 170), i.e., the marginal product of labor is positive but decreasing.

Figure #4.12

Example. Let us next check returns to scale in the Cobb-Douglas production function

7 Another measure of returns to scale is the so-called “scale elasticity”, which measures the percent increase in

output due to a 1% increase in the amounts of all inputs. That is, ,

( , )

( , )q t

f tk tl tE

t f k l

∂= ⋅

∂. Exercise 9.9 in NS

provides additional practice on scale elasticities.

Page 113: MUÑOZ,F. Advanced microeconomic theory. WSU


1 2 1 2 1 2 1 2( , ) ( ) ( ) ( , )f z z z z z z f z zα β α β α β α βλ λ λ λ λ λ+ += = =

Therefore, when the 1α β+ = , we have constant returns to scale. Importantly, note the strong relationship between returns to scale and homogeneity of the production function. The definition of homogeneity of degree one states that if we increase all inputs in the same proportion we must see the total output of the firm increase in the same proportion. This is exactly what occurs when constant returns to scale are satisfied, i.e., when 1α β+ = . When the sum 1α β+ > , we then have increasing returns to

scale and the production function is homogeneous of degree larger than one. Finally when the 1α β+ < , the production set satisfies decreasing returns to scale and the production function is homogeneous of degree less than one.8

Several empirical applications use the Cobb-Douglas production function to test for the presence of increasing, decreasing or constant returns to scale. Here we have the sum of the exponent α β+ ,

separating industries in three groups: those with increasing returns to scale ( 1α β+ > ), those with

constant returns to scale ( 1α β+ = ), and those with decreasing returns to scale ( 1α β+ < ). Note that, for example, doubling all inputs in the tobacco industry implies that output grows less than proportionally (in 1.42), while increasing inputs in a similar fashion in the primary metal industry produces a more than proportional increase in output (of 2.36).

Industry Alpha+Beta

Decreasing returns Tobacco 0.51

Food 0.91

Constant returns Apparel and textile 1.01

Furniture 1.02

Electronics 1.02

Increasing returns Paper products 1.09

Petroleum and coal 1.18

Primary metal 1.24

Example. The linear production function exhibits constant returns to scale. Indeed,

( , )

( , ) ( ) ( , )

f K L aK bL

f tK tL atK btL t aK bL t f K L

= += + = + = ⋅

8 For a more detailed discussion of the relationship between returns to scale and homogeneity of the production function, see NS 302-304.

Page 114: MUÑOZ,F. Advanced microeconomic theory. WSU


And similarly the fixed proportion production function exhibits constant returns to scale since

{ }{ } { }

( , ) min ,

( , ) min , min , ( , )

f K L aK bL

f tK tL atK btL t aK bL t f K L


= = ⋅ = ⋅

One interesting property of a production function, f(k,l) exhibiting constant returns to scale is that we can incorporate increasing or decreasing returns to scale by simply using a transformation F(.),

( , ) [ ( , )] where >0f K L f K L γ γ=


by CRS of ( , )

( , ) [ ( , )] [ ( , )] ( , ) ( , )f K L

F tK tL f tK tL t f K L t f tK tL t F K Lγ γ γ γ= = ⋅ = ⋅ = ⋅

Then if γ>1, the transformed production function F(k,l) exhibits increasing returns to scale, if γ=1 it exhibits constant returns to scale, and if γ<1 it exhibits decreasing returns to scale.

10. Additivity (or free entry): if production plans y and y’ individually belong to production set Y, then its sum y+y’ must also belong to Y. that is, if one plant produces y and another plant enters producing y’, then the aggregate production y+y’ must be feasible.

11. Convexity: if two production plans y and y’ belong to the production set Y, then its linear combination must also belong to Y.

, and [0,1]

(1 )

y y Y

y y Y

αα α

′∈ ∈′+ − ∈

The left figure below illustrates a production set that satisfies convexity. In particular, the linear combination between two production plants on the production frontier belongs to the production set Y, as does the linear combination between two production plans that do not belong to the production frontier. In contrast, the figure in the right reflects a production set that violates convexity. Specifically, the linear combination between two production plants does not necessarily lie within production set Y. Therefore; the interpretation of convexity is that balanced input combinations (using a mixture of the inputs in the two production plans) are more productive than unbalanced input combinations.9

9 In addition, the convexity of the production set maintains a close relationship with the concavity of the production function; see exercise 5B3 in MWG, and the review sessions.

Page 115: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.13

The following two figures examine convexity for production sets in which the firm incurs fixed costs (left figure) or sunk costs (right figure). In particular, note that when the firm incurs fixed costs a linear combination of two production plans yields a new production plan that doesn't necessarily satisfy convexity. In contrast, when the firm incurs sunk costs, any linear combination between two production plans lies within production set Y, satisfying convexity.

Figure #4.14

Let us now make a brief detour in order to discuss under which conditions the marginal rate of technical substitution (MRTS), representing the slope of the firm’s isoquants, is decreasing.

Page 116: MUÑOZ,F. Advanced microeconomic theory. WSU


, ,



( ) ( )

( )

ll k l k


dk dkl k k ll lk l kl kkdl dl



f dl

MRTS f f f f f f

l f

= = −

∂ + ⋅ − + ⋅=

We hence want to check under which conditions this derivative is smaller than zero.





Using the fact that along an isoquant, and Young's Theorem ,

( ) ( )

( )

( )

l l

k k



llk kl


f fk ll lk l kl kkf fl k


fk ll lk l l lk kk f


fdkf f

dl f

f f f f f fMRTS

l f

f f f f f f f


= − =

− ⋅ − − ⋅∂=

− − + ⋅=

2 2,


2 2


Hence, multiplying numerator and denominator by


( )

0 and 2


l k k ll kk l l k lk


k ll kk l l k lk


MRTS f f f f f f f

l f

f f f f f f f+ − − + + + −

∂ + −==

+ <

First, note that if flk>0 (i.e., if an increase in the amount of capital raises the marginal productivity of workers), then MRTS is decreasing in labor, and the isoquants gets flatter as we move to larger numbers of workers. If, however, flk<0, then we can have two cases:


,2 2

2 2

1) If 0 ( ), then is decreasing in

2) If 0, then we can have:

a) 2 0

b) 2

lk l l k


l kk ll kk l l k lk

k ll kk l l k lk

f k MP MRTS l


MRTSf f f f f f f


f f f f f f f

> ↑ ⇒↑


+ > ⇒ <∂

+ < ⇒ , 0l kMRTS



We summarize our results in the following two figures. The first one illustrate isoquants where the MRTS is decreasing in labor, and embody the case in which flk>0 and the case in which flk<0 (but an increase in the amount of capital produces a relatively small decrease in the marginal productivity of workers). The second figure (right) reflects, in contrast, isoquants where the MRTS is increasing in labor, and represents the case in which flk<0 and, in addition, an increase in the amount of capital produces a large drop in the marginal productivity of workers.

Page 117: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.15

Example. Let us next examine one example.

2 2 3 3

2 3 2

2 2 3

2 3

( , ) 600

1) Marginal Products:

1200 3 0 iff 400

1200 3 0 iff 400

2) Decreasing Marginal Productivity:

1200 6 0 iff 200


l l

k k



f k l k l k l

MP f k l k l kl

MP f kl k l kl

MPf k k l kl



= −

= = − > <

= = − > <

∂= = − < >


= =∂

2 30 6 0 iff 200l kl kl− < >

We can therefore summarize our results about the values of kl for which MPL and MPK are positive and decreasing in shaded area of the following figure.

Figure #4.16

But is the above condition (graphically represented in the area 200<kl<400) a sufficient condition in order to guarantee that the MRTS is diminishing and we obtain the standard bowed isoquants? No. As we described in our previous discussion, in order to guarantee that the MRTS is diminishing in labor we need to check the sign of flk.

Page 118: MUÑOZ,F. Advanced microeconomic theory. WSU


2 22400 9 0 if and only if 266= = − > <lk klf f kl k l kl

Figure #4.17

We know that when fkl>0 we can guarantee that MRTS is diminishing. Among the area 200<kl<400, this occurs in particular at values below kl=266, as depicted in the figure. For values above that cutoff, however, fkl becomes negative, and we cannot guarantee that the MRTS is diminishing.

Remark on CRS: Recall that when a production function exhibits constant returns to scale we have that an increase in all inputs by the same proportion produces an increase in the firm’s output in the same proportion. That is,

( , ) ( , )f tk tl t f k l= ⋅

But we know that if the production function exhibits constant returns to scale, then it must be homogenous of degree one. In addition, we know that if a function is homogeneous of degree one, its derivative must be homogeneous of degree zero. Hence, the marginal product of labor and of capital must be homogeneous of degree zero. Therefore,


( , ) ( , )

( , ) ( , )

if we set ,

( , ) ( ,1)


l l


kl l l l

f k l f tk tlMP

l lf k l f tk tl


MP f k l f

∂ ∂= =

∂ ∂= =


= =

We can therefore conclude that, when the firm’s production function exhibits constant returns to scale, the marginal product of labor only depends on the ratio of capital to labor, but not on the absolute levels of capital and labor used by the firm. A similar argument can be extended to the marginal product of capital. Hence, the ratio of the marginal products MRTS=MPL/MPK only depends on the ratio of capital to labor, but not on the absolute levels of capital and labor. Graphically, this implies that the slope of a firm’s isoquants coincide at any point along a ray from the origin.10 This occurs, in particular, when the firm’s production function is homothetic.11 We illustrate this property in the figure below.

10 Note that a ray from the origin maintains the ratio of k/l constant. 11 Recall a similar property in consumer theory.

Page 119: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.18

Elasticity of substitution. The elasticity of substitution measures the proportionate change in the ratio k/l relative to the proportionate change in the MRTS along an isoquant. That is,

% ( / ) ( / ) ln( / )

% / ln

k l d k l RTS k l

RTS dRTS k l RTSσ Δ ∂= = ⋅ =

Δ ∂

Note that the value of the elasticity of substitution is positive because when k/l decreases (increases) the MRTS decreases as well (increases as well, respectively). Indeed, if we move along an isoquant towards higher amounts of labor, then the ratio k/l decreases and the isoquant becomes flatter (reducing MRTS). A similar argument applies when we move towards higher amounts of capital.12 This is illustrated in the following figure, whereby a movement from point A to B reduces the ratio k/l and also the slope of the isoquant, measured by MRTS. The elasticity of substitution provides as with a measure about how magnitude changes the most.

Figure #4.19

12 Note that this reasoning is only valid when the isoquants are bowed in towards the origin, i.e., when MRTS is decreasing in labor. If, in contrast, MRTS increases in labor, a decrease in ratio k/l (moving towards higher amounts of labor) might cause the isoquant to become steeper (higher MRTS).

Page 120: MUÑOZ,F. Advanced microeconomic theory. WSU


First, note that if the elasticity of substitution is high, this implies that the MRTS is not substantially changing relative to k/l. This occurs, in particular, when the isoquants are relatively flat, as the figure below depicts.

Figure #4.20

Second, if the elasticity of substitution is low, then the MRTS is substantially changing relative to k/l. This occurs when the slope of the isoquant changes significantly when we alter the input combination. We provide a graphical illustration of an isoquant associated with a low elasticity of substitution below.

Figure #4.21

Page 121: MUÑOZ,F. Advanced microeconomic theory. WSU


Let us briefly analyze the extreme cases described above. Let us start with the linear production function q=f(k,l)=ak+bl. Importantly, this production function exhibits CRS since

( , ) ( ) ( , )f tk tl atk btl t ak bl tf k l= + = + =

In addition, all isoquants are straight lines (so their slope is constant for all amounts of labor, and therefore, for all k/l ratios). The figure below illustrates a linear production function in which labor and capital are perfect substitutes in the production process. Importantly, this implies the MRTS is constant in k/l and hence the elasticity of substitution defines an infinitely large number.13

Figure #4.22

Let us next examine the other polar case in which both inputs must be used in fixed proportions, i.e., q=min{ak,bl} where a,b>0. In this case, the MRTS changes dramatically from infinite (for labor amounts below the kink of the isoquant) to zero (for labor amounts beyond the kink). This implies that the change in the MRTS is infinite, defining as a consequence a zero elasticity of substitution.

Figure #4.23

Let us now analyze the Cobb-Douglas production function. ( , ) , , 0a bq f k l Ak l A a b= = >

13 Another associated property of the linear production function is that it is homothetic, i.e., the slope of its isoquants are constant along any ray from the origin.

Page 122: MUÑOZ,F. Advanced microeconomic theory. WSU


As we described in previous classes, this production function can exhibit any returns to scale, depending on the sum of its exponents.14 Importantly, this production function can be linearized by applying logarithms, as follows.

ln ln ln lnq A a k b l= + +

Where a is the elasticity of output with respect to capital, i.e., ln,ln

qq kk E∂

∂ = , and b is the elasticity of output

with respect to labor, i.e., ln,ln

qq ll E∂

∂ = .

Note that the elasticity of substitution for the Cobb-Douglas production function can be shown to be exactly one, for any parameter values. Indeed,


, 1




( , )


ln ln ln


ln ln ln




a b

a bl

l k a bk

l k

l k


l k

f k l Ak l

MP b A k l b kMRTS

MP a A k l a l


a l


l a



⋅ ⋅= = = ⋅

⋅ ⋅

= +

= −

∂= =∂

Let us finally examine the CES production function:

/( , ) [ ] 1, 0, >0q f k l k lρ ρ γ ρ ρ ρ γ= = + ≤ ≠

Where parameter gamma determines whether this function exhibits increasing, decreasing or constant returns to scale (when gamma>1, <1 or +1, respectively). On the other hand, note that for this production function we can define the elasticity of substitution as follows:

14 It is easy to prove that the elasticity of substitution of the Cobb-Douglas production function is exactly one. The proof is below. A good practice would be to prove it on your own.

Page 123: MUÑOZ,F. Advanced microeconomic theory. WSU


1 11

1 1


[ ] ( )

[ ] ( )


ln ( 1) ln

and solving for ln , we have

ln ln

Therefore, the elasticity of substitution between capital and labor





k l k kMRTS

lk l l





ρρ ρ ργρ

ρ ρ ργρ





− −−∂∂∂ − −∂

+ ⎛ ⎞= = = ⎜ ⎟⎝ ⎠+

= −



ln 1

ln 1



ρ= =

And therefore the CES production function embodies all the production functions described above. First, when ρ=1 we obtain that the elasticity of substitution between two inputs becomes infinite (i.e., linear production functions). Second, when ρ= -∞ rho=-infinity we obtain that the elasticity of substitution becomes zero, indicating that inputs cannot be easily substituted in the production process (i.e., fixed proportions production function). Finally, when ρ=0 we find that the elasticity of substitution becomes one, indicating that inputs can be somewhat easily substituted in the production process, as in the Cobb-Douglas production function.

We include below some empirical evidence about the elasticity of substitution for different German industries. In particular, note that inputs cannot be easily substituted in the chemical industry (while maintaining output constant), while they can be in the food industry.

Finally, note that the elasticity of substitution is defined as the percentage change in the ratio of two inputs to the percentage change in the MRTS, keeping the firm’s output and the amounts of all other inputs fixed. If we allow for variations in the output, we can then find a value of the elasticity of substitution between two inputs associated to different levels of production. In particular, the elasticity of substitution can take a different value for a particular production scale (i.e. a specific isoquant), but might change when we increase or decrease the level of production. First, we illustrate one case in which the elasticity of substitution decreases in scale: when the production level is low, at q0 and q1, the firm can easily substitute between labor and capital, but when the production level increases to q5 and q6, the substitution among inputs becomes more difficult.

Page 124: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.24

The next figure reflects the opposite case. Now the firm can more easily substitute capital for labor as its production level increases.

Figure #4.24

Page 125: MUÑOZ,F. Advanced microeconomic theory. WSU


Profit maximization

Let us now analyze the profit maximization problem and its dual: the cost minimization problem. We will henceforth assume that firms are price takers, i.e., the production plans of every individual firm do not alter market price p. In addition we assume that the production set satisfies nonemptiness, closedness and free-disposal. (Note that for completeness, we do not impose specific conditions about convexity or returns to scale, but we comment on that later on.) The profit maximization problem can also be viewed as maximizing the difference between total revenues and total economic costs.

The profit maximization problem (PMP) for the firm is


. . or alternatively . . ( ) 0

yp y

s t y Y s t F y

∈ ≤ ,

where F(ּ) is a transformation function describing Y.

The value function resulting from this maximization problem, π(p), denoted as the “ profit function” of the firm, associates every p with the highest amount of profits, chosen by the profit-maximizing production plan y. More formally, we can define the profit function as

{ }( ) max :y

p p y y Yπ = ⋅ ∈

And the supply correspondence y(p) associates to every price vector p the profit-maximizing production plan, i.e., the supply correspondence y(p) is the argmax of the above PMP. That is,

{ }( ) : ( )y p y Y p y pπ= ∈ ⋅ =

Positive components in the supply correspondence reflect the firm’s outputs supplied to the market, while negative components are inputs in the production process demanded by the firm. The following figure represents the profit maximization problem.

Page 126: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.25

First, note that in addition to the firm's production set Y, we can depict the firm’s isoprofit lines. Intuitively, an isoprofit line represents the combinations of inputs and output for which the firm obtains a given level of profits, e.g., 1 million dollars. Note that if the firm uses a larger amount of inputs, it should also obtain a larger amount of output to be sold in the market in order to maintain its profit level. Graphically, this implies that isoprofit lines increase in output when the firm uses more inputs, i.e., the isoprofit lines have a negative slope. In particular, the slope of the isoprofit line is given by the price ratio –p1/p2, which is sometimes otherwise denoted as –w/p. In addition, note that an increase in profits is associated to higher isoprofit lines, which graphically shifts the isoprofit lines northeast. Intuitively, the firm is capable of producing more units of output sold a constant market price p at a given input usage,15 its profits increase. Therefore, the firm increases its isoprofit line as far as possible (maximizing profits), and selects the production plan that is technologically feasible according to production set Y. In the above example, this occurs at the tangency point between the production set Y and the isoprofit line associated

15 Note that market prices have not changed, since isoprofit lines are moving in a parallel fashion. Hence larger can only be associated to a higher productivity of inputs.

Page 127: MUÑOZ,F. Advanced microeconomic theory. WSU


to profit level π(p). Hence, the firm chooses supply correspondence y(p) which reaches a profit level of π(p).16

A natural question at this point is that of existence. In particular, one might wonder whether there are PMPs with no supply correspondence, i.e., PMPs with no well defined profit-maximizing production plan y(p). The following example illustrates such a case.

Example. Consider a firm with a production function f(z)=q, where every unit of input to the production process is transformed into a unit of output. The following two figures illustrate this production function (which exhibits constant returns to scale). In particular, the figure at the top reflects the case in which input price pz is lower than the output price p, and as a consequence, isoprofit lines are relatively flat compared with the transformation frontier. In this case, the firm can increase the amount of input used (and output obtained) reaching higher profits (associated to higher isoprofit lines). If the firm is unconstrained in the use of inputs, it can always increase the amount of inputs in order to reach a higher isoprofit lines. Hence, the supply correspondence is not well defined, since the firm could always increase the input.17 If, in contrast, the input price is larger than the output price, then isoprofit lines become steeper than the transformation frontier. This case is illustrated in the figure at the bottom. In this example, the firm tries to reach a higher isoprofit lines by reducing the input in the production process. In the extreme, the firm reduces z until reaching z=0, i.e., y(p)=0 with associated profit function π(p)=0. In this example, if pz<p , the isoprofit curves are flatter than the transformation frontier if pz>p , the isoprofit curves are steeper than transformation frontier.

16 Note that the firm doesn't want to choose production plans associated to lower profit levels (such as those in isoprofit line π0) since, despite being technologically feasible, they do not reach the highest possible profit. Similarly, the firm cannot reach profits beyond π(p) (i.e., to the northeast of y(p)), since they are not technologically feasible.

17 However, note that if the firm is constrained in the use of inputs in the interval [0,zbar], then the firm can only increase the amount of inputs up to zbar, making the firm’s PMP well defined.

Page 128: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.26

In other terms, if pz≥p , then q=0 and Π(p)=0 and if pz<p , then q=∞ and Π(p)=∞ , clearly portraying a supply correspondence which is not well-defined.

Let us now go back to the solution of the profit maximization problem, approaching it algebraically. Taking first order conditions with respect to the firm’s production plan y, we obtain

*( )0k


F yp

yλ ∂− ≤

And for interior solutions, we have that*( )


F yp

yλ ∂=

∂, or using matrix notation, *( )p F yλ= ∇ .

Intuitively, this condition says that, at the solution of the profit maximization problem, the firm selects a production plan at which the price vector and the gradient vector are proportional (as depicted on the above figure of the profit maximization problem for interior solutions). Therefore, from the first order conditions of the PMP we have

Page 129: MUÑOZ,F. Advanced microeconomic theory. WSU


; and hence * * *( ) ( ) ( )

*( )

* ( ),*( )

p p pk k l

F y F y F yy y yk k l

F yp y

k k MRT yk lp F yl


λ= =∂ ∂ ∂∂ ∂ ∂


= =∂∂

Graphically, this condition implies that the slope of the transformation frontier (at the profit-maximizing production plan y*), i.e., MRTk,l(y*), coincides with the price ratio, pk/pl. (This condition is also graphically illustrated in the above figure representing interior solutions of the PMP.)

In order to grasp a good understanding of the above PMP, let us now focus on the case in which the firm uses several inputs in order to produce a single output. In particular, consider a production function f(z) that produces a single output using a vector z of inputs. We can then represent the profit maximization problem as follows.

0max ( )

zpf z wz


Note that, because of producing a single output, the only choice variable for the firm is the input vector z. Taking first order conditions with respect to each input zk, we obtain

*( )k


f zp w



Therefore, for interior solutions, this states that the market value of the marginal product obtained from using additional units of input k, pMPK, must coincide with the price of this input, wk. For the case of only two inputs, this condition implies that


* *

( )

*,( ) ( )

; and hence ( )k

k l

f zzk k

k lf z f zlz z

w wp MRTS z



∂ ∂∂ ∂

= = =

From this interior solution of the PMP, we see the ratio of input prices must be equal to the ratio of marginal products. Alternatively, we can express this condition by saying that the marginal productivity per dollar spent on input k is equal to that spent on input l.18

When the production set is convex, these first order conditions also sufficient. We illustrate this property in the following two figures. Note that in addition to the isoquant, we also depict isocost lines for the firm, where

18 This condition is analogous to the “bang for the buck” condition we described in consumer theory, but applied to the marginal productivity per dollar spent on inputs, rather than the marginal utility per dollar spend on a particular good.

Page 130: MUÑOZ,F. Advanced microeconomic theory. WSU


w1z1+w2z2=cbar ,

which reflects combinations of inputs z1 and z2 for which the firm incurs the same total cost, cbar, for given input prices w1 and w2.

With convex production sets, we have isoquants bowed in toward the origin, i.e., defining convex upper contour sets, as the figure below reflects. In this case, first-order necessary conditions are also sufficient.

Figure #4.27

However, if production sets are not convex, we have bowed out isoquants, defining concave upper contour sets, as the figure below illustrates.

Figure #4.28

Page 131: MUÑOZ,F. Advanced microeconomic theory. WSU


In this case, the tangency condition specified in the first-order necessary conditions of the PMP defines an input vector (z1,z2) –denoted as A in the figure— which is clearly not profit-maximizing. Instead, the profit-maximizing vector is at a corner solution, where the firm uses none of input 1 but only input 2. Indeed, the firm reaches the farthest out isoquant for a given isocost line at that input combination where the slope of isoquant and isocost do not coincide. In particular, the isoquant is flatter than the isocost, reflecting that the marginal productivity per dollar spent on input 1 is lower than that spent on input 2, which leads the firm to spend all its money on input 2 alone.

Profit function. Let us next describe some properties of the profit function, π(p). Assume that the production set Y is closed and satisfies the free disposal property. Then,

1. Homogeneity: The profit function π(p) is homogeneous of degree one in prices. That is, increasing the prices of all inputs and outputs produces a proportional increase in the firm's profits, i.e., π(λp)= λπ(p). Note that the profit function can be expressed as

1 1 2 2 ...p q w z w zπ = ⋅ − − − , where inputs and outputs are evaluated at the profitmaximizing amount. Scaling all prices up by a common factor λ, we obtain

( ) ( )p q w z p q w z pλ λ λ λ π⋅ − ⋅ = ⋅ − ⋅ = ⋅ Which shows that the firm's profits increase in the same proportion as input and output prices were increased, i.e., homogeneity of degree one holds.

2. Convex in output prices: The profit function π(p) is convex. Intuitively, this implies that the firm obtains more profits from balanced input and output combinations, than from unbalanced combinations.

3. If the production set Y is convex, then

{ }: ( ) for all 0LY y p y p pπ= ∈ ⋅ ≤ >>

4. If y(ּ) is a differentiable function at pbar, then Dy(pbar)=D²Π(pbar) is a symmetric and positive semi-definite matrix with Dy(pbar)pbar=0. Dy(p) here is the supply substitution matrix, whose properties parallel those of substitution matrices in demand theory, however the sign is reversed. Intuitively, property number 3 implies that the production set Y can be alternatively represented by this “dual” set. It specifies that, for any given prices p, all production vectors y generate less profits (py) than the optimal profit function π(p). Let us provide next a graphical representation of this property. The following figure represents a convex production set Y, the supply correspondence y(p) that maximizes profits, and the associated isoprofit line π=pq-wz.

Page 132: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.29

First, note that all combinations of an output below this isoprofit line yield a lower profit for the firm. That is, pq-wz≤ π(p). Alternatively, the isoprofit line can be represented by

wq z

p p

π= +

Note that if the price vector w/p is constant (i.e., different levels of input usage or different levels of output sales do not affect input or output prices, respectively), then we have that the slope of the isoprofit lines are constant in z, and therefore the profit function is convex. (The linear combination of any two points (z,q) and (z’,q’) is on or below the isoprofit line, i.e. lies within the set). If, however, input prices (w) and/or output prices (p) are not constant, we might have that the price vector is not constant. In this case, we might face nonconvex isoprofit lines. a. Let us first focus on the case in which input prices are a function of input usage, i.e., w=f(z)

where f’(z) is different from zero. Then either: i. f’(z)<0, and the firm gets a price discount per unit of input from suppliers when

ordering large amounts of inputs, e.g., loans; or ii. f’(z)>0, and the firm has to pay more per unit of input when ordering large amounts

of inputs, e.g., scarce qualified labor. b. Now we analyze the case in which output prices are a function of production, i.e., p=g(q)

where g(q) is different from zero. Then either: i. g’(q)<0, and the firm offers price discounts to its customers; or

ii. g’(q)>0, and the firm applies price surcharges to its customers. For the time being we ignore the possibility that a change in the firm’s production affects output prices. We will return to this topic the later chapters.

When we consider the possibility that w=f(z), we can then express the profit function as ( )f z

q zp p

π= + .

a. If f’(z)<0 (as described in point a.i. above), we then have strictly convex isoprofit curves, as the following figure illustrates. Intuitively, the price ratio becomes lower as we increase z, and therefore the isocost curve becomes flatter.

Page 133: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.30

b. If f’(z)>0 (as described in point a.ii. above), we then have strictly concave isoprofit curves, as the following figure illustrates. Intuitively, the price ratio increases as we increase z, and therefore the isocost curve becomes steeper.

Figure #4.31

c. If f’(z)=0, we then have straight isoprofit lines as in our previous examples where input prices

are independent upon input usage.

More comments about the profit function are in order. First, recall that it is a value function, measuring firm profit only for the profit-maximizing vector y*. Second, the profit function can be understood as a support function. In particular, let us first take the negative of the production set Y, i.e., -Y. Then we can define the support function of this –Y set as

{ }( ) min ( ) :Yy

p p y y Yμ− = ⋅ − ∈

The support function first evaluates the profits resulting from old production vectors y in Y, py; second, it takes the negative of all these profits, p(-y); and finally, the support function chooses the smallest one. Of

Page 134: MUÑOZ,F. Advanced microeconomic theory. WSU


course, this procedure is the same as maximizing the positive value of the profits resulting from all production vectors y in Y, py. We provide below a simple example for comparison.19

max p y⋅ min ( )p y⋅ −

1p y⋅ Highest ranking 1( )p y⋅ − Lowest ranking

2p y⋅ ↑ 2( )p y⋅ − ↓

3p y⋅ ↑ 3( )p y⋅ − ↓

… ↑ … ↓

… Lowest ranking … Highest ranking

Therefore, the profit function π(p) is the support function of the negative production set –Y, i.e.,( ) ( )Yp pπ μ−= . Note that the representation of the profit function as a support function allows us to

“equivalently” represent the production set using the support function. We do that in the following figure. First, note that the production set Y, which we are trying to equivalently describe, is convex. Then, for a given price vector p, we select the supply correspondence y(p) resulting from solving the PMP at prices p. We obtain an associated profit function π(p). We can then take all production plans y for which isoprofit is lower, i.e., { }: ( )y p y pπ⋅ ≤

Graphically, this set considers all production plans below the isoprofit line associated to y(p) on the figure. For a different price vector p’, we can similarly select the supply correspondence y(p’) resulting from solving the PMP, which yields a profit function π(p). We can now take all production plans y such that { }: ( )y p y pπ′ ′⋅ ≤

Graphically, the set considers all production plans below isoprofit line associated to y(p’), which contains an overlap region with the set described above for price p. If we repeat this process for any other price vector p, we can define infinity many sets whose overlap exactly coincides with the area of production set Y. The representation of the profit function as a support function, therefore, allows us to equivalently describe production set Y.

Supply correspondence. Let us now describe the properties of the supply correspondence y(p) that result from solving the profit maximization problem.

1. If the production set Y is weakly convex, then the supply correspondence y(p) is a convex set for all p. Moreover if the production set Y strictly convex, then the supply correspondence y(p) is single-valued (if nonempty).

19 Note that this is applicable to the argmax of any objective function. If x*1 is the argmin that maximizes function f(x), we can then claim that x*1 coincides with the argmin of the negative of this objective function. That is, if x*2 is the argument that minimizes –f(x), then x*1= x*2.

Page 135: MUÑOZ,F. Advanced microeconomic theory. WSU


In the following figure the production set is weakly convex. In particular, it has a flat surface along which the isoprofit line associated to the highest profit level is tangent. Therefore, we can identify the set of supply correspondences that generate the highest profit for the firm. Intuitively, the firm manager is indifferent among any of the input-output combinations within the y(p) region of tangency between the isoprofit line and the production set, since all these combinations yield the same profit level. Such a set of supply correspondences is, of course, convex, since a linear combination between any production plan in the y(p)-region also lies within that region. We can therefore conclude that the supply correspondence is a convex set.

Figure #4.32

(These graphs could also include the price vector, orthogonal to the isoprofit line.)

If production set Y is strictly convex, as the following figure illustrates, then the tangency condition between the isoprofit line and the production set occurs at a single point. Therefore, in this case the supply correspondence y(p) is single-valued.

Figure #4.33

(These graphs could also include the price vector, orthogonal to the isoprofit line.)

Page 136: MUÑOZ,F. Advanced microeconomic theory. WSU


2. Hotelling’s lemma: If the supply correspondence y(pbar) consists of a single point, then the profit function π(p) is differentiable at pbar. Moreover, such derivative yields

( ) ( )p p y pπ∇ =

This lemma is an immediate application of the duality theorem that we described in consumer theory. The law of supply and quantity theory also apply here; quantities respond in the same direction as price changes. Mathematically expressed,





3. If the supply correspondence y(p) is differentiable at pbar, then its derivative Dy(pbar)=D2y(pbar)

is a symmetric and positive semidefinite matrix with Dy(pbar)pbar=0. This property has two immediate consequences. First, it implies that the elements in the main diagonal of the matrix Dy(pbar) are nonnegative. Recall that the elements in the main diagonal of this matrix describe the own substitution effects. We therefore know that

( )0 for all k


y pk



Moreover, since the matrix Dy(pbar) is symmetric, we can hence conclude that the cost substitution effects are symmetric. That is,

( ) ( ) for all and l k

k l

y p y pl k

p p

∂ ∂=

∂ ∂

Importantly, nonnegative own substitution effects imply that quantities and prices move in the same direction, that is

(p-p’)(y-y’)≥0 This implies that the supply function of the firm is positively sloped, as the following figure indicates. That is the law of supply holds.

Figure #4.34

Page 137: MUÑOZ,F. Advanced microeconomic theory. WSU


Note that in this new budget constraint, there is no wealth compensation requirement, unlike in demand theory. This implies that there are no income effects, only substitution effects.20 Alternatively, from a revealed preference perspective, this implies that

when , I choose when , I choose

( ) ( ) ( ) ( ) 0p y p y

p y p y p y p y

p p y y py py p y p y′ ′

′ ′ ′ ′⋅ ≥ ⋅ ⋅ ≥ ⋅

′ ′ ′ ′ ′ ′− ⋅ − = − + − ≥

Cost minimization

Let us now analyze the combination of inputs the firm selects in order to minimize its total cost of production, conditional on reaching a particular output level. For simplicity, we focus on the single output case, where z is the input vector, f(z) reflects the production function, q are the units obtained of the single output, and w>>0 is the vector of input prices.

Therefore, the cost minimization problem (CMP) can be stated as follows (we assume free disposal of output):


. . ( )z

w z

s t f z q≥

≥ (productive feasibility)

In words, the firm selects a vector of inputs (or factors of production), z, that minimizes total costs, wz, subject to productive feasibility, i.e., f(z)≥q. The optimal vector of inputs is denoted as z(w,q), and it is usually referred to as the conditional factor demand correspondence.21(or function if it is always single-valued). Intuitively, z(w,q) reflects the optimal demand or inputs of a firm when input prices are w and the firm wants to reach a production level q. The following figure provides a graphical representation of the above cost minimization problem for a firm producing output using two inputs, z1 and z2.

20 We return to this issue when analyzing the cost-minimizing problem for the firm, where we describe it in more detail.

21 The term “conditional” in this expression simply refers to the fact that z(w,q) represents the firm's demand for inputs, conditional on the requirement that the output level q be produced.

Page 138: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.35

First, note that the input combinations on or above the isoquant f(z)=q are technologically feasible, while those below the isoquants are not. Therefore, the above CMP can be summarized as: “for input combinations along a given isoquant f(z)=q, choose the input combination associated to the lowest cost, wz, i.e., to the isocost line closer to the origin.” At input combination z(w,q) the firm cannot reduce its costs any farther and still produce output level q. At this input combination, the firm's costs are wz=c(w,q), as depicted in figure.22 Therefore, the input combination that minimizes costs is z(w,q), and the isocost line associated with that combination of inputs is {z : wz=c(w,q)}, where c(w,q) represents the lowest cost of producing output level q when input prices are w, and it is usually referred as the cost function.23 Graphically, note that the cost minimizing input combination z(w,q) the firm's isoquant curve is tangent to the isocost line. Let us prove this result by using the first order conditions of the above CMP.

**( )

0 ( 0 if interior solution, 0)k kk

f zw z

zλ ∂− ≥ = >

or in matrix notation,

*( ) 0w f zλ− ∇ ≥

and solving for the Lagrange multiplier, we obtain that

22 Note that for isocost lines above c(w,q) –using more inputs— still reach the isoquant f(z)=q, thus satisfy the constraint of this CMP. However, because of using more inputs, this input combinations are more costly than z(w,q) and are hence not cost minimizing. Similarly, isocost lines below c(w,q)—using less inputs—cannot be optimal either, since they do not reach output level q.

23 Note that, mathematically, the cost function c(w,q) is the value function of the CMP.

Page 139: MUÑOZ,F. Advanced microeconomic theory. WSU




( )

*,( )( )k


f zzk

k lf zl z





= =

Note that alternatively, this condition states that at the cost minimizing input combination, the marginal utility per dollar spent on input k must be equal to the marginal utility per dollar spent on input l. Otherwise, if the marginal utility per dollar is larger for one input then the firm will not be at the optimum since it would have incentives to spend more money on the input for which marginal utility per dollar is larger. (Importantly, note that this tangency condition coincides with the one obtained for the profit maximization problem some pages above, showing that the CMP is the dual problem of the PMP.)24

Sufficiency: similarly to the PMP, the above first-order necessary conditions become sufficient when the production set is convex. The following figure illustrates a nonconvex production set, in which the input combinations satisfying the first-order conditions is not the cost minimizing input combination z(w,q). Instead, z(w,q) occurs at the corner, where the firm only uses input 1.

Figure #4.36

A similar argument can be extended to linear production functions, as we describe in the following example.

Corner solutions: consider a firm with production function Q=10L+2K, where L and K denote amounts of labor and capital respectively. It is easy to check that the isoquant is a straight line with slope MRTS=-MPL/MPK=-5. In the case that input prices are w=$5 and r=$2, the isocost lines has a slope of –w/r=-2.5. If the firm wants to reach an output level of Q=200 units, the marginal product per dollar spent on labor is higher than that in capital, inducing the firm to choose a combination of inputs L=20 K=0 (corner) for which the above tangency condition (first order condition) does not hold. The following figure illustrates this case.

24 For a firm with production function Q=50(LK)1/2 (where L and K denote the amounts of labor and capital respectively) that wants to reach a production level of Q units, and facing input prices w and r, find the conditional factor correspondences for labor and capital.

Page 140: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.37

Lagrange multiplier: Finally, note that the Lagrange multiplier λ can be interpreted as the cost increase that the firm experiences when it needs to produce a higher output level q.25 therefore, the Lagrange multiplier is the marginal cost of production: the marginal increase in the firm's costs from producing additional units.

Comparative statics of z(w,q). Let us now continue with comparative statics analysis. We first describe how the conditional demand correspondence is affected by changes in input prices. When w falls, two effects occur:

1. A substitution effect. If output is held constant, there will be a tendency for the firm to substitute labor or capital in the production process.

2. An output effect. A change in the price of labor, w, reduces firm’s costs, allowing it to produce larger amounts.

Let us next provide a graphical intuition behind these two effects and later on describe them mathematically. The following figure illustrates the substitution effect associate to a wage decrease. Starting from an initial cost-minimizing input combination z0 (w,q), a reduction in wage produces an outward pivoting effect on the firm’s isocost associated with cost level c(w,q).26 However, the firm is not cost minimizing if it selects a point along the new isocost. Indeed, it can reduce its total costs (graphically, pushing the new isocost inwards in a parallel fashion) until it reaches a tangency point with the isoquant. At the new cost-minimizing input combination z1(w,q) the firm is indeed selecting the input combination that minimizes total costs (at the new input prices) and reaches output level q. That is, z1(w,q) solves the new CMP for the firm after the change in input prices. Comparing the cost-minimizing

25 Recall that, generally, the Lagrange multiplier represents the change in the value function resulting from the optimization problem if we relax the constraint, e.g. change of wealth level in the UMP, the utility level that must be reached in the EMP, etc.

26 Note that the isocost associated to the cost minimizing input combination must incur a cost level c(w,q)=wl(w,q)+rk(w,q), where l(w,q) denotes the cost minimizing amount of labor and k(w,q) that of capital. A reduction in w therefore pivots the isocost line outwards, as depicted in the figure.

Page 141: MUÑOZ,F. Advanced microeconomic theory. WSU


input combinations before and after the fall in w, z0(w,q) and z1(w,q), we can observe that the firm uses more labor (the factor of production that became relatively cheaper) and less capital (the input that became relatively more expensive).

Figure #4.38

We can therefore conclude that the substitution effect in production is negative: a decrease in the price of one input increases the firm's demand (use) of that input.27 That is, however, another effect associated to a decrease in the price of labor. In particular, the firm can now reach a higher output levels incurring the same total costs as before the input price change. We refer to this effect is the output effect. The following figure represents the output effect for our previous example.

Figure #4.39

27 Note that this is a consequence of the diminishing MRTS (isoquants becoming flatter as we increase labor in the figure).

Page 142: MUÑOZ,F. Advanced microeconomic theory. WSU


Starting from the cost minimizing input combination after the price change (denoted as B in the figure), we can observe that the firm is able to reach a higher isoquant f(z)=q1 incurring the same total costs as before the price change. In particular, note that the isocost passing through input combination A (at old input prices) and that passing through C (at new input prices) are equally costly. We can hence decompose the increasing labor demand associated to a reduction in labor prices into two effects: a substitution effect (measured by the increase in labor demand from LA to LB) where the firm still produces the same amount as before the price change, and an output effect (measured by the increase in labor demand from LB to LC) where the firm still incurs the same total costs as before the price change, but is capable of reaching higher output levels. The sum of these two effects reflects the total effect of a decrease in wages on labor demand.

A couple of comments are in order. First, note that the own substitution effect (a change in the price of one input into the demand for that same input) is negative. The output effect is, perhaps surprisingly, also negative, even when inputs are regarded as inferior in production (i.e., when an increase in output implies using that input in lower amounts).28 Second, the cross-price substitution effect is not necessarily negative, i.e., a decrease in wages can potentially increase/decrease the firm's demand for capital. We elaborate on these two points in our following mathematical treatment of the substitution and output effect.

Let lc(r,w,q) denote the conditional demand for labor (where “conditional” refers to the fact that the firm always produces output q)29, and let l(p,r,w) denote the unconditional demand for labor (which depends on the market price of the output and input prices, but doesn't depend on a particular output level q). We know that at the profit maximizing output level, q(p,r,w), both the conditional and unconditional demand for labor must coincide. That is,


Differentiating with respect to w yields

substitution effect output effect

( , , ) ( , , ) ( , , )c cl P r w l r w q l r w q q

w w q w

∂ ∂ ∂ ∂= + ⋅

∂ ∂ ∂ ∂

As indicated above, a reduction in wages produces an increase in the demand for labor when the firm maintains its production level unmodified. This increase in labor demand is reflected in the substitution effect. Nonetheless, a reduction in the price of labor allows the firm to increase production (reach a higher isoquant), i.e., dq/dw<0, and an increase in production is associated to an increase in the demand for labor, dlc(r,w,q)/dq>0. As a result the output effect is also negative, reinforcing the substitution effect. Hence, the unconditional labor demand l(p,r,w) must be negatively sloped.30 The following figure illustrates the conditional and unconditional labor demands. The reduction in wages produces a relatively small increase in labor demand if output is fixed at q1, i.e., moving from A to B along the conditional

28 For a longer discussion on why the output effect is always negative, see NS pp. 378-379 (specially footnote 15, and the accompanying explanations).

29 Recall that this conditional demand is denoted as z(w,q) in MWG using vector notation, i.e., z(w,q) includes the firm’s demand for all inputs, and w is the vector of input prices. Otherwise, both expressions are equivalent.

30 That is, we can observe “Giffen inputs”.

Page 143: MUÑOZ,F. Advanced microeconomic theory. WSU


labor demand lc(r,w,q1). This increase is reinforced by the output effect due to the fact that the firm is now capable of reaching a higher production level q2. The total effect, moving from A to C, is reflected by the unconditional labor demand l(p,r,w). Note that, because the total effect is larger than the substitution effect for all types of inputs, the unconditional labor demand must be flatter than the conditional labor demand.

Figure #4.40

Let us now turn into the cross-price effects associated to a reduction in the price of one input. Importantly, we cannot make a precise prediction about how capital usage responds to a wage change. On one hand, after a fall in wages the firm will substitute away from capital (since it became relatively more

expensive). As a consequence, the cross-price substitution effect is positive, i.e., ( , , )

0cK r w q



∂. On

the other hand, the output effect we described above will cause more capital to be demanded by the firm as it expands production. This implies that the cross price effect of output is negative, i.e.,

( , , )cK r w q q

q w

∂ ∂⋅

∂ ∂. Therefore, we cannot conclude whether the cross-price substitution effect dominates

the output effect (implying that the cross price total effect is positive) or that, instead, the cross price output effect dominates the substitution effect (in which case the cross price total effect is negative).31

Cost function. Let us next describe some properties about the cost function c(w,q) (i.e., the value function associated to solving the CMP). If the production set Y is closed and satisfies the free disposal, then

(i) C(.) is homogeneous of degree one in w and nondecreasing in q. (ii) C(.) is a concave function of w

31 For an interesting example of the substitution and output effects seen page 379-380 using NS. (If you are revising these lecture notes you should expand on this example).

Page 144: MUÑOZ,F. Advanced microeconomic theory. WSU


(iii) If the sets {z>=0: f(z)>=q} are convex for every q, then Y = {(-z,q): w.z>=c(w,q) for all w>>0}

(iv) Z(.) is homogeneous of degree 0 in w (v) If the set {z>=0:f(z)>=q} is convex, then z(w,q) is a convex set. Moreover, if {z>=0:f(z)>=q}

is a strictly convext set, then z(w,q) is single valued (vi) Shepard’s lemma

These properties are discussed in details here:

1. The cost function c(w,q) is homogeneous of degree one in the input prices w, i.e., c(λw,q)=λc(w,q). That is, increasing only input prices by a common factor λ induces a proportional increase in the minimal costs of production. As the following figure illustrates, an increase in all inputs by the same proportion produces a parallel downward shift in the firm's isocost line. If the firm needs to reach isoquant f(z)=q again, it needs to incur larger costs (shifting its isocost upwards) until it reaches f(z)=q.

Figure #4.41

2. The cost function c(w,q) is nondecreasing in output level q. Intuitively, producing higher output

levels implies a weakly higher minimal cost of production. The following figure illustrates this property.

Page 145: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.42

3. If the sets {z : f(z)≥q} are convex for every output level q, then the production set can be

equivalently described as

{ }( , ) : ( , ) for every 0Y z q w z c w q w= − ⋅ ≥ >>

The following figure illustrates this property. First, take an isoquant f(z)=q. Next, for input prices w=(w1,w2), find the cost function c(w,q) by solving the CMP. (We do that at input combination z(w,q) in the figure, with associated cost function c(w,q)). Note that only input combinations above the cost function represent input combinations that satisfy the constraint f(z)≥q of the CMP. But these combinations are more costly than the cost minimizing input vector z(w,q). We can now repeat this process for other input prices w’=(w1’,w2’), for which we can find the cost minimizing input vector z(w’,q) with associated cost function c(w’,q). If we repeat this process for infinitely more input vectors, the intersection of the “more costly” input combinations, wz≥c(w,q) for every input price vector w>>0, describes the set f(z)≥q.

Figure #4.43

Page 146: MUÑOZ,F. Advanced microeconomic theory. WSU


Conditional factor demand correspondence, z(w,q). If the production set Y is closed and satisfies the free disposal property, then

1. The conditional factor demand correspondence z(w,q) is homogeneous of degree zero in the input prices, w, i.e., z(λw,q)=z(w,q). Intuitively, an increase in all input prices by the same amount does not alter the firm's demand for inputs. We provide a graphical example of this property below. The firm is initially choosing the cost minimizing input vector z(w,q). When all inputs become more expensive, the firm's isocost line shifts downwards (in a parallel fashion, since the ratio of input prices has not been modified). However, if the firm wants to reach output level q, it must shift the isocost line upwards until reaching isoquant f(z)=q again. This, however, implies incurring larger costs, as described in our discussion of the cost function. Importantly, since the relative input prices have not changed the tangency between the isoquant and isocost occur at the same input combination, and therefore z(w,q) is unaffected by a common change in all input prices.

Figure #4.44

2. If the set {z : f(z)≥q} is strictly convex, then the firm's demand correspondence z(w,q) is single

valued. If, in contrast, the set {z : f(z)≥q} is weakly convex, then the demand correspondence z(w,q) is a convex set. These two properties are illustrated in the following two figures respectively. When set {z : f(z)≥q} is strictly convex, a unique combination of inputs is cost minimizing, and therefore the demand correspondence z(w,q) is single valued. When, in contrast, the set {z : f(z)≥q} is weakly convex (e.g., has a flat surface as that in the figure) the firm can identify a set of cost minimizing input combinations where the isocost is tangent to the isoquant curve. This set of cost minimizing input combinations is itself convex since a linear combination between any two pairs in the set yields an input combination that also lies on the set.

Page 147: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.45

3. Shephard’s lemma. If the vector of demand correspondence z(wbar,q) consists of a single point,

then the cost function c(w,q) is differentiable with respect to input prices, w, at wbar, and ( , ) ( , )wc w q z w q∇ = . Note that this lemma is an updated application of the duality theorem

described in previous chapters.32 4. If z(w,q) is differentiable at wbar, then D2wc(wbar,q)=Dwz(wbar,q) is a symmetric and negative

semidefinite (NSD) matrix, with Dwz(wbar,q)wbar=0. a. First, note that Dwz(wbar,q) is a matrix representing how the firm's demand for every

input responds to changes in the price of such input, or in the price of other inputs. Therefore, the fact that this matrix is negative semidefinite implies that the elements along the main diagonal must be negative (or zero). That is, own substitution effects are weakly negative

( , )0 for every input k


z w qk



32 If you are revising these lecture notes you should expand on the connection between Shephard’s lemma and the

duality theorem

Page 148: MUÑOZ,F. Advanced microeconomic theory. WSU


Intuitively, an increase in the price of input k implies a reduction in the demand for this input.

b. Second, the fact that matrix Dwz(wbar,q) is symmetric implies that cross substitution effects are symmetric. That is,

( , ) ( , ) for all inputs and k l

l k

z w q z w qk l

w w

∂ ∂=

∂ ∂

Production function, f(z). If the production set Y is closed and satisfies the free disposal, then

1. If the production function f(z) is homogeneous of degree one (i.e., if the production function exhibits constant returns to scale), then the cost function c(w,q) and the conditional factor demand correspondence z(w,q) are both homogeneous of degree one in output, i.e, c(w,λq)=λc(w,q) and z(w,λq)=λz(w,q) Intuitively, if the production function exhibits constant returns to scale, an increase in the output level the firm wants to reach induces an increase by the same proportion in the firm's demand for inputs and in the cost function. The following figure illustrates this property. In particular, an increase in the output level that the firm wants to produce (from 10 to 20 units, for instance) induces a similar increase in the amount of inputs that the firm needs to use (because the firm's production function exhibits constant returns to scale). This increase in input usage implies, in turn, a similar increase in the minimum cost that the firm must incur.

Figure #4.46

2. If the production function f(z) is concave, then the cost function c(w,q) is a convex function of output, q. In particular, marginal costs are nondecreasing in output. That is,33



( , ) ( , )0, i.e., weakly increases in q

c w q c w q

q q

∂ ∂≥

∂ ∂

33 If you a revising these lecture notes you should expand on this property.

Page 149: MUÑOZ,F. Advanced microeconomic theory. WSU


Alternative representation of the PMP. We can alternatively represent the PMP using the cost function (i.e., the value function of the CMP). In particular,

0max ( , )

qpq c w q


Note that in our previous discussion the firm chose an input combination yielding a particular output level, i.e., the z vector was the choice variable in the version of the PMP analyzed above. In contrast, the firm now chooses an output vector, which yields a particular cost level, reflected in the cost function, c(w,q). (Recall that, in particular, the cost function contains information about the minimum cost that the firm must incur in order to produce output level q at given input prices w).

The first order conditions for q* to be profit maximizing in above PMP are

* *( , ) ( , )0; and in interior solutions,

c w q c w qp p

q q

∂ ∂− ≤ =

∂ ∂

Intuitively, at an interior optimum q*, price equals marginal cost, dc(w,q*)/dq.34

Firm’s expansion path

A firm’s expansion path represents the locus of cost-minimizing tangencies as the firm reaches higher production levels. We provide a graphical example of an expansion path below, in which the firm increases its demand for both labor and capital when it raises its output from q0 to q1 and from q1 to q2.35

34 MWG present a nice example of this problem. See example 5.C.1.

35 Note the analogy with wealth expansion paths in consumer theory: the wealth expansion path is the locus of utility-maximizing bundles for the consumer, i.e., it shows how the consumer’s demand for good 1 and 2 increases as wealth increases.

Page 150: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.47

Intuitively, this figure shows that in order to produce more output, the firm needs more of all inputs. Graphically, this implies that the firm’s expansion path is positively sloped. Hence, both inputs are regarded as “normal” inputs (as opposed to inferior inputs) since

( , ) ( , )0 and 0

c cK w q l w q

q q

∂ ∂≥ ≥

∂ ∂

If, instead, the firm uses fewer units of one input as output increases, we denote that input as inferior. The following figure illustrates one example in which both inputs are normal when the firm increases production from q0 to q1, but labor becomes inferior when output is further increased from q1 to q2.

Figure #4.48

Page 151: MUÑOZ,F. Advanced microeconomic theory. WSU


The intuition behind inferior inputs is noteworthy; a firm using fewer units of an input as it increases production. Indeed, most inputs are normal and few can be regarded as inferior. Note that even in presence of an inferior input, the isoquants may keep their usual convex shape. However, we can identify inferior inputs when the list of inputs used by a firm is relatively disaggregated. For example, among labor input within a company we might have CEOs, executives, managers, accountants, secretaries, janitors, etc. First, note that these inputs do not necessarily increase in the same proportion as the firm increases output, i.e., expansion paths do not need to be straight lines. Moreover, after reaching a certain scale of output, the firm might buy, for instance, a powerful computer with which accounting can be done with actually fewer accountants, making the specific input “labor from accountants” an inferior input for the firm.

Remark: Note that if the firm’s expansion paths are straight lines, then: all inputs increase in the same proportion as output is increased, i.e., the firm’s production function exhibits constant returns to scale. (Recall the figure of constant returns to scale from previous chapters).

Cost and Supply in the single output case

In this section we analyze cost functions and its relationship with the firm’s production function analyzed in previous sections of this chapter. Let us assume a given vector of input prices wbar>>0. Then the cost function c(wbar,q) can be reduced to c(q), where we consider that the vector of input prices remains constant. Therefore, the expression of average and marginal costs is

( ) ( )( ) and ( )

C q C qAC q C q

q q

∂′= =∂ ,

where C’(q) = MC.

Recall also from our discussion of the PMP in the previous section that p≤c’(q) (where p=c’(q) at interior solutions36).

Remark: In previous classes we showed that the cost function is homogeneous of degree one in input prices. Let us now demonstrate that we can extend this property to the average and marginal cost expressions. First, if we increase all input prices by a common factor t, average cost becomes

( , ) ( , )( , ) ( , )

C tw q t C w qAC tw q t AC w q

q q

⋅= = = ⋅

And similarly for marginal costs,

( , ) ( , )( , ) ( , )

C tw q t C w qMC tw q t MC w q

q q

∂ ⋅∂= = = ⋅

∂ ∂

At this point it is important to clarify a common confusion. Some students consider that our last result about the marginal cost function violates Euler’s theorem, since we show that both the cost function and

36 Recall that this expression states that all output levels for which the firm’s marginal cost equals market price for the output are optimal supply correspondences for the firm, y(p).

Page 152: MUÑOZ,F. Advanced microeconomic theory. WSU


its first order derivative (the marginal cost function) are homogeneous of degree one in input prices. However, Euler’s theorem wouldn’t predict this result, but a different one: if the cost function is homogeneous of degree one in input prices, then the derivative of the cost function with respect to input

prices, ( , )

( , )C w q

z w qw


∂is homogeneous of degree zero in input prices. That is, the conditional factor

demand correspondence z(w,q) is homogeneous of degree zero in input prices, which holds, as shown in previous sections of this chapter.

Graphical analysis of total costs.

Let us examine next the relationship between returns to scale and total costs for different production functions. The following figure represents the case of a constant returns to scale technology, such as a Cobb-Douglas production function where Q=50(LK)1/2. In this case, total costs maintain a constant relation with output, i.e., TC=c*q.37

Figure #4.49

As a consequence, we can conclude that the average cost of this firm is constant, i.e., AC(Q)=TC/Q=c,

and so is the marginal cost, i.e., TC

MC cq

∂= =

∂, as the next figure depicts.

Figure #4.50

37 For a specific example, consider the case in which a firm with this production function faces input prices w=$5 and r=100. It is easy to check that in this example, TC(Q)=2Q.

Page 153: MUÑOZ,F. Advanced microeconomic theory. WSU


In the case that total costs are not proportional to output (i.e., the production function does not exhibit constant returns to scale), the analysis of average and marginal costs becomes more involved, as the following figure illustrates.

Figure #4.51

In this figure total costs initially grow very rapidly, then become relatively flat, and for high production levels increase rapidly again.38 Graphically, note that average costs are represented by the slope of the ray connecting any point along the total cost curve, such as A, with the origin.39 Rays connecting the origin with the total cost curve are initially very steep for low production levels (implying high average costs in the bottom figure), become flatter as we increase production, reaching a minimum slope (where the AC also reaches its minimum in the bottom figure), and finally when output is further increased rays from the origin to the total cost curve become steeper again, leading to an increase in the corresponding AC curve. Similarly, the firm’s marginal costs of production are represented by the slope of the total cost curve at any given point, such as A, where the slope of tangent line BAC is 10. Initially the slope of the total cost function is high, but decreasing in the concave portion of the total cost curve (i.e,. marginal costs are initially decreasing in output), it becomes almost zero at the inflection point of the total cost curve (where the corresponding marginal cost curve is close to zero), and grows again in the convex region of the total cost curve (i.e., marginal costs increase in output).

38 This might occur, for instance, when a third factor of production is present in the production process, such as the entrepreneurial skills of the founder of the firm: total costs grow fast initially, then they are almost unaffected by increases in production, but when the firm’s scale (output) becomes sufficiently large, the entrepreneur cannot manage the firm by himself and needs to hire additional managers who do not have the specific skills that he possesses, inducing a significant increase in costs.

39 This can be further understood by noticing that at point A, total costs are $1,500 and output is 50 units, implying an average cost of $1,500/50=30, which coincides with the slope of the ray connecting point A with the origin.

Page 154: MUÑOZ,F. Advanced microeconomic theory. WSU


Three elements of the above figures are especially noteworthy.

1. First, both the AC and MC curve originate at the same level for output q=0, as the following figure illustrates.

Figure #4.52

In order to show this property, note that we cannot compute the average cost at q=0, given that AC(0)=TC(0)/0=0/0. We can nonetheless apply l’Hopitat’s rule, as follows


( )

0 0 0

lim ( )

( )lim lim lim ( )


C qq

qq q qq

AC q

C qMC q


∂∂∂→ → →∂

= =

We can therefore conclude that AC=MC at q=0.

2. When MC>AC, the AC declines, and when MC<AC, the AC increases. The intuition behind the average of a variable (in this case costs) and the marginal of that same variable can be understood using the example of grades. If, before revealing your result in a new exam, your instructor tells you that your score in the new exam is helping you raise your average grade in the class, it must be that such new grade is better than your average so far. In this case, the marginal effect is higher than your previous average, inducing an increase in your average grade in the class. (For the case of MC>AC, this implies that producing an additional unit increases total costs so much that the firm’s average costs per unit experience an increase). In contrast, if your instructor informs you that your score in the new exam lowers your current average, it means that your score in the exam was below your average in the class. (For the case of total costs, MC<AC implies that additional production induces a slight increase in total costs, producing a decline in your average costs per unit).

3. Finally, note that the AC and MC curves cross at exactly the minimum of the AC curve. In order to show that, let us find the minimum of the AC curve.

Page 155: MUÑOZ,F. Advanced microeconomic theory. WSU


( ) ( )

2 2

. ( ) 1 ( ) ( )0


( ) ( ) 0 ( ) ( )

( )( ) ( )

At the value of for which ( ) is minimized

c q c qq qq c qAC q MC q c q

q q q q

q MC q c q q MC q c q

c qMC q AC q


q AC q

∂ ∂∂ − ⋅∂ ⋅ −

= = = =∂ ∂

⋅ = = ⇒ ⋅ =

= =

Hence, MC(q)=AC(q) at the value of q for which the AC(q) curve is minimized. Let us now continue with our analysis of cost and supply curves in the single output case. The following figures depict a firm with a technology that exhibits strictly decreasing returns to scale. Indeed, an increase in the use of inputs produces a less than proportional increase in output, i.e., the firm’s production set is strictly convex as figure (a) illustrates. For simplicity, we normalize the price of input z to $1. Thus, the use of zbar units of input implies a cost of zbar dollars. This normalization helps us represent the firm’s cost function as a 900-rotation of the production set, where the vertical axis in figure (a) (representing output) becomes the horizontal axis in figure (b), and the horizontal axis in figure (a) becomes the firm’s total cost in figure (b) in the vertical axis. As a consequence, the firm’s total cost function (figure b) is convex. This, in turn, implies that marginal costs are increasing (as depicted in figure c) since the slope of the firm’s total cost function increases in q. Similarly, average costs are also increasing in output (as represented in figure c) given that slope of the ray connecting the origin with any point along the total cost function increases as we raise output. Finally, note that the firm’s supply correspondence is identified by the locus of points for which the firm produces an output level q such that market price equals marginal cost. We showed in previous sections of this chapter that, under this condition, firms solve the PMP and that this is not only a necessary but also sufficient condition of an optimal production plan for the firm when production sets are convex, just as that we are analyzing in this case.40

Figure #4.53

40 When production plans are non-convex, we might encounter cases in which this first-order necessary condition is not sufficient for a profit-maximizing production plan. We expand on this result below.

Page 156: MUÑOZ,F. Advanced microeconomic theory. WSU


The following figures provide a similar analysis for technologies exhibiting constant returns to scale. First, note that an increase in input usage produces a proportional increase in output. Rotating the firm’s production set 90-degrees we obtain a strictly linear total cost function, with constant average and marginal cost curves, as described in previous sections. In this case, the firm’s supply correspondence becomes the output levels q that satisfy p=MC(q). Intuitively, for p<MC(q) for any q>0 the firm does not supply positive amounts, while for p≥MC(q) the firm supplies infinitely large amounts of output.

Figure #4.54

Finally, when the firm’s production set is non-convex, as that depicted in figure (a) below, we obtain a total cost curve that first increases very rapidly, becomes almost flat for intermediate levels of output, and increases rapidly again for large scales of output. In this case, as described above, AC and MC start from the same origin, MC lies below AC for low levels of output but MC is above AC for output levels above the minimum of the AC curve. Importantly, in this case the firm’s supply curve does not exactly coincide with the marginal cost curve, but rather, with the portion of the MC curve above AC. Indeed, note that for market prices below AC(qbar), the firm would sell units at a price below AC (incurring a loss of AC-MC per unit). As a consequence, the firm sells no units for p<AC(qbar) (and we represent that by the vertical spike at the vertical axis) but sells output levels at the locus of the MC curve for p>AC(qbar).

Figure #4.55

Although, it is logical to take the assumption of preference maximization as a primitive concept for the theory of the consumer, the same cannot be said for the assumption of profit maximization by the firm. The objectives of the firm should emerge from the objectives of those individuals who control it. A firm owned by a single individual has well-defined objectives: those of the owner. In this case, the only issue is whether this objective coincides with profit maximization. Whenever there is more than one owner,

Page 157: MUÑOZ,F. Advanced microeconomic theory. WSU


however, we have an added level of complexity (for full analysis of the objectives of the firm – read MWG pp 152-154)

Cost and supply in the single output case

The following figures examine the presence of nonconvexities in the production set Y arising from the existence of fixed setup costs, K, which are nonsunk. In particular, the firm's cost function can be represented in this case as


Where Cv(q) denote variable costs. The figure bellow illustrates the case in which variable costs are linear in output. Note that the firm's costs in figure (b) are zero for q=0, but K (or greater) for any strictly positive amount of output, because the firm will choose not to spend the fixed costs if they don't want to produce q>0. In addition, note that average costs are K/q+Cv(q)/q, where Cv(q)/q is a constant due to the linearity of the production function, i.e., Cv(q)=c*q implying that Cv(q)/q=c. Thus, the marginal cost is constant in output at c=C’v(q). Average costs decline in q since K/q declines in q and Cv(q)/q is constant in q. Note that since average costs are K/q+Cv(q)/q (or K/q+c), average costs, despite declining, lie above the firm’s marginal cost for all production levels, approaching the a horizontal asymptote at the marginal cost as q goes to infinity. Finally, regarding the firm’s supply curve, recall that the firm supplies positive amounts only if market prices are high enough to recover both variable and fixed costs, i.e., if prices are above average total costs. Since in this case, average costs lie above marginal costs for all output levels, the firm’s supply curve is a vertical spike for prices below pbar.

(a): Production Function (b): Cost curve (c) : Supply Curve,

Figure #4.56

Page 158: MUÑOZ,F. Advanced microeconomic theory. WSU


In the case that variable costs are nonlinear in output, the firm's total cost function also starts at K, but increasing variable costs imply that every additional unit is more costly for the firm, i.e., total costs are convex in output; as depicted in figure (e). This is confirmed in figure (f) where marginal costs are positive and increasing in output (indicating that the slope of the total cost curve increases in output).41 Regarding the average cost curve, note that it initially decreases and then increases in q. Intuitively, in the decreasing portion of the AC curve, the firm benefits from spreading its fixed costs over larger output levels (while variable costs are still relatively low). In the increasing portion of the AC curve, in contrast, the firm’s larger average variable costs offset the firm’s lower average fixed costs from spreading its fixed costs over larger output levels and, as a consequence, total average costs grow. Note that AC crosses the MC curve at exactly the output level qbar for which the slope of the total cost curve in figure (b) –i.e., the MC—coincides with the slope of the ray connecting the total cost curve at that point with the origin, i.e., the AC. Finally, the firm produces positive amounts of output when market prices are above total average costs of production. Otherwise the firm produces zero output, as depicted in the supply curve of figure (c).

Figure #4.57

At the following figure, we slightly modify our above description by assuming that the firm's fixed costs are now sunk. First, note that a portion of the firm's production set is now not included, since the firm cannot modify input levels within that interval (figure a), i.e., inaction is not possible in this case. Similarly, the total cost curve now originates at K, given that the firm must incur fixed sunk costs K. finally, note that the supply locus in this case considers the entire marginal cost curve and not only output level for which MC>AC, as in the case that the firm experiences convex costs under the presence of fixed (nonsunk) costs. Intuitively, note that now the firm faces sunk costs, so it will not shut down even if it is obtaining negative profits in the short run.42

41Examples of total cost curves: (1) TC(q)=a+bq where a,b>0 is a linear cost function incurring fixed costs a>0; (2) TC(q)=bq2 represents the presence of convex production costs, but without fixed costs (note that in this case, marginal costs lie above average costs); and (3) TC(q)=a+bq2 illustrates the presence of convex variable costs and fixed costs.

42 Note that this supply locus resembles that of a firm with a convex cost function but facing no fixed costs at all.

Page 159: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.58

Short-run total costs

In this section we examine the firm's minimal cost of production when one of the inputs is fixed at a certain level. Since the firm doesn't have the flexibility of input choice in the short run, the firm will generally incur higher costs than in the long run. In other words, the firm will not be able to choose an input combination in which the isoquant and isocost are tangent to each other and, as a consequence, the MRTS will not be equal to the ratio of input prices.

Let us first analyze an example, depicted in the following figure, where capital is fixed in the short run at kbar.43 In the long run, if the firm was capable of choosing any cost minimizing input combination, it will select the input vector denoted by A in the figure, where isoquant Q0 and isocost are tangent. In the short run, however, the firm cannot alter the amount of capital from Kbar and hence, if the firm must still reach a production level of Q0, the firm manager will need to choose input combination F, associated to a higher isocost line. Therefore, the firm's inability to modify the amount of capital being used induces the firm to incur higher costs.

43 Capital can be fixed in the short run if the process of financing the acquisition of new equipment is relatively slow, or for other technological reasons, making labor more flexible in the short run, i.e., having to build a new production plant vs. hiring more workers to keep the factory open longer in the short run. Nonetheless, a similar analysis can be extended to production processes in which labor is the fixed input in the short run while capital is variable. This might be the case in certain highly-qualified occupations where the scarce resource is the precise human capital of the job candidate, whereas the capital equipment that the firm uses is so standardized that the firm can easily acquire it in 1-2 business days, e.g., computers, software packages, etc.

Page 160: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.59

The following figure illustrates a similar situation where, for an output level of 1 million TVs per year, the firm chooses input combination (k1,l1) both in the long run and in the short run when its capital structure is fixed at exactly k=k1. In this case, we can conclude that the firm's minimal cost of producing 1 million TVs per year is the same in the long run and in the short run when its capital structure is fixed at k=k1. This point is graphically illustrated in the figure below where we represent the firm's long run cost function TC(q), and short run cost function when k=k1. When production requirements are increased to 2 million TVs per year, however, a capital level of k=k1 does not allow the firm to minimize costs. Indeed, in the short run the firm selects input combination B, associated to a higher isocost line, while in the long run the firm selects input combination C, associated to a lower isocost line. This difference in the short and long run costs for a capital of k=k1 is also illustrated in the bottom figure where short run total costs when k=k1 are higher than long run costs for a production level of q=2 million.

Figure #4.60

Page 161: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.61

We can repeat this analysis for different capital structures, reaching similar conclusions, as the following figure illustrates. Indeed, short run total costs lie above long-run total costs, except for the case in which, for a given output level, in the long run the firm chooses to use exactly the amount of capital that the firm is obliged to use (fixed input) in the short run, i.e. the input is fixed at the long run optimal level.

Figure #4.62

Let us next provide an example of our previous discussion. Considering a firm using two inputs in order to produce one output, the firm’s cost function in the long run is given by 1 1 2 2( )C q w z w z= + where both

Page 162: MUÑOZ,F. Advanced microeconomic theory. WSU


inputs 1 and 2 are variable. In the short run, however, input 2 is fixed at a level z2bar, while input 1 is variable. The firm’s short-run cost function when input 2 is fixed at a level z2bar is therefore44

2 1 1 2 2 2( | ) where is fixed C q z w z w z z= +

The following figure compares the firm’s long-run, C(q), and short-run cost curves, C(q|z2),for different levels of the fixed input (input 2). As described above, C(q)≤C(q|z2) for any given level of z2, since in the long run the firm is capable of selecting the exact value of input 2, z2, that minimizes the firm’s cost of producing q units of output. In contrast, in the short run the firm must take the value of z2 as given.

Figure #4.63

Note that at the point where the long-run and short-run cost functions coincide (the firm incurs the same costs) representing output levels, q, for which the firm’s factor demand correspondence of input 2, z2(w,q), exactly coincides with the level at which input 2 is being fixed in the short run, z2bar. A similar argument extends to the short-run cost function when input 2 is fixed at z21, which coincides with the long-run cost function when the firm’s (long-run) demand for input 2, z2(w,q) is exactly z21.45

44 Note that this implies that the firm uses only input 1 in order to reach output level q, i.e., chooses z1 such that f(z1,z2bar)=q. (This explanation parallels our previous discussion about a firm increasing labor amounts, for a fixed capital level Kbar, in order to reach a particular output level Q0). Therefore, the only choice variable for the firm in the short run is the amount of input 1, z1.

45 This discussion parallels our above explanation about the case in which input combination A is cost-minimizing both in the long run and in the short run (when capital level is fixed at K1) since, in the long run, the firm’s demand for capital when

Page 163: MUÑOZ,F. Advanced microeconomic theory. WSU


We can therefore conclude that when the demand for input 2 is at its long-run value, z2(w,q), the short-run and long-run costs coincide,

C(q)=C(q|z2(w,q)) for all output levels q

From the above figure we can obtain an additional conclusion: when the short-run cost function is evaluated at the long-run demand for input 2, not only do the level of the long-run and short-run cost functions coincide (i.e., their heights coincide in the figure), but their slopes coincide as well. That is,

C’(q)=C’(q|z2(w,q)) for all output levels q

Geometrically, this means that the slope of the long-run marginal cost curve coincides with that of the short-run marginal cost curve for every output level q, in other words the long and short run curves are tangent at that point. This result, together with our above result of C(q)<C(q|z2), implies that the long-run cost curve C(q) is the lower envelope of the short-run cost curves, C(q|z2).

Aggregation in production

In this section of our discussion on production theory we investigate under which conditions the “law of supply” holds at the aggregate level, and under which conditions we can define a “representative producer” that parallels the “representative consumer” in consumer theory.

As a side note for aggregate production, it can be stated that function of aggregate production is function that maps aggregate inputs into aggregate outputs. In other words, it describes the maximum level of output that can be obtained if the inputs are used efficiently in the production process.

Consider J firms with production set Y1, Y2, …, YJ, where each production set Yj is nonempty, closed, and satisfies the free-disposal property. In addition, assume that every supply correspondence yj(p) for firm j is single valued46 and differentiable in prices (where p>>0). Let us define the aggregate supply correspondence for this economy as the sum of the individual supply correspondences



( ) ( )






y p y p

y y y




⎧= ∈ =⎨⎩

producing Q=1 million TVs is exactly K=K1. For different capital levels, however, the short-run cost-minimizing input combination does not coincide with that of the long-run, leading to higher costs in the short-run.

46 Note that this implies that production sets are strictly convex and hence the tangency condition between the firm’s isoprofit line and the production set holds at a single input-output point.

Page 164: MUÑOZ,F. Advanced microeconomic theory. WSU


For firm j’s profit-maximizing production plan yj(p), for all firms j=1,2,…,J.

Law of supply

The law of supply is satisfied in the aggregate. We can easily show either:

1. Using the derivative of every firm’s supply correspondence with respect to prices, Dpyj(p). This derivative defines a symmetric positive semidefinite matrix, for every firm j. Since this property is preserved under addition (when we aggregate across all firms in the economy), we can conclude that the derivative of the aggregate supply correspondence with respect to prices, Dpy(p), must also define a symmetric positive semidefinite matrix. Intuitively, an increase in market prices increases the aggregate output supplied by all firms.

2. Using a revealed preference argument. In particular, recall that for every firm j we have that [ ] [ ( ) ( )] 0 for every , adding over .j jp p y p y p j j′ ′− ⋅ − ≥

We can hence add over all J firms, obtaining [ ] [ ( ) ( )] 0p p y p y p′ ′− ⋅ − ≥ Which implies that market prices and aggregate supply move in the same direction, i.e., the law of supply holds in the aggregate.

Representative producer

Let us first define the aggregate production set as

1 21

... :J

LJ j


Y Y Y Y y y y=

⎧ ⎫= + + + = ∈ =⎨ ⎬

⎩ ⎭∑

For every firm j’s production plan yj. Note that 1


jjy y

== ∑ , where every production plan for firm j, yj,

is just a feasible production plan for firm j, but not necessarily firm j’s profit-maximizing production plan (i.e., its supply correspondence, yj(p)). Let y*(p) be the supply correspondence for the aggregate production set Y (i.e., the supply correspondence that maximizes aggregate profits), and let π*(p) denote the associated profits from this supply correspondence y*(p).

We can now claim that there exists a representative producer producing an aggregate supply y*(p) that exactly coincides with the sum of the individual firm’s supply correspondences, i.e., y*(p)=∑ , and obtains an aggregate profits π*(p) that exactly coincides with the sum of the individual firm’s profit functions, i.e., π*(p)=∑ . Intuitively, the aggregate profit obtained by each firm maximizing profits separately (taking prices as given) is the same as that which would be obtained if all firms were to coordinate their actions (their production plans yj’s) in a joint profit maximizing decision. Importantly, this is a “decentralization” result. Indeed, it suggests that in order to find the solution of the joint profit maximization problem for given prices p, it is enough to say “let each individual firm do what’s best for it” and add the solutions of their individual PMPs. This result is sometimes referred as supporting “laissez faire” arguments since it suggests that the social planner should let every firm j choose its own production plan yj that maximizes its own profits (i.e., every firm independently selecting its own yj(p)), since this production plan will maximize aggregate profits.

Page 165: MUÑOZ,F. Advanced microeconomic theory. WSU


This intuition is illustrated in the following figure, representing firm 1’s and firm 2’s production set, Y1 and Y2. Firm 1 maximizes profits choosing a supply correspondence y1, and firm 2 does so selecting y2.47 If we add vectors y1 and y2, we obtain y1+y2 in the figure. Importantly, the aggregate supply correspondence y1+y2 coincides with the supply correspondence that a single firm manager would select if the firm’s production set was described by the aggregate production set Y=Y1+Y2 when facing the same price vector as firms 1 and 2. Hence, jointly both firms would be selecting (Y1+Y2) given p, and given aggregate production set Y. Besides, we need to note that all iso-profit lines should be parallel.

Figure #4.64

Finally, note that one of the key assumptions in order to obtain the above “decentralization result” is that firms take prices as given. If, in contrast, firms’ decision about how much to produce has an effect on market prices, the above “decentralization result” is not necessarily satisfied.48

Efficient production

Let us continue with our discussion of when individual firms choose profit-maximizing production plans that maximize aggregate profits. In this regard, let us define efficient production vectors. We say that a production vector y∈Y is efficient if there is no other production vector y’∈Y such that y’≥y and y’ y. That is, y is efficient if there is no other feasible production vector y’ producing more output with the same amount of inputs (or alternatively, producing the same amount of output with fewer inputs).

47 Note that the isoprofit line (that firms use to choose the tangency point where the isoprofit line is tangent to the production set) has the same slope for firm 1 and 2 since both firms face the same market prices. Nonetheless, firm 2’s profits are higher than firm 1’s, since firm 2’s isoprofit line at y2 is further from the origin than firm 1’s isoprofit line when evaluated at y1.

48 A simple example is that of oligopoly markets where firms compete in quantities (a la Cournot). In particular, when every firm independently selects a profit-maximizing output level it does not take into account the effect that its additional production has on the units sold by its competitors. This leads every firm to overproduce, relative to the output level that maximizes joint profits (i.e., the output level that every firm would produce if they coordinated by forming a cartel).

Page 166: MUÑOZ,F. Advanced microeconomic theory. WSU


Graphically, note that this definition of efficiency implies that if a production plan is efficient then it lies on the boundary of the production set Y, as the following figure illustrates. In particular, y is efficient, whereas y’ and y’’ are inefficient (y’ is inefficient because it uses the same amount of inputs as y, but produces less output. y’’ is inefficient because it produces the same output as y, but uses more inputs).

Figure #4.65

The converse argument (that every production plan lying on the boundary of the production set must be efficient) is not necessarily true, as the next figure shows. Specifically, production plan y’ –despite lying on the boundary of production set Y— is inefficient since it produces the same amount of output as y, but uses more inputs.

Figure #4.66

After defining efficient production plans, we can now present the first and second fundamental theorem of welfare economics (FTWE).

First FTWE: if a production plan y∈Y is profit maximizing for some price vector p>>0, then y must be efficient.

Page 167: MUÑOZ,F. Advanced microeconomic theory. WSU


Proof. Let us proof the first FTWE by contradiction. Hence, suppose that production plan y∈Y is profit maximizing, i.e., py≥py’, but y is not efficient. Then, there is another production plan y’∈Y such that y’≥y. Multiplying both sides by price vector p, we obtain py’≥py, since p>>0. But then y cannot be profit maximizing (as the premise of this proof established). We have then reached a contradiction, proving the 1st FTWE. ■

Importantly, note that for this result we do not need the production set Y to be convex. The following two figures illustrate convex and non-convex production sets. In both cases production plan y is profit maximizing, which implies that it must lie on the boundary of the production set, for both convex and non-convex production sets.

Figure #4.67

Furthermore, note that when applied to the aggregate, the 1st FTWE says that if a collection of firms each independently maximizes profits with respect to the same price vector p>>0, then the aggregate production plan is socially efficient.

In addition, note that the assumption p>>0 on the price vector cannot be relaxed to p≥0. In order to see why, take a production set Y with an upper flat surface, as that in the following figure. Hence, any production plan y in the flat segment of the production set can be profit maximizing if prices are p=(0,1). Indeed, this price vector implies that the slope of the isoprofit line is zero. The firm hence can choose a region of profit-maximizing production plans (where the isoprofit line and the production set are tangent to each other, as depicted in the figure). However, not all of these profit-maximizing production plans are efficient. Indeed, only the production plan y, lying exactly on the kink of the production set is efficient,

Page 168: MUÑOZ,F. Advanced microeconomic theory. WSU


i.e., all other profit-maximizing production plans to the left of y are inefficient since they use more inputs than y in order to produce the same amount of output. Hence, in order to apply the 1st FTWE we need p>>0, i.e,. price vector is positive in all components.

Figure #4.68

The 2nd FTWE states the converse of the 1st FTWE (i.e., if a production plan y is efficient, then it must be profit-maximizing). Note that the converse of the 1st FTWE is not necessarily true. The following figures illustrate that when the production set is convex, then every efficient production plan (lying on the boundary of the production set) must also be profit maximizing. When the production set is non-convex, however, the fact that a production plan is efficient does not imply that such plan maximizes the firm’s profits. This is evident in production plan y’ which lies on the boundary of the production set but is not profit-maximizing. Indeed, production plan y is the profit-maximizing vector.

Page 169: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.69

The 2nd FTWE is therefore restricted to convex production sets. Specifically, the 2nd FTWE states that, if the production set Y is convex, then every efficient production plan y in Y is a profit-maximizing production plan, for some non-zero price vector p≥0.

In order to easily prove the 2nd FTWE, let us use the following steps. First, take an efficient production plan, such as y in the next figure. Let us now define the set of production plans that are strictly more

efficient than y, that is { }:LyP y y y′ ′= ∈ >> . As the figure depicts, this set contains all production

plans producing more than production plan y using the same inputs, and those producing the same output amount using fewer inputs. Furthermore, note that the boundaries of the set are not included since we only consider production plans that are strictly more efficient than y. This implies that there exists no intersection point between set Py and the production set Y, i.e. set Py is an open set. In addition, note that set Py is a convex set, since any the linear combination of any two production plans in Py lies within the set.

Page 170: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.70

We can now apply the Separating Hyperplane Theorem. In particular, we can claim that there exists some price vector p 0 such that py’≥py’’ for all production plan y’ in Py and y’’ in Y.49 Since this is true for all y’’ in Y, it must also be true for any other production plan on the boundary, such as y. Therefore, py’≥py for all production plan y’ that is more efficient than y, i.e., y’’>>y. We can now take any production plan y’’ in Y, to obtain py’≥py’’ for all y’ in the set of “more efficient” production plans Py. Finally, since we can choose y’ to be arbitrarily close to the efficient production plan y, we can have py≥py’’ for every production plan y’’ in Y. Therefore, production plan y must be profit-maximizing.

One interesting property of the 2nd FTWE is that we are not imposing that all prices must be positive, i.e., p>>0, but only that all must be weakly positive, i.e., p≥0. Hence, we just assume that the price vector is not zero at every single component, i.e., p (0,0,…,0). Note that this implies that the slope of the isoprofit line can be zero (which occurs when the price of the input y1 is zero). The following figure illustrates this case. In particular, note that there is a set of profit-maximizing production plans (where the isocost line is tangent to the production set). However, there is only one efficient production plan, y, situated at the kink of the production set Y. According to the 2nd FTWE, such efficient production plan y must also be part of the set of profit-maximizing production plans, which holds in this case. Hence, the 2nd FTWE can be satisfied even if some input prices are zero.50

49 Note that production plan y’ is not technologically feasible since it lies outside production set Y.

50 Recall that, in contrast, the 1st FTWE does not necessarily hold if some input prices are zero, as described above.

Page 171: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #4.71

Despite allowing for some input prices to be zero, the 2nd FTWE does not allow for input prices to be negative. Let us examine if the 2nd FTWE could still hold if the price of one input was negative. Let us hence consider the case in which the price of input l was negative, pl<0. We would then have that py’<py for some production plan y’ that is more efficient than y, i.e., y’>>y, where y’l-yl being sufficiently large. Let us show why we can have py’<py with the following example.

Example. Consider a price vector p=(p1,p2)=(3,-5), and assume that the efficient production plan is y=(1,4) while a “more efficient” production plan (a production plan that is technologically unfeasible) is y’=(6,25), then

py=3*1+(-5)*4=3-20=-17, and


Hence, py’<py, i.e., the firm obtains a larger profit from production plan y than from a technologically unfeasible production plan y’. This contradicts the 2nd FTWE. Therefore, we need that the price vector satisfies p≥0.

Page 172: MUÑOZ,F. Advanced microeconomic theory. WSU


Chapter 5 – Competitive Markets Competitive markets

In this chapter we bring consumer and producer theory together. In particular, we analyze competitive equilibrium and compare it with Pareto optimal allocation. We then discuss the two fundamental theorems of welfare economics, measurements of welfare changes in the partial equilibrium analysis, and long-run considerations were firms are allowed to enter or exit the industry.1

Pareto optimality and competitive equilibrium

In this section we analyze the allocation of goods and inputs across the economy when consumers and firms interact in perfectly competitive markets, and compare such a location with that selected by a benevolent planner maximizing social welfare (also referred as a Pareto optimal allocation). Let's start by describing Pareto optimal allocations. In this regard, we first need to define what we mean by an economic allocation in a society involving L consumers and J firms. Specifically, an economic allocation (x1, x2,…,xL,y1,y2,…,yJ) is a specification including:

1. A consumption vector xi∈Xi for every consumer i∈I, where xi=(x1i,x2i,…,xLi) describing the amount that individual i consumes of every good L, and

2. A production vector yj∈Yj for every firm j∈J, yj=(y1j,y2j,…,yLj), describing the amount that the firm j produces a very good L.

Given this definition of economic location, we say that allocation is feasible if, for every good l, we have

1 1


li l lji j

x w y= =

≤ +∑ ∑

Intuitively, the notation is feasible for good l if the total consumption of this good by all I consumers in the economy is lower (or equal) than the initial endowment of this good2 and the production of these goods by all J firms. We are now ready to define Pareto optimal allocations. Specifically, a feasible allocation (x1, x2,…,xI,y1,y2,…,yJ) is Pareto optimal (or Pareto efficient) if there is no other feasible allocation (x’1, x’2,…,x’I,y’1,y’2,…,y’J) such that

ui(x’i)≥ui(xi) for all subjects i=1,2,…,I; and ui(x’i)>ui(xi) for some subject.

That is, there is no alternative way to arrange the distribution of goods among consumers and/or the production of goods such that some individual is made strictly better off with the alternative allocation, i.e., ui(x’i)>ui(xi), and no consumer is made worse off, i.e., ui(x’i) ≥ui(xi). Importantly, the definition of Pareto optimality implies a notion of efficiency in production (since there is no way to rearrange inputs in order to produce more output) and in consumption (since there is no alternative way to achieving a Pareto

1 In this chapter we follow MWG Chapter 10. For a good discussion of these topics, see Varian Ch. 13 and NS Ch. 12 (although none of them is as complete as MWG).

2 Note that the endowment of good l, wl, is in fact a vector describing the amount of good l that every consumer initially owns, i.e., wl=(wl1,wl2,…,wlI).

Page 173: MUÑOZ,F. Advanced microeconomic theory. WSU


improvement). The above definition can be graphically represented using the utility possibility set. In particular, these set can be defined as

1 2 1 2 1 2( , ) : there is a feasible allocation ( , , , ,..., )

such that ( ) for all subjects 1, 2L

i i i

u u x x y y yU

u u x i

⎧ ⎫= ⎨ ⎬≤ =⎩ ⎭

Figure #5.1

Intuitively, note that utility pairs in the frontier of the UPS are Pareto optimal. Indeed, for a utility pair such as (u1hat, u2hat) we cannot improve individual 1’s utility level without reducing that of individual 2. Two remarks are noteworthy. First we don't need convexity in the UPS in order to have that points on the frontier are Pareto optimal. Indeed, the above figure illustrates a nonconvex UPS, and yet utility pairs of the frontier are Pareto optimal.3 Second, it is important to distinguish efficiency (measured in the Pareto optimal sense) from equity. Indeed, an allocation were all resources in the economy are assigned to individual 2 (and none to individual 1), such as the one depicted on the vertical axis in the above figure, is extremely unequal and yet is Pareto optimal, since we cannot increase the utility level of one individual without decreasing that of other individuals.

Let us next describe a competitive equilibrium, CE (or Walrasian equilibrium) allocation. In particular, in this context we consider that consumers and firms interact in markets where their relative size is

3 Importantly, note that we only need that the UPS doesn't have increasing segments. If it did, we could be able to increase both consumers’ utility levels by choosing utility pairs away from the frontier.

Page 174: MUÑOZ,F. Advanced microeconomic theory. WSU


negligible. As a consequence, their individual purchasing or selling decisions do not affect market prices for output nor inputs. Specifically, we say that an allocation (x*1, x*2,…,x*I,y*1,y*2,…,y*J) and a price vector p*∈RL constitute a CE if:

1. PMP: for each firm j, yj* solves *max

j jj

y Yp y


2. UMP: for each consumer i, xi* solves

* * *


max ( )

. . ( )

i ii i

x X


i i ij jj

u x

s t p x p w p yθ


⋅ ≤ ⋅ + ⋅∑

3. Market clearing condition: for each good l,

* *

1 1


li l lji j

x w y= =

= +∑ ∑

Intuitively, the first condition states that, when consumers and firms interact in a market, every firm individual solves its own PMP (as described in the chapter on production theory). Similarly, the second condition says that every consumer individually solves its UMP, given a budget constraint which is slightly differs from that considered in previous chapters. In particular, consumer i must select a bundle xi for each its cost, p*xi, is lower than the value of this individual’s initial endowment, p*wi, and the

participation of these individual in the profits of all J firms, *


( )J

ij jj

p yθ=

⋅∑ . Finally, the market clearing

condition states that, for every good l, the total consumption of this good by all consumers must be equal to the initial endowment of this good in the economy plus the total production of this good by the J firms. Hence, this condition implies that in equilibrium there can be no excess demand for a good (since otherwise some consumers could have incentives to offer higher prices for the good in order to obtain more units of it) nor excess supply for the good (since otherwise some firms could have incentives to offer the good at lower prices in order to sell more units of it). An interesting consequence of the market clearing condition is that if the market clearing condition is satisfied for all but one good, then it must be satisfied for that good as well. Furthermore, note that an allocation (x*1, x*2,…,x*L,y*1,y*2,…,y*J) and a price vector p*∈RL constitute a CE, then this allocation and price vector αp* (for any α>0) must also be a CE. Hence we can normalize prices, keeping the same equilibrium allocation. Finally, we will assume that market prices are all positive, since otherwise the consumer would demand infinite amounts.

Partial equilibrium competitive analysis

In this section we analyze competitive allocations. Let us start by analyzing the behavior of firms. For a given price vector p*, every firm j’s equilibrium output level qj* must solve the PMP


0max ( )

jj j j

qp q c q


Which has the necessary and sufficient condition

Page 175: MUÑOZ,F. Advanced microeconomic theory. WSU


* ' * *( ), with equality if 0j j jp c q q≤ >

which, in the case of interior solutions the states that every firm j operating in a perfectly competitive market increases output until the point in which the marginal cost of producing such output equals market prices, as described in the previous chapter.

Let’s now turn to the consumer. For simplicity, we consider that every consumer in the economy has a quasilinear utility function ( , ) ( )i i i i i iu m x m xφ= + , where mi denotes the numeraire and ' ( )i ixφ >0 but

'' ( )i ixφ <0 for all xi>0, i.e., the consumer obtains a positive but diminishing marginal utility from additional units of good xi. In addition, we consider that every individual obtains zero utility from good xi when consuming zero units of it, i.e, (0) 0iφ = .4 Therefore, consumer i’s UMP is


* * * *


max ( )

. . ( ( )

i i


i i im x R


i i m ij j j jj

m x

s t m p x w p q c q



+∈ ∈



+ ≤ + ⋅ −∑

Since the budget constraint must hold with equality (i.e., Walras’ law holds), we have

* * * *


( ( ))i


i i m ij j j jj

m p x w p q c qθ=

⎡ ⎤= − + + ⋅ −⎢ ⎥

⎣ ⎦∑

and plugging the budget constraint into the objective function we can rewrite the UMP as

* * * *


max ( ) ( ( ))i



i i i m ij j j jx


x p x w p q c qφ θ+∈


⎡ ⎤− + + ⋅ −⎢ ⎥

⎣ ⎦∑

where now the only choice variable for consumer i is good xi. Taking first order conditions with respect to xi we obtain

' * *( ) with equality if 0i i ix p xφ ≤ >

Which intuitively states that a consumer increases the amount bought of good xi until the point in which the marginal utility he obtains from consuming further units of the goods exactly coincides with the market price he has to pay for them.

Summarizing, an allocation x*1, x*2,…,x*I,y*1,y*2,…,y*J) and a price vector p*∈RL constitute a CE if:

4 Recall that with quasilinear utility functions, wealth effects for all non-numeraire commodities (such as xi) are zero. Our model examines, for instance, the consumption of a good xi that represents a small share of all monthly expenses for consumers, since in that case wealth effects are negligible.

Page 176: MUÑOZ,F. Advanced microeconomic theory. WSU


* ' * *

' * *

* *

1 1

( ), with equlity if 0

( ) with equality if 0

j j j

i i i


i ji j

p c q q

x p x

x q


= =

≤ >

≤ >

=∑ ∑

Note that the previous conditions do not depend upon the consumer’s initial endowment.5 We next provide a graphical illustration of the above conditions. The following figure represents consumer i’s demand for good xi. in particular note that for prices above '(0)iφ , the consumers marginal utility from purchasing the first unit of the good is lower than the market price p, leading him to buy zero units of good xi. For prices below this cutoff, the consumer purchases a positive amount of good, increasing xi until the point in which the market utility from by the last unit coincides with the going market price.6

Figure #5.2

We can now horizontally sum individual demands in order to obtain the aggregate demand for good x, as the following figure illustrates. Interestingly, we can identify the segments of aggregate demand x(p). First, when the market prices are above max ' (0)i iφ , no individual demands a positive amount of good x, implying that a demand is also zero. Intuitively, in this range of (high) market prices the marginal utility that all consumers obtain from buying the first unit of good is the still lower than the current market price, and hence no positive units are demanded. For intermediate prices, however, individual 2 in the figure obtains a positive marginal utility from buying positive amounts of good x, but individual 1 does not. As a result, aggregate demand coincides with individual 2’s demand for this range of prices. Finally, when market prices are sufficiently low, aggregate demand reflects the horizontal sum of all individuals demand curves.

5 Note that this result arises from quasilinearity. Indeed, an increase in the initial endowment raises consumer i’s initial wealth. This helps him increase the amount consumed of all other goods, but leaves his demand of good xi unaffected, i.e., no wealth effects.

6 Importantly, note that inverting '( )i ixφ we can obtain this consumer’s Walrasian demand xi(p).

Page 177: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #5.3

Let us now examine the firm’s supply curve. The following figure represents the supply curve for an individual firm j. Note that when market prices are sufficiently low, i.e., p<c’j(0), firm j’s marginal cost of producing the first unit is higher than current market prices, leading the firm to supply zero units of the good. When market prices, however, are above that cutoff, the firm increases production until the point in which the marginal cost of such level of output exactly coincides with the market price the firm obtains from setting those units to the market, i.e., p=c’(qj), as described in previous chapters.7

Figure #5.4

7 An interesting question at this point is if the firm supply curve could look like the one that we examined in the chapter on production theory, where the firm produces positive amounts for prices above the minimum of the average costs curve. Note that in this chapter we assume convex total costs. When the firm incurs no fixed costs, the corresponding marginal cost curve starts at the origin (and coincides with the firm’s supply curve). In the case of the firm incurs fixed (nonsunk) setup costs, its marginal cost curve also starts at the origin, but the firm’s supply curve has a vertical spike at the vertical axis for prices below the minimum of the average cost curve, coinciding with the firm’s marginal cost curve otherwise.

Page 178: MUÑOZ,F. Advanced microeconomic theory. WSU


Aggregate supply can be obtained by horizontally summing individual supply curves. Similarly as in the case of aggregate demand, we can identify three regions in the aggregate demand curve q(p), as the following figure reflects. First, when market prices are below marginal cost of producing the first unit for the most efficient firm (the firm with the lowest marginal cost of production, firm 2 in the figure), then no firm chooses to supply positive units to the market, and aggregate supply is zero. When market prices are intermediate, only the most efficient firm find profitable to supply positive units of good x, and aggregate supply coincides with individual supply for the most efficient firm (firm 2 in the figure). Finally, when market prices are sufficiently high, both firms supply positive units and as a consequence aggregate supply consists of the individual supply of firms 1 and 2.

Figure #5.5

We can now combine aggregate demand and aggregate supply in a single figure in order to obtain competitive equilibrium allocation of good x. First, note that in order to guarantee that a competitive equilibrium exists (i.e., aggregate demand crosses aggregate supply in the figure), we need that max ' (0) * min '(0)i i j jp cφ ≥ ≥ . Graphically, note that this condition states that the vertical intercept of the

aggregate demand curve lies above that of the aggregate supply curve.

Page 179: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #5.6

Note that if, instead, max ' (0) min '(0)i i j jcφ < holds, we cannot guarantee that there is a positive

production or consumption of good x, as the following figure illustrates. Intuitively, this condition indicates that the willingness to pay of the consumer most interested in the good is still lower than the marginal cost of production for the most efficient firm. As a consequence is no room for a profitable exchange, and no units of the goods are produced or consumed.

Figure #5.7

Additionally, since the marginal utility '( )i ixφ is downward sloping for every consumer, i.e., ''( ) 0i ixφ < for all i, and the marginal cost c’j(qj) is upward sloping in output for every firm j, i.e., c’’j(qj)>0 for all j, then aggregate demand and supply cross at a unique point, and therefore the CE allocation is unique.

Page 180: MUÑOZ,F. Advanced microeconomic theory. WSU


Finally, note that we can understand the inverse of the aggregate supply function as the industry’s marginal cost function. In particular, taking any given output qbar, we can now map it into the aggregate supply function in the vertical axis. Then, the inverse q-1(qbar) can be viewed as the industry marginal cost of production.

Figure #5.8

We can similarly understand the inverse of the aggregate demand function as the marginal social benefit function. Specifically, take any consumption level xbar, map it into the aggregate demand function in the vertical axis. Then, the inverse of the aggregate demand curve, x-1(xbar) which is also referred as p(xbar), represents the marginal social benefit of xbar units of consumption.

Figure #5.9

Therefore, at the CE output, the aggregate marginal cost of producing such level of output coincides with the marginal social benefit that all consumers obtain from consuming it.

Comparative statics

Page 181: MUÑOZ,F. Advanced microeconomic theory. WSU


In this section we examine how the competitive equilibrium output prices are affected by changes in the parameters of the model. Specifically let's assume that the consumer’s preferences are affected by a vector of parameters α∈RM, where M≤L.8 Then, consumer i’s utility from good x becomes ( , )i ixφ α . Similarly, firms’ technology is affected by a vector of parameters β∈RS, where S≤L. Then, firm j’s cost function becomes cj(qj,β). When bearing a tax, we will use pihat(p,t) to denote the effective price paid by the consumer i and pjhat(p,t) to denote the effective price received by firm j.9 If consumption and production are strictly positive in the CE, then the following conditions must hold

' * *

' * *

* *

1 1

ˆ( , ) ( , ) for every consumer

ˆ( , ) ( , ) for every firm

i i i

j j j


i ji j

x p p t i

c q p p t j

x q

φ α


= =



=∑ ∑

We then have I+J=1 equations, which depend on parameter values α, β and t. In order to understand how optimal consumption bundles xi* and profit-maximizing production plans qj* depend on parameters α and β, we can use the Implicit Function Theorem as long as the functions are differentiable.

Remark, Implicit Function Theorem: Let u(x,y) be a utility function, where x and y are amounts of two goods.

( , )If 0 when evaluated at ( , ), then

( , )( )

( , )

( , )Similarly, if 0 when evaluated at ( , ), then

( , )( )

( , )

for all ( , )

u x yx y

xu x y

dy x yu x ydx

xu x y

x yy

u x ydx y x

u x ydyy

x y



∂= −∂



∂∂= −


Similarly, if the utility function describes the consumption of a single good x, u(x,α), where α determines

the consumer’s preference for x, and ( , )

0, then,u x



8 This implies that there are fewer parameters than goods. In most economic applications this is normally the case, where only a few parameters are modified simultaneously.

9 Hence, in order to denote a per unit tax (charged on every unit sold), we use pihat(p,t)=p+t, where the consumer’s total expenditure on that good thus becomes pq+tq, whereas to denote an ad valorem tax (i.e., a sales tax) we use pihat(p,t)=p+pt=p(1+t), where the consumer’s total expenditure on that good becomes pq+tpq=p(1+t)q.

Page 182: MUÑOZ,F. Advanced microeconomic theory. WSU


unknown we would have to solve the entire UMP easy to solve

( , )( )

( , )

u xdx

u xdx

αα α




For a more detailed description of the Implicit Function Theorem with applications to economics, see Simon and Blume, pp. 339-341. ■

Example: Introducing a Sales tax. The expression of the aggregate demand now becomes x(p+t) since the effective price that the consumer pays is actually p+t, i.e., the sales tax is equivalent to an increase in the price paid by consumers. In equilibrium, the market price after imposing the tax, p*(t), must satisfy


Hence, if the sales tax is marginally increased (and functions are differentiable at p=p*(t)), we obtain

' * *' ' * *'

*' ' * ' * ' *

' **'

' * ' *

( ( ) ) ( ) 1 ( ( )) ( )


( ) [ ( ( ) ) ( ( ))] ( ( ) )


( ( ) )( )

( ( ) ) ( ( ))

x p t t p t q p t p t

p t x p t t q p t x p t t

x p t tp t

x p t t q p t

⎡ ⎤+ ⋅ + = ⋅⎣ ⎦

⋅ + − = − +

+= −

+ −

Since the aggregate demand function x(p) is decreasing in prices and the aggregate supply function q(p) is increasing in prices, then x’(p*(t)+t)<0<q’(p*(t)), and

' **'

' * ' *

( ( ) ) ( )( )

( ( ) ) ( ( )) ( )

x p t tp t

x p t t q p t− +

+ −= − = − = −

+ − −

Moreover, the above ratio is larger than -1, which implies that p*’(t) lies in the interval (-1,0]. Therefore, we can conclude that the equilibrium price p*(t) decreases in t, i.e., the price received by producers falls in the tax. Additionally, since p*(t)+t is the price paid by consumers, then p*’(t)+1 is the marginal increase in the price paid by consumers when the tax is marginally increased. Since p*’(t)≥-1, then p*’(t)+1≥1, and consumers’ cost of the product raises. The following figure summarizes the effect of imposing a tax on competitive equilibrium price and quantity. Before the introduction of the tax, CE occurs at p*(0) and x*(0), where the aggregate demand x(p) and aggregate supply q(p) cross each other. The imposition of the tax shifts aggregate demand curve from x(q) to x(q+t), without affecting the supply curve, q(p). (Note that the vertical distance between these two curves is equal to the tax, t, at any output level q.) This implies that the new CE after the introduction of the tax occurs at a lower output level, decreasing output from x*(0) to x*(t). Regarding prices, note that consumers pay p*(t)+t after the imposition of the tax, rather than p*(0) before the tax was introduced, while producers receive a price p*(t) for the x*(t) units after the tax they sell rather than p*(0) they received per unit before the tax.

Page 183: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #5.10

At this point, we can easily examine if supply curve is very responsive to price, i.e., if q’(p*(t)) is large. In this case,

' **'

' * ' *

huge and negative

( ( ) )( ) 0

( ( ) ) ( ( ))

x p t tp t

x p t t q p t

+= − →

+ −

Therefore, p*’(t)→0, and the price received by producers before the tax, p*(0), does not fall after the introduction of the tax, p*(t), as depicted in the following figure. However, consumers still have to pay p*(t)+t, which after the tax change raises to p*’(t)+1=0=1. That is, the tax is mainly borne by consumers. Indeed, as the figure illustrates, the price paid by consumers increases by the tax.

Figure #5.11

If, in contrast, supply is not responsive to price changes, i.e., if q’(p*(t)) is close to zero, then

Page 184: MUÑOZ,F. Advanced microeconomic theory. WSU


' * ' **'

' * ' * ' *


( ( ) ) ( ( ) )( ) 1

( ( ) ) ( ( )) ( ( ) )

x p t t x p t tp t

x p t t q p t x p t t

+ += − = − = −

+ − +

Therefore, p*(t) → -1, and the price received by producers falls in $1 for every extra dollar in taxes, i.e., producers bear most of the tax burden. In contrast, consumers pay p*(t)+t, which after the tax changes raises to p*’(t)+1=-1+1=0. That is, consumers do not bear tax burden. This is illustrated in the following figure, where consumers’ cost of the good does not increase, from p*(0) before the tax to p*(t)+t after the tax, whereas the price received by producers falls in $1 for every extra dollar in taxes, i.e., from p*(0) before the tax to p*(t) after the tax.

Figure #5.12

A remark on nonconvex cost functions. In all our previous discussion we considered that firms’ cost function is convex. Let us examine the effect of considering cost functions with concave segments, as in the example in the following figure. In the figure, aggregate demand x(p) is decreasing in prices, while aggregate supply is not weakly increasing in prices, since the cost function is nonconvex. Then, aggregate supply is represented by two intervals (shaded segments of q(p) in the figure). Intuitively, for relatively high prices, firms prefer to supply more units than less (and hence select the region of the q(p) curve in which, for the same price level, firms produce the largest output). Because of the specific pattern of the firms’ nonconvex cost function, we might have that no crossing point exists between aggregate demand and supply, and no CE exists.10

10 Note that alternatively, more than one crossing point can occur if firms’ cost function is nonconvex. In that case, we could observe that the same equilibrium price level is associated to different equilibrium output.

Page 185: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #5.13

Tax incidence

In this subsection we use N&S approach to tax incidence. In order to discuss the effects of a per-unit tax, t, we need to distinguish between the price paid by consumers (pd) and the price received by sellers (ps), where pd=ps+t, or alternatively, ps=pd-t. As a result, note that the wedge between both prices is t=pd-ps. When examining the effect of a small increase in the tax, we have


and since we must maintain the market clearing condition in equilibrium, dQd=dQs, or Dpdpd=Spdps. Substituting this condition into the above expression of a marginal change in the tax rate, we obtain


where the last equality originates from the fact that ps=pd-t


Dpdpd=Spdpd-Spdt, or Spdt=(Sp-Dp)dpd

We can now solve for the effect of the tax on the price paid by consumers, pd, obtaining



edP S

dt S D e e




= ⋅ = >− −

Where the expression on the right-hand side is obtained by multiplying the numerator and denominator by p/q. And similarly for the price received by suppliers, ps, obtaining

Page 186: MUÑOZ,F. Advanced microeconomic theory. WSU




dP D e

dt S D e e



= ⋅ = <− −

Since price-elasticity of demand is negative, eD≤0, but that of supply is positive, es≥0, we obtain that dpd/dt>0 while dps/dt<0. Intuitively, an increase in the sales tax increases the price that consumers have to pay for the good and decreases what producers receive for the good, expanding the wedge between both prices. In the extreme case in which demand is perfectly inelastic, i.e., eD=0, the per-unit tax is completely borne by consumers (note that eD=0 implies dpd/dt=1, reflecting that a $1 increase in the tax produces a $1 increase in the price paid by consumers). In contrast, when demand is perfectly elastic, i.e., eD=∞, the per-unit tax is completely borne by producers. In particular, eD=∞ implies that dpD/dt approaches one.

The above discussion illustrates that the actor (consumer or producer) with the less elastic responses bears most of the price change caused by the tax. Indeed, if we divide the above two expression, we obtain


/S D


dP dt e

dP dt e− = −

Finally, note that the introduction of a tax reduces consumer surplus by an amount pDFEp* in the following figure. (Note that, out of this area, region pdFHp* represents the money transferred to the government in the form of tax revenue). Similarly, the tax reduces producer surplus by an amount p*EGps, where p*HGps denotes the money transferred by producers to the government in the form of tax revenue. Hence, the “net” loss in CS and the “net” loss in PS illustrate the welfare that this economy looses as a result of the tax, after taking into account the welfare that is merely transferred from either of the agents to the government.11 This is usually referred as the deadweight loss of the tax, and is represented in the figure by area FEG.12

11 Total tax revenue is therefore represented by area pdFGps.

12 MWG present a more formal example of the deadweight loss of taxation (see example 10.E.1).

Page 187: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #5.14

Mathematical model of supply and demand

Suppose that the demand function is represented by a function QD=D(p,α) that depends upon market prices (negatively) and on a parameter α that shifts the demand curve, i.e., dD/dα=Dα can have any sign (e.g., positive for a transfer, negative for a tax). Similarly, the supply relationship can be expressed by a function QS=S(p,β) that depends on market prices (positively) and on a parameter β that shifts the supply curve, i.e., dS/dβ=Sβ can have any sign. Equilibrium requires that market demand equals market supply, so QD=QS. In order to arrive at the comparative statics of this model, we need to totally differentiate the supply and demand functions,

( , )

( , ) D D P


Q D P dQ D dP D d

Q S P dQ S dP S dα


α αβ β

= → = +

= → = +

Since the market must still be in equilibrium, we must have that the change in demand is offset by the change in supply, or dQD=dQS. For simplicity consider that the demand parameter α changes while the supply parameter β remains constant. The equilibrium condition hence requires that

Page 188: MUÑOZ,F. Advanced microeconomic theory. WSU





P PS d


D dP D d S dP







α+ −


+ = +


∂ −

And since Sp-Dp>0, the derivate dp/dα will have the same sign as Dα. For instance, a fad making certain clothing fashionable, i.e., Dα>0, implies that equilibrium price increases in α.

If, in contrast, the supply parameter β changes while the demand parameter α remains constant, we have that the equilibrium condition requires

[ ]P P



D dP D d S dP S d

S d S D dP


d S D

α β



α β


β+ −


+ = +

⇒= = −

−⇒ =

Therefore, the derivative dp/dβ will have the opposite sign of Sβ. For instance, the introduction of a new technology that reduces firms’ costs implies that Sβ>0 since aggregate supply is positively affected by this technology. Hence, dp/dβ<0 implying that the introduction of this technology reduces market prices.

We can easily convert all our previous analysis into elasticities. Indeed, multiplying by α/p on both sides of the expression of dp/dα, we obtain



P S D Pα

αα α


= ⋅ = ⋅∂ −

Dividing the numerator and denominator by Q, we have

( )( )( )


, ,



D ee

e eS D

αα α

α = =−−

Which states that a 1% increase in the demand parameter α produces a ep,α percent change in the quantity demanded. A similar analysis can be extended to the supply parameter β.

Fundamental Welfare Theorems

In this section we relate competitive equilibrium allocations with those chosen by the social planner as being Pareto optimal. Before setting a formal comparison, let us emphasize some interesting properties of consumers’ quasilinear demand for good x. Recall that when preferences are quasilinear,

( ) ( )i i i i iu x m xφ= +

Page 189: MUÑOZ,F. Advanced microeconomic theory. WSU


Therefore, for a given allocation (x1bar, x2bar, q1bar, q2bar),


1 1 2 2 1 2 1 1 2 2

( )

( ) ( ) ( ) ( )J

m j jjw c q

u x u x m m x xφ φ


+ = + + +

We can therefore define the utility possibility frontier as the set of all those (u1,u2) pairs for which

1 2 1 1 2 2( ) ( )u u u x u x+ ≤ + , or in other words

1 2 1 1 2 21( ) ( ) ( )


m j jju u w c q x xφ φ

=+ ≤ − + +∑

Importantly, note that the left-hand side of the above inequality neither depends on u1 nor on u2. As a consequence, the utility possibility frontier is a straight line, and changes in the endowment wm, in the output, or in the amount of consumption (x1bar, x2bar) shifts the entire utility frontier upward or downward, without altering its slope. The following figure depicts two different utility possibility frontiers: one in which consumption and production is given by x1bar, x2bar and qjbar, and another one in which it is given by amounts x1*, x2* and qj*.

Figure #5.15

Pareto optimal allocations. Let us now examine Pareto optimal allocations. In particular a benevolent planner chooses the optimal consumption vector (x1,x2,…,xI)≥0 and production vector (y1,y2,…,yJ)≥0 such that

1 1

1 1

max ( ) ( )

. .


m j j i ij i


i ji j

w c q x

s t x q

φ= =

= =

− +


∑ ∑∑ ∑

Intuitively the above maximization problem states that the social planner wants to maximize aggregate surplus (i.e., the sum of all individuals utility function less total production costs) subject to the market

Page 190: MUÑOZ,F. Advanced microeconomic theory. WSU


clearing condition (stating, as usual, that aggregate consumption must be equal to the production). Taking first order conditions with respect to xi and qj we obtain

* *

* *

* *

1 1

( ) with equality if 0

( ) with equality if 0

j j j

i i i


i ii j

c q q

x x

x q


φ μ

= =

≤ >

≤ >

=∑ ∑

These first order conditions probably look familiar to you. Indeed, they coincide with the first order conditions for competitive equilibrium allocations for the specific case in which the Lagrange multiplier μ exactly coincides with the vector of market prices p*. Intuitively, this implies that the equilibrium price is equal to the shadow price of good l.13 We can now state the first connection between competitive equilibrium and Pareto optimal allocations.

1st FTWE: If price p* and allocation (x1*, x2*, …,xI*, y1*, y2*,…, yJ*) constitute a CE, then this allocation is also PO.

This result, despite being applicable in many cases, crucially depends on some conditions. First, when market participants (consumers and firms) are price takers. Otherwise, we would have monopsony or monopoly (or other forms of market power). Second, we assume that markets a complete. That is, there are markets for every relevant commodity.14

The 2nd FTWE examined under which conditions we can state the converse of the 1st FTWE, as follows.

2nd FTWE. For every PO utility levels (u1*,u2*, …, uI*) there are transfers of the numeraire commodity

(T1,T2,…,TI) satisfying1




=∑ (i.e., for distributing the fixed amount of the numeraire commodity

among all individuals) such that a competitive equilibrium reached from the endowments

1 1( ,..., )Im m Iw T w T+ + yields precisely the PO utility levels (u1*,u2*, …, uI*).

That is, the 2nd FTWE states that a particular PO allocation in which individuals achieve utility levels (u1*,u2*, …, uI*) can be implemented by a central authority could transfers money among consumers and then “allows the market work”, i.e., allows every individual to choose its optimal consumption bundle given his/her new wealth level wmi+Ti. The CE resulting from such a new initial state will induce PO utility levels (u1*,u2*, …, uI*). A normal question at this point is whether the 2nd FTWE tells us that

13 That is, in the CE: (1) every firm, by producing until the point in which marginal costs are equal to market prices, the firm makes marginal cost equal to the marginal social value of output (μ); and (2) every consumer, by consuming until the point in which the marginal benefit from additional units is equal to market price, makes the marginal benefit from consumption equal to its marginal cost.

14 Note that this assumption does not hold when there exists incomplete information about the product being exchanged in the market, as in the used-cars markets where the presence of incomplete information might induce all good cars to be deterred from the market. This is the standard argument of the market for “lemons”.

Page 191: MUÑOZ,F. Advanced microeconomic theory. WSU


“redistribution” is always good. Importantly, this theorem is supported only under relatively strong assumption. In particular, we consider that preferences and production sets are convex and, of course, we are assuming agents have complete information, which might be very restrictive in certain cases.15

Note that an alternative way to set up the social planner problem is

{ } { }1 1

1 1, , ,

1 1

1 1

max ( )

. . ( ) for all 2,3,...,

( ) for all 1, 2,...,

JIi i j ji j

ix m z q

i i i i


i ji j


i j mi j

j j j

m x

s t m x u i I

x q

m z w

z c q j J


φ= =

= =

= =


+ ≥ =

+ ≤

≥ =

∑ ∑∑ ∑

Intuitively this problem states that benevolent planner wants to maximize the utility level of individual 1 without reducing the utility level of any other individual in the society below a certain cutoff uibar, while satisfying two resource constraints and a technological constraint for every firm.

A note on the social welfare function. We consider that society measures the social welfare generated by a given vector of utility levels among individuals (u1,u2, …, uI) by using a social welfare function W(u1,u2, …, uI). The following figure depicts an example of this function. First, note that from our previous discussion the utility possibility frontier is a straight line indicating the pairs of utility levels that the society can reach given its endowment and current technology. Intuitively, this set represents utility pairs that are feasible for the society. The social welfare function, in contrast, helps select one particular pair among all those that are feasible. For the initial consumption and production levels x10, x20 and qj0, society prefers utility pair u0 since at this point society can reach the highest social welfare level.16 When consumption and production are increased to x11, x21 and qj1, the utility possibility set shifts outwards. If, after the change in consumption and production levels society is at a utility pair u1, a policy of transfers among consumers allows society to reach a higher social welfare level moving along the utility possibility set towards utility pair u1*.17

15 Standard presentations of general equilibrium theory show that the 2nd FTWE doesn't hold if these conditions are not satisfied, while the 1st FTWE still holds. (For a reference, see section 16.D in MWG).

16 Graphically, the figure represents utility pairs for which the society reaches the same social welfare level, i.e., iso-welfare curves.

17 Note that, in the specific case in which the social welfare function is “utilitarian”, i.e., W(u1,u2)=2



u=∑ , the iso-

welfare curves become straight lines, inducing the tangency condition with the utility possibility set to be a complete overlap. In that particular case, any utility pair along the utility possibility set is Pareto optimal.

Page 192: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #5.16

Welfare analysis

When evaluating how a change in consumption or production due to a change in some parameters (for instance, after the introduction of a tax) modifies aggregate social welfare we use aggregate Marshallian surplus, defined as the difference between the total benefit from consumption less the total cost of production,

' '

1 1( ) ( )


i i j ji jS x c qφ

= == −∑ ∑

and taking a differential change in the quantity of good k that individuals consume and that firms produce

such that 1 1


i ji jdx dq

= ==∑ ∑ . Then, the change in the aggregate Marshallian surplus is

' '

1 1( ) ( )


i i i j j ji jdS x dx c q dqφ

= == −∑ ∑

and since the marginal benefit from additional units of consumption ' ( )i ixφ coincides with the inverse demand function p(x) for all consumers (i.e., their individual consumes until the marginal benefit from additional units is equal to the market price), and c’j(qj)=C’(q) for all firms (i.e., every firm j’s marginal cost of its equilibrium production coincides with the aggregate marginal cost), then


1 1


1 1

( ) ( )

( ) ( )


i ji j


i ji j

dS P x dx C q dq

dS P x dx C q dq

= =

= =

= −

= −

∑ ∑∑ ∑

But since 1 1

, and I J

i ji jdx dq dx x q

= == = =∑ ∑ by market feasibility, then

Page 193: MUÑOZ,F. Advanced microeconomic theory. WSU


'( ) ( )dS P x C x dx⎡ ⎤= −⎣ ⎦

Therefore, the change in Marshallian surplus of a marginal increase in consumption (and production) is the difference between the consumers’ additional utility and firms’ additional cost of production. This intuition is graphically represented in the following figure, where the differential change in Marshallian surplus produced by a marginal increase in x is depicted in the vertical distance between the marginal benefit that consumers obtain from additional units the good and the marginal cost that firms incur in order to produce those additional units.

Figure #5.17

We can also integrate the above expression, eliminating the differentials, so we can obtain the total Marshallian surplus for an aggregate consumption level of x, as follows.

'0 0

( ) ( ) ( )x

S x S P s C s ds⎡ ⎤= + −⎣ ⎦∫

Where S0=S(0) is the constant of integration, and represents aggregate surplus when aggregate consumption is zero, x=0. The next figure represents aggregate Marshallian surplus for a given aggregate consumption level x.

Page 194: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure #5.18

A natural question at this point is “for which consumption level is aggregate Marshallian surplus S(x) maximized?” Differentiating the expression of S(x) with respect to x, we obtain the first order necessary condition

S’(x*)=P(x*)-C’(x*)≤0, or rearranging P(x*)≤C’(x*)

And the second order (sufficient) conditions,


and this expression is negative since P’(x*)<0, given that the inverse demand function decreases in quantity, and C’’(x*) since firms’ costs are convex in output (and therefore aggregate production costs are convex as well). Hence, function S(x*) is concave and we can confirm that x* constitutes a maximum of S(x). In addition, when x*>0 in interior solutions aggregate surplus S(x) is maximized for an output level where P(x*)=C’(x*). This implies that the aggregate surplus S(x) is maximized and the competitive equilibrium allocation. This could be anticipated by a visual examination of the above figure, where S(x) increases until x=x*. Coincides with the 1st FTWE, namely, every CE allocation is also PO, i.e., the CE allocation maximizes aggregate welfare.18

Concluding remarks. Let us briefly recall the assumptions in this chapter. First, all prices except for pk are fixed. When is it valid to use this assumption? When studying groups of commodities, as long as prices between the groups do not substantially change. Second, we were considering the absence of wealth effects (i.e., we were using a quasilinear utility function). When wealth effects are present, our supply and demand analysis, the definition of competitive equilibrium allocation, comparative statics,

18 For an interesting example related with the use of aggregate Marshallian surplus see Example 10.E.1 in MWG.

Page 195: MUÑOZ,F. Advanced microeconomic theory. WSU


etc. are still valid. However, the welfare analysis (evaluating Marshallian surplus) is not accurate when wealth effects are present, since neither AV=CV nor AV=EV.

Page 196: MUÑOZ,F. Advanced microeconomic theory. WSU


Chapter 6: Choice under Uncertainty

Expected Utility Theory

In contrast to our analysis in previous chapters, where the individual or firm selects among a set of certain outcomes, we now examine choices under uncertain outcomes. In this section we present the decision maker’s preferences over uncertain outcomes, and how to represent this preference relation with an expected utility function.

In particular, consider a set of possible outcomes (or consequences) C. This set might include, for instance, simple monetary payoffs (either positive or negative), in which case C=Reals, or instead, represent consumption bundles, in which case C=X (where X is a subset of RL, as in previous chapters). For simplicity, outcomes are considered finite, and hence the set of possible outcomes C contains N elements. In addition, the probabilities associated to every possible outcome are objectively known,1 being p1 for outcome 1, p2 for outcome 2, etc. In this chapter we use the concept of lotteries to represent uncertain outcomes. In particular, a simple lottery is a list


With pn≥0 for all outcome n, and 1



==∑ where pn is the probability of outcome n occurring.2 We

can graphically represent a simple lottery with two possible outcomes as a point along the line connecting (0,1) and (1,0), as depicted below.

Figure 6.1

Intuitively, note that the horizontal (vertical) intercept represents “degenerated” probability distributions, where outcome 1 (outcome 2, respectively) is certain. Strictly positive probability pairs (p1,p2) on the line p1+p2=1, in contrast, describe a lottery where none of the outcomes is certain and therefore the individual faces some uncertainty. We can easily extend this graphical representation of lotteries to the case of a

1 In later sections of this chapter we consider that the decision maker does not perfectly know the probability associated to every outcome (e.g., he does not know how likely is outcome 1).

2 Note that some textbooks describe lotteries as lists of not only probabilities, but also the outcome associated to every probability.

Page 197: MUÑOZ,F. Advanced microeconomic theory. WSU


lottery of 3 possible outcomes with associated probabilities (p1,p2,p3), as the following figure illustrates. First, note that the intercepts also represent degenerated probabilities where one outcome is certain. Second, note that points strictly inside the hyperplane connecting the three intercepts denote a lottery where the individual faces uncertainty, such as at the point depicted in the figure. This figure is usually referred as the probability simplex of lotteries with N=3 outcomes.

Figure 6.2

In order to simplify our graphical analysis, we can do a 2-dimension projection of the above hyperplane, as the following figure illustrates. First, note that the vertices represent the intercepts (where one outcome is certain). Second, a simple lottery where the individual faces uncertainty (interior points in the triangle) where the distance from the point and the side of the triangle represents the probability that the outcome represented at the opposite vertex occurs.

Figure 6.3

Page 198: MUÑOZ,F. Advanced microeconomic theory. WSU


We can now use our previous notation to define compound lotteries. Specifically, given a list of K simple lotteries, where

Lk=(p1k,p2k,…,pnk) for every lottery k=1,2…,K

with associated probabilities αk≥0 for every lottery k, with 1



=∑ then the compound lottery

(L1,L2,…,LK;α1,α2,…,αK) is the risky alternative that yields the simple lottery Lk with probability αk.

We can hence intuitively interpret a compound lottery as a “lottery of lotteries”: first, we face a probability α1 of playing lottery L1, and lottery 1 occurs, then we face a probability p11 of outcome 1 occurring, probability p21 of outcome 2 occurring, etc. Then, the probability of outcome 1 is in fact

1 21 1 1 2 1 1... K

Kp p p pα α α= ⋅ + ⋅ + + ⋅

Therefore, for any compound lottery (L1,L2,…,LK;α1,α2,…,αK), we can calculate a corresponding reduced lottery as the simple lottery L=(p1,p2,…,pN) that generates the same ultimate distribution of outcomes. That is, the reduced lottery L of any compound lottery can be obtained by

1 1 2 2 ... K KL L L Lα α α= + + + ∈Δ

Let us see two examples of reduced lotteries. In example 1 below, all lotteries are equally likely (αi=1/3 for i=1,2,3) but, if lottery 1 occurs, we are guaranteed outcome 1, while if lotteries 2 or 3 occur, we face a positive probability of obtaining either of the three possible outcomes. The probability of outcome 1 in

this compound lottery is therefore, 1 . Similarly, the probability of outcome 2 is 0

(the probability of outcome 3 can be found in a similar manner, also being ¼). The reduced

lottery of the compound lottery represented in example 1 is therefore , , .

Figure 6.4

In example 2 below, lotteries 4 and 5 are equally likely. The probability of each outcome is

Page 199: MUÑOZ,F. Advanced microeconomic theory. WSU


1 1 1 1 1 13 3 4 3 4 2

3 31 1 1 13 3 8 3 8 4

3 31 1 1 13 3 8 3 8 4

outcome 1: 1

outcome 2: 0

outcome 3: 0

⋅ + ⋅ + ⋅ =

⋅ + ⋅ + ⋅ =

⋅ + ⋅ + ⋅ =

Figure 6.5

The reduced lottery of the compound lottery represented in example 2 is therefore , , . Interestingly,

both compound lotteries induce the same reduced lotteries, despite originating from a different set of simple lotteries. This reduced lottery (which assigns the same probability weight to lottery L4 and L5) is graphically represented as the linear combination between these two lotteries in the probability simplex below.

Figure 6.6

Preferences over lotteries

Page 200: MUÑOZ,F. Advanced microeconomic theory. WSU


Regarding the preferences of decision makers who face uncertain outcomes, we assume that individuals only care about the compound lotteries that induce the same reduced lottery; as in the previous example where two different compound lotteries induced the same reduced lottery. We refer to this assumption as “consequentialism” since only consequences (outcomes), and the probability associated to every consequence, matter for the decision maker.

In addition, we consider the set of all simple lotteries over outcomes C, £. We assume that the decision maker has a complete and transitive preference relation over lotteries in £, allowing him to compare any pair of simple lotteries L and L’. That is,

1. Completeness: Either and , or both, , LL L L L L L′ ′ ′∀ ∈∼ ∼

2. Transitivity: If and , then , , , LL L L L L L L L L′ ′ ′′ ′′ ′ ′′∀ ∈∼ ∼ ∼

Examples. Let us now describe some examples of preference relations over lotteries. First, we consider examples of preferences over lotteries where the decision maker is only concerned about the probability distribution over outcomes.

1. Extreme preference for certainty: The decision maker prefers lottery L to L’ if and only if

max maxn nn N n N

p p∈ ∈


Intuitively, this preference relation represents a decision maker who is only concerned about the probability associated to the most likely outcome. That is, he considers the most likely outcome in lottery L and L’ and chooses the lottery in which such outcome is the most likely. (Note that such outcome might differ from lottery L to lottery L’).

2. Smallest size of the support: The decision maker prefers lottery L to L’ if and only if supp(L)≤supp(L’)

where supp(L) denotes the support of lottery L, i.e., the number of outcomes with an strictly positive probability, or more precisely supp(L)={n :pn>0}. Intuitively, this preference relation considers a decision maker who prefers the lottery whose probability distribution is concentrated over the smallest set of possible outcomes.

Let us next examine preference relations over lotteries for which the decision maker cares about not only probability distributions but also outcomes.

3. Lexicographic preferences: first, we order outcomes from most to least preferred. Then, the decision maker prefers lottery L to L’ if and only if

p1>p1’, or if p1=p1’ and p2>p2’, or

if p1=p1’ and p2=p2’ and p3=p3’, or… Intuitively, the decision maker prefers lottery L to L’ if outcome 1 (the most preferred outcome) is more likely to occur in lottery L than in lottery L’. If such outcome is equally likely in both lotteries, i.e., p1=p1’, then the decision maker prefers lottery L to L’ if outcome 2 (the second most preferred outcome) is more likely to occur in lottery L than in lottery L’, etc.

4. The worst case scenario: First, the decision maker attaches a number v(.) to every outcome, v(z). Then, he prefers lottery L to L’ if and only if

Page 201: MUÑOZ,F. Advanced microeconomic theory. WSU


min{v(z):p(z)>0}> min{v(z):p’(z)>0} Intuitively, this implies that this decision maker prefers lottery L if the worst utility he can get from playing lottery L, min v(z), is higher than the worst utility he can get from playing lottery L’.3

Let us next define continuity of preferences in this context of preferences over lotteries. For completeness, we present two equivalent definitions.

Continuity 1. For any three lotteries L, L’ and L’’, the sets

{ }{ }

[0,1] : (1 ) [0,1] is closed, and

[0,1] : (1 ) [0,1] is closed



α α α

α α α

′ ′′∈ + − ⊂

′′ ′∈ + − ⊂

are closed. The following definition of continuity is probably more intuitive. We therefore ellaborate on the intuition behind continuity after presenting the following definition.

Continuity 2. If lottery L is strictly preferred to L’, then there is a small neighborhood of L and L’, B(L) and B(L’), such that for all La B(L) and Lb B(L’), we have that La is strictly preferred to Lb. The following figure illustrates the intuition behind this definition. In particular, small changes in the probability distribution of lotteries L and L’ do not change the decision maker’s preference over the two lotteries.

Figure 6.7

Using an example from MWG, if a decision maker prefers a car trip to staying at home (both events with certain probabilities), then he must still prefer the car trip (if we include a small probability of suffering a car accident) than staying at home, as the following figure illustrates. In particular, he slightly moves from one of the vertices, but still prefers the lottery La (a car accident with a small probability of a car accident) to lottery Lb (staying at home).

3 This preference ordering over uncertain outcomes is sometimes observed in computer sciences, where one algorithm is preferred to another if it functions better in the worst case scenario, independently of the probability that such worst case scenario occurs (as long as it is positive).

Page 202: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 6.8

The above continuity assumption, as in consumer theory, implies the existence of a utility function from the set of all lotteries £ to the reals, i.e., U:£→R, such that lottery L is weakly preferred to lottery L’ if and only if U(L)≥U(L’).

We must however impose an additional assumption on preferences over lotteries in order to guarantee that the decision maker’s preferences satisfy “consequentialism” as suggested above. We do so by imposing the so-called independence axiom (IA).

A preference relation over lotteries satisfies the IA if, for any three lotteries L, L’ and L’’, and α (0,1) we have that L is weakly preferred to L’ if and only if αL+(1- α)L’’ is weakly preferred to αL’+(1- α)L’’. Intuitively, if we mix each of two lotteries, L and L’, with a third one L’’, the preference ordering of the two resulting compound lotteries does not depend (is independent of) the particular third lottery L’’ that we use. We provide a graphical illustration of the IA below. Specifically, in the figure on the left the individual prefers lottery L to L’. Hence, it must be that, when we construct a linear combination of the first two lotteries with any third lottery L’’, the linear combination of L and L’’ is still preferred to that of L’ and L’’.

Figure 6.9

Page 203: MUÑOZ,F. Advanced microeconomic theory. WSU


The following example emphasizes on the intuition behind the IA. Consider a decision maker prefers lottery L to L’. We can construct a compound lottery where, after a coin toss, the decision maker plays lottery L when heads comes up and L’’ when tails does, and another compound lottery where the decision maker plays lottery L’ when heads comes up and L’’ when tails does. The IA tells us that this decision maker must still prefer the first to the second compound lottery.4 For examples of preferences that do not satisfy the IA, see Rubinstein (pages 91-92).

Figure 6.10

Given the above assumptions, we can now state that the utility function over lotteries has the so-called expected utility form.

The utility function U:£→R has the expected utility form if there is an assignment of numbers (u1,u2,…,uN) to the N possible outcomes such that, for every simple lottery L=(p1,p2,…,pN) £ , we have


In addition, a utility function with the expected utility form is also referred as a von-Neumann-Morgenstern (vNM) expected utility function. Note that this function is linear in the probabilities, as the following result states.

A utility function U:£→R has the expected utility form if and only if it is linear. That is, if and only if

1 1

( )K K

k k k kk k

U L U Lα α= =

⎛ ⎞ = ⋅⎜ ⎟⎝ ⎠∑ ∑

for any K lotteries Lk £, k=1,2,…,N and probabilities (α1,α2,…,αK)≥0 for every lottery. Intuitively, the

utility of the expected value of the K lotteries, 1

, coincides with theK

k kk

U Lα=

⎛ ⎞⎜ ⎟⎝ ⎠∑


expected utility of the K lotteries, ( )K

k kk

U Lα=

⋅∑ . Indeed, note that the utility of the expected value of

playing the K lotteries is

4 Despite the IA seems a sensible assumption in the theory of choice under uncertainty, note that it did not necessarily hold in consumer theory (under certain outcomes). In particular, a consumer might prefer good A over good B, but the combination of A with a third good C does not need to be preferred to the combination of B with the third good C, i.e., the consumer might regard A and C as substitutes but B and C as complements in consumption.

Page 204: MUÑOZ,F. Advanced microeconomic theory. WSU




k k n k nk n k

U L u pα α=

⎛ ⎞ ⎛ ⎞= ⋅⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠∑ ∑ ∑

Where, for a given outcome n, the decision maker finds the joint probability of outcome n occurring in lottery 1, α1pn1, plus the joint probability of outcome n occurring in lottery 2, α2pn2, and similarly for all K lotteries. Summing the joint probability of outcome n occurring along the K lotteries, we obtain the total joint probability of outcome n occurring, and we multiply it times the utility that the decision maker gets from outcome n, un. We can then repeat this process for every possible outcome n=1,2,…,N. Similarly, the expected utility from playing the K lotteries is indeed represented by


( )K k

kk k k n n

k n n

U L u pα α=

⎛ ⎞⋅ = ⋅⎜ ⎟⎝ ⎠

∑ ∑ ∑

where, for a given lottery k, we find the expected utility from outcome 1 occurring in lottery k, u1p1k, plus the expected utility from outcome 2 occurring in lottery k, u2p2k, etc. Summing over all possible outcomes, we obtain the expected utility from playing a given lottery k. We can then multiply this expected utility from the associated probability of lottery k occurring, and then repeat this process for all lotteries k=1,2, …, K.

Note that the above EU property is a cardinal property (not ordinal). That is, not only the ranking matters, but the particular number resulting from the utility function U:£→R. Hence, the EU form of a original utility function U(L) is preserved only under increasing linear transformations, such as βU(L)+γ, where β>0, as the following result confirms. A utility function UTILDE: £→R is another vNM utility function for the decision maker’s preferences over lotteries if and only if UTILDE(L)= βU(L)+γ for every lottery L £, where β>0.5

Using the above assumptions we can now state the following result. Suppose that the decision maker’s preference relation over lotteries satisfies rationality (completeness and transitivity), continuity and the independence axiom. Then, this preference relation admits a utility representation of the expected utility form. That is, we can assign a number un to every outcome n=1,2,…,N in such a manner that for any two lotteries L=(p1,p2,…,pN) and L’=(p1’,p2’,…,pN’), lottery L is weakly preferred to lottery L’ if and only

if U(L)≥U(L’), or 1 1


n n n nn n

p u p u= =

′⋅ ≥ ⋅∑ ∑ . (Note that un is the utility that the decision maker assigns to

outcome n. It is usually referred as the Bernouilli utility function.)

Up to this point the decision maker’s preference over lotteries had not been graphically represented with indifference curves. Let us next analyze the effect of the IA on individual’s indifference curves over lotteries. In particular, the IA implies that indifference curves must be straight and parallel lines.

5 βU(L)+γ is also referred as an affine transformation, i.e., an increasing linear transformation.

Page 205: MUÑOZ,F. Advanced microeconomic theory. WSU


1. Indifference curves must be straight lines. Indeed, if a decision maker is indifferent between two lotteries L and L’, then applying the IA he must be indifferent between αL+(1-α)L’ and αL+(1-α)L for all 1>α>0 (where note that we only added αL on both sides of the indifference relation).This result is graphically illustrated in the following figure, where the decision maker is indifferent between L and L’, and therefore he must also be indifferent between L and any linear combination between L and L’, i.e., graphically represented by the line connecting lotteries L and L’.

Figure 6.11

Alternatively, note that if a decision maker is indifferent between lotteries L and L’, then using

the IA we obtain that he must be indifferent between and . This is graphically

represented in the following figure, where the individual is indifferent between lotteries L and L’ (so they both lie on the same indifference curve), and therefore the IA implies that the compound

lottery should also lie on the same indifference curve, which graphically implies that

indifference curves must be straight. Note that if, in contrast, indifference curves are curvy –as

that in the next figure—the compound lottery does not lie on the same indifference

curve as lottery L and L’, and hence the decision maker is not indifferent between such

compound lottery and the simple lotteries L and L’.

Figure 6.12

Page 206: MUÑOZ,F. Advanced microeconomic theory. WSU


2. Indifference curves must be parallel. If a decision maker is indifferent between two lotteries L

and L’, then applying the IA he must be indifferent between and . In the

figure below, this implies that, starting from two lotteries L and L’ over which the decision maker is indifferent (and therefore lie on the same indifference curve), the linear combination of each of these two lotteries with a third lottery L’’ should also lie on the same indifference curve. If these

two compound lotteries and lie on different indifference curves –as they do

in the figure below— then IA is violated.6

Figure 6.13

Violations of the IA

Despite its intuitive appealing, many individuals violate the IA in their choices among uncertain outcomes. Let us next present some examples.

Allais’ paradox. Consider a lottery over three possible monetary outcomes: a first prize of US$2.5 million, a second prize of half a million dollars, and a third prize of zero dollars. The decision maker is initially asked to choose among lotteries L1 and L1’, where

10 89 11 1 100 100 100(0,1,0) and ( , , )L L ′= =

and he/she is then asked to select one lottery between the following two:

89 10 90112 2100 100 100 100(0, , ) and ( , 0, )L L ′= =

Interestingly, more than 50% of the students confronted with these two choices express preferring lottery L1 to L1’, but preferring L2’ to L2. (This result has been recurrently observed in different countries, and among subjects with different backgrounds.) Let us next show why this preference relation violates the IA. If the decision maker’s preferences over lotteries satisfied all previous assumptions (and hence can be represented with an expected utility function), the fact that L1 is strictly preferred to L1’ implies that

6 Indeed, in the figure the decision maker is indifferent between lotteries L and L’, but is not indifferent between 1



3 and




3, violating the IA.

Page 207: MUÑOZ,F. Advanced microeconomic theory. WSU


10 89 15 25 5 0100 100 100

89 890 5100 100

89 89 10 89 89 8915 0 5 25 5 0 0 5100 100 100 100 100 100 100

89 10 90115 0 25 0100 100 100 100


By the IA, we can add on both sides, we obtain

( ) ( )

and simplifying

u u u u

u u

u u u u u u u u

u u u u


> + +

+ − > + + + −

+ > +

⇔ 2L ′

Hence, if the decision maker prefers L1 to L1’ the IA implies that he must prefer L2 to L2’. The dissonance between theoretical predictions and people’s actual choices over lotteries has produced several reactions.

1. Approximation to rationality. One reaction to the Allais’ paradox considers that people might violate the IA the first time (or the first few times) they are confronted with choices among different lotteries. However, they are capable of adapting, and we shouldn’t expect that subjects still violate the IA after a sufficient period of time.

2. Little economic significance. Other reaction to the Allais’ paradox says that the lotteries presented to subjects involve probabilities that are close to zero and one, which rarely occur in real economic settings.

3. Regret theory. Some subjects justify their choice of lottery L1 over L1’ saying that they did not “want to regret a sure win of half a million!” These justifications led to the development of regret theory in the context of choice under uncertainty.7

4. Use of weaker assumption. Finally, another reaction to the Allais’ paradox is to give up the IA in favor a weaker assumption, such as the betweeness axiom (as discussed in the review session).

Machina’s paradox. Consider a decision maker with the following preference over certain outcomes: he prefers a trip to Venice (Italy) than watching a movie about Venice, and he prefers a movie about Venice than staying at home without watching the movie. Let us now consider the following two lotteries over the above three outcomes.

99 991 11 2100 100 100 100( , ,0) and ( ,0, )L L= =

Intuitively, the first lottery involves a 99% probability of winning a trip to Venice and a 1% probability of winning the movie about Venice. The second lottery still maintains the same 99% probability of winning a trip to Venice but shifts the 1% probability towards the outcome in which the individual does not watch the movie about Venice. One interesting feature of the IA is that, from the previous preferences over certain outcomes, we can infer this decision maker’s preference over the above two lotteries. Denote by T the trip to Venice, M the movie about Venice and H staying at home (without the movie). Using the fact that this decision maker prefers T to M, we have

7 One of the exercises in your homework assignment explores a decision maker whose preferences over lotteries reflect regret.

Page 208: MUÑOZ,F. Advanced microeconomic theory. WSU


99 991 1100 100 100 100

99 991 1100 100 100 100

1 2

Second, from , we have

Hence, by transitivity,





+ +

+ +

Hence, a decision maker whose preference over lotteries satisfies the IA should prefer the first to the second lottery. Interestingly, many subjects in experimental settings prefer L2 to L1, violating as a consequence the IA. Similarly as in the Allais’ paradox, many subjects explain choosing L2 over L1 because of the disappointment they would experience in the case of losing the trip to Venice, and having to watch a movie about it instead.

The above two examples present situations in which subjects’ actual behavior is inconsistent with the IA. Can we still rely on the IA as a sensible assumption about individuals’ preferences among lotteries? A way to answer this question is by asking what would happen to individuals whose behavior violates the IA. In short, they would be weeded out of the market because they would be open to the acceptance of the so-called “Dutch books,” leading them to a sure loss of money.

Example of Dutch Books: Consider an individual who prefers lottery L to L’ and lottery L to L’’, i.e., and L L L L′ ′′ . However, in violation of the IA, (1 )L L Lα α′ ′′+ − for some

(0,1)α ∈ . So if we present the individual with the chance to trade lottery L for a compound

lottery of L′ with probability α and lottery L′′ with probability (1 )α− , (1 )L Lα α′ ′′+ − , for a small fee (x dollars), he would accept the trade. But as soon as the first stage of the compound lottery is over the individual will have either L′ or L′′ . Since he prefers lottery L to both of these lotteries L’ and L’’, we could present him with the chance to trade his lottery for lottery L for a small fee (y dollars) and he will accept the trade. Thus after both trades he will have paid two small fees (x+y dollars) and ended up exactly where he began. We could start the cycle again, extracting more money from this individual. Hence, individuals who systematically violate the IA in their choices among risky lotteries would be weeded out of the marketplace.

Money lotteries

In the following sections we restrict our attention to lotteries over monetary outcomes, i.e., C= . Since lottery is a continuous variable, x∈ , this allows us to describe money lotteries as a cumulative distribution function (cdf)

F(x)=prob{y≤x} for all y∈

That is, F(x) represents the probability that the realized payoff y is less than or equal to x. The following figure illustrates an example of a money lottery that assigns the same probability to every possible payoff and it can therefore be represented with a uniform cdf F(x)=x.

Page 209: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 6.14

whereas the next figure depicts a money lottery that assigns a larger probability to the initial values (approximately before $40) than to the last values (beyond $60).

Figure 6.15

The above examples consider continuous probability distributions. The decision maker can nonetheless face a money lottery that is distributed according to a discrete probability distribution, as the following figure illustrates.

Figure 6.16

Page 210: MUÑOZ,F. Advanced microeconomic theory. WSU




0 if 1

if [1,4)( )

if [4,6)

1 if 6


xF x



<⎧⎪ ∈⎪= ⎨ ∈⎪⎪ ≥⎩

In addition, if there is a density function f(x) associated with the cdf F(x), then

( ) ( )x

F x f t dt−∞

= ∫

The following figures illustrate the density function f(x) for the above continuous and discrete cdfs.

Figure 6.17

Figure 6.18

In the context of money lotteries, we can represent compound lotteries as follows. If the list of cdf’s F1(x), F2(x), …, FK(x) represents K simple money lotteries, each occurring with probability α1,α2, …,αK, then the compound lottery can be represented as


( ) ( )K

k kk

F x F xα=


which intuitively represents the expected value of the K simple money lotteries.

Page 211: MUÑOZ,F. Advanced microeconomic theory. WSU


For simplicity, we thereafter consider that money lotteries are distributed over non-negative amounts of money.8 We can now express the expected utility that the decision maker obtains from playing a particular money lottery as follows

( ) ( ) ( ) , or ( ) ( )U F u x f x dx u x dF x= ∫ ∫

where u(x) denotes the utility value that the decision maker obtains when the lottery gives him a monetary amount of x dollars.9 Note that U(F) is the mathematical expectation of the values of u(x), over all possible values of x. Furthermore, note that this expression is linear in the probabilities. Indeed, in the case that the cdf is a discrete probability distribution (as that described in the previous examples), we can find the EU from playing such a money lottery by writing p1u(x1)+ p2u(x2)+…

Importantly, this expected utility representation is sensitive not only to the mean of the distribution, but also to the variance, and higher moments of the distribution of monetary payoffs. We show this property of the expected utility function in the following example.

Example. Let us show that if u(x) = 2x xβ γ+ then EU is determined by mean and variance alone.


2 2

2 2 2 2



( ) ( ) [ ] ( ) ( ) ( )

and on the other hand, we know that

( ) ( ) ( ) . Hence, ( ) ( ) ( )

Substituting ( ) in the above expression,

EU ( ) ( ) ( )

EU u x dF x x x dF x x dF x xdF x

Var x E x E x E x Var x E x

E x

Var x E x E x

β γ β γ

β β γ

= = + = +

= − = +

= + +

∫ ∫ ∫ ∫

And as a consequence, the EU is determined by the mean and variance alone.

Importantly, note that we imposed a relatively limited set of assumptions on the decision maker’s Bernouilli utility function, u(x) (the utility he obtains from a particular outcome or monetary outcome): that it is increasing in money and continuous. We must however impose an additional assumption: that u(x) is bounded. Otherwise, we can end up in relatively absurd situations, such as that illustrated in the so-called St. Petersburg-Menger paradox, which we present next.

St. Petersburg-Menger paradox. Consider an unbounded Bernouilli utility function, u(x). We can then find an amount of money xm such that u(xm)>2m, for any integer m. In particular, consider a lottery in

8 Note that this just implies a normalization with shifts all possible payoffs in the lottery, e.g., summing a constant P to all of them, where P represents the smallest negative payoff that the decision maker can obtain in the lottery. This normalization guarantees that all resulting payoffs are zero or positive.

9 Note that if there is a density function f(x) associated to the cdf F(x), we can use either of the above expressions. Otherwise we can only use the latter. In addition, note that we did not write the intervals of integration. We thereafter assume that the integral is defined over the full range of possible realizations of x, i.e., from zero to infinity.

Page 212: MUÑOZ,F. Advanced microeconomic theory. WSU


which we toss a coin repeatedly until tails comes up. We then give a monetary amount xm if tails comes

up at the m-th toss. Since, the probability that tails comes up in the m-th toss is … ,

then the expected utility from playing this lottery is


1( )

2 mmm

u x∞


But because of u(xm)>2m, we then have that

1 1

1 1( ) 2

2 2m

mm mm m

u x∞ ∞

= =

≥∑ ∑

where the expression on the left is infinitely large. Hence, this individual would be willing to pay infinite amounts of money to be able to play this lottery. It might therefore seem reasonable to assume that the Bernouilli utility function, u(x), is bounded.10

Measuring risk preferences

In this section we evaluate the preference towards risky lotteries of different individuals. First, we start with the measure of risk aversion. In particular, we say that an individual’s utility exhibits risk aversion if, for any money lottery F(.),

( )( ) ( ) ( )u x dF x u xdF x≤∫ ∫

If this relationship holds with equality, we denote this individual as risk neutral. If, instead, the sign of the inequality is reversed, we denote him as risk lover. Intuitively, the above expression says that the utility that this individual obtains from receiving the expected value of playing the lottery is higher than the expected utility from playing such lottery. The following figure illustrates this intuition. In particular, it considers a lottery with two possible outcomes: $1 and $3 which are equally likely. Note that, first, we depict the utility from outcomes $1 and $3, u(1) and u(3) respectively, by mapping $1 and $3 into the utility function. We then find the expected value of the lottery ($2) and map it into the utility function,

obtaining u(2). We can then connect u(1) and u(3). The midpoint of this line represents 1 1

(1) (3)2 2

u u+ ,

which is the expected utility of playing the lottery. Clearly, the utility from the expected value of the

lottery, u(2), is higher than the expected utility from playing the lottery, 1 1

(1) (3)2 2

u u+ . We can therefore

conclude that this individual’s utility exhibits risk aversion.

10 Alternatively, we can avoid situations such as that described in the St.Petersburg-Menger paradox by checking that the distribution function we are using does not allow for this type of paradoxes. (You can read more about this paradox, and potential solutions, in NS pp. 203-205. I strongly recommend you to read the Query in page 205 and check your answer at the back of the book.)

Page 213: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 6.19

Note that the above definition of risk aversion is a direct application of Jensen’s inequality. This suggests a strong connection between the concavity of an individual’s utility function and his degree of risk aversion. We return to this topic below.

The next figure depicts an individual who is risk neutral. In this case the utility from the expected value of

the lottery, u(2), coincides with the expected utility of the lottery, 1 1

(1) (3)2 2

u u+ . Thus, this individual

exhibits risk neutrality.

Figure 6.20

Finally, if an individual is risk lover, as the following figure illustrates, the utility from the expected

valued of the u(2), is lower than the expected utility from playing the lottery, 1 1

(1) (3)2 2

u u+ .

Page 214: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 6.21

An alternative way to measure risk aversion is by finding the certainty equivalent of a lottery. In particular, the certainty equivalent of money lottery F(.) for an individual with utility function u(.), c(F,u), is the amount of money for which the individual is indifferent between playing lottery F(.) and accepting a certainty (sure) amount c(F,u). More compactly, the certainty equivalent can be expressed as

( ( , )) ( ) ( )u c F u u x dF x= ∫

Where the right-hand side denotes the expected utility that this individual obtains from playing lottery F(.). The following figure illustrates the certainty equivalent for a risk-averse individual. Specifically, note that c(F,u) is the amount of money that makes the individual reach the same utility as if he played the lottery. Because he is risk averse, the certainty equivalent c(F,u) is below the expected value of the lottery, $2. In particular, c(F,u) can be found by applying the above definition to this particular lottery

u(c(F,u))= 1 1

(1) (3)2 2

u u+ .11 The difference between the expected value of the lottery and the certainty

equivalent that a risk averse individual would be willing to accept in order to avoid the risky lottery is also used as a measure of how risk-averse a certainty individual is. In particular, this measure is commonly referred as the risk-premium of a lottery, RP, and is defined as RP=EV – c(F,u). The figure below includes the risk premium that this individual is willing to bear in order to avoid the lottery, i.e., he is willing to accept the certainty equivalent which is below the expected value of the lottery.

11 If, for instance, u(x)= x , then

1 1( , ) 1 3

2 2c F u = + , or ( , )c F u =1.36. Squaring both sides of the equality,

we obtain a certain equivalent of c(F,u)=1.86.

Page 215: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 6.22

In the case that we examine a risk lover, the previous rankings are reversed, as the following figure illustrates. Indeed, the certainty equivalent c(F,u) lies above the expected value of the lottery, $2. As a consequence, the risk premium for this individual, RP=EV – c(F,u) is actually negative since EV<c(F,u). Intuitively, this implies that this individual would have to be given an amount of money above the expected value of the lottery in order to convince him to “stop playing” the lottery (he loves risk!!).

Figure 6.23

Finally, note that in the case of a risk-neutral decision maker, the certainty equivalent c(F,u) coincides with the expected value of the lottery, $2, and therefore the risk premium is zero, RP=EV – c(F,u)=0,

Page 216: MUÑOZ,F. Advanced microeconomic theory. WSU


reflecting that this individual is not willing to accept money to avoid playing the lottery (as for risk averse individuals), nor we must compensate him in order to stop playing the lottery (as for risk lovers).

Figure 6.24

All the measures of riskiness discussed above focus on money. The next measure, in contrast, focuses on probabilities. The probability premium measures the excess in winning probability over fair odds (equally likely outcomes) that makes the individual indifferent between the certainty outcome x and a gamble between the two outcomes x+ε and x-ε. That is,

[ ] [ ]1 12 2( ) ( , , ) ( ) ( , , ) ( )u x x u u x x u u xπ ε ε π ε ε= + + + − −

Intuitively, this implies that a risk adverse individual, in order to be attracted to play a particular lottery, must be given better than fair odds since otherwise he would not accept the risk associated to the lottery. This intuition is graphically represented in the following figure for our on-going example of the lottery between monetary amounts $1 and $3. First, we map $1 and $3 into the utility function, obtaining u(1) and u(3). We then find that the expected value of the lottery, $2, and map it into the utility function, u(2). Then, the “extra winning probability” (extra probability of outcome $3 occurring) that the risk averse decision maker needs in order to make the EU of the lottery raise until it coincides with the utility from the expected value of the lottery, u(2), is

1 12 2(2) ( ) (3) ( ) (1)u u uπ π= + ⋅ + − ⋅

Graphically, note that the EU from the lottery with fair odds (equally likely outcomes) lies below the lottery in which the winning probability has been increased by π.12

12 The probability premium of a lottery is referred as the “insurance premium” by NS. For examples on the probability premium, see the exercises on the handout of the review session and example 7.2 in NS (pp. 209-210).

Page 217: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 6.25

Given the above different measures of risk aversion, we next establish a connection between them. In particular, the following properties are equivalent:

1. The decision maker is risk averse. 2. The utility function is concave, u’’(x)≤0.

3. The certainty equivalent is lower than the expected value of the lottery, i.e., c(F,u)≤ ( )xdF x∫ ,

where ( )xdF x∫ denotes the expected value of lottery F(x).

4. The risk premium is positive, i.e., RP=EV-c(F,u), which simply implies EV>c(F,u). 5. The probability premium is positive for all x and ε, i.e., π(x,ε,u)≥0.

Example. The following example examines an individual’s decision about how much insurance to acquire. Consider a risk averse individual with utility function u(.) and wealth w. In the case that no loss occurs (which happens with probability 1-π), his utility is given by u(w-αq), where αq denotes the amount of money he spends on α units of insurance at a price of q per unit. If a loss occurs (which happens with probability π), his utility is now given by u(w-αq-D+α) where D denotes the dollar amount of the loss he suffers and α represents that the insurance company gives him $1 per unit of insurance bought. Hence, this decision maker’s expected utility maximization problem becomes

0max(1 ) ( ) ( )u w q u w q Dα

π α π α α≥

− − + ⋅ − − +

where α is this individual’s only choice variable (the number of units of insurance he buys). Taking first order conditions with respect to α we obtain

* * *(1 ) ( ) (1 ) ( ) 0q u w q q u w q Dπ α π α α′ ′− − − + − − − + ≤

When the FOC is satisfied with equality (at an interior optimum) we have

Page 218: MUÑOZ,F. Advanced microeconomic theory. WSU


* * *

* * *

(1 ) ( ) ( 1) ( )

( ) ( ) ( ) ( )

q u w q q u w q D

q q u w q q u w q D

π α π α α

π α π π α α

′ ′− − − = − − − +

′ ′− + − = − − − +

Now, assuming that q=π (and hence the insurance is actuarially fair, since the price of every unit of insurance is equal to the probability of a loss), then

2 * 2 * *

* * *

( ) ( ) ( ) ( )

( ) ( )

u w u w D

u w u w D

π π α π π π α π α

α π α π α

′ ′− + − = − − − +

′ ′− = − − +

and since u’(.) is strictly decreasing (by concavity), we obtain

* * *w w Dα π α π α− = − − +

and rearranging α*=D. Thus, if insurance is actuarially fair, the decision maker insures completely, i.e., he acquires a number of units of insurance that are exactly equal to the loss he can suffer.13

Arrow-Pratt coefficients of absolute and relative risk aversion. In this subsection we examine other forms of measuring risk aversion. In particular, focusing on the connection between risk aversion and the concavity of a decision maker’s utility function, we next present the Arrow-Pratt coefficient of absolute risk aversion, rA(x).

( )( )

( )A

u xr x

u x

′′= −

Clearly, the greater the curvature of the utility function, u’’(x), the larger the coefficient rA(x). Despite being interested in the curvature of the utility function –as described by u’’(x)— we cannot simply use u’’(x) to measure an individual’s risk aversion. In particular, such a measure is not invariant to positive linear transformations of the utility function. For instance, if v(x)=βu(x), then v’’(x)=βu’’(x) (is not invariant to the linear transformation) whereas the coefficient rA(x) is invariant since

( ) ( )( )

( ) ( )A

u x u xr x

u x u x


′′ ′′= − = −

′ ′

Example. Taking a utility function u(x)=-e-ax where a>0. Then, the Arrow-Pratt coefficient of absolute risk aversion, rA(x), is


( ) for all ax

A ax

a er x a x


−= − =

where rA(x) is constant in the individual’s wealth level, x. The literature refers to this utility function as the Constant Absolute Risk Aversion (CARA) utility function. ■

13 If insurance is not actuarially fair, i.e., q>π, then a different result follows. See homework assignment.

Page 219: MUÑOZ,F. Advanced microeconomic theory. WSU


If, instead, coefficient rA(x) decreases as we increase wealth x, we say that such utility function satisfies

decreasing absolute risk aversion, i.e., ( )Ar x



∂. Intuitively, this implies that wealthier individuals are

willing to bear more risk than poorer individuals. Note, however, that this is not due to different utility functions between these two groups of people, but rather, because the same utility function is evaluated at higher/lower wealth levels.

The following coefficient is unaffected by the wealth level at which risk aversion is evaluated. In particular, the coefficient of relative risk aversion can be expressed as follows.

( )( ) that is, ( ) ( )

( )


( ) ( )( )



u xr x x r x x r x

u x

r x r xr x x

x x

′′= − ⋅ = ⋅

∂ ∂= + ⋅

∂ ∂

And the utility function for which the coefficient of relative risk aversion is constant is commonly referred as the Constant Relative Risk Aversion (CRRA) utility function, U(x)=xb. (It is easy to check that rR(x)=b for this utility function).

Finally, let us now establish equivalences between the above measures of risk aversion. For two utility functions u1 and u2, where u2 is a concave transformation of u1 (i.e., u2 is more concave than u1), we have that:

1. The coefficient of absolute risk aversion for the more concave utility function is higher, i.e., rA(x,u2)≥ rA(x,u1).

2. There exists an increasing concave function φ(.) such that u2(x)= φ(u1(x)) at all x. That is, u2(.) is a concave transformation of u1(.), i.e., u2(.) is a more concave function than u1(.).

3. The certainty equivalent that the decision maker with utility function u2(.) is willing to accept in order to avoid the lottery is lower than that of the decision maker with utility function u1(.), i.e., c(F,u2)≤c(F,u1) for any lottery F(.).

4. The probability premium that the individual with utility function u2(.) needs in order to accept playing lottery F(.) is higher than that of the individual with utility function u1(.), i.e., π(x,ε,u2)≥ π(x,ε,u1).

5. Whenever u2(.) finds a lottery F(.) at least as good as a riskless outcome xbar, then u1(.) also finds such lottery F(.) at least as good as xbar. That is

2 2 1 1( ) ( ) ( ) implies ( ) ( ) ( )u x dF x u x u x dF x u x≥ ≥∫ ∫

The following figure summarizes some of the above results. First, note that u1(.) and u2(.) are evaluated at the same wealth level x. Then, we map outcomes $1 and $3 into u1(.) and into u2(.), separately. Connecting u1(1) and u1(3) we obtain the expected utility of playing the lottery for individual 1, and similarly for individual 2. Note that EU1>EU2. We then find the certainty equivalent for each individual, i.e., the amount of money that provides each individual with the same utility as what he expects to obtain if actually playing the lottery. As the figure depicts, the certainty equivalent that individual 2 is willing to accept in order to avoid playing the lottery is lower than that of individual 1, reflecting that individual 2 is more risk averse than individual 1.

Page 220: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 6.26

Comparison of payoff functions

In previous sections we analyzed risk preferences for a given lottery and described different measures of riskiness. In this section we examine different distribution of payoffs, and how some might be more attractive than others. Specifically we will use two main evaluation criteria:

1. If a lottery F(.) yields unambiguously higher returns than G(.) the first lottery seems more attractive than the second lottery. We will explore this idea by the definition of first-order stochastic dominance (FOSD). This concept is connected with the mean of the lottery. Hence individuals compare the mean of two lotteries when facing a decision problem, and prefer the lottery with a higher mean.

2. If, however two lotteries, F(.) and G(.) have the same mean, but lottery F(.) is unambiguously less risky than G(.), i.e., it is distributed over a smaller support, then we can anticipate that lottery F(.) would be preferred to lottery G(.),In this case, the concept developed to rank lotteries is related with the variance of a lottery, and we will explore it in the definition of second-order stochastic dominance (SOSD).

FOSD. The distribution of monetary payoffs in lottery F(.) first-order stochastically dominates (FOSD) the distribution of monetary payoffs in lottery G(.) if and only if

1-F(x)≥1-G(x) for every payoff x, or alternatively F(x)≤G(x)

Page 221: MUÑOZ,F. Advanced microeconomic theory. WSU


First, note that for a given lottery F(.), 1-F(x) intuitively represents the probability of obtaining prizes above x. Hence, the above condition for FOSD implies that, at any given outcome x, the probability of obtaining prizes above x is higher with lottery F(.) than with lottery G(.). This intuition is graphically represented in the following figure, where for a given outcome xbar, F(xbar)≤G(xbar), or alternatively 1-F(xbar)≥1-G(xbar). Graphically, this implies that the cdf of lottery F(.) lies below that of G(.). Indeed, the probability weight that lottery F(.) assigns to high monetary outcomes is larger than that of lottery G(.).

Figure 6.27

Let us now examine an example with lotteries over discrete outcomes (the above examples of lotteries F(.) and G(.) considered continuous cdfs). In the following figure, we consider lottery G(.), which assigns half probability to the monetary outcome $1 and half to outcome $4. Lottery F(.), in contrast, shifts the probability weight lying in outcome $1 towards outcomes $2 and $3 equally (with a probability of ¼ each) the probability weight in outcome $4 is shifted to $5. The probability weight is kept unaltered.

Figure 6.28

The following figure illustrates these two lotteries, which provides a visual comparison of their cdfs. In particular, we can easily check that F(.) lies below lottery G(.), and therefore F(.) FOSD G(.).

Page 222: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 6.29

The previous example with discrete probability distributions suggests that an “upward probabilistic shift” –such as the one described from lottery G(.) to lottery F(.)— produces a new cdf that FOSD the original cdf. Generally, if we take any outcome x, and add an amount z, where z is distributed according to a cdf

Hx(.), with Hx(0)=0, then ( ) ( ) ( )xu x u x z dH z= +∫ since the distribution generates a final return of

at least x with probability one. (Recall example).

Intuitively, note that the above condition simply states that lottery F(.) generates a higher expected utility than lottery G(.), where F(.) is simply the “upward probabilistic shift” that function Hx(.) produces in the original cdf G(.).

SOSD. We now focus on the dispersion of monetary outcomes in a lottery, as opposed to the higher/lower returns that FOSD analyzes. To focus on the dispersion of the lottery only, we assume that lotteries F(.) and G(.) both have the same mean (i.e., the same expected outcome). We then say that lottery F(.) SOSD G(.) if, for every nondecreasing utility function u(x), u: , (mapping certain monetary outcomes into utility levels), we have that

That is, lottery F(.) SOSD G(.) if the former generates a larger expected utility than the latter, where both of them yield the same mean.

Example 1: Mean Preserving Spread. Let us first consider lottery F(.), which assigns an equal probability to outcomes $2 and $3 occurring. Then we spread the probability weight of these two outcomes over the probability of these and other outcomes. In particular, we spread the probability weight of $2 (1/2) over

( )xH ⋅

u(x)dF(x) = u(x + z)dH x (z)∫⎡⎣ ⎤⎦dF(x) ≥ u(x)dG(x)∫∫∫

( ) ( ) ( ) ( )u x dF x u x dG x≥∫ ∫

EUF u(x) EUG

Page 223: MUÑOZ,F. Advanced microeconomic theory. WSU


outcome $1 and $2 equally (1/4 each). Similarly, we spread the probability weight of $3 (1/2) over outcome $3 and $4 equally (1/4 each). First, note that the expected value of both lotteries coincides, being 5/2 for both F(.) and G(.). Hence, the mean is preserved across lotteries. However, lottery G(.) spreads the probability weight of lottery F(.) over a larger set of outcomes.

Figure 6.30

We can conclude that lottery F(.) SOSD G(.) since they both have the same mean, but the former concentrates its probability weight over a smaller support, i.e., F(.) is less dispersed than G(.). Note, however, that neither lottery FOSD the other. Indeed, as the following figure indicates, F(.) is not above G(.) for all x, or below G(.) for all x.

Figure 6.31

Example 2: Elementary Increase in Risk. We say that a cdf G(.) is an Elementary Increase in Risk (EIR) of another cdf F(.) if G(.) takes all the probability weight of an interval [x’,x’’] and transfers it to the end points of this interval, x’ and x’’, such that the mean is preserved. Hence, both cdfs F(.) and G(.) maintain the same mean but G(.) concentrates more probability at the end points of the interval [x’,x’’] than the original distribution F(.). The following figure illustrates an EIR.

Page 224: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 6.32

Note that an EIR is a mean preserving spread (MPS), but the converse is not necessarily true.14 In the above example, F(.) and G(.) share the same mean but F(.) is less dispersed than G(.). As a consequence, lottery F(.) SOSD G(.).15

For exercises related to FOSD and SOSD, see MWG 6.D.2 and 6.D.3.

State-dependent utility

In all our previous discussions the decision maker only cared about the payoff arising from every outcome of the lottery. In this section, we assume that the decision maker cares not only about his monetary outcomes, but also about the state of nature that causes every outcome. Intuitively, this implies that, for a given outcome x, the decision maker might experience a different utility if such outcome originates from state of nature 1 occurring than from state of nature 2. In the following subsections, we will first discuss how we can describe uncertainty using states of nature paralleling outcomes from our previous discussions. Secondly, we will analyze how these state-dependent preferences can be used to obtain an “extended” expected utility representation.

Using states of nature to represent utility

Let us now assume that each of the possible monetary payoffs in a lottery is generated by an underlying cause (an underlying state of nature). Let’s consider two different examples:

1. The monetary payoff of an insurance policy is generated by a car accident. In this case, state of nature={car accident, no car accident}.

2. The monetary payoff of a corporate stock is generated by the state of the economy. In particular, state of nature={economic growth, economic depression}.

14 This would be the case if the MPS shifts some probability weight towards points away from interval [x’,x’’], satisfying the definition of a MPS but not that of an EIR.

15 Note that, similarly to the above example, we cannot determine whether lottery F(.) FOSD G(.) since neither of them lies above or below the other for all monetary outcomes x.

Page 225: MUÑOZ,F. Advanced microeconomic theory. WSU


Generally, we know every state of nature as s S, where S is a finite set containing all states of nature. Every state s has a well defined, probability of occurrence πs≥0. Finally, a random variable is a function g:S that maps states of nature in S into monetary payoffs. Let us extend our previous examples.

1. Car accident: the random variable assigns a monetary value to the state of nature “car accident” (e.g., -$1,000, with probability πacc) and to the estate of nature “no accident” (e.g., -$100 where the driver only pays its insurance premium, with probability 1-πacc).

State of Nature Monetary Payoff

car accident Deductable – premium

no car accident Premium (-)

2. Corporate stock: the random variable assigns a monetary value to the state of nature “economic

growth” (e.g., $250 in increased value of the shares, with probability πgrowth), and to the state of nature “economic depression” (e.g., - $125 in decreased value of the shares, with probability 1- πgrowth).

State of Nature Monetary Payoff

growthπ economic growth Dividends, higher price of shares

depressionπ economic depression No dividends, loss if we sell shares

Every random variable g(.) can be used to represent the monetary lottery F(.). In particular,

where {s : g(s)≤x} represents all those states of nature s for which the monetary payoff arising from them, g(s), is lower than a particular monetary payoff x.16 hence, the random variable g(.) generates a monetary payoff for every state the nature s S, and since set S is finite, we can represent this list of monetary payouts as

Where xs is the monetary payoff corresponding to state of nature s. The following figure provides an example of a random variable g(.). Specifically, outcomes are ordered from lower to higher monetary payoffs, i.e., x4≥x3≥x2≥x1. In addition, outcome 1 can occur with probability 50%, outcomes 2 and 4 can occur with probability 25% each, while outcome 3 receives zero probability.

16 For an example, think of stocks: F($200) represents the cumulated probability of obtaining a payoff equal or lower to $200 from the stock.



{ }: ( )

( ) ss g s x

F x π≤

= ∑

1 2( , ,..., ) Ssx x x +∈

Page 226: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 6.33

We can hence express the cumulative probability of every outcome as follows

This example reveals one disadvantage of using F(x). In particular, for a given outcome x, we cannot keep track of which different states of nature generated x.

“Extended” expected utility representation

We can now express a preference relation of the list of monetary payoffs (x1,x2,…,xS) . It is important to note the similarity of this setting with that in consumer theory. Indeed, in that context we described preferences over bundles, while now we described preferences over lists of monetary payoffs. Since the list of monetary payoffs (x1,x2,…,xS) specifies one payoff for each state of nature (one for each contingency), this list is usually referred to as “contingent commodities.”

We now expand our previous EU representation to this state-dependent utility. In particular, we say that a preference relation has an Extended EU representation if for every state of nature s, there is a utility function us: (mapping the monetary outcome in state s, xs, into a utility value us(xs) in ), such that for any two lists of monetary outcomes

Interestingly, note that the main difference with previous sections is that now the Bernouilli utility function is state-dependent, us(.), whereas in the previous section it was state-independent, u(.).

11 1 12

31 12 1 2 2 4 4

31 13 1 2 3 2 4 4

4 1 2 3 4

( ) since states with ( )

( )

( ) 0

( ) 1

F x g s x

F x

F x

F x

ππ ππ π ππ π π π

= = ∃ <

= + = + =

= + + = + + =

= + + + =



+ →

1 2 1 2

1 2 1 2

( , ,..., ) and ( , ,..., )

( , ,..., ) ( , ,..., ) iff ( ) ( )


S S s s s s s ss s

x x x x x x

x x x x x x u x u xπ π

+ +′ ′ ′∈ ∈

′ ′ ′ ′≥∑ ∑∼

Page 227: MUÑOZ,F. Advanced microeconomic theory. WSU


Let us next provide a graphical representation of a decision maker’s state-dependent preferences. First, we depict the monetary outcome arising in state of nature 1 and 2 (x1 and x2, respectively) in the horizontal and vertical axis. In addition, note that at the “certainty line” the decision maker receives the same monetary amount regardless of the state of nature, i.e., x1=x2 (45-degree line). Second, all the (x1,x2) pairs on a given indifference curve must satisfy

Third, note that the upper contour set of an indifference curve that passes though point (x1bar,x2bar) is

Figure 6.34

Furthermore, note that movements along a given the indifference curve do not change the decision maker’s utility level. Hence, totally differentiating (as we did in order to find the MRS in consumer theory), we obtain

1 1 1 2 2 2( ) ( )u x u x Uπ π⋅ + ⋅ =

1 1 1 2 2 2 1 1 1 2 2 2( ) ( ) ( ) ( )

or more generally, ( ) ( )s s s s s ss s

u x u x u x u x

u x u x

π π π π

π π

⋅ + ≥ ⋅ + ⋅

≥∑ ∑

Page 228: MUÑOZ,F. Advanced microeconomic theory. WSU


which represents the slope of the indifference curve, evaluated at point (x1bar,x2bar). Finally, note that if the Bernouilli utility function were state-independent, i.e., u1()=u2()=…=uS(), then the slope of the

indifference curve would be

Example. Insurance with state-dependent utility. Starting from an initial situation without insurance, the pair of monetary outcomes for a particular individual with wealth level w is (w, w-D), where D represents the loss he suffers from a certain accident. After purchasing insurance, the decision-maker gets a payment z1 in state 1 (no accident) and a payment z2 in state 2 (accident). That is, the pair of monetary payoffs becomes (w+z1, w-D+z2). Moreover, if the policy is actuarially fair, its expected payoff is zero

Figure 6.35

First, note that insurance allows this individual to consume along any point of his budget line. In addition,

note that the slope of the budget line is , which coincides with the slope of the decision maker’s

indifference curve at the certainty line x1=x2 when his preferences are state-independent. Therefore, in this case the indifference curve is tangent to the budget line at the certainty line. This implies that this individual would insure completely since his consumption level is completely unaffected by the possibility of suffering an accident: his consumption with/without accident coincides. In the case that the decision maker’s preferences are state-dependent, however, indifference curves are not tangent to the budget line at the certainty line. Instead, the decision-maker prefers a point such as (x1’,x2’) to the certain outcome (xbar,xbar). That is, at (xbar,xbar) he prefers higher payoffs in state 1 than in state 2, since

1 1


2 2


1 1 2 21 1 2 2

1 2

( )12 1 1 1

( )1 2 2 22

( ) ( )0

and rearranging,

( )

( )

u xx

u xx

u x u xdx dx

x x

dx u x

dx u x

π π

π πππ



∂ ∂⋅ + ⋅ =

∂ ∂

⋅ ′⋅= − = −


2 1

1 2


ππ= −

1 1 2 2 0z zπ π+ =




Page 229: MUÑOZ,F. Advanced microeconomic theory. WSU


u1’(xbar)>u2’(xbar). Otherwise, he would prefer higher payoffs in state 2 than in state 1. In addition, note that u1’(xbar)>u2’(xbar) implies that u1’(xbar)/u2’(xbar)<1, and

Figure 6.36

(For more about the state dependent approach, see NS pp. 216-220, and for extra practice see Examples 7.3 and 7.4 for the CARA and CRRA, respectively, and the “Portfolio problem” in pp. 214-215)

Let us now allow for the possibility that the monetary payoff under state s, xs, is not a certain amount of money, but a random amount with distribution function Fs(.). Hence, when monetary outcomes arising from the S states of nature can be described as a lottery for every state of nature L=(F1,F2,…,FS). Given this “extended” definition of lotteries to the account for state-dependence, we can then rewrite the IA as the “extended” IA, as follows.

1 1 1

2 2 2

( )

( )

u x

u x

π ππ π

′⋅− < −


Page 230: MUÑOZ,F. Advanced microeconomic theory. WSU


The preference relation over lotteries satisfies the “extended” IA if, for all L, L’ and L’’ and , we have

Hence, this “extended” IA is a mere extension of the standard IA to the case of “extended” lotteries L=(F1,F2,…,FS). We can now express the Extended EU Theorem.

Extended EU Theorem. Suppose that a decision maker’s preferences over lotteries satisfy continuity and the extended IA. Then, we can assign a utility function us(.) for money in state s, such that for any two lotteries

The last inequality simply says that the decision maker prefers “extended” lottery L to L’ if the expected utility from L is higher than that from L’. In particular, note that the expected utility from extended lottery L can be expressed as above, since for a given state of nature s, we have payoffs distributed according to the cdf Fs(.).


Subjective probability theory

Suggested additional reading: Varian chapter 11, Page 190-194

In previous sections we assumed that the probabilities of every possible outcome were objective and observable by the decision maker. This might not be the case in certain cases where, instead, people might hold probabilistic beliefs about the likelihood of a certain event. We will refer to these probabilistic beliefs as individual’s subjective probabilities. Because of being subjective, a natural question is whether we can infer a decision maker’s subjective probabilities from his/her actual behavior. The answer to this question is that we can infer subjective probabilities. For instance, consider a decision maker who prefers a gamble giving him $1 in state 1 and $0 in state 2 to another gamble in which he gets $0 in state 1 and $1 in state 2. If the value of money is the same across states, we can infer that this decision maker is assigning a higher subjective probability to state 1 than to state 2.

In this section we want to extend the EU theorem we described in previous parts of this chapter for objective probabilities to the case of subjective probabilities. Before stating this extension of the EU theorem we must, however, start with some definitions.

17 You can find more exercises on lotteries on Rubinstein, Lecture 8.

(0,1)α ∈

if and only if (1 ) (1 )L L L L L Lα α α α′ ′′ ′ ′′+ − + −∼ ∼

L = (F1,F2 , F3..........., Fs ) and L ' = (F '1,F '2 , F '3 ..........., F 's )

we have

L ≥ L' iff us (xs )dFs (xs )∫( )s∑ ≥ us (xs )dF 's (xs )∫( )


E us (xs )[ ]

Page 231: MUÑOZ,F. Advanced microeconomic theory. WSU


Let us start defining an individual’s preferences over two lotteries in state s. In particular, we define state s preferences on state s lotteries Fs(.) by saying that an individual prefers lottery Fs(.) to F’s(.), both of them in state s, if and only if the expected utility from lottery Fs(.) is larger than that from lottery F’s(.), or

( ) ( ) iff ( ) ( ) ( ) ( )S S S S S S S S S SF F u x dF x u x dF x′ ′⋅ ⋅ ≥∫ ∫∼

Hence, the state preferences ( )1 2, ,...,

S∼ ∼ ∼ on state lotteries ( )1 2, ,..., SF F F are state uniform if

S S′=

∼ ∼for any two states s and s′

That is, if for any two states, s and s′ , the ranking of lotteries coincides ( ) ( )S SF F ′⋅ ⋅∼

for any two

lotteries ( )SF ⋅ and ( )SF ′ ⋅ .

Alternative interpretation, for any two states, s and s′ , the ranking of expected utilities from playing two

lotteries ( )SF ⋅ and ( )SF ′ ⋅ coincide, for example, ( ) ( ) ( ) ( )S S S S S S S Su x dF x u x dF x′≥∫ ∫ .

With our above definition of state uniform preferences, the Bernouilli utility function in state s and s’, us(.) and us’(.), can differ only up to an increasing linear transformation. That is, there is a utility function u(.) such that

(.) (.)s s su uπ β= +

for every state s, where 0sπ > and sβ (and similarly for us’(.), so that the ranking between the expected utility in state s and s’ is unaffected).

We can now state the extension of the EU theorem to subjective probabilities. Suppose that a preference relation over lotteries satisfies continuity and the extended IA. Suppose, in addition, that the derived state preferences over lotteries are state uniform. Then, there are subjective probabilities 1 2( , ,..., ) 0Sπ π π >> and a utility function u(.) on certain amounts of money, such that any two lists of monetary amounts

1 2( , ,..., )Sx x x and 1 2( ' , ' ,..., ' )Sx x x ,

1 2 1 2( , ,..., ) ( , ,..., ) iff ( ) ( )S S S S S SS S

x x x x x x u x u xπ π′ ′ ′ ′≥∑ ∑∼

Intuitively, the last expression says that a decision maker prefers the first list of monetary outcomes to the second if the “subjective” expected utility he obtain from the first is larger than that he obtains from the second. This result is interesting since it allows us to generalize much of our previous methodology to the case of subjective probabilities (i.e., beliefs). Nonetheless, the predictions of the subjective EU theorem are not necessarily satisfied in certain experimental settings. The following example (the so-called Ellsberg paradox) presents a behavioral pattern that violates the subjective EU theorem, paralleling the

Page 232: MUÑOZ,F. Advanced microeconomic theory. WSU


“anomalies” we described after presenting the IA, namely, the Allais’ paradox and the Machina’s paradox.

Ellsberg paradox: Consider the following game. Your instructor shows up in the classroom with an urn containing 300 balls. He/she informs you that, among the 300 balls, 100 are red, but the remaining 200 can be either blue or green. Then he presents you the following two gambles, asking you to choose only one of them. (In all gambles you first insert your hand in the urn without being able to see which ball you extract from the urn)

Gamble A: $1000 if the ball you extract is red.

Gamble B: $1000 if the ball you extract is blue.

Confronted with these two gambles, many subjects select gamble A, showing that the objective probability that the ball is red (1/3) must be higher than their subjective probability that the individual assigns to the ball being blue. Now your instructor offers you the choice among two other gambles.

Gamble C: $1000 if the ball you extract is not red.

Gamble D: $1000 if the ball you extract is not blue.

In this case, many subjects choose gamble C, showing that the objective probability that the ball is not red (2/3) must be higher than the subjective probability the individual assigns to the ball not being blue. However, this choice is contradictory with the previous choice (of gamble A over B) if the decision maker uses the subjective EU theory described above. First, from gamble A being preferred to B we can infer that


where p(Blue) denotes the subjective probability that the individual assigns to the ball being blue. Second, from gamble C being preferred to D, we infer that

2/3>p(Not Blue),

where p(Not Blue) represents the subjective probability that this individual assigns to the ball not being blue. However, from standard probability, we know that p(Blue)=1-p(Not Blue), which contradicts the previous two results.

Page 233: MUÑOZ,F. Advanced microeconomic theory. WSU


Chapter 7: Monopoly In this chapter we examine the output and pricing decision by a firm that holds market power selling its product to a group of customers, i.e., monopolist. In addition, we evaluate the welfare effects of monopolies, and describe price discrimination practices often used by monopolists to further increase their profits beyond those the monopolist obtains when setting a single price to all customers, i.e., usually referred as uniform pricing.

Profit maximizing output under monopoly

Let us start considering a general demand function x(p), which is continuous and strictly decreasing in p, i.e., x’(p)<0. Similarly as our discussion under perfectly competitive markets, we assume that there is a price pbar<∞ such that x(p)=0 for all p>pbar, as described in the figure below. Intuitively, this guarantees that if market prices are sufficiently high, no consumers buy positive amounts of the good.

Figure 7.1

In addition, consider a general cost function c(q) which is increasing and convex in q. (Recall that convexity guarantees that the first derivate is itself increasing in q, and hence marginal costs are increasing in q.) Hence, the monopolist profit maximization problem can be expressed as a choice of a price p such that

max ( ) ( ( ))p

px p c x p−

Alternatively, taking the inverse demand function p(q)=x-1(p), we can rewrite the monopolist problem as follows, where the choice variable of the monopolist is now its output,

0max ( ) ( )

qp q q c q


Differentiating with respect to q, we obtain

( ) ( ) ( ) 0m m m mp q p q q c q′ ′+ − ≤


Page 234: MUÑOZ,F. Advanced microeconomic theory. WSU


[ ]( )

( ) ( ) ( ) with equality if 0d p q q


m m m m m


p q p q q c q q


′ ′+ ≤ >

In addition, we assume that p(0)>c’(0), which graphically implies that the vertical intercept of the demand curve lies above that of the marginal cost curve, as depicted in the figure. This guarantees that an interior optimum exists, and hence our above first-order condition holds with equality. That is,

( ) ( ) ( )m m m mp q p q q c q′ ′+ =

This implies that the monopolist increases production until the point where the marginal revenue from selling an additional unit equals the marginal cost from producing such unit. The following figure illustrates this result.

Figure 7.2

The figure also shows that market demand lies above the marginal revenue curve. Indeed, since p’(qm)<0 (i.e., market prices decrease in total output), marginal revenue must be weakly lower than demand, p(q).1 This result calls for a more elaborate explanation of the economic intuition behind the MR(q).

( ) ( )m m mMR p q p q q′= +

Intuitively, an increase in output produces two effects on the monopolist’s total revenue. First, it produces a direct (positive) effect, since the monopolist can sell a larger amount of units at a price p(qm). Second, however, it produces an indirect (negative) effect. In particular, the increased production implies a movement along the demand curve, lowering prices. Hence, the monopolist must reduce the price it charges to not only the new additional unit it produces (the marginal unit) but also the units it was already producing (the so-called inframarginal units, i.e., those below the margin). Specifically, this

1 In addition, note that for q=0 both MR and p(q) coincide. This is also illustrated in the figure, where the vertical intercept of both curves coincides. You can easily check that MR(q)≤p(q) for the case of linear demand p(q)=a-bq, where MR(q)=a-2bq.

Page 235: MUÑOZ,F. Advanced microeconomic theory. WSU


indirect effect is embodied in the second term of the MR(q) expression, which shows the reduction in market price due to the increased production, p’(qm), times all the units the monopolist produces, qm. Since p’(qm)<0, the indirect effect of larger output in total revenue is negative.

Furthermore, since p’(qm)<0 in the above expression of MR(q), we have that p(qm)>c’(qm). In words, the price the monopolist sets is higher than its marginal costs from producing the last unit. In addition, since in perfectly competitive markets p(q*)=c’(q*), we can then conclude that pm>p* and, given that the demand curve is negatively sloped, qm<q*. The above figure compares market prices under monopoly and perfectly competitive industries, i.e., pm>p* , and total output, qm<q*.

The above first-order conditions lead us to conclude that the monopolist increases production until the point in which marginal revenue and marginal costs coincide. In order to show that this is indeed a production level that maximizes (and not minimizes) monopoly profits, we must next check the second-order conditions associated to the above profit maximization problem. Taking the FOCs and differentiating with respect to q again, we obtain

( ) ( ) ( ) ( ) 0dMCdMR


p q p q p q q c q′ ′ ′′ ′′+ + − ≤

Or more compactly, this states that the slope of the marginal revenue must be weakly smaller than the slope of the marginal cost function at the profit-maximizing output qm. This is indeed satisfied at the optimum, as the following figure illustrates. In particular, the slope of the marginal cost curve is positive while that of the marginal revenue curve is negative, which implies that the above SOC is satisfied. (Note that this condition holds even if the marginal costs are constant in output, where the slope of the marginal cost is zero). Alternatively, this condition can be understood as that the MR(q) curve crosses the MC(q) curve from above.

Figure 7.3

Page 236: MUÑOZ,F. Advanced microeconomic theory. WSU


Let us finally check if this property holds when the marginal cost curve is decreasing in output, as the following figure indicates. In this case, the slope of the marginal cost and that of the marginal revenue curve are both decreasing in q. Thus, the SOC imposes the restriction that, at the optimum, the marginal revenue curve must be steeper than the marginal cost curve (or, alternatively, that the MR curve crosses the MC curve from above).

Figure 7.4

The monopolist profit maximizing condition we found in the above FOCs can be alternatively expressed in terms of the mark-up that the monopolist charges over marginal costs. Indeed, taking the MR function

MR(q) we have: ( ) ( ) pqMR p q p q q p q∂∂′= + = + , multiplying by p

p ,





p p qMR p p p p

p q pε


= + = +∂

Since at the profit-maximizing output the monopolist sets MR(q)=MC(q), we can set the above expression of MR(q) equal to MC(q) to obtain



p MC

p ε−

= −

Intuitively, this index (the so-called Lerner index of market power) says that the price mark-up over marginal cost that a monopolist can charge (as percentage of price) is a function of the elasticity of demand. In particular, markets with more elastic demand curves have low price mark-ups, while markets with more inelastic demands (for instance, because there are no close substitutes to the product sold by the monopolist) imply substantial price mark-ups. The Lerner index can also be written as





Page 237: MUÑOZ,F. Advanced microeconomic theory. WSU


Perloff uses two examples of different elasticities of demand. First, in the case of the heart-burn medicine Prilosec OTC, price-elasticity of demand has been estimated at approximately -1.2. Using the Lerner index, this implies that p=5.88MC, or that the price that this drug company can charge is 5.88 times higher than its marginal cost of production.2 The second example considers designer jeans, with a slightly more elastic demand of -2. In such case, the Lerner index shows that p=2MC, showing a lower price mark-up than in the previous example.

Special case 1: Monopoly facing a linear demand curve

Consider a monopolist facing a market with linear demand function p(q)=a-bq, where b>0 implying that the inverse demand curve is negatively sloped. The monopolist cost function is c(q)=cq, where c>0. In addition, we usually assume that a>c. Note that this assumption just guarantees an interior solution since it is the application of condition p(0) > c’(0) to the current case, i.e., p(0)=a-b0=a and c’(q)=c so c’(q)=c. In this case, note that the objective function for the monopolist (its profit function) becomes

( )a bq q cqπ = − −

Taking FOCs we obtain

2 0a bq c− − =

which imply a maximum since the SOC (-2b) is indeed negative, implying concavity of the profit function.3 Solving for the optimal qm in the above FOC we obtain

2m a c



And inserting qm into the inverse demand function p(q), we obtain monopoly prices pm,

2 2m a c a c

p a bb

− +⎛ ⎞= − =⎜ ⎟⎝ ⎠

Finally, inserting pm and qm into the monopolist profit function we can find the monopolist profits at the optimum,

revenue costs4

m m m m a cp q cq

bπ −

= − =

We can graphically represent our previous results in the following figure. Interestingly, note that for linear demand curves, the MR curve has double the slope of the inverse demand function, i.e., it crosses

2 Note that this data corresponds to the elasticity of demand before Prilosec OTC lost its patent, and could be also sold as a generic drug by supermarkets such as RiteAid, Safeway, etc.) As a consequence, we can anticipate that the elasticity of demand for this drug is now probably lower, reducing as a consequence the price mark-up.

3 Note that this result is due to the assumption of negatively sloped demand (i.e., b>0). If, instead, demand was positively sloped (as in the case of Giffen goods) and b<0, then the output decision of the monopolist represented in the FOC would not guarantee an output level that maximizes the firms’ profits.

Page 238: MUÑOZ,F. Advanced microeconomic theory. WSU


the horizontal axis at the midpoint between the origin and the horizontal intercept of the inverse demand curve.

Figure 7.5

In addition, note that our previous analysis can be easily extended to the case in which marginal costs are not constant in q, but instead increasing as the following figure illustrates. Indeed, demand is still linear in output, but the cost function is convex in output, e.g., c(q)=cq2 implying a marginal cost curve c’(q)=2cq, with a positive slope of 2c as indicated in the figure.

Figure 7.6

Page 239: MUÑOZ,F. Advanced microeconomic theory. WSU


Special case 2: Constant elasticity demand function

Another case of interest where we can readily apply the Lerner index is that in which the monopolist faces a constant elasticity demand function q(p)=Ap-b where –b represents the price-elasticity of demand. (Applying the definition of price-elasticity of demand it is easy to show that


( )

( 1) ( 1) ( 1)

( )( ) ( )



q pppq

b b b b

q p p pq b Ap

p q Ap

b Ap p p b p p bA

ε − −−


− + − + +

∂= = −

− ⋅ ⋅ ⋅ = − ⋅ = −

Hence, using the Lerner index, we can find the monopoly price, as follows

1 1( )1 1


q b

c cp


= =− +

Welfare loss of monopoly

The following figure illustrates the welfare loss associated to monopoly. In particular, note that consumer surplus decreases in B and C (i.e., -B-C) when moving from a perfectly competitive to a monopoly, while the producer surplus increases in B-E. Therefore, area B simply represents a transfer from consumers to producers, whereas areas C and E illustrate a net loss for the economy, since they are not transferred to any other economic agent.

Page 240: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 7.7

Thus, the shaded area C+E denotes the deadweight loss of monopoly. More formally, the deadweight loss is identified by the area below the inverse demand curve, p(q), and above the marginal cost curve, c’(q), lying between the monopoly and the perfectly competitive outputs qm and q*. That is, the deadweight loss is the area represented by the integral

[ ]*

( ) ( )m


qDWL p s c s ds′= −∫

Page 241: MUÑOZ,F. Advanced microeconomic theory. WSU


Interestingly, note that this expression decreases as demand (and/or supply) become more elastic.4

The following two graphs show how monopoly profits and deadweight loss vary with demand elasticity.






Figure 7.8

D2 is more elastic than D1. With less elastic demand (D1), a monopolist charges higher price ($50) as compared to the more elastic demand (D2) where it charges only $35. For both demands, the profit maximizing choice of quantity is where MR = MC where MC1=MC2=MC

Monopoly profit and social deadweight loss increase with decreasing elasticity. The triangle in dark red on the following figure shows deadweight loss due to the difference in elasticity of D1 and D2 and the monopoly profit is $750 (=(50-35)(50)) more under D2 than D2 (shown in black).

4 This result resembles that we found when analyzing the size of the deadweight loss associated to the imposition of a sales tax under perfectly competitive markets.







Page 242: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 7.9

If, for instance, demand becomes infinitely elastic, p’(q)=0. In that case, the inverse demand curve becomes totally flat, as the following figure represents. Since p’(q)=0 then MR(q)=p(q)+0*q=p(q). This explains why the figure does not show any marginal revenue curve: it essentially coincides with the inverse demand curve (See recitation #9 Ex. 1).

Figure 7.10

The monopolist’s profit maximizing rule, MR=MC, becomes therefore p(q)=MC(q), which exactly coincides with that of a firm operating in a perfectly competitive industry. As a consequence, qm=q* in the figure, and thus deadweight loss is measured as the area between the inverse demand curve and the


Qm=50 Qc=100 Q

Fig. Monopoly profits and deadweight loss vary with elasticity of demand

(Carlton and Perloff, page 98 and adjusted to fit the explanation here)

Page 243: MUÑOZ,F. Advanced microeconomic theory. WSU


marginal cost curve between two coincident output levels, qm and q*, leading to a null deadweight loss of monopoly.

[ ] [ ]*since


( ) ( ) ( ) ( ) 0

( ) for all

mq q


m m

q q

q qDWL p s c s ds p s c s ds

p q p q TR p q MR p


′ ′= − = − =

= ⇒ = ⋅ ⇒ =

∫ ∫

Demand and MR curves totally overlap.

Welfare losses and elasticity

Let us next examine more closely the connection between the welfare loss of a monopoly (DWL) and elasticity. For simplicity, let us consider a monopolist with constant marginal and average costs, c, who faces a market demand with constant elasticity, i.e., q(p)=p-e where e denotes the elasticity of demand where e<-1.5 In the competitive equilibrium, price equals marginal cost, i.e, pc=c, whereas under monopoly, price pm can be found by the Lerner index (or rearranging it in the so-called inverse elasticity pricing rule, IEPR), as follows






Given that demand is q(p)=p-e, the consumer surplus associated with any price p0 can be computed as the area above p0 and below that demand curve,

0 0



( )

1 1





CS Q P dP P dP


e e

∞ ∞

∞ ++

= =

= = −+ +

∫ ∫

Therefore, under perfect competition, where p0=c, consumer surplus becomes








= −+

5 Recall that, at the optimum, the monopolist operates in the inelastic portion of the demand curve, i.e., price-elasticity is smaller than -1. When the monopolist faces a linear demand curve (where elasticity is different at different points along the curve) we don’t need to impose any further assumptions. In this case, however, price-elasticity is the same along all points of the demand curve. We must hence impose the condition that e<-1.

Page 244: MUÑOZ,F. Advanced microeconomic theory. WSU


whereas consumer surplus under monopoly, where pm=c/(1+1/e), becomes








+⎛ ⎞⎜ ⎟+⎝ ⎠= −


Taking the ratio of these two surplus, CSm/CSc, we obtain a measure of the difference in consumer surplus between monopoly and perfectly competitive markets. In particular, a ratio close to zero implies that the difference between CSm and CSc is extremely large, while a ratio close to one implies that consumer’s welfare is approximately the same in both market structures. In particular, dividing CSm over CSc we obtain







c e



+⎛ ⎞

= ⎜ ⎟+⎝ ⎠

For instance, if price-elasticity is -2, this ratio becomes ½, intuitively saying that CS under monopoly is half of that under perfectly competitive markets. The following figure illustrates how this ratio is affected by reductions in the price-elasticity demand. In particular, the ratio becomes closer to zero as demand becomes more inelastic (movements to the left in the figure). Intuitively, as demand becomes more insensitive to price, the monopolist can exercise its market power –charging higher mark-ups over marginal cost— ultimately reducing CSm.

Figure 7.11

Let us now focus on the monopoly profits. Notice that, after the monopolist sets profit-maximizing output and prices, profits are given by



1 1 1



1 1 1

m m m m me

e ece

me e e

cP Q cQ c Q

c c




⎛ ⎞= − = −⎜ ⎟+⎝ ⎠

⎛ ⎞ ⎛ ⎞ ⎛ ⎞−= ⋅ = − ⋅⎜ ⎟ ⎜ ⎟ ⎜ ⎟+ + +⎝ ⎠ ⎝ ⎠ ⎝ ⎠

Page 245: MUÑOZ,F. Advanced microeconomic theory. WSU


Hence, the monopoly profits can expressed as a function of only two parameters: marginal costs and price-elasticity of demand. In addition, in order to evaluate the transfer of welfare from CS into monopoly profits that consumers experience when moving from a perfectly competitive market to a monopolistic market, we can divide monopoly profits by the CSc, as follows



1 1

1 1

e e


c e

e e

CS e e


⎛ ⎞+⎛ ⎞ ⎛ ⎞= =⎜ ⎟⎜ ⎟ ⎜ ⎟+ +⎝ ⎠ ⎝ ⎠⎝ ⎠

Notice that, if price-elasticity is -2, then this ratio is ¼. The following figure illustrates that a more inelastic demand (leftward movements) increase the percentage of monopoly profits over the consumer surplus under perfectly competitive markets, i.e., the transfer of CS towards monopoly profits increases. A good exercise for additional practice: HW#9 Exercise #6.

Finally, note that the social costs of monopoly are not only evaluated by the DWL area defined in previous sections. Indeed, there are more social costs of monopoly. Here are some examples of social costs associated to monopolists:

1. R&D expenditure. In some cases, it might be excessive. For instance, in a patent race all (or most) of the R&D expenditure made by the firm that didn’t get the patent is a social cost.

2. Persuasive (not informative) advertising. 3. Lobbying costs (different from bribes). Indeed, note that a bribe cannot be strictly considered as a

social cost since it simply implies the transfer of money from the monopolist to politicians. Other type of lobbying costs, such as the time spent by lobbyists trying to convince politicians about the benefits of certain policies, for instance guaranteeing a legal barrier of entry, can be considered a social cost associated to monopolies.

4. Resources to avoid entry of potential firms in the industry. This occurs, for instance, if the incumbent (monopoly) overinvests in capacity that sits idle afterwards just to guarantee that any entry will be severely fought by flooding the market with products for a few periods.

As the above discussion suggests, some expenditures cannot be strictly considered social costs associated to monopolies. For example, bribes are just a wealth transfer from the monopolist to politicians. Similarly, some forms of R&D (not directly related with patent races) might produce benefits for the firm in the long run, and cannot therefore be considered social costs.

Comparative statics

In this subsection we examine how qm varies as a function of marginal cost. As the following figure indicates, we expect monopoly output to decrease in marginal cost.

Page 246: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 7.12

We know that at the optimum, qm(c),

( ( ), )0



q c c



Differentiating with respect to c and using the chain rule, we obtain

2 2


( ( ), ) ( ) ( ( ), )0

m m mq c c dq c q c c

q dc q c

π π∂ ∂+ =

∂ ∂ ∂

And solving for dqm(c)/dc, we have




( ( ), )( )

( ( ), )




q c cdq c q c

q c cdcq



∂∂ ∂= −


Note that this expression could be immediately obtained by using the Implicit Function Theorem. In addition, since the monopolist profit function is concave in q, the denominator of the above expression must be negative. This implies that the sign of the dqm(c)/dc –intuitively representing how qm is affected by changes in c— only depends on the sign of the term at the numerator. This conclusion is valid for any demand function, and even more generally, for the case in which we don’t perfectly observe the demand function, but we have information about the cross-derivative represented in the numerator. Let us next apply this rule to a linear demand curve p(q)=a-bq. In this case, the cross-derivative is

[ ][ ]2

( )

2( ( ), )1


a bq q cq

q a bq cq c c

q c c c

π⎛ ⎞∂ − −

∂ ⎜ ⎟∂ ∂ − −∂ ⎝ ⎠= = = −∂ ∂ ∂ ∂

Page 247: MUÑOZ,F. Advanced microeconomic theory. WSU


Inserting this result into the above expression for dqm(c)/dc we obtain that this derivative must be negative since




( ( ), )( )

( ( ), )




q c cdq c q c

q c cdcq



∂∂ ∂=


which intuitively implies that an increase in marginal costs, c, produces a decrease in monopoly output qm. The following two figures illustrate an increase in marginal costs when the monopolist faces a linear demand function, first for the case in which marginal costs are increasing in q (i.e., total costs are convex in output) and second, for the case that marginal costs are constant in q. Of course, the above result is independent on the increasing or constant pattern of the marginal cost curve.

Figure 7.13

Page 248: MUÑOZ,F. Advanced microeconomic theory. WSU


Multiplant monopolist

In this subsection we briefly analyze the monopolist production decision when it operates more than one plant. For instance, the monopolist produces in different countries (with potentially different efficiency levels, salaries, etc.) but sells its global production to an international market. The monopolist decision about how much total output to produce resembles our previous analysis. Nonetheless, its decision about how to distribute total output among its different plants requires a more detailed discussion.6 In particular, the monopolist produces output q1, q2,…, qN across the N plants it operates, with total costs TCi(qi) at each plant i={1,2,…,N}. Hence, the multiplant monopolist profit-maximization problem becomes

1,...,1 1 1

max ( )N


i i i iq q

i i i

a b q q TC q= = =

⎡ ⎤− −⎢ ⎥⎣ ⎦∑ ∑ ∑

And taking FOCs with respect to the production level at any individual plant j, qj, we have


2 ( ) 0 ( ) ( ) for all N

i j j j ji

a b q MC q MR Q MC q j=

− − = ⇔ =∑

Note that the last condition states that the monopolist must increase its production in plant j, qj, until the point in which the marginal cost of producing further units in that plant, MCj(qj) coincides with the marginal revenue that the monopolist obtains by selling this additional unit in the international market, i.e., MR(Q). The following figure illustrates this profit maximizing condition for a multiplant monopolist operating two plants with marginal costs MC1 and MC2.

Figure 7.14

6 For introductory references, see Besanko and Braeutigam (section 11.4) and Shy (section 5.4).

Page 249: MUÑOZ,F. Advanced microeconomic theory. WSU


First, note that the monopolist total marginal costs, MCtotal, are given by the horizontal sum of its marginal costs among both plants, i.e., MCtotal=MC1+MC2. Given MCtotal, we can easily find the total production for this monopolist by setting its total marginal costs, MCtotal, equal to its marginal revenue curve, MR, which occurs at point A. Hence, the monopolist produces Qtotal among both plants, and sells this total output at a price pm. At this point, we must distribute the total production Qtotal between the two plants. In order to do that, we first evaluate MR(Qtotal) –graphically represented by the height of point A in the figure—and we find the point of the MC1 for plant 1 that reaches this same height. This determines the output q1 that plant 1 produces, i.e., q1 solves MR(Qtotal)=MC1(q1). Similarly for plant 2, starting from the height of point A, which depicts MR(Qtotal), we extend a dotted line to the left crossing MC2. This determines q2, i.e., q2 solves MR(Qtotal)=MC2(q2).

In order to find closed-form solutions in the above example, consider that all plants are symmetric and have the cost function TCi(qi)=F+c(qi)2. Hence, they all produce the same output level, q1=q2=…=qN=q and the above FOCs become

2 2 which implies 2( )j j j

aa bNq cq q

bN c− = =


which implies that the total output produced by the monopolist, Q, is

2( )j

NaQ Nq

bN c= =


and market price is

( 2 )

2( ) 2( )

Na a bN cp a bQ a b

bN c bN c

+= − = − =

+ +

As special cases, note that if the monopolist operates a single plant, i.e., N=1, then total output and price coincide with that under standard monopoly models, i.e., Q=qm and p=pm. In addition, note that an increase in the number of plants N decreases the individual production for every plant qj. Furthermore, an increase in N reduces the profits for every individual plant. Finally, we briefly examine an example of a multiplant monopolist operating two asymmetric plants.

Example. Consider a monopolist facing linear demand p(q)=120-3Q and operating two plants with marginal costs MC1(q1)=10+20q1 and MC2(q2)=60+5q2, respectively. First, we seek to determine total output Qtotal. In order to obtain Qtotal we need to find MCtotal. However, note that we cannot obtain MCtotal summing (10+20q1)+(60+5q2). Indeed, this would be a vertical sum (not a horizontal sum) of the marginal cost functions. Hence, we first must invert the marginal cost functions for each plant, obtaining

11 1 1

110 20

20 2

MCMC q q= + ⇔ = − and similarly 2

2 2 260 5 125

MCMC q q= + ⇔ = −

We can then sum q1+q2= Qtotal to obtain Qtotal=0.25MCtotal-12.5, or inverting it, MCtotal=50+4Qtotal. Setting MR(Q)=MCtotal(Q), we obtain Qtotal=7 and market price is p=$99. We can now evaluate the marginal revenue at Qtotal, i.e., MR(Qtotal), obtaining 120-6*7=$78. This allows us to set MR(Qtotal)=MC1(q1), or 78=10+20q1 which implies q1=3.4 units. Similarly for plant 2, setting MR(Qtotal)=MC2(q2), or 78=60+5q2 which implies q2=3.7 units. Clearly, Qtotal=q1+q2.

Page 250: MUÑOZ,F. Advanced microeconomic theory. WSU


Price discrimination

In this section we analyze how monopoly profits can be further increased by setting different prices for purchases of different quantities of the good (or different prices to different customers).7 Intuitively, the monopolist is making positive profits, but could still capture a larger surplus from two segments of customers, as the following figure illustrates. On one hand, the customers buying the product at pm would be willing to pay more for the good, paying prices p>pm. On the other hand, there is a segment of customers who didn’t buy the good at pm, but whose willingness to pay for the good is still higher than the marginal cost of production, i.e., pm>p>c. When setting a uniform price pm for all units, however, the monopolist captures neither of these segments of potential customers. In fact, in order to capture these additional surpluses, the monopolist must abandon uniform pricing and use a form of price discrimination.

Figure 7.15

In particular, we will discuss two types of price discrimination: first (or perfect) price discrimination –where the monopolist charges to every customer his/her maximum willingness to pay for the object— and third degree price discrimination –where the monopolist charges different prices to two or more groups of customers. We do not examine second degree price discrimination –where the monopolist offers a menu (or plan) to customers so that every type of customer self-selects the most convenient menu— since that exposition involves elements of game theory that haven’t been discussed yet.

7 In this section we follow some parts of NS (pp. 503-509) and of Varian (sections 14.5-14.8).

Page 251: MUÑOZ,F. Advanced microeconomic theory. WSU


First-degree price discrimination. Under first (or perfect) price discrimination, the monopolist charges a different price to every buyer (i.e., a “personalized” price). The first buyer pays p1 for the q1 units, the second pays p2 for q2-q1 units, and similarly for all other buyers, as the next figure illustrates. Specifically, the monopolist continues doing so until the last buyer is willing to pay the marginal cost of production. (Increasing sales any further would imply loses for the monopolist).

Figure 7.16

In the limit, the monopolist captures all the area below the demand curve (representing consumers’ willingness to pay) and above marginal cost, as depicted in the following figure.

Figure 7.17

Let us prove the above result in a more formal way. Suppose that the monopolist can offer a combination of a fixed fee, r*, and an amount of the good, q*, that maximizes its profits. This implies choosing (r*,q*) that solve the following PMP


. . ( )

r qr cq

s t u q r

Page 252: MUÑOZ,F. Advanced microeconomic theory. WSU


First, note that the monopolist wants to raise the fee r until u(q)=r. (Otherwise, the monopolist could still increase its profits by further increasing fee r). Hence, we can reduce the set of choice variables (from (r,q) to only q), as follows

maxq u(q)-cq

Taking first order conditions with respect to q, we obtain u’(q*)-c=0, i.e., u’(q*)=c. Intuitively, the monopolist practicing first-degree price discrimination increases output until the marginal utility that consumers obtain from additional units (graphically represented by the inverse demand curve) coincides with the marginal cost of production. Given this level of production q*, we can obtain the optimal fee, r*=u(q*). This result states that the monopolist charges a fee r* that coincides with the utility that the consumer obtains from consuming q* units of output. Both of these results are graphically represented in the following figure where: (1) fee r* is depicted by all the area below p(q) until q* units; and (2) the monopoly profits are therefore r*-cq*, i.e., the area below the demand curve and above marginal cost.

Figure 7.18

Example. Let us next consider a simple example. A monopolist faces inverse demand curve p(q)=20-q and constant marginal costs c=$2. When it practices uniform pricing, setting MR equal to MC, the monopolist produces q=9 units at a price p=$11 with associated profits of $81. These profits are represented by the shaded area in the following figure.

Page 253: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 7.19

If, instead, the monopolist practices first-degree price discrimination, it sets p(q)=MC, producing q=18 units at a price of p=$2, with corresponding profits of $162, graphically represented by the area below the demand curve and above marginal costs in the previous figure. As expected, the practice of first-degree price discrimination increases the monopolist’s profits.8 □

Summarizing, under first-degree price discrimination, output coincides with that in perfectly competitive markets, where p(q*)=c. Unlike perfectly competitive markets, however, the consumer does not capture any surplus. In contrast, the producer captures now all this surplus. Because this type of price discrimination requires an enormous amount of information, we do not see many examples of it in real applications. Nonetheless, some examples approach this type of price discrimination to a large extent. For instance, financial aid in undergraduate education is often cited as a form of “tuition discrimination” practiced by many US colleges. In particular, application forms ask many details about the student’s (and his/her family) finances in order to determine his/her willingness to pay for higher education. On a lighter note, Coca-Cola tried to apply first-degree price discrimination by installing a thermometer in their vending machines. Specifically, the vending machine increased soda prices according to the temperature, where potential buyers’ willingness to pay was higher on a hot day.9

Third-degree price discrimination. In this type of price-discrimination, the monopolist sells its product to two (or more) different types of customers that are easily identifiable by the monopolist, e.g., youth

8 For another example, see Example 14.4 in NS.

9 Coca-Cola’s public image among many customers was damaged by these vending machines, and the company finally decided to take the vending machines away.

Page 254: MUÑOZ,F. Advanced microeconomic theory. WSU


versus adult customers at the movies (which can be identified by showing a valid ID).10 The monopolist PMP hence becomes

1 21 1 1 2 2 2 1 2

,max ( ) ( )x x

p x x p x x cx cx+ − −

Taking first order conditions with respect to x1 and x2 we obtain,

1 1 1 1 1 1

2 2 2 2 2 2

( ) ( ) 0

( ) ( ) 0

p x p x x c MR MC

p x p x x c MR MC

′+ − = ⇒ =

′+ − = ⇒ =

Interestingly, these FOCs coincide with those of a regular monopolist who practices uniform pricing as if it was serving two different markets, i.e., MR1=MC and MR2=MC. The following figure illustrates this idea for the example of adults (market 1) and seniors (market 2) at the movies. In particular, p1(x1)=38-x1 for adults, p2(x2)=14-1/4x2 for seniors and MC=$10 for both markets. Indeed, it is easy to check that MR1(x1)=38-2x1, which crosses MC=10 at x1=14 units, implying a price for adults of p1=$24. Similarly for seniors, MR2(x2)=14-0.5x2, which crosses MC=10 at x2=8 units, implying price for seniors of only p2=$12.

Figure 7.20

Using the property that MRi=MC for every type of customer i. We can rewrite this expression using the IEPR just as we did for monopolist practicing uniform pricing in previous sections of this chapter. In particular,

10 Recall that this differentiates this type of price discrimination with that under second-degree, where the monopolist cannot easily identify different groups of customers, and must offer a menu in order to achieve self-separation, i.e., that every customer chooses the most convenient menu, e.g., calling plan in a phone company.

Page 255: MUÑOZ,F. Advanced microeconomic theory. WSU




1 1 1

2 2 1

( )1

( )1

cp x

cp x





1 2

1 1 2 2 1 1 1 1

2 12 1 2 1

Note that ( ) ( ) if and only if ( ) , which1 1

1 1 1 1implies 1 1 .

c cp x p x p x

ε ε

ε εε ε ε ε

> = >− −

− > − ⇔ < ⇔ >

Therefore, the market with the more elastic demand (the market that is more sensitive to price changes, i.e., market 2 in our above example) is where the monopolist charges the lower price.

Example. A single airline operates the route Pullman-Seattle and considers charging different prices for their business class seats and economy seats.11 According to demand estimates, the price-elasticity of demand for business class seats is -1.15 while that for economy seats is -1.52, showing a larger sensitivity to price changes. From the first estimate, and using the IEPR, we can conclude that the price charged for every business class seat must satisfy pB0.13=MC. Similarly, using the IEPR we obtain that the price charged for every economy class seat must satisfy pV0.343=MC. Therefore, pB0.13= pV0.343, or pB=2.63pV. That is, the airline maximizes its profits by charging business class seats a price 2.63 times higher than that of economy class seats.12,13 □

Regulation of Natural Monopolies

Some monopolies exhibit decreasing cost structures, with the MC curve lying below the AC curve, as the following figure depicts. In this case, having a unique firm serving the entire market might seem better (more “natural” than) having multiple firms, since total average costs would be lower in the former than in the latter case. For this reason, monopolies with decreasing costs are usually referred as “natural” monopolies. An unregulated natural monopoly, however, would maximize profits at the point where MR=MC, producing Q1 units in the figure and selling them at a price p1. If a regulatory agency dislikes this monopoly output and prices and forces the monopoly to charge marginal cost pricing (as if the market structure was perfectly competitive) the monopoly will have to charge p2 (where demand crosses MC) and produce Q2 units. This production level, however, implies a loss of p2-c2 per unit in the figure.

11 If you have been in that plane, you know that the airline’s marginal cost of offering business class and economy class seats is exactly the same!

12 NS presents a similar example in Example 14.5.

13 Note that third-degree price discrimination might imply serving (not serving) some customers who might be not served (served, respectively) under uniform pricing. This implies that the practice of third-degree price discrimination can be welfare improving (or welfare reducing) under certain conditions. For a detailed discussion on this topic, see Varian pp. 250-253.

Page 256: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 7.21

The above discussion illustrates a dilemma for regulatory agencies when dealing with natural monopolies: either they abandon the policy of setting prices equal to marginal cost altogether, or they continue applying marginal cost pricing but must subsidize the natural monopoly (providing p2-c2 per unit of production) forever. One way in which regulatory agencies can avoid this dilemma is the implementation of a multiprice system: charging some users a high price while maintaining a low price (e.g., marginal cost pricing) to other users. For instance, the regulatory commission can allow charging a high price p1 to some users while other users are offered a lower price p2, as the following figure illustrates. Specifically, this produces a benefit p1-c1 per unit of output from 0 to Q1 and a loss of c2-p2 per unit of output for the additional units (Q2-Q1) sold to the second segment of customers. This approach is frequently used by several utility companies (electricity, water supply, etc.) that set different prices to different types of customers (e.g., business, households, etc.)

Figure 7.22

An alternative approach to the regulation of natural monopolies is to allow the monopoly to charge a price above marginal cost that is sufficient to earn a “fair” rate of return on capital investments. This approach, however, presents two difficulties. First, it might be prone to different interpretations about what is a “fair” rate of return on capital investments. Second, it leads to overcapitalization, as we show more formally below.

Page 257: MUÑOZ,F. Advanced microeconomic theory. WSU


Overcapitalization of natural monopolies (Averch-Johnson effect). Suppose a regulated utility company has a production function of the form q=f(k,l). Suppose that the rate of return on capital investments, s, is constrained by a regulatory agency to be equal to s0. Then, the firm’s profit maximization problem is represented by the following Lagrangian

[ ]0( , ) ( , )L pf k l wl vk wl s k pf k lλ= − − + + −

where the constraint states that the rate of return on capital investment dictated by the regulatory agency is s0. Note that λ cannot be zero. Otherwise, the above PMP would simply become pf(k,l)-wl-vk. Indeed, in such case the regulation would be ineffective, and the monopolist would behave as any profit-maximizing firm. Similarly, λ cannot be equal to one. Otherwise, the above PMP reduces to (s0-v)k. In addition, assuming that the rate of return dictated by the regulatory agency s0 is higher than that currently present in the market, v, s0>v, this will mean that the monopoly will hire infinite amounts of capital. It must therefore be that 0<λ<1. In particular, the FOCs are



( ) 0

( ) 0

( , ) 0

l l

k k

Lpf w w pf


pf v s pfkL

wl s k pf k l




∂= − + − =


= − + − =∂∂

= + − =∂

From the first FOC, we obtain that the regulated monopoly increases L (hiring more workers) until pfl=w, i.e., until the point in which the value of the marginal product of labor coincides with the marginal cost of an additional worker. The result obtained in the second FOC, in contrast, implies that the monopolist increases capital until


0 0

(1 )

( )

1 1



p f v s

v s s vp f v

λ λλ λλ λ

− ⋅ = −

− −⋅ = = −

− −

and since s0>v and 0<λ<1, the above condition implies

0( )


s vp f v




−⋅ = −

Hence, pfk<v. Therefore, the firm would hire more capital (achieving a lower marginal product of capital fk) than under unregulated conditions, where pfk=v. This result suggests why some regulated natural monopolies (such as electricity and water suppliers) might be “overcapitalized” after being regulated. The following figure illustrates this overcapitalization result (also referred as the Averch-Johnson effect). In particular, before regulation the firm selects the input combination (LBR, KBR). After the regulation is introduced capital becomes cheaper in relative terms (flatter isocost) which leads the firm to choose an input combination with a larger amount of capital (LAR,KAR).

Page 258: MUÑOZ,F. Advanced microeconomic theory. WSU


Chapter 8 – Externalities and Public Goods


An externality is present when the well-being of a consumer (or the production possibilities of a firm) is directly affected by the actions of another agent in the economy. Accordingly, externality can arise in many ways, but, however they arise, their affects are always the same. The actions of a consumer or a producer may benefit or harm other consumers or producers that can be distinguished as effects of positive or negative externalities. Negative externalities Externalities can be negative if they impose costs on or reduce benefits for other producers and consumers. One of the standard examples in the case of pollution occurred by production is if a manufacturer of an industrial good causes environmental damage by polluting the air or water. Suppose that a factory produces and sells tires. In the course of the production, smoke is produced, and everybody that lives in the neighborhood of the factory suffers because of it. The price consumers are willing to pay for tires is given by the benefit derived from using the tires. Hence at the market equilibrium, the marginal cost of producing a tire is equal to the marginal benefit of using the tire, but the market does not incorporate the additional cost of pollution imposed on those who live near the factory. Thus from the social point of view, too many tires will be produced by the market. As a result, with the negative externality, the marginal social cost exceeds the marginal private cost. Positive externality With a positive externality, the marginal social benefit from the good or service exceeds the marginal private benefit. The example of positive externalities in the production is the development of a new technology like the laser or the transistor benefits not only the inventor, but also many other producers and consumers in the economy. Another simple example can be a local bakery producing bread example. People who walk by the bakery get the benefit from the pleasant smell of baking bread, and this is not incorporated into the price of bread. Therefore, at the equilibrium, the marginal social benefit of another loaf of bread is equal to the benefit people get from eating the bread as well as the benefit people get from the pleasant smell of baking bread. However, since bread purchasers do not take into account the benefit provided to people who do not purchase bread, at the equilibrium price the total marginal benefit of additional bread will be greater than the marginal cost. From a social perspective, too little bread is produced.1 Therefore, in the next section, to illustrate all these externality issues, we will discuss bilateral externalities and solution to the externality.

Bilateral externalities2. In this section we focus on two consumers, i={1,2} that belong to an economy with several consumers and firms, who consume L traded goods with price vector p. For simplicity, we will assume price taking behavior.3 Every individual’s wealth level is wi, and his/her utility function is 1 For more examples of negative and positive externalities, please refer to Besanko and Braeutigam, 2005, pp.638-


2 An additional explanation on bilateral (interfirm) externalities can be found in Snyder and Nicholson, 2008, p.671

3 Note that this implies that the L commodities are traded in perfectly competitive markets. Modifying this assumption to one in which commodities are traded in monopoly or oligopoly markets can have significant consequences on our results. See, for instance, Koldstad.

Page 259: MUÑOZ,F. Advanced microeconomic theory. WSU


ui(x1i,x2i,…,xLi,h) where h is the measure of the externality that consumer 1’s actions cause on consumer 2’s wellbeing, e.g., tons of CO2 in the air. Note that in the case of negative externalities, such as the pollution of a river, global warming, loud music of your officemate, etc. imply that dui/dh < 0. In contrast, when externalities are positive (such as your neighbor’s care of his garden, vaccination decisions in your community, etc.) individual’s utility level increases in h, i.e., dui/dh > 0.4,5

Let us define, for a given level of the externality, h, individual i’s Utility Maximization Problem (UMP)

0max ( , )

. . i

i ix

i i

u x h

s t p x w

⋅ ≤

and define vi(p,wi,h) to be the value function associated to the above maximization problem. For simplicity, consider the following quasilinear utility function


where x-1i denotes individual i’s consumption of all goods other than good 1, i.e., x-1i=(x2i,x3i,…,xLi). Then, the Walrasian demand for these L-1 goods, x-1i(p,h), is independent of her wealth. Therefore,

1 1( , , ) ( ( , ), )i i i iv p w h x g x p h h−= +

But we know that

1 1 1 1( , ) ( , )i i i i i ix p x p h w x w p x p h− −+ ⋅ = ⇒ = − ⋅



1 1( , , ) ( , ) ( ( , ), )


i i i i i


v p w h w p x p h g x p h h− −= − ⋅ +

And denoting

1 1

( , )

( , , ) ( , ) ( ( , ), ) ( , )


i i i i i i i

p h

v p w h w p x p h g x p h h p h wφ

φ− −= − ⋅ + = +

That is,

( , , ) ( , )i i i iv p w h p h wφ= +

and since the prices of the L goods are unaffected by changes in the externality level h, we can simply write φi(h). In particular, φi(h) reflects how individual i’s utility is affected by the externality, where we

4 This property of positive externalities is very similar to public goods, where larger contributions of other individuals to the public good increase every individual’s utility level (because of the non-rivalry property). We return to the connection between positive externalities and public goods below.

5 Externalities can also arise in production. For a worked-out example, see Example 19.1 in NS.

Page 260: MUÑOZ,F. Advanced microeconomic theory. WSU


assume that φ’i(h)≤0 and that φ’’i(h)<0 (indicating that the first derivate φi’(h) is decreasing in h), as the following figure illustrates.

Figure 8.1

Hence, individual i obtains a positive and significant additional benefit from the first unit of the externality-generating activity, but the additional benefit becomes lower as the amount of the activity increases. This could be the case, for instance, of a firm generating pollution as a side-effect of its production process

Competitive equilibrium: In the competitive equilibrium, every individual independently chooses the level of the externality-generating activity, h, that solves the UMP

0max ( )i i

hh wφ


Taking FOC with respect to h, we obtain

* *( ) 0, with equality if 0 (interior)i h hφ ′ ≤ >

This result is graphically represented (for an interior solution) in the following figure, where individual i increases the externality-generating activity until the marginal benefits he would obtain from an additional unit (net of marginal costs) are exactly zero, at h*.

Page 261: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 8.2

Pareto optimal: In contrast, the social planner selects the level of h that maximizes social welfare, that is

1 20

max ( ) ( )h

h hφ φ≥


The first-order condition for an interior maximum is:

1 2( ) ( ) 0, with equality if 0 oro o oh h hφ φ′ ′+ ≤ >

where h0 is the Pareto optimal amount of h

That is

1 2 1 2( ) ( ) or ( ) ( ) in the case of interior solutions.o o o oh h h hφ φ φ φ′ ′ ′ ′≤ − = −

Intuitively, this condition states that, at an interior solution the marginal benefit that consumer 1 obtains from an additional unit of the externality-generating activity, φ’1(h), must be equal to the marginal benefit that consumer 2 obtains, φ’2(h), as the following figure depicts.

Page 262: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 8.3

Importantly, note that the above figure represents the case of a negative externality, i.e., φ’2(h)<0 that is bad externality for consumer 2 (loud music). In this case, the level of h*>h0, where too much externality h is produced.

If, in contrast, the activities of consumer 1 produce a positive externality of consumer 2’s wellbeing (baking bread smell or beautification of the yard), i.e., φ’2(h)>0, h*<h0 (i.e., there is an underproduction of the externality-generating activity) as the following figure illustrates.

Page 263: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 8.4

A couple of points are in order. First, negative externalities are not necessarily eliminated at the Pareto optimal solution. Indeed, this would only occur at the extreme case in which the externality-generating activity produces a sufficiently high damage for consumer 2 such that -φ’2(0)>φ’1(0) and the Pareto optimal solution only occurs at the corner where h0=0. Second, in this example the quasilinearity assumption eliminated the wealth effects. A natural question is what would happen if we do not assume that consumers have quasilinear utility functions. This question is explored in exercises given in the homework assignment6.

Solutions to the externality problem

1. There are two traditional approaches for solving externality problems:Setting quotas (emission standards). If the social planner is perfectly informed about the benefits and damages of the externality for all consumers, he can choose to set an emission standard banning production levels higher than the Pareto optimal level h0.

2. Imposing tax on the externality-generating activity or Pigouvian taxation.7 This policy sets a tax th per unit of the externality-generating activity h. But, what is the level of tax th that restores efficiency? In order to answer this question, let us start by re-writing consumer 1’s UMP including the tax, as follows


max ( ) hh

h t hφ≥

Taking FOCs with respect to h, we obtain

6 Exercise 24.1 from Varian, Microeconomic analysis, and exercise 11.B.4 in MWG

7 For additional references about this policy measure, see pp. 355-56 in MWG and Ch. 7 in Koldstad.

Page 264: MUÑOZ,F. Advanced microeconomic theory. WSU


1 1( ) 0 ( )h hh t h tφ φ′ ′− ≤ ⇒ ≤

which, in the case of interior solutions, implies φ’1(h)=th. Since we know that at the optimal level, h0, φ’1(h

0)=-φ’2(h0). Thus setting th=-φ’2(h

0) (which is positive) will lead consumer 1 to choose ht=h0, implementing the social optimum, see below figure .

Figure 8.5

Importantly, note that the tax produces a downward shift in the curve representing consumer 1’s marginal benefit from additional units of the externality-generating activity, as the next figure depicts. This allows for the new curve of marginal benefits to exactly cross the horizontal axis at h0, indicating that after the tax is imposed consumer 1 voluntarily chooses a level of h that coincides with the Pareto optimal level h0.

Figure 8.6

Intuitively, note that the optimality-restoring tax th is equal to the marginal externality at the optimal solution. That is, it is equal to the amount of money that consumer 2 would be willing to pay consumer 1 in order to reduce h slightly from its optimal level h0. As suggested above, the tax th induces consumer 1 to internalize the externality that he is causing to consumer 2. These types of

Page 265: MUÑOZ,F. Advanced microeconomic theory. WSU


optimality-restoring taxes are referred as Pigouvian taxes. Finally, note that in the case that such negative externality is very substantial (and h0=0), we need to impose a tax th=-φ’2(0) or higher, as described in the following figure.

Figure 8.7

All our previous discussion can also be extended to positive externalities. In particular, we similarly set a tax th=-φ’2(h

0). However, now φ’2(h0)>0 (since further units of h increase consumer 2’s welfare),

which implies that the tax th=-φ’2(h0)<0, i.e., the tax establishes a subsidy to consumer 1 for every

unit of the positive externality h that he generates. Graphically, this per-unit subsidy produces an upward shift in the curve representing the marginal benefits that consumer 1 obtains from increasing the amount of the externality-generating activity, as the following figure represents. This implies that consumer 1 has incentives to increase h beyond the competitive equilibrium level h* until the Pareto ��������������� �

Figure 8.8

Page 266: MUÑOZ,F. Advanced microeconomic theory. WSU


Some important points about Pigouvian taxation8 : a. A tax th on the negative externality is equivalent to a subsidy inducing agents to reduce the

externality until the Pareto optimal level h0. In particular, consider that the social planner sets a subsidy sh=-φ’2(h0)>0 for every unit that consumer 1’s choice of h is below the equilibrium level of h*. Hence, consumer 1’s UMP becomes

* *1 1

0subsidy per unit tax

max ( ) ( ) ( )h h hh

h s h h h s h s hφ φ≥

+ − = + −

Taking FOCs with respect to h, we obtain φ’1(h)-th≤0, i.e., φ’1(h0)≤th. Importantly, this FOC

coincides with that under the Pigouvian taxation described above (taxing the negative externality at a rate th), plus a lump-sum transfer of shh*. Hence, a subsidy for the reduction of the externality (combined with a lump-sum transfer shh*) can exactly replicate the outcome of the Pigouvian tax.9

b. The Pigouvian tax levies a tax on the externality-generating activity (e.g, pollution) but not on the output that generated such pollution. In this sense, the externality-generating activity is directly taxed. If, instead, output was taxed, the firm would reduce output which isn’t guaranteed to reduce pollution emissions.10

c. The quota and the Pigouvian tax are equally effective under complete information, i.e., the social planner has accurate information about all agents benefits and cost functions. This might not be the case if governments lack relevant information about the benefits and costs of the externality for consumers and firms.11

Fostering bargaining over externalities

8 For a worked-out example on Pigovian Tax on Newsprint, see Example 19.2 in NS and detailed graphical

illustrations are given in Nechuba, Microeconomics, pp.746-751

9 Koldstad (pp. 124-128) expands on the equivalence between Pigouvian taxes and subsidies.

10 There is, however, one exemption: if emissions bear a fixed monotonic relationship to the level of output, then every unit of output generates a constant proportion (e.g., α) of emissions. Indeed, emissions can be measured in such case by simply observing output, and a tax on output induces the firm to reduce output (and as a consequence emissions) to its optimal level. Therefore, in this case imposing a direct tax on emissions or an indirect tax on output would yield the same results in terms of total pollution. (One exercise in the homework assignment, MWG 11.B.5 explores this possibility).

11 See Koldstad for regulation under contexts of incomplete information.

Page 267: MUÑOZ,F. Advanced microeconomic theory. WSU


In this subsection we examine a less intrusive approach to solving the externality problem, namely, allowing bargaining between the parties generating and affected by the externality. That is a different approach to the problem relies on the parties to negotiate a solution. The success of this system depends on clear assignment of property rights. Does the consumer 1 have the right to produce externality h? If so, how much? Can consumer 2 prevent consumer 1 from producing externality? The result is that as long as property rights are clearly assigned, the two parties will negotiate in such a way that the optimal level of the externality-producing activity is implemented (known as the Coase Theorem12). Unlike the previous solutions like quotas, taxes or subsidies, note that bargaining does not imply government intervention.

Let us first assume that we assign property rights to consumer 2 –the individual suffering the negative externality— so that at the initial state no externality is generated, i.e., h=0. We refer to this state as the “externality-free” environment. In this context, consumer 1 (the polluter) must pay consumer 2 if he wants to increase the externality over zero. In particular, let us assume that consumer 2 makes a take-it-or-leave-it offer where consumer 1 pays T dollars in exchange of h units of pollution, i.e., in order to be allowed by consumer 2 to produce h units of pollution. Specifically, consumer 1 agrees to pay $T to consumer 2 (in order to pollute h units) if and only if

1 1

current state

( ) (0)h Tφ φ− ≥

Given this constraint on the set of acceptable offers, consumer 2 will choose (h, T) in order to solve the problem


1 1

max ( )

. . ( ) (0)h T

h T

s t h T


φ φ≥


− ≥

Note that the constraint of the UMP is binding (holding with equality) since player 2 will raise the fixed fee $T he charges to consumer 1 until the point where consumer 1 is made indifferent between accepting and rejecting such offer. That is,

1 1 1 1( ) (0) ( ) (0)h T h Tφ φ φ φ− = ⇒ − =

Plugging this result into consumer 2’s UMP, we obtain

2 1 10

max ( ) ( ) (0)h


h hφ φ φ≥

+ −

and taking first order conditions with respect to h,

2 1 1 2( ) ( ) 0 ( ) ( )h h h hφ φ φ φ′ ′ ′ ′+ ≤ ⇔ ≤ −

Importantly, this first order condition coincides with that solving the social planner’s problem. Therefore, the level of the externality h is set at the optimal level h=h0. The following figure illustrates this result. In particular, starting from an initial state where h=0 (externality free environment), the above result shows

12 The Coase Theorem states that, regardless of how property rights are assigned with an externality, the allocation

of resources will be efficient when the parties can costlessly bargain with each other, Besanko, 2005, p.653

Page 268: MUÑOZ,F. Advanced microeconomic theory. WSU


that consumer 1 (the polluter) is willing to pay $T to the consumer 2 in order to increase pollution until h=h0.13

Figure 8.9

What happens if instead the property rights are assigned to the polluter? First, note that if there is no bargaining between consumers 1 and 2, consumer 1 would pollute until the marginal benefits are still, i.e., h=h*. However, consumer 2 can pay $T the consumer 1 in exchange of a lower level of pollution, h, where h is reduced from h*. Note that the consumer 1 is willing to take this offer if and only if

*1 1

current state

( ) ( )h T hφ φ+ ≥

Hence, consumer 2’s UMP becomes


*1 1

max ( )

. . ( ) ( )

h Th T

s t h T h


φ φ≥

+ ≥

(Note that the fee $T now enters negatively into consumer 2’s utility, but positively into consumer 1’s, unlike in the previous case, where property rights were assigned to consumer 2). Similarly as in our previous discussion, consumer 2 reduces the offer T until the point where consumer 1 is indifferent between accepting and rejecting the offer T. That is,

13 Note that the polluter does not have incentives to raise pollution beyond h0 since the payment he would have to make to the consumer (in order to compensate him for his marginal costs) is above the marginal benefit the polluter obtains from additional units of the externality.

Page 269: MUÑOZ,F. Advanced microeconomic theory. WSU


* *1 1 1 1( ) ( ) ( ) ( )h T h T h hφ φ φ φ+ = ⇒ = −

inserting this result into consumer 2’s UMP, we obtain

*2 1 1

0max ( ) ( ) ( )


h h hφ φ φ≥

− +

taking first order conditions with respect to h, we obtain

2 1 1 2( ) ( ) 0 ( ) ( )h h h hφ φ φ φ′ ′ ′ ′+ ≤ ⇒ ≤ −

which again coincides with the first order conditions at the optimal level of the externality (social planner’s problem), where h=h0. The following figure depicts the voluntary reduction of the externality associated to the bargaining process. Specifically, starting from an initial situation where h=h* consumer 2 pays $T to consumer 1 in order to reduce pollution until h=h0.14

Figure 8.10

We just shown that, regardless of the initial assignment of property rights over the externality-generating activity, agents can negotiate the increase or reduction of the externality level until reaching the Pareto optimal level. This result is usually referred as the Coase Theorem, and we present it below.

14 Note that consumer 2 is not willing to reduce pollution below h0, since he would have to compensate consumer 1 for his relatively high marginal benefits. Since consumer 2’s marginal cost of additional units of pollution (for all h<h0) is lower than consumer 1’s marginal benefits from such pollution, consumer 2 is not willing to further reduce pollution below h0. Note that this argument parallels our discussion of why agents do not agree to pollution levels above h0 when property rights were assigned to consumer 1.

Page 270: MUÑOZ,F. Advanced microeconomic theory. WSU


Coase Theorem. If bargaining between the agents generating and affected by the externality is possible, then the initial allocation of property rights does not affect the level of the externality. In particular, the externality is finally set at the optimal level h=h0.15

Nonetheless, the allocation of property rights affects the final wealth of the two agents:

1. If property rights are assigned to consumer 2 (the individual affected by the externality),

consumer 1 must pay 1 1( ) (0)oT hφ φ= − to consumer 2.

Indeed, if property rights are allocated to consumer 2, consumer 2’s utility is


2 1 1

( )

( ) ( ) (0)


o o


h T

h h


φ φ φ


+ −

while that of consumer 1 is


2 1 1


( )

( ) ( ( ) (0))



o o

h T

h h


φ φ φφ

− −

Hence, consumer 2’s utility is higher than that of consumer 1 if

2 1 1 1 1 2 1( ) ( ) (0) (0) ( ) ( ) 2 (0)o o o oh h h hφ φ φ φ φ φ φ+ − > ⇔ + >

2. If instead, property rights are assigned to consumer 1 (the polluter), consumer 2 must pay *

1 1( ) ( )oT h hφ φ= − to consumer 1.

Indeed, if property rights are allocated to consumer 1, consumer 1’s utility is


* *1 1 1 1

( )

( ) ( ) ( ) ( )


o o


h T

h h h h


φ φ φ φ


+ − =

while that of consumer 2 is


*2 1 1

( )

( ) ( ( ) ( ))


o o

h T

h h h


φ φ φ

− −

Hence, consumer 1’s utility is higher than that of consumer 2 if * *

1 2 1 1

*1 1 2

( ) ( ) ( ) ( )

2 ( ) ( ) ( )

o o

o o

h h h h

h h h

φ φ φ φ

φ φ φ

> − +

> +

Therefore, the agent with the bargaining power has a total utility higher than the agent without the bargaining power if

*1 1 2 1

Aggregate welfare at the Pareto Optimum

2 ( ) ( ) ( ) 2 (0)o oh h hφ φ φ φ> + >

Let us examine the distribution of utility levels before/after bargaining using a utility possibility set, representing the distribution of utility levels (u1,u2) among the two parties.

15 For an excellent discussion of the Coase theorem, see Kolstad chapter 6.

Page 271: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 8.11

Point a denotes the case in which we assign property rights to consumer 2 (and the externality is initially h=0, at the “externality-free” environment). In contrast, point b represents the case in which we assign property rights to consumer 1 (and the externality is initially h*). Therefore, the take-it-or-leave-it offer leads to point f in the first case and point e in the second case. Anyway, individual 2 uses his bargaining power since he makes a take-it-or-leave-it offer to individual 1). If, instead, the bargaining procedure was the opposite, and individual 1 proposes a take-it-or-leave-it offer to individual 2, then individual 1 would be exploiting individual 2, reaching point d (point c) could be reached after bargaining when property rights are assigned to consumer 2 (consumer 1, respectively); as the following figure depicts.

Figure 8.12

Finally, note that other more complex bargaining procedures (allowing for offers and counteroffers during multiple periods, as in game-theoretic models) would yield a more intermediate allocation of utility levels, graphically represented in points along segment [f,d] (segment [e,c]) when property rights are allocated to consumer 2 (consumer 1, respectively).

Let us finally emphasize some of the advantages and disadvantages of bargaining as a solution to the problem, i.e., the Coase theorem. The main disadvantage of the Coase theorem is its assumption that property rights must be perfectly defined. Otherwise, the agents might not know who they should bargain

Page 272: MUÑOZ,F. Advanced microeconomic theory. WSU


with, and as a consequence the externality problem might never be solved. In addition, property rights must be perfectly enforced, i.e., the level of h must be perfectly observable and measurable by both parties. This might be technologically feasible for some types of externalities, but not others, especially when several polluters might be responsible for the externality. Indeed, the above two assumptions (perfectly defined and enforced property rights) are not satisfied in many externalities, which hampers the possibility of using negotiations in order to solve the externality problem.

Nonetheless, if property rights are well defined and enforceable, the Coase theorem presents an important advantage over other solutions to the externality problem such as taxes, subsidies or quotas. In particular, only the parties involved must know the marginal benefits and costs associated to the externality, i.e., the regulator does not need to know anything! However, note that this assumption is also relatively strong, since the polluter must know the cost of the externality for the affected consumers, and similarly, consumers must know by how much the profits of the firm increase as a result of higher emissions, i.e., the polluter’s profit function.16

Externalities as missing markets. An alternative way to interpret externalities is simply by considering that externalities are a commodity which lacks a market where it can be traded. Let us show that, if externalities were a traded commodity, the level of externality produced in the economy exactly coincides with the Pareto optimal level h=h0. Let us start by assuming well defined property rights, and a competitive market for the right to engage in the externality-generating activity. In addition, let ph denote the price of engaging in one unit of this activity. In this setting, consumer 1 (the polluter) decides how many “polluting rights” to purchase, say h1, by solving

11 1 1

0max ( ) h

hh p hφ


and taking first order conditions with respect to h1, we obtain17

1 1 1( ) , with equality if 0hh p hφ ′ ≤ >

Similarly, consumer 2 (the individual affected by pollution) decides how many “polluting rights” to sell, say h2, by solving

22 2 2

0max ( ) hh

h p hφ≥


where now the revenues from selling polluting rights, phh2, enter positively into consumer 2’s utility function. Taking first order conditions with respect to h2, we obtain18

16 Note that if the two parties are firms (such as a fishery and a refinery) a form of bargaining could be the sale of one firm to the other. This would imply a Pareto efficient level of the externality, since the now merged firm would internalize the effects of pollution on the production process of the fishery.

17 In addition, note that second-order conditions are also satisfied since 1 1''( ) 0hφ < by definition.

18 Note that in this case second-order conditions are also satisfied since 2 2''( ) 0hφ < by definition

Page 273: MUÑOZ,F. Advanced microeconomic theory. WSU


2 2 2

2 2 2

( ) 0, with equality if 0

( ), with equality if 0



h p h

p h h



′ + ≤ >

′≤ − >

In addition, the competitive equilibrium, the market for polluting rights must clear. Hence, h1=h2=h**, and we must therefore have

** **1 2( ) ( )hh p hφ φ′ ′≤ ≤ −

or simply

** ** **1 2( ) ( ) with equality if 0h h hφ φ′ ′≤ − >

Importantly, this condition coincides with the first order conditions under the Pareto optimal level of the externality h0. Thus, the amount of polluting rights exchanged in this market for the externality-generating activity, h**, coincides with the socially optimal level h0, h=h0, and the market price for the externalities then

*1 2( ) ( )o o

hp h hφ φ′ ′= = −

Multilateral Externalities

In this subsection we extend our previous discussion to externalities that are generated by multiple parties and felt by multiple parties. In particular, we will differentiate between depleatable and non- depleatable externalities. Specifically, a depleatable externality is one in which the experience of the externality by one agent reduces the amount that will be felt by other agents. For instance, dumping of garbage on people's property constitutes a depleatable externality. Indeed, if an additional unit of garbage is dumped on one property, that same unit cannot be dumped on other properties. That is, the externality is rival in consumption and therefore shares the features of private goods. In contrast a non-depleatable externality is one in which the amount of the externality experienced by one agent does not reduce the amount felt by other agents. Examples of non-depleatable externalities are pollution, global warming, etc. in particular this type of externality shares the characteristics of a public good (or more precisely a public bad) since they are non-rival in consumption. Let us start by showing that in the case of depleatable externalities the amount of the externality produced under the competitive equilibrium is Pareto optimal19.

Depleatable externalities. Consider a group of I consumers and J firms, both of them sufficiently large so that none of them maintains any market power. Let p denote a price vector of L traded goods. Every firm j

generates an externality hj≥0 with associated profit of πj(hj). Every consumer experience utility ( )i ihφ

when the amount of externality he suffers is ih . Note that, since we are dealing with a depleatable

externality, the amount of externality suffered by individual i, ih , is not experienced by any other

individual (rivalry in consumption). We assume the above the profit and utility functions are twice

19 Related exercises are given in the homework

Page 274: MUÑOZ,F. Advanced microeconomic theory. WSU


differentiable, i.e., πj’’(hj)<0 and '' ( ) 0i ihφ < .20 For simplicity, we analyze a negative externality, so that

' ( ) 0i ihφ ≤ but a similar analysis can be extended to positive externalities. First, note that at the

competitive equilibrium, every firm j (polluter) chooses the level of hj that solves its PMP

max ( )j

j jh

Taking first order conditions with respect to hj, we obtain

* *( ) 0, with equality if 0j j jh hπ ′ ≤ >

In contrast, the Pareto optimal allocation of the externality involves choosing a profile describing the externality received by every consumer, , , … . , and the externality produced by every firm,

0 0 01 2, ,..., Jh h h , which solves

1 21 1

1 2

1 1


, ,..., 0 ( ) ( )

, ,..., 0

. .

I Jo o o

j i i j ji jo o o



i ji j

h h h h h

h h h

s t h h

φ π= =

= =

≥ +≥


∑ ∑

∑ ∑

Note that the previous constraint reflects the depleatability of the externality. Intuitively, if consumer i experiences one unit more of the externality, the total amount of externality to be experienced by all other consumers decreases in exactly one unit. The Lagrange from this maximization problem is

1 1 1 1

( ) ( )I J I J

i i j j i ji j i j

L h h h hφ π μ= = = =

⎡ ⎤= + − −⎢ ⎥

⎣ ⎦∑ ∑ ∑ ∑

Taking first order conditions with respect to ih , we obtain

( ) 0, with equality if 0o oi i ih hφ μ′ − ≤ >

taking first order conditions with respect to hj, we have

( ) 0, with equality if 0o oj j jh hπ μ′ + ≤ >

and taking first order conditions with respect to μ, we obtain

20 Recall that intuitively, this implies that the firm's profit function is concave in the level of the externality, and the marginal cost that consumers suffer from additional units of the externality is increasing in h, as depicted in all our previous figures.

Page 275: MUÑOZ,F. Advanced microeconomic theory. WSU


1 1


i ji j

h h= =

=∑ ∑

Importantly, the previous three conditions resembled those we obtain in competitive markets. In particular, conditions 10.D.3 to 10.D.5 for perfectly competitive markets establish that21

* *

* *

( ) 0, with equality if 0 (10.D.4)

( ) 0, with equality if 0 (10.D.3)

i i i

j j j

x x

c q q

φ μ


′ − ≤ >

′ + ≤ >


* *

1 1


i ii j

x q= =

=∑ ∑

Hence, we can conclude that if well-defined and enforceable property rights can be specified over the externality, if the externality is depleatable and if the number of consumers and firms I and J are sufficiently large so that price taking is a reasonable assumption.

Multilateral externalities: non-depletable externalities

When the externality is non-depletable, the market alone is typically unable to result in an efficient outcome. Let us now assume that the externality is completely non-rival in consumption. Hence, if all J

firms in the economy generate an aggregate amount of externality1



h=∑ , every consumer suffers an

externality 1



h=∑ . In the competitive equilibrium, each firm increases its level of hj* until the point where

πj’(hj*)=0, i.e., marginal benefits from further increases in the externality-generating activity are zero. In contrast, any Pareto optimal allocation involves externality generation levels (h1


0) that solve the social planer’s problem

1 11 2 )

max ( ) ( )( , ,...., 0


i j j j

i j jj

h hh h h

φ π= =

+≥∑ ∑ ∑

Taking FOCs with respect to every hj, we obtain22

21 Note that the negative of the profit function can be viewed as the firm's cost function of producing the externality.

Page 276: MUÑOZ,F. Advanced microeconomic theory. WSU




( ) ) 0(j


ij j

i j

h hφ π=

+ ≤∑ ∑ with equality if hj0>0

which exactly coincides with the optimality conditions for a public good (as shown in condition 11.C.1 in MWG):

0 0' '


( ) ( ) 0I


q qcφ=

− ≤∑ , with equality if q0>0

where q0 represent the total amount of public good provided at the optimum.

Therefore, hj* does not necessarily coincide with hj0, and unlike in the case of depletable externalities

analyzed in the previous section, the introduction of a market for the externality will not lead to an optimal outcome. Intuitively, the free-rider problem (common in public good contexts) emerges in non-depletable externalities and, as a consequence, the equilibrium level of the negative externality exceeds its optimal level (overproduction of the negative externality)23.

If the regulator possesses adequate information about firms’ profit functions and consumers’ damage from the externality, however, it can achieve optimality using quotas or taxes.

1. Setting quotas. First, if the regulator uses quotas, the optimal externality level can be obtained by setting a quota of h1

0 for firm 1, h20 for firm 2, etc..

2. Taxes. If, instead, the regulator uses taxes, the tax th that he must impose per unit of externality generated by every firm j must be

3. ' 0


( )I

h jii j

t hφ=

= −∑ ∑

Intuitively, the tax must be equal to the marginal cost (disutility) that the externality generates to all consumers in the economy. It is easy to show that this tax induces every firm j to voluntarily choose the optimal externality level hj

0. In particular, firm j’s PMP after the tax is imposed becomes

0max ( )

jj j h j

hh t hπ


Taking FOCs with respect to hj, we obtain 0' ( ) 0j j hh tπ − ≤ . Therefore, the value of th that

makes this FOC coincide with that of the social planner is ' 0


( )I

h jii j

t hφ=

= −∑ ∑

Indeed, in that case the FOC from the firm’s PMP become

22 Second-order conditions are also satisfied since

'' 0 ''


( ) 0I

j jii j

hφ π=

+ <∑ ∑

23 Worked-out example 19.3 in NS illustrates the free-rider problem

Page 277: MUÑOZ,F. Advanced microeconomic theory. WSU


'' 0 0


( ) ( ) 0I

j j jii j

h hφπ=

+ ≤∑ ∑

where hj>0 which exactly coincides with the FOCs at the optimal level of the externality, hj

0, we found above.

4. Tradable Externality Permits. Regulators might instead use externality permits to solve the externality problem. Every externality permit grants the right to generate one unit of the externality. Suppose that the regulator chooses a number of total permits equal to the socially

optimal aggregate externality, h0, i.e., h0= 0j


h∑ . In particular, every firm receives permits.24

In addition assume that there is a sufficiently large number of firms, so that they regard the market price of externality permits as given (i.e., price taking assumption). Specifically, let ph* denote the equilibrium price of these permits. Therefore, every firm j’s PMP now becomes



max ( ) ( ) )( 0

j j j jhjh

ph h hπ + −≥

where firm j must pay a price ph* for every permit it needs to buy excess of its initial endowment .25 Taking first order conditions with respect to hj, we obtain26

*'( ) 0

j j hphπ − ≤ , with the equality if hj>0

In addition, if all J firms are carrying out this PMP, we need the market clearing condition

h0= jj

h∑ . Given the above first order conditions for the J firms and the market clearing

condition, we can restore efficiency by setting a price permit ph* of* ' 0


( )I


p hφ=

= −∑ . Indeed,

setting this price, we modify firm j’s FOCs as follows, '' 0


( ) ( ) 0I

j j jii j

h hφπ=

+ ≤∑ ∑ , with equality if hj>0

which exactly coincides with the FOC that solves the social planner problem. Therefore, every firm j is induced to voluntarily choose an optimal externality level j hj=hj

0. Interestingly, the advantage of tradable externality permits, relative to other policy instruments such as quotas or taxes, is that government officials do not need so much information. In particular, they only need data about the optimal level of pollution, h0. This simply implies having information about aggregate firms’ profits (industry profits) and on consumers’ damage from the

24 The particular procedure by which externality permits are assigned to firms is not explicitly described here, but it could be done according to every firm's history of emissions, using an auction, etc. for a discussion of different assignments of permits see Kolstad.

25 Note that if the firm sells permits (because the firm doesn't need to use its initial (permits) profits increase, while if the firm to buy further permits (beyond ) profits decrease.

26 Note that second-order conditions are also satisfied since π’’j(hj)<0 by definition.

Page 278: MUÑOZ,F. Advanced microeconomic theory. WSU


externality in aggregate terms, but not necessarily about individual firm’s profit functions or individual consumer’s damage function.27

Public goods

A good is a (pure) public good if, once produced, no one can be excluded from benefiting from its availability and if the good is non-rival – the marginal cost of an additional consumer is zero. Therefore, public goods are characterized by two properties: non-rivalry and non-excludability. First, non-rivalry implies that the consumption of the good by one individual does not reduce the quantity available for consumption to other individuals or a good is non-rival if consumption of additional units of the good involves zero social marginal costs of production.Second, non-excludability means that if the good is provided, no consumer can be excluded from consuming it (or more precisely, the cost of excluding consumers from enjoying the good is extremely high). For example, national defense, mosquito control, public parks, television and radio signals, and artwork in public goods. The following matrix represents a taxonomy of four different types of goods:

Rivalrous Non-rivalrous

Excludable Private Good Club Good

Non-excludable Common property resource Public good

1. Private goods, e.g., an apple. These goods are rival in consumption –since the consumption of the good by one individual reduces the amount available to other individuals— and excludable in consumption, given that it is easy to exclude an individual who did not pay for the good;

2. Club goods, e.g., golf course. These goods are non-rival in consumption –since the consumption of the good by one individual reduces the amount available to other individuals28— but excludable in consumption, given that it is easy to exclude an individual who did not pay for the good (e.g., asking for an entry fee);

3. Common property resources, e.g., fishing grounds. These goods are rival in consumption –given that the consumption of the good by one individual (e.g., fishery) reduces the amount of the good available to other individuals (to other fisheries in the same area)— but non-excludable, since the costs of excluding additional vessels would be extremely high.

4. Public goods, e.g., national defense. These goods are both non-rival and non-excludable, as described in our previous discussion.

Consider I consumers, one public good x and L traded private goods. Every consumer i’s utility the consumption of x units of a public good is ' ( )i xφ , where note that x does not have a subscript because of

non-excludability, i.e., the total amount of public good in the economy, x, is enjoyed not only by

27 This is a very active area of research, with models analyzing, for instance, how to design the initial distribution of permits, what are the consequences of having a dominant firm in the industry that holds monopolistic power in their purchases of externality permits, etc.

28 This property, of course, assumes that the amount of users is sufficiently low so that no congestion effects emerge, reducing the utility of previous users.

Page 279: MUÑOZ,F. Advanced microeconomic theory. WSU


individual i but also by all other individuals. We consider the case of a public good, where ' ( )i xφ >0 for

every individual i.29 In addition, assume that '' ( ) 0i xφ < , which intuitively implies a decreasing marginal

utility from additional units of the public good. The following figure illustrates the marginal benefit from the public good for individual i.

Figure 8.13

On the other hand, the cost of supplying q units of the public good is c(q), where c’(q)>0 and c’’(q)>0 for all q, i.e., costs of providing the public good are convex in q. The following figure depicts the cost function.30

Figure 8.14

29 Note that a “public bad” would imply ' ( )i xφ <0 for every i.

30 Note that if we were describing a public bad, such as pollution, we would need c’(q)<0 since reducing q is costly, but increasing q is not costly.

Page 280: MUÑOZ,F. Advanced microeconomic theory. WSU


Let us first find the Pareto optimal allocation. In particular the social planner maximizes aggregate surplus, as follows


max ( ) ( )( ) 0



q c qq



taking first order conditions with respect to q, we obtain

0 0' '


( ) ( ) 0I


q qcφ=

− ≤∑ , with equality if q0>0

and the second order conditions are also satisfied, since

0 0'' ''


( ) ( ) 0I


q qcφ=

− ≤∑

in the case of an interior solution, the above first order conditions establish that the optimal level of public good is achieved for level of q0 such that

0 0' '


( ) ( )I


q qcφ=


Intuitively, this condition implies that the social planner should increase the provision of a public good until the point in which the sum of the consumers’ marginal benefit from increasing the public good in one more unit (also referred as marginal social benefit) is equal to its marginal cost. This condition is commonly referred as the Samuelson rule. Importantly, the Pareto optimal patient public goods does not coincide with that of private goods where, for interior solutions, every individual i increases his consumption of the private good until his marginal benefit is equal to his marginal cost, that is

*' '( ) ( )

ji i jq qcφ =

Inefficiency of private provision of public goods

Let us next show that the creation of market in which every individual purchases amounts of the public good does not eliminate the divergence between the Pareto optimal and the equilibrium amount of public good. In particular, let us consider the case in which a market exists for the public good and that each consumer chooses how much of the public good to buy, denoted as xi≥0 units, taking as given a market

price of p. The total amount of the public good purchased by all I individuals is hence31 x=1



x=∑ .

Consider a single producer of the public good (i.e., federal government) with a cost function c(q).32 31 At this point is important to start to think intuitively about the incentives of every consumer in this model: if you knew that the amounts of public goods purchased by all other individuals in the society are nonrival (i.e., you can benefit from them): units of the public good would you buy?

32 We could change this assumption in order to consider J firms producing the public good, then aggregate cost function for the entire industry that exactly coincides with c(q). [Note that we can do this because of the price taking assumption, as we did in perfectly competitive markets.]

Page 281: MUÑOZ,F. Advanced microeconomic theory. WSU


Formally, the competitive equilibrium price p*, each consumer i’s purchase of the public good xi* must satisfy



max ( ) *( )( ) 0


px xi iixi


+ −≥

Note that, when determining his purchases of the public good, individual i takes the purchases of all the

other individuals is given, *



≠∑ , and these purchases enter into his utility function because of the

nonexcludability assumption. In this regard, other individuals’ purchases are a form of positive externality. Finally, note that consumer i pays p*xi when acquiring xi units of the public good. Taking first order conditions with respect to xi, we obtain



*' ( ) * 0k


pxii xφ≠

+ − ≤∑ , with equality if xi*>0

For compactness, let x* denote the total porches of the public goods so that x*= * *i k

k i

x x≠

+∑ . Hence,

' ( *) * 0x piφ − ≤ , with equality if xi*>0

On the other hand the firm producing the public good must solve the PMP,

max * ( )0

p q c qq


and taking first order conditions with respect to q, we obtain

* '( *) 0p c q− ≤ , with equality if q*>0

Finally the market clearing condition implies that the total amount of the public goods produced coincides with the amount consumed by all individuals q*=x*. Combining the first order conditions for consumers and the firm, we obtain

' ( *) '( *)q c qiφ = , if q*>0, and

' ( *) '( *)q c qiφ < , if q*=0

The following figure illustrates the above expression for the case of interior solutions. Intuitively individual i increases his consumption of the public good until the point in which his marginal benefit from the public good equals the marginal cost.

Page 282: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 8.15

If, in contrast, only a corner solution exists, the marginal cost of providing the first unit of the public good is higher than the marginal benefit that individual i would obtain from such unit, as the next figure depicts.

Page 283: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 8.16

Recall that at the Pareto optimality and we must have 0 0' ( ) '( )


Icq qi

iφ∑ =

=. Graphically, this implies a

vertical summation the marginal benefit that all individuals obtain from the public good.33 This result is graphically represented in the following figure, which shows that there is an and the provision of the public good relative to the optimal allocation.

33 Unlike in private goods, where in order to obtain aggregate demand, we conducted a horizontal sum of individual demands. In that case we found, for a given price p, how many units were demanded by all consumers in the economy. In the case of public goods, in contrast, we find for a given amount of the public good q, what is the marginal social benefit that all individuals in the economy obtain.

Page 284: MUÑOZ,F. Advanced microeconomic theory. WSU


Figure 8.17

Intuitively, individual i’s purchases of the public good benefit not only him but also of all individuals. In other words, every individual doesn't have sufficient incentives to purchase additional amount of the public good, leading to the standard free rider problem.

Not included in these lecture notes:

1. Environmental policy under incomplete information, 2. Groves-Clark mechanism applied to environmental policy, 3. Oligopoly models (an introduction).
