The Value of Information in Monotone Decision Problems · marginal cost information than the social...

The Value of Information

in Monotone Decision Problems∗

Susan Athey

MIT and NBER

Jonathan Levin

Stanford University

First Draft: September 1997

This Draft: February 2001

Abstract

This paper studies decision problems under uncertainty where a decision-maker observes an

imperfect signal about the true state of the world. We analyze the information preferences

and information demand of such decision-makers, based on properties of their payoff functions.

We restrict attention to monotone decision problems, whereby the posterior beliefs induced

by the signal can be ordered so that higher actions are chosen in response to higher signal

realizations. Monotone decision problems are frequently encountered in economic modeling. We

provide necessary and sufficient conditions for all decision makers with different classes of payoff

functions to prefer one information structure to another. We also provide conditions under

which two decision-makers in a given class can be ranked in terms of their marginal value for

information and hence information demand. Applications and examples are given.

JEL ClassiÞcation: C44, C60, D81.

Keywords: Bayesian decision problems, value of information, stochastic dominance, stochas-

tic orderings, decision-making under uncertainty.

∗We would like to thank Scott Ashworth, Kyle Bagwell, Dirk Bergemann, Glenn Ellison, John Geanakoplos, Bengt

Holmström, Eric Lehmann, Richard Levin, Meg Meyer, Stavros Panageas, Ben Polak, and Lones Smith, as well as

seminar participants at Michigan, MIT, Northwestern, UCLA, Stanford, Yale, the 1998 Summer Meetings of the

Econometric Society, and the PaciÞc Institute for Mathematical Sciences for valuable discussions. We have both

beneÞted from the hospitality of the Cowles Foundation at Yale University. Athey acknowledges the support of NSF

Grant SBR-9631760.

1 Introduction

In many economic models, a decision-maker faces uncertainty about her marginal returns to some

action and obtains information about these returns. A common and useful practice in such models

to assume that the decision problem has an order structure, in the sense that the potential beliefs

the agent could arrive at can be ranked from less to more optimistic, where more optimistic

beliefs are beliefs that induce a higher action. Examples of such monotone decision problems arise

in many contexts, including problems of production under uncertainty about marginal costs or

about demand elasticity, Þnancial and capital investment, auctions, contracting, adverse selection,

coordination under uncertainty, and search.

In this paper, we provide deÞnitions of more information that are tailored to different classes

of monotone decision problems. Any agent faced with such a problem will prefer one information

structure to another if and only if they can be ranked according to our conditions. We also

provide conditions under which the incentives of two agents to acquire better information in such

an environment can be ranked. For example, we show that a monopolist has a lower demand for

marginal cost information than the social planner, and we analyze under-investment in information

gathering for delegated decision-making problems.

The stochastic environment we consider is composed of two real-valued random variables: an

unknown state of the world W and a signal X. The decision-maker has a prior belief about the

distribution of W . The decision-maker observes the realization of X, say x, before choosing her

action, and her payoff u(ω, a) depends on her action a and ω, the realization of W . This decision

problem is monotone if observing a higher signal realization induces a higher action.1

A family of decision-makers is deÞned by (i) a set of possible priors, and (ii) a set of possible

payoff functions. We consider sets of payoff functions that are alike in how the incremental returns

to higher actions change with ω (e.g. the set of supermodular payoff functions, for which the returns

to increasing the action are nondecreasing in ω). A signal X leads to a monotone decision problem

for each agent in the family if the posterior beliefs about W induced by observing X can be ranked

in an appropriate stochastic order. The stochastic order is chosen so that higher posterior beliefs

(in the given order) induce all agents in the family to choose higher actions. For the family of

supermodular payoff functions with a given prior, the restriction is satisÞed if the posterior beliefs

induced by X can be ordered by Þrst-order stochastic dominance (FOSD).

1Karlin and Rubin (1956) introduced the term monotone decision problem to describe decision problems where

the payoff function has a single-crossing property and the density of the signal conditional on the state of the world

satisÞes the monotone likelihood ratio property (see also Lehmann, 1988). In this paper, we use the term more

broadly, to describe any decision problem where higher signal realizations induce higher actions.

1

Now, consider two signals, X and X 0, that lead to monotone decision problems for each agent

in a given family of decision-makers. We look for conditions under which, for all decision-makers

in the family, observing X 0 is more valuable than observing X. Our answer, stated roughly, is that

X 0 is more informative than X for these decision-makers if on average the posteriors induced

by high realizations of X 0 are higher than the posteriors induced by high realizations of X, and

conversely for low realizations. Here, posteriors are higher or lower in the appropriate stochastic

order, e.g. FOSD for the family with supermodular payoff functions. Intuitively, better information

allows for a more accurate match between beliefs and actions.

Our second result concerns relative demands for information. Extending the techniques from

our Þrst result, we provide conditions under which two payoff functions u and v can be ranked

in terms of their marginal value for better information, and show how this result can be used to

obtain comparative statics results concerning the demand for information.

Our results extend a line of inquiry begun by Lehmann (1988) and continued by Persico (2000),

who studied a speciÞc class of monotone decision problems introduced by Karlin and Rubin (1956).

In such problems, the payoff function has a single-crossing property and posterior beliefs have the

monotone likelihood ratio property. Lehmann (1988) characterized an informativeness ordering

for these problems, while Persico (2000) identiÞed when one payoff function has a higher marginal

value for information than another. Our result on information demand generalizes Persicos to many

other sets of payoff functions. Our methods differ from Lehmann (1988), who exploited the sign-

preservation properties of distributions with the monotone likelihood ratio property. Nevertheless,

our results are related, and we derive his ordering as a by-product of our analysis. In Section

5, we show a close connection between his ordering for single crossing decision problems and our

informativeness condition for supermodular problems. These two classes of problems often play a

central role in information-theoretic modeling.

More broadly, our analysis builds on the classic work of Blackwell (1951, 1953). Blackwell

introduced a partial order over signals, showing that a signal X 0 is more valuable than a signal

X for every decision problem if and only if X 0 is statistically sufficient for X. Our problem is

different in that we seek notions of valuable information that are tailored to speciÞc types of

economic contexts, rather than a general statistical property that holds across all environments. In

addition to showing how a problems structure can be incorporated to order signals in a manner less

restrictive than sufficiency, our results can be interpreted as establishing consequences of statistical

sufficiency for speciÞc classes of decision problems, consequences that may be useful for further

analysis (e.g. comparative statics on information acquisition or market equilibria).

2

2 Monotone Decision Problems

2.1 The Set-Up

The stochastic environment is composed of some unknown state of the world W , with typical

realization ω ∈ Ω ⊂ R, and a signalX with typical realization x ∈ X ⊂ R. Given a prior, H ∈ ∆(Ω),the distribution of the signal induces a joint distribution over signals and states, F : Ω×X → [0, 1],

referred to as the information structure. Let FX(·|ω) be the signal distribution conditional onW = ω, and let FX(·) be the marginal distribution of the signal, FX(x) = EW [FX(x|W )], computedusing the prior distribution on W . Let FW (·|x) denote the conditional distribution of W given

X = x, that is, the agents posterior belief after observing a realization X = x. Of course, posterior

beliefs must be consistent with the prior: for all ω ∈ Ω, EX [FW (ω|X)] = H(ω).We use f, fW , andfX to denote the corresponding probability mass functions or densities.

After observing the signal realization, the decision-maker chooses an action a ∈ A, where A is acompact subset of R. She has a payoff function u : Ω×A→ R. We assume throughout that payofffunctions are continuous in a. The ex ante value of the decision problem hF, ui is:

V (F, u) = EX·maxa∈A

ZΩu(ω, a)dFW (ω|X)

¸. (1)

We use α∗(x) to denote an optimal decision rule that achieves this ex ante value.

A family of decision-makers is deÞned by a pair of sets (Λ, U), where Λ ⊆ ∆(Ω), and U is a setof payoff functions u : Ω×A→ R. Each member of the family has a prior belief H ∈ Λ and a payofffunction u ∈ U . We ask when a signal X 0 is preferred to another signal X for all decision-makers

in a given family (Λ, U). Observe that expanding either the set of allowable priors or the set of

allowable payoff functions will in general lead to a more restrictive ordering on signals. For this

reason, we initially take the set of prior beliefs be an arbitrary singleton, H, and analyze orderings

over signals taking the prior as given. This allows us to focus on the effects of considering different

sets of payoff functions. It is also natural in economic contexts where agents may have objective

information about the prior distribution (i.e., the distribution of worker abilities in a population

is known). In Section 5.2, we analyze conditions under which enlarging the set of priors leads to

more restrictive orderings.

2.2 Monotone Decision Problems

A decision problem is monotone if it has an optimal decision rule α∗(x) that is monotone. We

Þrst characterize monotonicity in terms of a relationship between the payoff functions incremental

return to higher action and posterior beliefs. The next section provides examples.

3

Let R = g : Ω→ R, g bounded, measurable. We can then deÞne:

DeÞnition 1 Given R ⊂ R, a payoff function u has R-incremental returns if for any a0 > a, theincremental return function r(ω) = u(ω, a0)− u(ω, a) ∈ R.

Thus, any set R ⊂ R deÞnes a set of payoff functions, letting UR be the set of payoff functions

with R-incremental returns. As an example, suppose R is the set of nondecreasing functions. Then

UR is the set of supermodular payoff functions u(ω, a) (subject to the additional restriction that

for all a0 > a, u(·, a0)− u(·, a) ∈ R).In addition to deÞning a set of payoff functions, a set R ⊂ R induces a stochastic order on ∆(Ω)

(e.g. on posterior beliefs) as follows. For P,Q ∈ ∆(Ω), write Q ÂR P if

∀r ∈ R :ZΩr(ω)dP (ω) ≥ 0⇒

ZΩr(ω)dQ(ω) ≥ 0.

This stochastic order, ÂR, represents a notion of single crossing (where we say that a functiong : R → R satisÞes single crossing if g(x) ≥ 0 implies g(x0) ≥ 0 for all x0 > x).2 In principle, ÂRis weaker than standard stochastic dominance, which requires that for all r ∈ R, R rdQ ≥ R rdP .However, for many cases of interest, the two are equivalent (Lemma 2, below). For instance, when

R is the set of nondecreasing functions, Q ÂR P means exactly that Qdominates P in the sense ofFOSD. Our next deÞnition uses this order.

DeÞnition 2 Given R ⊂ R, an information structure F is R-ordered if ÂR is a complete orderon FW (·|x)x∈X , that is, for any x0 > x, FW (·|x0) ÂR FW (·|x).

Our Þrst Lemma shows that if a decision-maker with payoff function u ∈ UR is faced with anR-ordered information structure, she has an optimal decision rule that is monotone.

Lemma 1 Given R ⊂ R, if u has R-incremental returns, and F is R-ordered, then there exists a

function α∗(x) ∈ argmax RΩ u(ω, a)dFW (ω|x) that is nondecreasing in x.Proof. DeÞne U(x, a) =

RΩ u(ω, a)dFW (ω|x). Then if a0 > a, by the R-order, U(x, a0)−U(x, a) ≥ 0

implies U(x0, a0)−U(x0, a) ≥ 0 for any x0 > x. This implies Shannons (1995) weak single crossingproperty, from which the result follows. Q.E.D.

A family of decision-makers (Λ, UR) induces a class of monotone decision problems, where each

decision problem is deÞned by a decision-maker (H,u) ∈ (Λ, UR) and a signal X, and each decisionproblem is monotone. Until Section 5.2, we Þx Λ = H and so refer to a family of decision-makersby the set of payoff functions UR. Then, the induced class of monotone decision problems admits

all signals X where the corresponding information structure F is R-ordered.2 There are a variety of alternative (i.e., weak and strong) notions of single crossing; see Milgrom-Shannon (1994)

or Shannon (1995).

4

2.2.1 A Characterization Lemma

In what follows, it is essential to have a clean characterization of the stochastic order introduced

above. To obtain this, we introduce a condition on the sets R ⊂ R that we use to deÞne setsof payoff functions. Although it places some structure on the sets of payoff functions we might

consider, it does not restrict the analysis signiÞcantly in terms of examples of interest.

Condition 1 (i) R is a closed convex cone; (ii) there exists some r : Ω→ R such that r,−r ∈ R,and either (a) r>0 on a set of positive Lebesgue measure, or (b) r = 1ω0 for some ω0 ∈ Ω, andfurther for some N>0, r+n=1[ω0,ω0+1/n) ∈ R and r−n=−1(ω0−1/n,ω0] ∈ R for all n>N.3

If R is the set of nondecreasing functions (or, for instance, the set of concave functions), Con-

dition 1 is satisÞed with r,−r as the constant functions: r ≡ 1, −r ≡ −1. Part (ii)(b) is useful foranalyzing payoff functions that satisfy single crossing (such as portfolio problems), as discussed in

the next section. Condition 1 implies that if u ∈ UR, then adding a beneÞt Kar(ω) to each actiona ∈ A results in a new payoff function that is also in UR.

When r = 1ω0, it will be helpful to refer to a second condition:

Condition 2 Given P ∈ ∆(Ω) and a set R satisfying Condition 1, if r = 1ω0, then either (a) ω0

is a mass point of P, or else (b) the density p is continuous at ω0.

Throughout, we abuse notation by writing EW [r(W )] in place of limn→∞ nR

1[ω0,ω0+1/n)dP =

p(ω0) when part (b) of Condition 2 is satisÞed.4 We now observe that when Conditions 1 and 2

are satisÞed, the stochastic order ÂR admits an alternative representation.5

Lemma 2 Given R ⊂ R and P,Q ∈ ∆(Ω), such that (R,P ) and (R,Q) satisfy Conditions 1 and2, then Q ÂR P if and only if

∀r ∈ R,ZΩrdQ ≥ λ

ZΩrdP (2)

for some λ ≥ 0, where λ = ¡R rdQ¢ / ¡R rdP¢ if R rdP > 0.Proof. Suppose Q ÂR P , and consider the problem of choosing r ∈ R to minimize

RrdQ subject to

the constraint thatRrdP ≥ 0. The minimized value must be nonnegative. Moreover, there is some

3 This condition can be generalized to allow for other functions r; for example, we could require that there existtwo sequences of elements of R, r+n and r−n , where each element has positive Lebesgue measure and further,r = limn→∞ r+n=− limn→∞ r−n . We focus on the more speciÞc condition for simplicity.

4 Although the sequence n1[ω0,ω0+1/n) is unbounded, we do not require each member of the sequence to be in R;each time this construction is used, we will be working with inequalities of the form

RrdQ ≥ λ R rdP, so that we can

multiply both sides by n.5 This approach borrows from Jewitt (1986), Gollier and Kimball (1995), and Athey (1998a).

5

λ ≥ 0 such that this linear program is equivalent to choosing r ∈ R to minimize R rdQ− λ R rdP .Thus, Q ÂR P implies

RrdQ ≥ λ

RrdP for all r ∈ R. But if r,−r ∈ R (and using Condition 2

(ii)(b) if needed), we must have λ =¡RrdQ

¢/¡RrdP

¢. The other direction is immediate. Q.E.D.

Note that if R contains the constant functions (so we can choose r ≡ 1), and Q,P are probabilitydistributions, then λ = 1, and ÂR coincides with standard stochastic dominance.

2.3 Examples

We now describe some sets of payoff functions, and the induced classes of monotone decision

problems, captured by our framework.

1. Payoff functions with nondecreasing incremental returns. A payoff function u(ω, a) with non-

decreasing incremental returns is supermodular in (ω, a). The property that the marginal returns

to action are nondecreasing in some unknown variable is a common feature of economic problems.

To see two simple examples, consider an oligopolistic Þrm choosing an output plan subject to

uncertainty about marginal cost reductions; or, consider a competitive Þrm investing in process

innovation to lower marginal cost, when the market-clearing price is unknown. Supermodularity

also arises in research and development problems, welfare economics, and coordination games and

search models (see Milgrom and Roberts (1990), and the recent books by Topkis (1998) and Cooper

(1999) for many examples). Often in such problems, the marginal cost of acting is a parameter

(to allow for varying taxes, subsidies, interest rates, production technologies, etc.); thus, payoff

functions of the form v(ω, a)− c(a) are considered for a rich family of functions c.To derive the stochastic order for the set of nondecreasing functions RND, take r = 1, and

observe that in Lemma 2 λ =¡RdQ¢/¡RdP¢= 1. Thus, ÂRND is FOSD, and so if posterior beliefs

are ordered by FOSD, higher signal realizations will lead to higher actions when u is supermodular.

2. Payoff functions with concave incremental returns. Various investment problems with risk-

aversion have the property that the incremental returns to higher investment are concave in the

unknown asset returns. For instance, in the classic portfolio problem, a Þxed endowment e must

be allocated between a risky asset (e.g. a market portfolio) with return W and a safe asset with

known return ω0. If the investor has a utility function v over wealth, and invests a in the risky

asset, then u(ω, a) = v(aω + (e− a)ω0). Her payoff function has concave incremental returns as afunction of ω if v has constant or increasing relative risk-aversion.

Let RCV denote concave functions. Applying Lemma 2 (where r ≡ 1 again implies that λ = 1),ÂRCV coincides with second order stochastic dominance (SOSD) (Rothschild and Stiglitz, 1970).Thus, the investor described above will invest more in the risky asset when she believes asset returns

to be less risky (by SOSD).

6

3. Payoff functions with incremental returns that cross zero at ω = ω0. Payoff functions

with this property arise in investment theory (Athey, 1998a). In the standard portfolio problem

described in the previous example, the incremental returns to investing are negative for ω < ω0,

and positive for ω > ω0. Formally, we generate a general set of payoff functions with this property

by considering RSC(ω0) = r : Ω→ R : r(ω) ≤ 0 if ω < ω0 and r(ω) ≥ 0 if ω > ω0.The stochastic order, ÂRSC(ω0) , for this set is a weakening of the monotone likelihood ratio

property (MLRP) (where fW has the MLRP if for x0 > x, fW (·|x0)/fW (·|x) is a nondecreasingfunction). To apply Lemma 2, we take r = 1ω=ω0 and use Condition 2. Then, if fW (ω0|x) > 0for all x, we Þnd that FW (·|x0) ÂR FW (·|x) if fW (·|x0) − [fW (ω0|x0)/fW (ω0|x)] fW (·|x) is singlecrossing (or, more intuitively, if fW (·|x) is positive on Ω, fW (·|x0)/ fW (·|x) − fW (ω0|x0)/fW (ω0|x)is single crossing).6

4. Payoff functions with single-crossing incremental returns. A payoff function u has single

crossing incremental returns if the incremental returns to higher action are single crossing as a

function of ω. This is the class of problems studied by Lehmann (1988) and Persico (2000). For

example, if a Þrm chooses price to maximize its expectation of u(p,ω) = (p− c)D(p,ω), then thepayoff function u(p,ω) will have single crossing incremental returns if an increase in ω reduces

demand elasticity (and more generally, a sufficient condition when u is positive is that ln(u) is

supermodular). Single crossing incremental returns also arises in auction models. See Milgrom and

Shannon (1994) and Athey (1998b) for further examples. Formally, deÞne RSC as follows: r ∈ RSCis there is some ω0 such that r ∈ RSC(ω0). While RSC does not satisfy our Condition 1, we cananalyze it indirectly by considering it as the union over all sets RSC(ω0 such that ω0 ∈ Ω. Ifposterior densities or mass functions are positive throughout Ω, by requiring that ÂRSC(ω0) holdsfor all ω0, we Þnd that the order ÂR induced by RSC is the MLRP.7

We may also study other sets of payoff functions. Below, we remark on a class of payoff func-

tions that arise in the study of Þrst-price auctions, that lie between supermodular payoff functions

(example 1) and single-crossing payoff functions (example 4). We might also combine conditions

(i.e. nondecreasing and concave) or place restrictions on higher-order derivatives of r. For ex-

6 Although it may seem restrictive to assume f(ω0|x) > 0, it does not add much beyond the R-order for RSC(ω0),so long as h(ω0) > 0. (We can take ω0 in the support of H without loss of generality, but we still must assume apositive density.) To see this, suppose in contrast that f(ω0|x) = 0 for some x. This implies (using the R-order) thatf(ω0|x0) = 0 for all x0 > x; but in turn, that implies (using Lemma 2 and the fact that f(ω0|x00) > 0 for some x00)that EW [r(ω)|X = x0] ≥ 0 for all r ∈ R and x0 > x. Thus, for each x0 > x, the support of f(ω|x0) is greater thanω0, and every decision-maker in UR should take the highest possible action, so that there is no additional value todistinguishing among signals greater than x. As Athey (1998b) shows, if R satisÞes stronger notions of single crossing,the R-order imposes weaker restrictions on how the support changes with x, but we focus on RSC(ω0) because thisset is a closed convex cone.

7 The analysis generalizes if densities are not always positive; see Athey (1998a, 1998b) for details.

7

ample, if R is the set of affine functions, ÂR compares distribution means, while if R contains

quadratic functions, ÂR is a mean-variance order (which could be used for investment problemswith mean-variance preferences over gambles).

3 Monotone Information Orders

We now derive conditions under which for all decision-makers in a family¡H, UR¢, the signal

X 0 will be preferred to the signal X given that the corresponding information structures F 0 and

F are R-ordered. That is, we provide conditions under which V (F 0, u) ≥ V (F, u) for every payofffunction u with R-incremental returns.

As a prelude, we recall Blackwells approach to informativeness. His approach proceeded in

three steps. First, he observed that for any payoff function u, if we deÞne u : ∆(Ω)→ R according

to u(P ) = maxa∈ARΩ u(ω, a)dP (ω), u is convex in the posterior, P .

8 Second, when V (F, u) is

formulated as in (1), the problem of comparing X and X 0 can be framed as a problem of comparing

the distributions over posteriors (denoted µ(·) and µ0(·)) generated by the two signals. Thus, ifRu(P )dµ0(P ) ≥ R

u(P )dµ(P ) for all convex functions u : ∆(Ω) → R, µ0 dominates µ accordingto the convex stochastic dominance order, and any agent will prefer X 0 to X. Third, he gave a

characterization of the stochastic dominance order in terms of the signals X 0 and X, showing that

the order is equivalent to requiring that X is a garbling of X 0.

In this section, we take a similar approach, in that we Þrst consider the consequences of opti-

mality (in this case, monotone policies) together with properties of the payoff function, and then

derive an ordering based on comparisons between distributions over posteriors. In so doing, we

relax the stringent requirements of the convex stochastic dominance order. We return to further

characterizations of the order over posteriors (analogous to Blackwells third step) in Section 5.

3.1 Sufficient Conditions

We group together a set of conditions that will be referred to throughout.

Condition 3 Given R ∈ R, a prior H, and a signal X, (i) R satisÞes Condition 1, and (ii) for

each x ∈ X , FW (·|X ≤ x) satisÞes Condition 2, and EW [r(W )|X = x] > 0.9

8 Convexity requires maxa∈ARΩu(ω, a)d(γP+(1γ)Q)≤ γmaxa∈A

RΩu(ω, a)dP+(1γ) maxa∈A

RΩu(ω, a)dQ,

which follows by linearity of the integral and because the agent can do better by optimizing for each posteriorrather than choosing the optimal action for the convex combination of posteriors.

9 When r = 1ω0 , this requires f(ω0|x)>0 for all x; see footnote 6 for further discussion.

8

Given a prior H, a signal X, and R ∈ R satisfying Condition 3, we now deÞne an indexing

function that identiÞes each signal realization with a real number between zero and one. Let10

T (x) = Pr(X ≤ x)EW [r(W )|X ≤ x]EW [r(W )]

.

Observe that T is a nondecreasing function mapping X → [0, 1]. An indexing function T 0 can be

similarly deÞned for an alternative information structure F 0. Then, T (X) and T 0(X 0) are signals

with the same information content as X and X 0.11

The choice of this indexing function will play a central role in our derivations, in particular

in the derivation of necessary conditions. However, it will be easier to describe how T is derived

after we have shown how it is used. For the moment, we simply observe that if R contains the

constant functions, T (x) ≡ FX(x) signal realizations are indexed by their ex ante percentile.

If R is the set of functions that are single crossing at ω0, T (x) ≡ FX(x|ω0) signal realizations

are indexed by their percentile conditional on W = ω0. Below, we consider an example where

R is the set of functions that are nondecreasing for ω < ω0 and zero thereafter, in which case

T (x) ≡ FX(x|W ≤ ω0).

Theorem 1 Consider R ⊂ R, a prior H, and two signals X and X 0 satisfying Condition 3, where

the corresponding information structures F and F 0 are R-ordered. If

∀z ∈ [0, 1] : F 0W (·|T 0(X 0) ≥ z) ÂR FW (·|T (X) ≥ z), (MIO)

then V (F 0, u) ≥ V (F, u) for every u with R -incremental returns.

If the conditions of Theorem 1 are satisÞed, we say that X 0 is more informative than X for all

decision-makers in the family¡H, UR¢, or that F 0 ÂMIO−R F.

The (MIO) condition compares averages of posterior beliefs, where the averages are computed

according to our indexing function. Because the average of all posteriors is just the prior, (MIO)

is equivalent to saying that:

∀z ∈ [0, 1] : F 0W (·|T 0(X 0) ≤ z) ≺R FW (·|T (X) ≤ z).

Thus, our informativeness criterion requires that the high posteriors under F 0 be, on average, higher

than the high posteriors under F , and the low posteriors be, on average, lower where high

10 If there is more than one r satisfying Condition 1, simply choose one. The choice will not affect the results that

follow.11 T is strictly increasing at x unless Pr(X = x) = 0, in which case no information is lost. Moreover, following

Lehmann (1988), X can be transformed into an information-equivalent X∗ such that T ∗ is continuous and onto.

9

and low refer to the stochastic order ÂR induced by R. Put simply, informativeness correspondsto posterior beliefs being more spread out in a given stochastic order.

Proof of Theorem 1. The proof proceeds in two steps: Þrst, we apply Lemma 2 to obtain an

alternative characterization of (MIO); second, we show that for any monotone policy based on

observing X, there is an alternative policy based on observing X 0 that does at least as well.

Step 1: Fix z ∈ [0, 1]. By Lemma 2, (MIO) is equivalent to

∀r ∈ R :

ZΩr(ω)dF 0W (ω|T 0(X 0) ≥ z) ≥ λz

ZΩr(ω)dFW (ω|T (X) ≥ z),

where λz =

Rr(ω)dF 0W (ω|T 0(X 0) ≥ z)Rr(ω)dFW (ω|T (X) ≥ z) =

Pr(T (X) ≥ z)Pr(T 0(X 0) ≥ z) .

The second equality follows from our deÞnition of T, T 0 (observe that if r ≡ 1, λz = 1). Thus (MIO)holds if and only if for all z ∈ [0, 1] and r ∈ R,

EW£r(W ) | T 0(X 0) ≥ z¤Pr ¡T 0(X 0) ≥ z¢ ≥ EW [r(W ) | T (X) ≥ z] Pr (T (X) ≥ z) . (3)

Step 2: Suppose F, F 0 are R-ordered and that (MIO) holds. And suppose A = a1, ..., an isÞnite. Consider an arbitrary payoff function u with R-incremental returns. We can write u(ω, ak) =

u(ω, a1) +Pki=2 ri(ω), where ri(ω) = u(ω, ai)− u(ω, ai−1) ∈ R.

Consider an arbitrary monotone policy α : X → A for use with F . It is deÞned by a set of

cut points: x1 ≤ x2 ≤ ... ≤ xn+1, with α(x) = ai when xi < x < xi+1, and α(x) = maxj:xj=x ajwhen x = xi for some i. The cut points also can be represented by their indices zin+1i=1 where

zi ≡ T (xi). The ex-ante payoff using this policy with F is:

V (F, u,α) =

ZX

ZΩu(ω,α(x))dFW (ω, x)

= EW [u(W,a1)] +nXi=2

EW [ri(W ) | T (X) ≥ zi] Pr(T (X) ≥ zi). (4)

We now construct an alternative policy α0 for use with F 0 that does better than the policy

α used in conjunction with F . This new policy is given by cut points x0i, where we deÞneT 0(x0i) = T (xi) = zi for all i = 1, ..., n+ 1. Then, e.g., α

0(x) = ai when x0i < x < x0i+1. The payoff

to this new policy under F 0 is:

V (F 0, u,α0) = EW [u(W,a1)] +nXi=2

EW£ri(W ) | T 0(X 0) ≥ zi

¤Pr(T 0(X 0) ≥ zi). (5)

It follows immediately from (3) that if (MIO) holds, then V (F 0, u,α0) ≥ V (F, u,α). Since there isa monotone policy under F that is optimal, we are done.

10

The case where A is compact follows from a limiting argument. Any monotone policy α(x)

used under F can be approximated by a sequence of step functions α1(x),α2(x), ... converging

to α(x), and for each step function, we can construct a policy αk0(x) for use with F 0 such that

V (F 0, u,αk0) ≥ V (F, u,αk). Moreover, αk0(x)→ α0(x), for some monotone policy α0(x). Since u is

continuous in a, it follows that V (F 0, u,α0) ≥ V (F, u,α). Q.E.D.

The proof shows that for any monotone policy α : X → A for use with F , there is a policy α0 :

X → A for use with F 0 that achieves a higher ex ante expected payoff, where α0(x) = α(T 0−1(T (x))).

The key to the result is the transformation T 0−1(T ) : X → X (i.e. the indexing functions T, T 0),

which plays two important and somewhat subtle roles.

First, the reader may note from the proof that a sufficient condition for X 0 to be more informa-

tive that X is that there is some transformation Φ : X → X such that (3) holds with Φ in place of

T and the identity in place of T 0 (in which case the policy α(Φ(·)) used with X 0 does better than

the policy α(·) used with X). In general, however, (3) bears no relationship to a standard stochas-tic dominance relation, nor is it intuitive or easy to check. Thus, one role of the transformations

T 0, T is to allow us to move directly between the payoff comparison (3) and an easily interpretable

stochastic dominance relation (MIO).

Second, and more subtly, observe that the proof of Theorem 1 does not appeal to the optimality

of the initial policy α, only to its monotonicity. We show below, however, that (MIO) (or equiv-

alently (3) is both sufficient and necessary for X 0 to be more informative than X for the relevant

class of decision problems. Remarkably, this is not true for any transformation Φ : X → X other

than T 0−1(T ). That is, the analogue of (3) is sufficient but not necessary for X 0 to be more infor-

mative than X. Another way to say this is that our choice of indexing function has the very special

property that it fully incorporates an optimality restriction into a standard stochastic dominance

relation (MIO). This point is discussed in more detail following Theorem 2.

3.2 Examples

1. Payoff functions with nondecreasing incremental returns. If a decision-makers incremental

returns are nondecreasing in the state variable, she wants to match high actions with high beliefs

about W , and low actions with low beliefs. All such decision-makers with prior H will prefer a

signal X 0 to another X, given that the induced posterior beliefs can be ordered by FOSD, if for all

z ∈ [0, 1], F 0W (·|F 0X(X 0) ≥ z) dominates FW (·|FX(X) ≥ z) by FOSD that is, if high realizations

of X 0 induce, on average, higher beliefs (in the sense of FOSD) than high realizations of X.12

12 It can further be shown that in this case, (MIO) is equivalent to requiring that for all (ω, x) ∈ (Ω,X ),F 0(ω, F 0X(x)) ≥ F (ω, FX(x)). This, in turn, is equivalent to the supermodular stochastic order (see Meyer (1990)

11

Intuitively, this allows for a better match between actions and beliefs.

The following example illustrates this idea. Fix a prior H ∈ ∆(Ω) with density h and deÞne,for θ ∈ £0, θ¤, a signal Xθ with support [0, 1] such that (W,Xθ) have joint density:

fθ(ω, x) = h(ω) + θk(ω)l(x),

where k : Ω→ R and l : [0, 1]→ R are bounded increasing functions withRΩ k(ω)dω =

R 10 l(x)dx =

0, and fθ ≥ 0. Note that this example is constructed so that T θ(x) = F θX(x) = x. Further, for

any θ, the posteriors of W |Xθ = x are ordered by FOSD (typically, they will not have the MLRP).

Finally, an increase in θ makes Xθ more informative for supermodular decision-makers. However, it

will generally not be the case that Xθ0 is sufficient for Xθ when θ0 > θ, or even that Xθ0 dominates

Xθ in Lehmanns order (deÞned below in (12)). To better understand this example, observe that

for high (low) values of x, l(x) > (<)0, and an increase in θ puts more positive (negative) weight

on the nondecreasing function k, thus shifting probability weight towards (away from) high values

of ω.

Below, in Section 5, we provide further characterizations of this information order in terms of

marginal-preserving spreads.

2. Payoff functions with concave incremental returns. A decision-maker with incremental re-

turns that are concave as a function of the state variable wants to match high actions with less risky

beliefs. So X 0 is more informative than X under (MIO) if, for every z ∈ [0, 1], F 0W (·|F 0X(X 0) ≥ z)dominates FW (·|FX(X) ≥ z) by SOSD. That is, high realizations of the signal lead, on average, toless risky posteriors under F 0 than F , and low realizations of the signal lead to more risky posteriors.

Again, (MIO) ensures a better match between actions and belief.

The following example illustrates the idea. Fix a prior H ∈ ∆(Ω) with density h and deÞne, forθ ∈ £0, θ¤, a signal Xθ with support [0, 1] such that (W,Xθ) have joint density:

fθ(ω, x) = h(ω) + θk(ω)l(x),

where again l : [0, 1] → R is a bounded increasing function withR 10 l(x)dx = 0, and k : Ω → R

is a bounded function that satisÞesRΩ k(ω)dω =

RΩ ωk(ω)dω = 0 and

R t−∞

³R s−∞ k(ω)dω

´ds ≤ 0.

Again the example is constructed so that T θ(x) = F θX(x) = x. The function k centers mass

around the mean of W , so for any θ, a higher realization of Xθ leads to a posterior belief that is

higher in the sense of second order stochastic dominance. Similarly, an increase in θ means makes

Xθ more informative for decision-makers whose payoff functions have concave marginal returns.

Again, Xθ0 need not be sufficient for Xθ when θ0 > θ. To better understand this example, observe

and Shaked and Shantikumar (1997)) for the comparison between (W,F 0X(X0)) and (W,FX(X)) .

12

that for high (low) values of x, l(x) > (<)0, and an increase in θ puts more positive (negative)

weight on the centering function k, thus shifting probability weight towards (away from) central

values of ω.

Below, in Section 5, we provide further characterizations of this information order in terms of

mean and marginal-preserving spreads.

3.3 Necessary Conditions for Smoothly Parameterized Families

In this section, we demonstrate that (MIO) is not just a sufficient condition for X 0 to be more infor-

mative than X for a given class of monotone decision problems, it is also a necessary condition for

small changes in the signal structure (i.e., for making comparisons along a smoothly parameterized

family of signals).

A family of signals Xθθ∈Θ⊂R, is smoothly parametrized if, for all ω ∈ Ω, and x ∈ X , F θX(x|ω)is continuously differentiable in θ.

Theorem 2 Consider R ⊂ R, a prior H, and a smoothly parametrized family of signals Xθθ∈Θsatisfying Condition 3, where for each θ, the information structure F θ is R-ordered. Then ∂

∂θV (Fθ, u) ≥

0 for every u with R-incremental returns if and only if (MIO) holds for F 0 = F θ+dθ and F = F θ.

The only if proof proceeds in two steps. First, we use the envelope theorem to show that a

comparison between F 0 = F θ+dθ and F = F θ can be made holding the optimal decision rule Þxed.

We then construct a subset of decision-makers in¡H, UR¢ with the property that X 0 is preferred

to X for all of these decision-makers exactly when (MIO) holds.

Proof. The if direction is a consequence of Theorem 1. We prove the only if direction.

Step 1: We apply the envelope theorem. It is useful to Þrst transform the signals using the

indexing functions, and to introduce notation for the joint distribution over states and indexed

signals. For each θ, let Zθ ≡ T θ(Xθ), which is identical to Xθ in information content. Then, W

and Zθ have joint distribution Gθ : Ω× [0, 1]→ [0, 1] with Gθ(ω, z) ≡ F θ(ω, T θ−1(z)), and

V (F θ, u) = EZθ·maxa∈A

ZΩu(ω, a)dGθW (ω|Zθ)

¸.

Denoting the optimal policy αθ(z), and applying the envelope theorem:

∂

∂θV (F θ, u) =

Z[0,1]

ZΩu(ω,αθ(z))d

½∂

∂θGθ(ω, z)

¾. (6)

For small changes in θ, we can hold the optimal policy Þxed, and consider the change in expected

payoffs due to the change in the joint distribution over signals and states.

13

Step 2: Let r ∈ R and z ∈ [0, 1] be given. DeÞne a payoff function urz such that urz(ω, a)=0if a < a, while urz(ω, a) = r(ω) −Krzr(ω) if a ≥ a, where Krz = E

£r(ω)|Zθ¤ ±E

£r(ω)|Zθ¤ is a

Þxed constant. If r = 1ω0, deÞne a series of such payoff functions corresponding to r+n , r

−n as in

Condition 1. Each urz has R-incremental returns, and the optimal policy for the decision problemurz,G

θ®is to choose a ≥ a if and only if Zθ ≥ z. So

V (F θ, urz) = EW [r(W )−Krzr(W )|Zθ ≥ z] Pr(Zθ ≥ z).

Our deÞnition of the indexing function implies that EW [ r(W )|Zθ ≥ z] Pr(Zθ ≥ z) is constant in θ.Thus, a necessary condition for ∂

∂θV (Fθ, urz) ≥ 0 for all such urz is that for all r ∈ R and z ∈ [0, 1],

EWhr(W ) |T θ+dθ(Xθ+dθ) ≥ z

iPr(T θ+dθ(Xθ+dθ) ≥ z) ≥ EW

hr(W ) |T θ(Xθ) ≥ z

iPr(T θ(Xθ) ≥ z).

(7)

The proof of Theorem 1 (Step 1) shows that this is equivalent to the requirement that (MIO) holds

for F 0 = F θ+dθ and F = F θ. Q.E.D.

Again, we remark on the crucial role of the indexing functions T θ. The proof of Theorem 2 shows

that for two close signals, (7) (equivalently (MIO)) is a necessary condition for ∂∂θV (F

θ, u) ≥ 0.Observe that (7) is the same condition as the payoff comparison (3) in the proof if Theorem 1.

Earlier, we observed that an analogue of (7) with some arbitrary transformation Φ : X → X in

place of T θ and the identity in place of T θ+dθ would be a sufficient condition for ∂∂θV (F

θ, u) ≥ 0.However, it is not necessary.

To see this, suppose we had used an alternative set of increasing onto functions Sθ : X → [0, 1] to

index signals (instead of T θ). Mimicking the above proof, a necessary condition for ∂∂θV (F

θ, u) ≥ 0for all u ∈ UR would be that (7) hold (with Sθ in place of T θ) for all z ∈ [0, 1] and some subset ofincremental return functions Rz ⊂ R. SpeciÞcally, for each z, Rz contains all functions r ∈ R forwhich there exists a payoff function u ∈ UR such that the optimal policy is to switch actions at zand the payoff conditional on using this policy is E[r|Sθ(Xθ) ≥ z] Pr(Sθ(Xθ) ≥ z). Thus, making atight ranking of information structures for any comparison of signals Φ : X → X other than the onebased on our indexing functions, requires explicitly incorporating a separate optimality condition for

each payoff function in the relevant class! In contrast, by using T θ-averages to compare information

structures, we guarantee that for every z, Rz = R, so the necessary (and sufficient) condition for∂∂θV (F

θ, u) ≥ 0 is simply the stochastic dominance relation (MIO).Remark. We can be somewhat more mathematically precise about the last comment. Fix some

arbitrary indexing functions Sθ. A necessary condition for ∂∂θV (F

θ, u) ≥ 0 for all u ∈ UR is that

14

for all z ∈ [0, 1],13

0 ≤ L(z) = infr∈R

(E£r | Sθ+dθ(Xθ+dθ) ≥ z¤Pr ¡Sθ+dθ(Xθ+dθ) ≥ z¢−E

£r | Sθ(Xθ) ≥ z¤Pr ¡Sθ(Xθ) ≥ z¢

)s.t. E

£r | Sθ(Xθ) = z

¤= 0.

(8)

By standard methods, L(z) ≥ 0 if and only if there exists λ(z) ≥ 0 such that, for all r ∈ R,(E£r | Sθ+dθ(Xθ+dθ) ≥ z¤Pr ¡Sθ+dθ(Xθ+dθ) ≥ z¢−E


)≥ λ(z)·E

hr | Sθ(Xθ) = z

i. (9)

Under Condition 3,

λ(z) =

(E£r | Sθ+dθ(Xθ+dθ) ≥ z¤Pr ¡Sθ+dθ(Xθ+dθ) ≥ z¢−E


),Ehr | Sθ(Xθ) = z

i.

We can think of λ(z) as the shadow value of the optimality constraint in (8). For an arbitrary

transformation of signals Sθ, λ(z) is positive the optimality constraint is binding. In other words,

a tight comparison of Xθ+dθ and Xθ with respect to the indexing Sθ+dθ, Sθ must explicitly account

for the fact that the decision-maker will always use an optimal policy. However, by comparing

Xθ+dθ and Xθ with respect to the indexing T θ+dθ, T θ, we guarantee that λ(z) = 0 for all z. The

fact that decision-makers will optimize is relevant for the information comparison only to the extent

that it ensure they will use monotone policies. This is the key to obtaining a clean and meaningful

characterization of information value in monotone problems.

3.4 Global Necessity for Monotone Testing Problems

Theorem 2, which establishes that our informativeness criterion is necessary for all payoff functions

with R-incremental returns to prefer F 0 to F , is a local result that applies to marginal changes in

information. Lehmann (1988) was able to show a somewhat different necessity result for the class of

problems he studied. He showed that his efficiency criteria was globally necessary for improvement

in a class of monotone hypothesis testing problems.14 We now demonstrate that our information

orders have an analogous property.

13 To see this, note that if r ∈ R, then E[r | Sθ(Xθ) = z] is single crossing in z. So if E[r | Sθ(Xθ) = z] = 0, deÞner,Krz such that r−Krzr = r to obtain a payoff function urz as in the above proof. The objective here is the changein payoff for this urz when θ increases.14 Lehmanns (1988) order is necessary for information comparisons in exactly the same circumstances that our

monotone information orders are (given R)that is, there is no sense in which Lehmanns (1988) order is tighterfor the class of single crossing functions than (MIO) is for a given R.

15

Fix R ⊂ R. For any r ∈ R, we consider the statistical problem of testing the hypothesis

H0 : r(W ) ≥ 0 against the alternative r(W ) < 0, with a constraint on the level of the test. Theproblem is:

maxa:X→0,1

ZX

ZΩa(x)r(ω)dF (ω, x) (10)

s.t.ZXa(x)dT (x) = 1− z,

where a = 1 means accept and a = 0 means reject. When R is the set of functions that cross

zero at ω0, the constraint in (10) is that the test has level z. That is, the probability of rejection,

conditional on W = ω0, is z. When R contains the constant functions, the constraint is that the

average (or ex ante) probability of rejection is z. We consider an generalized economic example of

such a problem grading students on a curve below.

Call (10) an R-monotone testing problem if F is R-ordered. Then the class of all R-monotone

testing problems is obtained by considering all r ∈ R and all z ∈ [0, 1].

Theorem 3 Consider R ⊂ R, a prior H, and two signals X and X 0 satisfying Condition 3, where

the corresponding information structures F and F 0 are R-ordered. Then F 0 is more informative

than F for all R-monotone testing problems if and only if (MIO) holds.

Proof. Consider the problem (10). The optimal policy is accept, a(x) = 1 if and only if X = x ≥ x,where T (x) = z. The ex ante payoff under F is thus EW [r(W )|T (X) ≥ z] Pr(T (X) ≥ z). This

payoff is higher under F 0 if and only if (3), or equivalently (MIO), holds. Q.E.D.

3.4.1 Application: Testing and Evaluation

As an economic example, consider the problem facing an examiner who must grade a large number

students to a curve based on test results. Grading based on a students percentile in the distri-

bution is common, not only in university classes, but also in entrance exams, aptitude tests, and

applications for civil service positions around the world. Suppose the students are ex ante identical

with distribution of capabilities given by H, and that student i has unknown capability Wi. For

each student, the examiner observes an informative test score Xi. The examiner would like to

assign the highest grades to the most capable students (i.e. her payoff function u : Ω × A → R issupermodular). Suppose that the possible grades are given by j = 1, 2, ...., J with J the highest,

and that the grade curve mandates that a fraction βj must receive grade j or lower. Assuming

that the posterior beliefs are ordered by FOSD, the optimal grading policy will be monotone in the

students test scores. Given the large number of (ex ante identical) students, the optimal grading

16

policy is to assign grade j to any student with a test score x such that FX(x) ∈ (βj−1,βj ]. Thequestion of interest is under what conditions will one possible test be more valuable than another?

The answer is that the examiner necessarily will receive higher expected utility from grading if and

only if the information revealed by the test improves in the sense of (MIO) for RND the set of

nondecreasing functions.

3.5 Further Examples

3/4. Payoff functions with incremental returns that are single crossing. We now use our general

(MIO) condition to re-derive Lehmanns efficiency order, and to provide an alternative characteri-

zation that seems potentially useful for applications. Consider two information structures F 0 and

F whose posteriors are ordered by the MLRP. First, Þx the point at which incremental returns

cross zero, ω0, i.e. take R = RSC(ω0). We can re-express (MIO) using our earlier deÞnition of the

ÂR order: for all z ∈ [0, 1]:

Pr¡F 0X(X

0|ω0) ≥ z | ω¢ ≥ Pr (FX(X|ω0) ≥ z | ω)

if and only if ω ≥ ω0.15 Or, alternatively, for all z ∈ [0, 1],

FX¡F−1X (z|ω0) | ω

¢ ≥ F 0X ¡F 0−1X (z|ω0) | ω¢

(11)

if and only if ω ≥ ω0.For F 0 to be more informative than F for all payoff functions with single crossing incremental

returns, this condition must hold for all ω0. Or in other words, the inequality (11) must hold for

any z ∈ [0, 1] and any ω,ω0 ∈ Ω with ω ≥ ω0. Applying the monotone transformation F 0−1X (·|ω) toeach side, and substituting x = F−1X (z|ω0),we obtain the equivalent condition that:

∀x ∈ X , F 0−1X (FX(x|ω)|ω) is nondecreasing in ω. (12)

This is the condition derived by Lehmann (1988).

We can also use (11) and our approach of expressing information orders in terms of orders

over average posterior beliefs, to re-express Lehmanns criterion as a likelihood ratio ordering on

posterior beliefs.

15This condition turns out to have a very nice interpretation in the spirit of hypothesis testing. Suppose r(ω)crosses zero at ω0. Then a statistical test of the hypothesis H0 : r(ω) ≥ 0 against the alternative r(ω) < 0 is a test ofω > ω0. The constraint is Pr(reject|ω0) = z: the probability of rejection conditional on the truth being ω0 is z, andthe optimal test has the form: accept if and only if the realization of X is greater than x, where FX(x|ω0) = z. Thus(MIO) states that X 0 provides a more powerful test of level z than X it is more likely to accept (reject) when thetrue state is above (below) ω0.

17

Corollary 1 Suppose that for all θ and all x ∈ X , the posterior belief has a density that is dif-ferentiable. Then FXθ,W is ordered by (MIO) for R = RSC, if and only if, for all ω ∈ Ω and allz ∈ [0, 1],

f 0W (ω | Xθ ≥ F−1Xθ (z|ω))

fW (ω | Xθ ≥ F−1Xθ (z|ω))

is nondecreasing in θ. (13)

The proof of this result is in the Appendix. In applying this characterization, it is important to

observe that ω appears in two roles in the formula: once as the argument of the density function,

and once in determining the cutoff value of x, F−1Xθ (z|ω).

5. Payoff functions with that are nondecreasing on ω ≤ ω0, and zero thereafter. Payoff functionsthat take this form arise in the study of Þrst-price auctions with affiliated values, whereW represents

the opponents signal.16 Formally, deÞne RNDZ(ω0) as follows: r ∈ RNDZ(ω0) if r is nondecreasingon ω ≤ ω0 and r(ω) = 0 for all ω > ω0. This class of payoff functions satisÞes Condition 1 with

r(ω) ≡ 1ω≤ω0, and the order ÂR induced by RNDZ(ω0) requires that FW (·|x0) ÂR FW (·|x) ifFW (·|x0)/ FW (·|x) ≤ FW (ω0|x0)/FW (ω0|x); this is simply FOSD conditional on ω ≤ ω0. Taking

the set of payoff functions RNDZ = ∪ω0∈ΩRNDZ(ω0) induces as ÂR the monotone probability ratioorder, which requires that FW (·|x0)/ FW (·|x) is nondecreasing for x0 > x. This order is weaker

than the MLRP but stronger than FOSD.17

The analysis of MIO for this set is analogous to that for single crossing functions. Taking

R = RNDZ(ω0), (MIO) reduces to: for all z ∈ [0, 1],

FX(F−1X (z|W ≤ ω0)|W ≤ ω) ≤ F 0X(F 0−1X (z|W ≤ ω0)|W ≤ ω)) (14)

for all ω < ω0. When we let ω0 = sup(Ω), we have exactly (MIO) for R = RND, as desired.

For the set RNDZ , we check (14) for all ω0 ∈ Ω, yielding

∀x ∈ X , F 0−1X (FX(x|W ≤ ω)|W ≤ ω) is nondecreasing in ω. (15)

And, analogous to above, we can re-express this in more familiar terms:

16 In a two-bidder, Þrst-price, common-value auction, if bidder 2 uses a strictly monotone bidding strategy β(·), thepayoffs to bidder 1 for a given bid b, signal s, and realization of the opponents signal are given by u(b,ω) = (E[V |S =s,W = ω]− b)1b>β(ω). As a function of ω, the returns to choosing bH rather than bL < bH are Þrst negative (andconstant) at bL− bH , then (in the region where higher bids cause bidder 1 to go from losing to winning by increasingthe bid) nondecreasing in ω, and Þnally the returns are equal to 0 (when both bL and bH are losing bids). See Athey(forthcoming) for generalizations and further discussion.17Athey (2000) shows that RNDZ induces the same order ÂR as the one induced by considering the incremental

returns to investment for the class of risk-averse investors making an investment, so that u(a,ω) = v(π(a,ω)), wherev is concave and π is supermodular. The order ÂRNDZ is also the same as the order induced by the set of payoffs rthat are single crossing and quasi-concave (Athey (forthcoming)).

18

Corollary 2 Suppose that for all θ and all x ∈ X , the posterior belief has a density. Then FXθ,W

is ordered by (MIO) for R = RNDZ , if and only if, for all ω ∈ Ω and all z ∈ [0, 1],fW (ω | Xθ ≥ F−1

Xθ (z|W ≤ ω))FW (ω | Xθ ≥ F−1

Xθ (z|W ≤ ω)) is nondecreasing in θ. (16)

The proof of this result mimics Corollary 1. Athey (2000) shows that when signals are ordered

in this way, under a few additional regularity conditions (as well as the restriction that posterior

beliefs are ordered by the monotone probability ratio order), all risk-averse investors will Þnd higher

values of θ more valuable.

4 Demand for Information

In this section, we obtain a comparative statics result concerning the relative demand for infor-

mation of two different decision-makers. The following Theorem builds on an approach used by

Persico (2000) for the special case of decision-makers with single-crossing incremental returns.

Theorem 4 Consider R ⊂ R, a prior H, and a smoothly parametrized family of signals Xθθ∈Θsatisfying Condition 3, where for each θ, the information structure F θ is R-ordered, and F θθ∈Θis ordered by (MIO). DeÞne

w(ω, x) = u(ω,αθ,u(x))− v(ω,αθ,v(x)),

where αθ,u denotes an optimal decision rule for the decision problemF θ, u

®. If for all x0 > x,

w(ω, x0)−w(ω, x) ∈ R, then ∂∂θV (θ, u) ≥ ∂

∂θV (θ, v).

Proof. Recall the normalized information structures Gθ deÞned in the proof of Theorem 3. DeÞne

w(ω, z) = u(ω, aθ,u(T θ−1(z))− v(ω, aθ,v(T θ−1(z)). By the Envelope Theorem,

∂

∂θV (θ, u)− ∂

∂θV (θ, v) =

ZΩ

Z[0,1]

w(ω, z)d∂

∂θ

nGθ(ω, z)

o.

Assume A is Þnite. Then there is some series of cut-points zini=1, with z1 ≤ z2 ≤ ... ≤ zN+1, suchthat w(ω, z) = w(ω, zi) on [zi, zi+1). Let ri(ω) = w(ω, zi)−w(ω, zi−1), and note that ri(ω) ∈ R forall i = 2, ..., N. Then, ∂

∂θV (θ, u)− ∂∂θV (θ, v) ≥ 0 if

nXi=2

EWhri(W ) | Zθ+dθ ≥ zi

iPr³Zθ+dθ ≥ zi

´−

nXi=2

EWhri(W ) | Zθ ≥ zi

iPr³Zθ ≥ zi

´≥ 0,

which holds by (MIO). The case where A is continuous follows via a limiting argument. Q.E.D.

Note that the result does not rely on u, v ∈ UR. An immediate consequence follows.

19

Theorem 5 Suppose the conditions of Theorem 4 hold. For C : Θ→ R, let θ∗(u)=argmaxθ∈Θ V (θ, u)−C(θ). Then θ∗(u) ≥ θ∗(v) (in the strong set order).

Proof. By Theorem 4, [V (θ, u)−C(θ)] − [V (θ, v)−C(θ)] is nondecreasing in θ. So by TopkisMonotonicity Theorem (e.g. Topkis, 1998), θ∗(u) ≥ θ∗(u) in the strong set order. Q.E.D.

The difficulty in applying Theorem 4 is that the critical condition depends on the properties of

the two objective functions evaluated at their respective optima. Thus, it requires a fair amount of

structure. This should probably not come as a surprise. A change in preferences has at least two

effects in a decision problem under uncertainty. First, it changes the optimal behavior and hence

the responsiveness to difference realizations of information. Second, it changes the preferences over

the residual risk faced after a decision is made. A comparison of marginal values for information

must incorporate both effects. Almost by necessity, then, it must deal with potentially subtle

comparisons of the curvature of the payoff functions.

Despite this, Theorem 4 can be applied to obtain new results in a variety of contexts. Persico

(2000) analyzes auction models. Below, we apply the result to a standard model of monopoly.

We also provide an example of a class of applications, namely principal-agent problems, where the

analysis is simpliÞed because the same policy function affects both players.

The requirements of Theorem 4 for primitives can be characterized precisely when we consider

small changes in the payoff function, and when 1,−1 ∈ R. We parameterize u by γ, so thatu : Ω × A× Γ → R, and deÞne u : A× X × Γ → R by u(a, x; γ) =

Ru(ω, a; γ)dFW (ω|x). We let

subscripts denote partial derivatives.

Corollary 3 Suppose the conditions of Theorem 4 hold, and that (a) 1,−1 ∈ R; (b) Γ,A are

compact, convex subsets of R; (c) for each ω, u(ω, ·; ·) is C(3); and (d) u is quasi-concave in aand C(2). Then ∂2

∂θ∂γV (θ, u(·; γ)) ≥ 0 if, for each (a, γ), u(·, a; γ) satisÞes: (i) u(·, ·; γ) ∈ UR; (ii)uγ(·, a; γ) ∈ UR, (iii) uaaγ(·, a; γ) ≥ 0, and (iv) either uaa(·, a; γ) is a constant function of ω, orelse ua(·, ·; γ) ∈ UR and uaγ(·, a; γ) ≥ 0.

Proof. By Theorem 4, the result follows if ∂2

∂γ∂xu(·,αθ,γ(x); γ) ∈ R. Differentiating yields:∂2

∂γ∂xu(·,αθ,γ(x); γ) = uaγ(·, a; γ)αθ,γx (x) + ua(·, a; γ)αθ,γxγ (x) + uaa(·, a; γ)αθ,γx (x)αθ,γγ (x),

evaluated at a = αθ,γ(x). Since F θ is R-ordered, αθ,γx ≥ 0 by (i). Thus, the Þrst term is in R by

(ii). If uaa is constant in ω, the last term is constant in ω (and thus in R); otherwise, it is in R if

uaa ∈ R and αθ,γγ ≥ 0, which follows if uaγ ≥ 0 (as in (iv)). The second term is in R by (i), so long

as αθ,γxγ ≥ 0. To evaluate this, we apply the implicit function theorem, yielding:∂2

∂x∂γαθ,γ(x) = [−uaa(·)uaxγ(·) + uax(·)uaaγ(·)]

.(uaa(·))2

¯a=αθ,γ(x)

20

At the optimum, uaa < 0, and uax ≥ 0 since F θ is R-ordered; uaaγ ≥ 0 by (iii). Since 1,−1 ∈ R,F θW (·|x) is ordered by stochastic dominance for R; then uaγ ∈ R implies uaxγ ≥ 0. Q.E.D.

Each of conditions (ii)-(iv) has a natural interpretation and ensures that a particular effect

works in the right direction. We illustrate with a series of investment examples based on the

case of supermodular payoff functions. Let the payoff function be u(ω, a) = v(ω, a)− c(a), wherea represents an investment level, c is a cost of investment, and the returns to investment are

nondecreasing in ω. Assume c is nondecreasing and convex.

For condition (ii), suppose γ affects the scale of returns. Let u(ω, a; γ) = γar(ω) − c(a),so conditions (iii) and (iv) are trivial. Then an increase in γ increases information demand

because it makes the marginal returns to investment more sensitive to changes in ω (i.e.

more nondecreasing in ω).

For condition (iii), suppose that γ multiplies the costs of investment. Let u(ω, a; γ) = ar(ω)−(γ − γ)c(a), so that conditions (ii) and (iv) are trivial. Now, an increase in γ increases

information demand by making the investment problem less concave in a, and hence the

policy function more responsive to information about returns (i.e. ∂∂xα

θ,γ is increasing in γ).

Finally, condition (iv) becomes relevant when we generalize the functional form for investment.Let u(ω, a; γ) = v(ω, a) − (γ − γ)c(a). Since c is nondecreasing, an increase in γ increasesthe optimal policy. In turn, this increases the demand for information if it makes marginal

returns more sensitive to ω (i.e. condition (iv) requires that vaω be increasing in a). To see

an application, suppose that a is an input (e.g. labor). If u(ω, a; γ) = v(ω, a) + γa− c(a), ifvaω is increasing in a, the Þrm invests more (less) in information gathering in response to a

wage subsidy (tax).

4.1 Application: Production under Uncertainty

A growing literature considers the value of information to Þrms under imperfect competition (see,

e.g., Mirman, Samuelson, and Schlee (1994) and references therein). Here, we prove a simple but

new result using Theorem 4: under standard conditions, a monopolist will not only produce less

but will gather less information about production costs than is socially efficient.

To model this, let P (q)be the inverse demand curve. Suppose the cost of producing q units

is C(q,ω), where (letting subscripts denote partial derivatives) Cq is nonincreasing in ω. The

monopolists payoff is uM(ω, q) = qP (q) − C(q,ω), while the social planners payoff is uS(ω, q) =R q0 P (t)dt − C(q,ω). Both payoff functions are supermodular, so consider

©F θªθ∈Θ satisfying the

hypotheses of Theorem 4 for RND. The cost of information is c(θ).

21

By Theorem 1, both monopolist and social planner prefer more information to less accord-

ing to our deÞnition of information. To show that ∂∂θV (θ, u

S) ≥ ∂∂θV (θ, u

M), we ask when∂∂xu

S(ω, qS(x))− ∂∂xu

M(ω, qM(x)) is nondecreasing in ω. Eliminating terms that do not depend on

ω, we express this difference as

−Cq(qS(x),ω)qSx (x) +Cq(qM(x),ω)qMx (x). (17)

Proposition 1 In the production problem with cost uncertainty, a social planner has a higher

demand for information than a monopolist, if for any θ, (i) marginal costs are increasing in q

and submodular in (q,ω), (i.e. Cqq ≥ 0, Cqqω ≤ 0), (ii) the marginal revenue curve is downwardsloping (qP 00(q) + P 0(q) ≤ 0 for all q), and (iii) the marginal social surplus function (P (q) −E[Cq(qS ,W )|X = x]) is convex in q.

The basic intuition for the result is easily grasped. Different realizations of X shift the perceived

marginal cost curve. Under relatively standard conditions, the demand curve faced by a social

planner will be ßatter than the marginal revenue curve faced by a monopolist, so the social planner

will be more responsive to shifts in marginal cost. Consequently, the social planner beneÞts more

from improving the match between perceived and actual marginal costs. A formal proof obtains

by using the implicit function theorem to express qMx and qSx , and checking that (17) is indeed

nondecreasing in ω.

4.2 Application: Delegated Project Selection

Recently, a rapidly growing literature has focused on delegated information-gathering and decision-

making (i.e. Aghion and Tirole, 1997). In such problems, two parties have different objectives but

are subject to the same decision rule, greatly simplifying the problem. Suppose we have a principal

with payoff function v(ω, a) and an agent with payoff function u(ω, a), both supermodular. The

agent must evaluate a set of projects (for simplicity, a continuum) to be accepted or rejected (for

example, a set of investment opportunities or workers to be hired), so A = 0, 1, and ω representsquality. There is a prior distribution of projects, H(·), known to both agents, and a smoothlyparameterized family of signals, Xθθ∈Θ, satisfying the conditions of Theorem 4 for RND. Higher

values of θ are more informative according to (MIO). Project selection is as follows. First, the

principal determines a quota (denoted z), stated as the fraction of the pool of potential projects

to be accepted. The agent then chooses θ, at cost C(θ), receives a signal about each project, and

then makes acceptance decisions, obeying the quota z; no further contracting takes place.

Observe that the principal and agent will agree on the ranking of projects, but not necessarily

on what fraction to accept. Moreover, the principal may want to accept more (or less) projects

22

depending on the quality of information. The principal must form a conjecture about what choice

of θ her quota z will induce; in equilibrium, her conjecture will be fulÞlled.

The agents selection rule, denoted α(x), is to accept a given project if and only if FX(x) ≥ z.Given this, the agents payoff is EW,X|θ [u(W,α(X))]−C(θ) and the principals is EW,X|θ [v(W,α(X))].Because u, v are supermodular, both parties enjoy higher payoffs (net of C) if information qual-

ity is improved. Moreover, if v − u is supermodular (u − v is supermodular)that is, the agentsreturns to quality for accepted projects are lower (higher) than the principalsthen the agent

under-acquires (over-acquires) information relative to what the principal would choose given the

same cost function, and same acceptance rule.

5 Further Characterizations

This section provides further characterizations of our informativeness conditions. We Þrst charac-

terize informativeness in terms of marginal-preserving spreads. We then discuss the restrictions

implied by our criteria on the conditional signal distributions when we allow for sets of prior beliefs.

5.1 Marginal-Preserving Spreads

Above, we stated (MIO) in terms of average posterior beliefs. We now provide an alternative

characterization in terms of the joint distributions on the signal and the state. For concreteness,

we consider two sets of payoff functions: those with nondecreasing incremental returns (URND),

and those with concave incremental returns (URCV ).

It is convenient to work directly with the indexed signals. Given a signal X, with associated

information structure F , deÞne Z = T (X) (= FX(X)). Then Z is a random variable on [0, 1],

and has identical information content to X. Given another signal X 0, we can similarly deÞne

Z 0 (= F 0X(X0)). Let G,G0 be the joint distributions of (W,Z) and (W,Z0). Then (i) G,G0 are

R-ordered (for given R) if and only if F, F 0 are, and (ii) without loss of generality, G,G0 have

equivalent marginals.18

To minimize notation, let us take Ω to be Þnite and assume that Z,Z 0 are discrete.19 Let g, g0

be the probability mass functions corresponding to G,G0 and deÞne γ(ω, z) = g0(ω, z) − g(ω, z).We say that G0 can be obtained from G by a single marginal-preserving spread (MgPS) if for

18 The marginal of W is the prior. If X,X0 are continuous, then by normalization Gz(z) = G0z(z) = z. If X,X0 are

discrete, it may not be the case that Z,Z0 have identical uniform marginal distributions. However, using Lehmanns(1988) technique, we can construct continuous random variables X∗,X∗0 with identical information content to X,X 0

such that their indexed versions Z∗ and Z∗0 have uniform marginals.19 See footnote 18. The continous analogues are straightforward.

23

z2

z1

ω1 ω2ω3 ω4ω1 ω2

ε

ε

δ

δ

η

η

A M arginal-Preserving Spread A M ean and Marginal-Preserving Spread

z2

z1

Figure 1: Illustrations of Spreads.

some ε > 0, ω1 < ω2, and z1 < z2,

γ(ω1, z1) = γ(ω2, z2) = −γ(ω2, z1) = −γ(ω1, z2) = ε,

and otherwise γ = 0.20 Similarly, G0 can be obtained from G by a single mean and marginal-

preserving spread (MMgPS) if for some δ, η > 0, ω1 < ω2 < ω3 < ω4, and z1 < z2,

γ(ω1, z1) = γ(ω2, z2) = −γ(ω2, z1) = −γ(ω1, z2) = δ,−γ(ω3, z1) = −γ(ω4, z2) = γ(ω4, z1) = γ(ω3, z2) = η.

wherePi ωiγ(ωi, zj) = 0 for j = 1, 2, and otherwise γ = 0.

As Figure 1 illustrates, a marginal-preserving spread captures the idea of making two ran-

dom variables more positively dependent. A mean and marginal-preserving spread adds positive

dependence below E [W ] and negative dependence above E [W ].

Proposition 2 Consider a prior H, and two signals X,X 0 whose indexed information structures

G,G0 are R-ordered (for the appropriate R). (i) G0 ÂMIO−ND G if and only if G0 can be obtained

from G via a series of MgPSs; (ii) G0 ÂMIO−CV G if and only if G0 can be obtained from G via a

series of MMgPSs.

Proof: Part (i) follows a result in Meyer (1990). The proof of (ii) uses Rothschild and Stiglitzs

(1970) characterization of SOSD in terms of mean-preserving spreads. Suppose G0 is obtained by

G via a single MMgPS, where the MMgPS is carried out on z1 < z2. Then W |Z ≥ z d= W |Z 0 ≥ z

for any z > z2 and any z ≤ z1, and the distribution of W |Z ≥ z is a mean-preserving spread

of W |Z 0 ≥ z for any z1 < z ≤ z2. So W |Z 0 ≥ z dominates W |Z ≥ z by SOSD for any z, and

20 Hamada (1974), Tchen (1976), Epstein and Tanny (1980) and Meyer (1990) have all discussed this concept instudying positive dependence of random variables.

24

G0 ÂMIO−CV G. With multiple MMgPSs, the argument is repeated. For the other direction,

suppose Z,Z0 take values on z1, ..., zn and have uniform marginals, and that G0 ÂMIO−CV G.

Then the distribution ofW |Z 0 = z1 is SOSD dominated by the distribution of W |Z = z1 and hencemust be obtained by a Þnite series of mean-preserving spreads. Starting with G, make a series

of MMgPSs using z1, z2 that result in a distribution G(1) with W |Z(1) = z1d= W |Z0 = z1, and

W |Z(1) ≤ z2 d= W |Z ≤ z2. These last two are SOSD dominated by W |Z0 ≤ z2, so in particular,

W |Z(1) = z2 is stochastically dominated by W |Z 0 = z2. Now, starting with G(1) make a series

of MMgPSs using z2, z3 that result in G(2) with W |Z(2) = z2d= W |Z 0 = z2. Continuing in this

fashion, we obtain from G via a series of MMgPSs, a distribution G(n−1) such that for all z < zn,

W |Z(n−1) = z d= W |Z 0 = z. Since G(n−1) and G0 have equivalent marginals, it must therefore be

that G(n−1) = G0. Q.E.D.

This result fully characterizes (MIO) when R is the set of nondecreasing or concave functions.21

Example: A MMgPS that Violates Sufficiency. The following example illustrates a series of MMg-

PSs; it also shows how two signals X 0 and X might be information-ranked for payoff functions

with concave incremental returns without one being statistically sufficient for the other. Suppose

Ω = −2,−1, 0, 1, 2 and the prior on Ω is uniform. There are two signals, which take realizationson X = x1, x2, x2 > x1. The joint signal-state distributions are:

Pr(W = ω,X = x) −2 −1 0 1 2

x11180

580

880

580

1180

x2580

1180

880

1180

580

Pr(W = ω, X 0 = x) −2 −1 0 1 2

x11480

480

480

480

1480

x2280

1280

1280

1280

280

Simple calculation shows that the posteriors induced by X do not lie in the convex hull of the

posteriors induced by X 0, so X 0 cannot be sufficient for X. Thus for some decision-makers, X is

preferred to X 0. However, the posteriors induced by both X and X 0 are ordered by SOSD, and the

distribution of (W,X 0) is obtained from the distribution of (W,X) by a sequence of two MMgPSs

(one on (x1, x2; −1, 0, 0, 1), the other on (x1, x2; −2,−1, 1, 2)). So X 0 is preferred to X by

all decision-makers whose payoff functions have concave marginal returns.

5.2 Information Orders and Prior Beliefs

While (MIO) and the marginal preserving spread conditions are intuitive and easy to state, they

have the disadvantage of intermingling the prior belief and the conditional signal distributions. In

21 When we further restrict attention to two states and two signals (where posterior beliefs are ordered by FOSD),it can be shown that a MgPS is equivalent to two elementary linear bifurcations on the distribution over posteriors,as in Grant, Kajii and Polak (1998); these authors show that this implies statistical sufficiency, which in turn impliesthat (MIO) and sufficiency are equivalent for this case.

25

many economic contexts, it is reasonable to assume that the prior is known to the analyst; for

example, the distribution over previous asset returns, input costs, or employee test scores may be

objectively measured. But in other cases, such information may be unavailable, and we would

prefer a characterization of information that relies less on knowledge of the prior beliefs.

Put differently, we have characterized information preferences for families of decision-makers

of the form (H, UR). To obtain an information ranking for a family of decision-makers (Λ, UR)where Λ is some set of possible prior beliefs, we must check (MIO) for the entire set Λ. Typically,

this implies a further restriction on the informativeness order. To see why, write TH in place of T

to highlight dependence on the prior, and rewrite (MIO) (using (3)) as:

∀r ∈ R, z ∈ [0, 1] :ZΩr(ω) Pr(T 0H(X

0) ≥ z |ω)dH(ω) ≥ZΩr(ω) Pr(TH(X) ≥ z |ω)dH(ω). (18)

This version of (MIO) compares weighted averages of the conditional signal distributions, where

the states are weighted according to the prior H and the return functions r ∈ R.Clearly, the prior H plays a role. Consequently, X 0 might be R-information preferred to X for

one prior H1, but not another prior H2. Intuitively, relative to X, X 0 might not distinguish well

between two states, ω and ω0; but if both are unlikely, X 0 might still be R-information preferred

to X. Clearly, this feature is desirable when the prior is known, but undesirable when we consider

a set of priors where other members place large weight on ω and ω0. An additional complication is

that even for a given signal X, it might be the case that the induced posteriors are R-ordered for

one prior H1 but not for another prior H2.

Consider Þrst the set from Example 3, RSC(ωo). Here, TH(X) = FX(X|ωo), which does notdepend on the prior; further, r · h ∈ RSC(ωo) for all prior densities h, if and only if r ∈ RSC(ωo).Thus, the prior does not affect (MIO) in this case. In contrast, for nondecreasing functions RND,

TH(X) = FX(X), which depends on the prior; and further, if we consider an alternate prior densityh, r nondecreasing implies only that r · h is single crossing. Thus, this analysis indicates thatmuch of the structure imposed by commonly encountered economic restrictions (i.e. monotonicity,

concavity) can be undone by allowing for a rich enough set of prior beliefs Λ. We formalize this

discussion as follows (recalling RSC contains single crossing functions, and RND ⊂ RSC):

Proposition 3 (i) The information structure F corresponding to X is RND-ordered (i.e., FOSD-

ordered) for all prior beliefs H ∈ ∆(Ω) if and only if it is RSC-ordered (i.e., MLRP-ordered).

(ii) Given two signals X 0 and X that are MLRP-ordered, X 0 is more informative that X for all

decision-makers in (∆(Ω), URND) if and only if X 0 is more informative that X for decision-makers

in (∆(Ω), URSC ).

Proof: Part (i) is due to Milgrom (1981). For part (ii), observe that for a Þxed prior H,

26

X 0 ÂMIO−ND X states that F 0W (ω|F 0X(X 0) ≥ z) ≤ FW (ω|FX(X) ≥ z) ∀ω ∈ Ω, z ∈ [0, 1].

Applying Bayes Rule and canceling terms yields F 0X(F0−1X (z)|W ≤ ω) ≤ FX(F

−1X (z)|W ≤ ω)

∀ω ∈ Ω, z ∈ [0, 1]; substituting z = FX(x) and re-arranging:

F 0−1X (FX(x)) ≤ F 0−1X (FX(x|W ≤ ω)|W ≤ ω) ∀ω ∈ Ω, x ∈ X .

Clearly, F 0−1X (FX(x|ω)|ω) nondecreasing in ω for all x ∈ X implies this last inequality regardless of

the prior H. Moreover, for the last inequality to hold for all H ∈ ∆(Ω), it must hold for all priorswith two point support, from which it follows that F 0−1X (FX(x|ω)|ω) must be nondecreasing in ωfor all x. Finally, as discussed in Example 3, F 0−1X (FX(x|ω)|ω) is nondecreasing in ω for all x if andonly if X 0 ÂMIO−SC X. Q.E.D.

Thus, we see that allowing for a rich enough set of prior beliefs renders useless the knowledge

that r is nondecreasing rather than just single crossing. In contrast, the orderings for single crossing

functions (as well as statistical sufficiency) are independent of the prior, and knowing the prior does

not allow these conditions to be tightened. Intuitively, the sets of payoff functions being considered

are so large that if X 0 does not distinguish well between ω and ω0, then even if these states are

unlikely, there is still some payoff function that cares only about the comparison between them.

Thus, we have shown that if the set of payoff functions being studied is large, knowledge of the

prior does not help in characterizing information preferences, while such information is essential to

fully exploit the structure of smaller sets of payoff functions.22

Example With a given prior, X 0 might be preferred to X for all decision-makers with nondecreasing

incremental returns (but not for all decision-makers with single crossing incremental returns); yet,

another prior may disturb the comparison. Suppose Ω = ω1,ω2,ω3 with ω1 < ω2 < ω3 and

X = x1, x2 with x1 < x2. Consider the following two joint distributions, where F 0 is obtained

from F via a single MgPS:

Pr(W = ω, X = x) ω1 ω2 ω3

x1624

324

324

x2224

524

524

Pr(W = ω, X 0 = x) ω1 ω2 ω3

x1624

424

224

x2224

424

624

.

Both FW (ω|x) and F 0W (ω|x) have the MLRP, but X 0 and X do not satisfy (12). Consider the

following payoff function, where A = 0, 1. u(ω, a) = ar(ω), where r(ω1) = −2, r(ω2) ∈ (1, 2),and r(ω3) = 1. It is easily checked that r(ω) satisÞes single crossing but is not nondecreasing, and

that the ex ante payoff for u is higher under F than under F 0. The idea is that X 0 distinguishes

well between ω1 and ω2,ω3, between ω1,ω2 and ω3, and between ω2 and ω3, but not well22 Jewitt (1997) shows that for two-point priors, where posteriors are ordered by the monotone likelihood ratio

order, Lehmanns order and statistical sufficiency coincide.

27

between ω1 and ω2. By allowing for a larger set of prior beliefs or payoff functions, one can choose

the prior or the payoff function to focus attention on this last fact.

6 Conclusion

In this paper, we established that the additional structure of monotone decision problems can be

exploited to derive necessary and sufficient conditions for all agents in a family to prefer one signal to

another, conditions that are typically less restrictive than statistical sufficiency (the ordering for all

payoff functions) or Lehmanns order (for single crossing functions). Alternatively, our results can

be interpreted as deriving additional consequences of statistical sufficiency and Lehmanns order,

for monotone decision problems with additional structure (i.e. payoff functions are supermodular,

and the prior distribution is Þxed). These consequences lead directly to comparative statics results

in decision problems and equilibrium models.

Finally, both our deÞnitions of more information and our results on information demand may be

useful for building models in which agents acquire their information at some cost prior to playing a

game of incomplete information. The examples given here, and Persicos (2000) work on auctions,

indicate that this may be a fruitful area for further inquiry.

7 Appendix

Proof of Corollary 1: Fix θ0 > θ. First, by deÞnition, F−1X (FX(x|ω0) | ω0) = x, so that∂

∂ωF−1X (FX(x|ω0) | ω)

¯ω=ω0

= −∂∂ω0FX(x|ω0)

fX(x | ω0) ,

or, letting z = FX(x| ω0),∂

∂ω0F−1X (z|ω0) = −

∂∂ωFX(F

−1X (z|ω0) | ω)

¯ω=ω0

fX(F−1X (z|ω0) | ω0)

.

Then, letting z = FX(x|ω0), we differentiate (12), yielding ∂∂ω0F−1X0 (FX(x|ω0)| ω0) =

−∂∂ωFX0(F−1X0 (z|ω0) | ω)

¯ω=ω0

fX0(F−1X0 (z|ω0) | ω0)+

∂∂ωFX(F

−1X (z|ω0) | ω)

¯ω=ω0

fX0(F−1X0 (z|ω0) | ω0).

Multiplying both sides by fX0(F−1X0 (z|ω0) | ω0), this is nonnegative if∂

∂ω

ÃfW (ω|X ≤ F−1X (z|ω0)) FX(F−1X (z|ω0)

h(ω)

!¯¯ω=ω0

≥ ∂

∂ω

ÃfW (ω|X 0 ≤ F−1X0 (z|ω0)) FX0(F−1X0 (z|ω0)

h(ω)

!¯¯ω=ω0

.

28

Simplifying, we have

FX(F−1X (z|ω0))h(ω0)

·f 0W (ω0|X ≤ F−1X (z|ω0))− fW (ω0|X ≤ F−1X (z|ω0)) h

0(ω0)h(ω0)

¸≥ FX0(F−1X0 (z|ω0))

h(ω0)

·f 0W (ω0|X 0 ≤ F−1X0 (z|ω0))− fW (ω0|X 0 ≤ F−1X0 (z|ω0))h

0(ω0)h(ω0)

¸or, multiplying through by h(ω0) and factoring out a term, this is:

fW (ω0|X ≤ F−1X (z|ω0)) FX(F−1X (z|ω0))"f 0W (ω0|X ≤ F−1X (z|ω0))fW (ω0|X ≤ F−1X (z|ω0))

− h0(ω0)h(ω0)

#(19)

≥ fW (ω0|X 0 ≤ F−1X0 (z|ω0)) FX0(F−1X0 (z|ω0))"f 0W (ω0|X 0 ≤ F−1X0 (z|ω0))fW (ω0|X 0 ≤ F−1X0 (z|ω0))

− h0(ω0)h(ω0)

#.

But, using Bayes rule, fW (ω0|X ≤ F−1X (z|ω0)) FX(F−1X (z|ω0)) = zh(ω0), so (19) becomesf 0W (ω0 | X ≤ F−1X (z|ω0))fW (ω0 | X ≤ F−1X (z|ω0))

≥ f 0W (ω0 | X 0 ≤ F−1X0 (z|ω0))fW (ω0 | X 0 ≤ F−1X0 (z|ω0))

.

Finally, using Bayes rule and the fact that the expectation of the posteriors must equal the prior,

the latter inequality is equivalent to the desired condition.

8 References

Aghion, Philippe and Jean Tirole (1997): Formal and Real Authority in Organizations.

Journal of Political Economy. 105 (1), 1-29.

Athey, Susan (1998a): Characterizing Properties of Stochastic Objective Functions, MIT

Working Paper 96-1R.

Athey, Susan (2000): Investment and Information Value for a Risk-Averse Firm, MIT Working

Paper 00-30.

Athey, Susan (forthcoming): Comparative Statics under Uncertainty: Single Crossing Properties

and Log-Supermodularity, Quarterly Journal of Economics.

Blackwell, David (1951): Comparisons of Experiments, Proceedings of the Second Berkeley

Symposium on Mathematical Statistics, 93102.

Blackwell, David (1953): Equivalent Comparisons of Experiments, Annals of Mathematical

Statistics, 24, 265272.

Cooper, Russell (1999): Coordination Games: Complementarities and Macroeconomics, Cam-

bridge University Press.

Epstein, L. and S. Tanny (1980): Increasing Generalized Correlation: A DeÞnition and Some

Economic Consequences, Canadian Journal of Economics, 13, 16-34.

29

Grant, Simon, Ben Polak and Atsushi Kajii (1998): Intrinsic Preference for Information,

Journal of Economic Theory, 83, 233259.

Gollier, C. and M. Kimball (1995): Toward a Systematic Approach to the Economic Effects

of Uncertainty I: Comparing Risks, Mimeo, University of Toulouse.

Hamada, K. (1974): Comment on Stochastic Dominance in Choice under Uncertainty, in M.

Balch, D. McFadden, and S. Wu, eds., Essays on Economic Behavior under Uncertainty.

Amsterdam: North Holland.

Jewitt, Ian (1986): A Note on Comparative Statics and Stochastic Dominance, Journal of

Mathematical Economics, 15, 249254.

Karlin, Samuel and Howard Rubin (1956): The Theory of Decision Procedures for Distribu-

tions with Monotone Likelihood Ratio, Annals of Mathematical Statistics, 27, 272299.

Lehmann, E.L. (1988): Comparing Location Experiments, Annals of Statistics, 16 (2): 521-533.

Meyer, Margaret (1990): Interdependence of Multivariate Distributions: Stochastic Domi-

nance Theorems and an Application to the Measurement of Ex Post Inequality under

Uncertainty, Nuffield College Discussion Paper.

Milgrom, Paul (1981): Good News and Bad News: Representation Theorems and Applications,

Bell Journal of Economics, 12, 380391.

Milgrom, Paul and John Roberts (1990): Rationalizability, Learning, and Equilibrium in

Games with Strategic Complementarities, Econometrica, 58, 12551277.

Milgrom, Paul and Christina Shannon (1994): Monotone Comparative Statics, Economet-

rica, 62, 157180.

Mirman, L., L. Samuelson, and E. Schlee (1994): Strategic Manipulation of Information in

Duopolies. Journal of Economic Theory, 36, 364-384.

Persico, Nicola (2000): Information Acquisition in Auctions, Econometrica.

Rothschild, Michael and Joseph Stiglitz (1970): Increasing Risk I: A DeÞnition, Journal

of Economic Theory, 2, 223243.

Shaked, Moshe and George Shantikumar (1997): Supermodular Stochastic Order and Pos-

itive Dependence of Random Variables, Journal of Multivariate Analysis, 61, 86101.

Shannon, Chris (1995), Weak and Strong Monotone Comparative Statics, Economic Theory 5

(2), March, pp. 209-27.

Tchen, A. (1976): Inequalities for Distributions with Given Marginals, Ph.D. Thesis, Dept. of

Statistics, Stanford, June 1976

Topkis, Donald (1998): Supermodularity and Complementarity. Princeton University Press.

30

Date post:	22-Jul-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

The Value of Information in Monotone Decision Problems · marginal cost information than the social...

Documents