The Value of Information
in Monotone Decision Problems∗
Susan Athey
MIT and NBER
Jonathan Levin
Stanford University
First Draft: September 1997
This Draft: February 2001
Abstract
This paper studies decision problems under uncertainty where a decision-maker observes an
imperfect signal about the true state of the world. We analyze the information preferences
and information demand of such decision-makers, based on properties of their payoff functions.
We restrict attention to monotone decision problems, whereby the posterior beliefs induced
by the signal can be ordered so that higher actions are chosen in response to higher signal
realizations. Monotone decision problems are frequently encountered in economic modeling. We
provide necessary and sufficient conditions for all decision makers with different classes of payoff
functions to prefer one information structure to another. We also provide conditions under
which two decision-makers in a given class can be ranked in terms of their marginal value for
information and hence information demand. Applications and examples are given.
JEL ClassiÞcation: C44, C60, D81.
Keywords: Bayesian decision problems, value of information, stochastic dominance, stochas-
tic orderings, decision-making under uncertainty.
∗We would like to thank Scott Ashworth, Kyle Bagwell, Dirk Bergemann, Glenn Ellison, John Geanakoplos, Bengt
Holmström, Eric Lehmann, Richard Levin, Meg Meyer, Stavros Panageas, Ben Polak, and Lones Smith, as well as
seminar participants at Michigan, MIT, Northwestern, UCLA, Stanford, Yale, the 1998 Summer Meetings of the
Econometric Society, and the PaciÞc Institute for Mathematical Sciences for valuable discussions. We have both
beneÞted from the hospitality of the Cowles Foundation at Yale University. Athey acknowledges the support of NSF
Grant SBR-9631760.
1 Introduction
In many economic models, a decision-maker faces uncertainty about her marginal returns to some
action and obtains information about these returns. A common and useful practice in such models
to assume that the decision problem has an order structure, in the sense that the potential beliefs
the agent could arrive at can be ranked from less to more optimistic, where more optimistic
beliefs are beliefs that induce a higher action. Examples of such monotone decision problems arise
in many contexts, including problems of production under uncertainty about marginal costs or
about demand elasticity, Þnancial and capital investment, auctions, contracting, adverse selection,
coordination under uncertainty, and search.
In this paper, we provide deÞnitions of more information that are tailored to different classes
of monotone decision problems. Any agent faced with such a problem will prefer one information
structure to another if and only if they can be ranked according to our conditions. We also
provide conditions under which the incentives of two agents to acquire better information in such
an environment can be ranked. For example, we show that a monopolist has a lower demand for
marginal cost information than the social planner, and we analyze under-investment in information
gathering for delegated decision-making problems.
The stochastic environment we consider is composed of two real-valued random variables: an
unknown state of the world W and a signal X. The decision-maker has a prior belief about the
distribution of W . The decision-maker observes the realization of X, say x, before choosing her
action, and her payoff u(ω, a) depends on her action a and ω, the realization of W . This decision
problem is monotone if observing a higher signal realization induces a higher action.1
A family of decision-makers is deÞned by (i) a set of possible priors, and (ii) a set of possible
payoff functions. We consider sets of payoff functions that are alike in how the incremental returns
to higher actions change with ω (e.g. the set of supermodular payoff functions, for which the returns
to increasing the action are nondecreasing in ω). A signal X leads to a monotone decision problem
for each agent in the family if the posterior beliefs about W induced by observing X can be ranked
in an appropriate stochastic order. The stochastic order is chosen so that higher posterior beliefs
(in the given order) induce all agents in the family to choose higher actions. For the family of
supermodular payoff functions with a given prior, the restriction is satisÞed if the posterior beliefs
induced by X can be ordered by Þrst-order stochastic dominance (FOSD).
1Karlin and Rubin (1956) introduced the term monotone decision problem to describe decision problems where
the payoff function has a single-crossing property and the density of the signal conditional on the state of the world
satisÞes the monotone likelihood ratio property (see also Lehmann, 1988). In this paper, we use the term more
broadly, to describe any decision problem where higher signal realizations induce higher actions.
1
Now, consider two signals, X and X 0, that lead to monotone decision problems for each agent
in a given family of decision-makers. We look for conditions under which, for all decision-makers
in the family, observing X 0 is more valuable than observing X. Our answer, stated roughly, is that
X 0 is more informative than X for these decision-makers if on average the posteriors induced
by high realizations of X 0 are higher than the posteriors induced by high realizations of X, and
conversely for low realizations. Here, posteriors are higher or lower in the appropriate stochastic
order, e.g. FOSD for the family with supermodular payoff functions. Intuitively, better information
allows for a more accurate match between beliefs and actions.
Our second result concerns relative demands for information. Extending the techniques from
our Þrst result, we provide conditions under which two payoff functions u and v can be ranked
in terms of their marginal value for better information, and show how this result can be used to
obtain comparative statics results concerning the demand for information.
Our results extend a line of inquiry begun by Lehmann (1988) and continued by Persico (2000),
who studied a speciÞc class of monotone decision problems introduced by Karlin and Rubin (1956).
In such problems, the payoff function has a single-crossing property and posterior beliefs have the
monotone likelihood ratio property. Lehmann (1988) characterized an informativeness ordering
for these problems, while Persico (2000) identiÞed when one payoff function has a higher marginal
value for information than another. Our result on information demand generalizes Persicos to many
other sets of payoff functions. Our methods differ from Lehmann (1988), who exploited the sign-
preservation properties of distributions with the monotone likelihood ratio property. Nevertheless,
our results are related, and we derive his ordering as a by-product of our analysis. In Section
5, we show a close connection between his ordering for single crossing decision problems and our
informativeness condition for supermodular problems. These two classes of problems often play a
central role in information-theoretic modeling.
More broadly, our analysis builds on the classic work of Blackwell (1951, 1953). Blackwell
introduced a partial order over signals, showing that a signal X 0 is more valuable than a signal
X for every decision problem if and only if X 0 is statistically sufficient for X. Our problem is
different in that we seek notions of valuable information that are tailored to speciÞc types of
economic contexts, rather than a general statistical property that holds across all environments. In
addition to showing how a problems structure can be incorporated to order signals in a manner less
restrictive than sufficiency, our results can be interpreted as establishing consequences of statistical
sufficiency for speciÞc classes of decision problems, consequences that may be useful for further
analysis (e.g. comparative statics on information acquisition or market equilibria).
2
2 Monotone Decision Problems
2.1 The Set-Up
The stochastic environment is composed of some unknown state of the world W , with typical
realization ω ∈ Ω ⊂ R, and a signalX with typical realization x ∈ X ⊂ R. Given a prior, H ∈ ∆(Ω),the distribution of the signal induces a joint distribution over signals and states, F : Ω×X → [0, 1],
referred to as the information structure. Let FX(·|ω) be the signal distribution conditional onW = ω, and let FX(·) be the marginal distribution of the signal, FX(x) = EW [FX(x|W )], computedusing the prior distribution on W . Let FW (·|x) denote the conditional distribution of W given
X = x, that is, the agents posterior belief after observing a realization X = x. Of course, posterior
beliefs must be consistent with the prior: for all ω ∈ Ω, EX [FW (ω|X)] = H(ω).We use f, fW , andfX to denote the corresponding probability mass functions or densities.
After observing the signal realization, the decision-maker chooses an action a ∈ A, where A is acompact subset of R. She has a payoff function u : Ω×A→ R. We assume throughout that payofffunctions are continuous in a. The ex ante value of the decision problem hF, ui is:
V (F, u) = EX·maxa∈A
ZΩu(ω, a)dFW (ω|X)
¸. (1)
We use α∗(x) to denote an optimal decision rule that achieves this ex ante value.
A family of decision-makers is deÞned by a pair of sets (Λ, U), where Λ ⊆ ∆(Ω), and U is a setof payoff functions u : Ω×A→ R. Each member of the family has a prior belief H ∈ Λ and a payofffunction u ∈ U . We ask when a signal X 0 is preferred to another signal X for all decision-makers
in a given family (Λ, U). Observe that expanding either the set of allowable priors or the set of
allowable payoff functions will in general lead to a more restrictive ordering on signals. For this
reason, we initially take the set of prior beliefs be an arbitrary singleton, H, and analyze orderings
over signals taking the prior as given. This allows us to focus on the effects of considering different
sets of payoff functions. It is also natural in economic contexts where agents may have objective
information about the prior distribution (i.e., the distribution of worker abilities in a population
is known). In Section 5.2, we analyze conditions under which enlarging the set of priors leads to
more restrictive orderings.
2.2 Monotone Decision Problems
A decision problem is monotone if it has an optimal decision rule α∗(x) that is monotone. We
Þrst characterize monotonicity in terms of a relationship between the payoff functions incremental
return to higher action and posterior beliefs. The next section provides examples.
3
Let R = g : Ω→ R, g bounded, measurable. We can then deÞne:
DeÞnition 1 Given R ⊂ R, a payoff function u has R-incremental returns if for any a0 > a, theincremental return function r(ω) = u(ω, a0)− u(ω, a) ∈ R.
Thus, any set R ⊂ R deÞnes a set of payoff functions, letting UR be the set of payoff functions
with R-incremental returns. As an example, suppose R is the set of nondecreasing functions. Then
UR is the set of supermodular payoff functions u(ω, a) (subject to the additional restriction that
for all a0 > a, u(·, a0)− u(·, a) ∈ R).In addition to deÞning a set of payoff functions, a set R ⊂ R induces a stochastic order on ∆(Ω)
(e.g. on posterior beliefs) as follows. For P,Q ∈ ∆(Ω), write Q ÂR P if
∀r ∈ R :ZΩr(ω)dP (ω) ≥ 0⇒
ZΩr(ω)dQ(ω) ≥ 0.
This stochastic order, ÂR, represents a notion of single crossing (where we say that a functiong : R → R satisÞes single crossing if g(x) ≥ 0 implies g(x0) ≥ 0 for all x0 > x).2 In principle, ÂRis weaker than standard stochastic dominance, which requires that for all r ∈ R, R rdQ ≥ R rdP .However, for many cases of interest, the two are equivalent (Lemma 2, below). For instance, when
R is the set of nondecreasing functions, Q ÂR P means exactly that Qdominates P in the sense ofFOSD. Our next deÞnition uses this order.
DeÞnition 2 Given R ⊂ R, an information structure F is R-ordered if ÂR is a complete orderon FW (·|x)x∈X , that is, for any x0 > x, FW (·|x0) ÂR FW (·|x).
Our Þrst Lemma shows that if a decision-maker with payoff function u ∈ UR is faced with anR-ordered information structure, she has an optimal decision rule that is monotone.
Lemma 1 Given R ⊂ R, if u has R-incremental returns, and F is R-ordered, then there exists a
function α∗(x) ∈ argmax RΩ u(ω, a)dFW (ω|x) that is nondecreasing in x.Proof. DeÞne U(x, a) =
RΩ u(ω, a)dFW (ω|x). Then if a0 > a, by the R-order, U(x, a0)−U(x, a) ≥ 0
implies U(x0, a0)−U(x0, a) ≥ 0 for any x0 > x. This implies Shannons (1995) weak single crossingproperty, from which the result follows. Q.E.D.
A family of decision-makers (Λ, UR) induces a class of monotone decision problems, where each
decision problem is deÞned by a decision-maker (H,u) ∈ (Λ, UR) and a signal X, and each decisionproblem is monotone. Until Section 5.2, we Þx Λ = H and so refer to a family of decision-makersby the set of payoff functions UR. Then, the induced class of monotone decision problems admits
all signals X where the corresponding information structure F is R-ordered.2 There are a variety of alternative (i.e., weak and strong) notions of single crossing; see Milgrom-Shannon (1994)
or Shannon (1995).
4
2.2.1 A Characterization Lemma
In what follows, it is essential to have a clean characterization of the stochastic order introduced
above. To obtain this, we introduce a condition on the sets R ⊂ R that we use to deÞne setsof payoff functions. Although it places some structure on the sets of payoff functions we might
consider, it does not restrict the analysis signiÞcantly in terms of examples of interest.
Condition 1 (i) R is a closed convex cone; (ii) there exists some r : Ω→ R such that r,−r ∈ R,and either (a) r>0 on a set of positive Lebesgue measure, or (b) r = 1ω0 for some ω0 ∈ Ω, andfurther for some N>0, r+n=1[ω0,ω0+1/n) ∈ R and r−n=−1(ω0−1/n,ω0] ∈ R for all n>N.3
If R is the set of nondecreasing functions (or, for instance, the set of concave functions), Con-
dition 1 is satisÞed with r,−r as the constant functions: r ≡ 1, −r ≡ −1. Part (ii)(b) is useful foranalyzing payoff functions that satisfy single crossing (such as portfolio problems), as discussed in
the next section. Condition 1 implies that if u ∈ UR, then adding a beneÞt Kar(ω) to each actiona ∈ A results in a new payoff function that is also in UR.
When r = 1ω0, it will be helpful to refer to a second condition:
Condition 2 Given P ∈ ∆(Ω) and a set R satisfying Condition 1, if r = 1ω0, then either (a) ω0
is a mass point of P, or else (b) the density p is continuous at ω0.
Throughout, we abuse notation by writing EW [r(W )] in place of limn→∞ nR
1[ω0,ω0+1/n)dP =
p(ω0) when part (b) of Condition 2 is satisÞed.4 We now observe that when Conditions 1 and 2
are satisÞed, the stochastic order ÂR admits an alternative representation.5
Lemma 2 Given R ⊂ R and P,Q ∈ ∆(Ω), such that (R,P ) and (R,Q) satisfy Conditions 1 and2, then Q ÂR P if and only if
∀r ∈ R,ZΩrdQ ≥ λ
ZΩrdP (2)
for some λ ≥ 0, where λ = ¡R rdQ¢ / ¡R rdP¢ if R rdP > 0.Proof. Suppose Q ÂR P , and consider the problem of choosing r ∈ R to minimize
RrdQ subject to
the constraint thatRrdP ≥ 0. The minimized value must be nonnegative. Moreover, there is some
3 This condition can be generalized to allow for other functions r; for example, we could require that there existtwo sequences of elements of R, r+n and r−n , where each element has positive Lebesgue measure and further,r = limn→∞ r+n=− limn→∞ r−n . We focus on the more speciÞc condition for simplicity.
4 Although the sequence n1[ω0,ω0+1/n) is unbounded, we do not require each member of the sequence to be in R;each time this construction is used, we will be working with inequalities of the form
RrdQ ≥ λ R rdP, so that we can
multiply both sides by n.5 This approach borrows from Jewitt (1986), Gollier and Kimball (1995), and Athey (1998a).
5
λ ≥ 0 such that this linear program is equivalent to choosing r ∈ R to minimize R rdQ− λ R rdP .Thus, Q ÂR P implies
RrdQ ≥ λ
RrdP for all r ∈ R. But if r,−r ∈ R (and using Condition 2
(ii)(b) if needed), we must have λ =¡RrdQ
¢/¡RrdP
¢. The other direction is immediate. Q.E.D.
Note that if R contains the constant functions (so we can choose r ≡ 1), and Q,P are probabilitydistributions, then λ = 1, and ÂR coincides with standard stochastic dominance.
2.3 Examples
We now describe some sets of payoff functions, and the induced classes of monotone decision
problems, captured by our framework.
1. Payoff functions with nondecreasing incremental returns. A payoff function u(ω, a) with non-
decreasing incremental returns is supermodular in (ω, a). The property that the marginal returns
to action are nondecreasing in some unknown variable is a common feature of economic problems.
To see two simple examples, consider an oligopolistic Þrm choosing an output plan subject to
uncertainty about marginal cost reductions; or, consider a competitive Þrm investing in process
innovation to lower marginal cost, when the market-clearing price is unknown. Supermodularity
also arises in research and development problems, welfare economics, and coordination games and
search models (see Milgrom and Roberts (1990), and the recent books by Topkis (1998) and Cooper
(1999) for many examples). Often in such problems, the marginal cost of acting is a parameter
(to allow for varying taxes, subsidies, interest rates, production technologies, etc.); thus, payoff
functions of the form v(ω, a)− c(a) are considered for a rich family of functions c.To derive the stochastic order for the set of nondecreasing functions RND, take r = 1, and
observe that in Lemma 2 λ =¡RdQ¢/¡RdP¢= 1. Thus, ÂRND is FOSD, and so if posterior beliefs
are ordered by FOSD, higher signal realizations will lead to higher actions when u is supermodular.
2. Payoff functions with concave incremental returns. Various investment problems with risk-
aversion have the property that the incremental returns to higher investment are concave in the
unknown asset returns. For instance, in the classic portfolio problem, a Þxed endowment e must
be allocated between a risky asset (e.g. a market portfolio) with return W and a safe asset with
known return ω0. If the investor has a utility function v over wealth, and invests a in the risky
asset, then u(ω, a) = v(aω + (e− a)ω0). Her payoff function has concave incremental returns as afunction of ω if v has constant or increasing relative risk-aversion.
Let RCV denote concave functions. Applying Lemma 2 (where r ≡ 1 again implies that λ = 1),ÂRCV coincides with second order stochastic dominance (SOSD) (Rothschild and Stiglitz, 1970).Thus, the investor described above will invest more in the risky asset when she believes asset returns
to be less risky (by SOSD).
6
3. Payoff functions with incremental returns that cross zero at ω = ω0. Payoff functions
with this property arise in investment theory (Athey, 1998a). In the standard portfolio problem
described in the previous example, the incremental returns to investing are negative for ω < ω0,
and positive for ω > ω0. Formally, we generate a general set of payoff functions with this property
by considering RSC(ω0) = r : Ω→ R : r(ω) ≤ 0 if ω < ω0 and r(ω) ≥ 0 if ω > ω0.The stochastic order, ÂRSC(ω0) , for this set is a weakening of the monotone likelihood ratio
property (MLRP) (where fW has the MLRP if for x0 > x, fW (·|x0)/fW (·|x) is a nondecreasingfunction). To apply Lemma 2, we take r = 1ω=ω0 and use Condition 2. Then, if fW (ω0|x) > 0for all x, we Þnd that FW (·|x0) ÂR FW (·|x) if fW (·|x0) − [fW (ω0|x0)/fW (ω0|x)] fW (·|x) is singlecrossing (or, more intuitively, if fW (·|x) is positive on Ω, fW (·|x0)/ fW (·|x) − fW (ω0|x0)/fW (ω0|x)is single crossing).6
4. Payoff functions with single-crossing incremental returns. A payoff function u has single
crossing incremental returns if the incremental returns to higher action are single crossing as a
function of ω. This is the class of problems studied by Lehmann (1988) and Persico (2000). For
example, if a Þrm chooses price to maximize its expectation of u(p,ω) = (p− c)D(p,ω), then thepayoff function u(p,ω) will have single crossing incremental returns if an increase in ω reduces
demand elasticity (and more generally, a sufficient condition when u is positive is that ln(u) is
supermodular). Single crossing incremental returns also arises in auction models. See Milgrom and
Shannon (1994) and Athey (1998b) for further examples. Formally, deÞne RSC as follows: r ∈ RSCis there is some ω0 such that r ∈ RSC(ω0). While RSC does not satisfy our Condition 1, we cananalyze it indirectly by considering it as the union over all sets RSC(ω0 such that ω0 ∈ Ω. Ifposterior densities or mass functions are positive throughout Ω, by requiring that ÂRSC(ω0) holdsfor all ω0, we Þnd that the order ÂR induced by RSC is the MLRP.7
We may also study other sets of payoff functions. Below, we remark on a class of payoff func-
tions that arise in the study of Þrst-price auctions, that lie between supermodular payoff functions
(example 1) and single-crossing payoff functions (example 4). We might also combine conditions
(i.e. nondecreasing and concave) or place restrictions on higher-order derivatives of r. For ex-
6 Although it may seem restrictive to assume f(ω0|x) > 0, it does not add much beyond the R-order for RSC(ω0),so long as h(ω0) > 0. (We can take ω0 in the support of H without loss of generality, but we still must assume apositive density.) To see this, suppose in contrast that f(ω0|x) = 0 for some x. This implies (using the R-order) thatf(ω0|x0) = 0 for all x0 > x; but in turn, that implies (using Lemma 2 and the fact that f(ω0|x00) > 0 for some x00)that EW [r(ω)|X = x0] ≥ 0 for all r ∈ R and x0 > x. Thus, for each x0 > x, the support of f(ω|x0) is greater thanω0, and every decision-maker in UR should take the highest possible action, so that there is no additional value todistinguishing among signals greater than x. As Athey (1998b) shows, if R satisÞes stronger notions of single crossing,the R-order imposes weaker restrictions on how the support changes with x, but we focus on RSC(ω0) because thisset is a closed convex cone.
7 The analysis generalizes if densities are not always positive; see Athey (1998a, 1998b) for details.
7
ample, if R is the set of affine functions, ÂR compares distribution means, while if R contains
quadratic functions, ÂR is a mean-variance order (which could be used for investment problemswith mean-variance preferences over gambles).
3 Monotone Information Orders
We now derive conditions under which for all decision-makers in a family¡H, UR¢, the signal
X 0 will be preferred to the signal X given that the corresponding information structures F 0 and
F are R-ordered. That is, we provide conditions under which V (F 0, u) ≥ V (F, u) for every payofffunction u with R-incremental returns.
As a prelude, we recall Blackwells approach to informativeness. His approach proceeded in
three steps. First, he observed that for any payoff function u, if we deÞne u : ∆(Ω)→ R according
to u(P ) = maxa∈ARΩ u(ω, a)dP (ω), u is convex in the posterior, P .
8 Second, when V (F, u) is
formulated as in (1), the problem of comparing X and X 0 can be framed as a problem of comparing
the distributions over posteriors (denoted µ(·) and µ0(·)) generated by the two signals. Thus, ifRu(P )dµ0(P ) ≥ R
u(P )dµ(P ) for all convex functions u : ∆(Ω) → R, µ0 dominates µ accordingto the convex stochastic dominance order, and any agent will prefer X 0 to X. Third, he gave a
characterization of the stochastic dominance order in terms of the signals X 0 and X, showing that
the order is equivalent to requiring that X is a garbling of X 0.
In this section, we take a similar approach, in that we Þrst consider the consequences of opti-
mality (in this case, monotone policies) together with properties of the payoff function, and then
derive an ordering based on comparisons between distributions over posteriors. In so doing, we
relax the stringent requirements of the convex stochastic dominance order. We return to further
characterizations of the order over posteriors (analogous to Blackwells third step) in Section 5.
3.1 Sufficient Conditions
We group together a set of conditions that will be referred to throughout.
Condition 3 Given R ∈ R, a prior H, and a signal X, (i) R satisÞes Condition 1, and (ii) for
each x ∈ X , FW (·|X ≤ x) satisÞes Condition 2, and EW [r(W )|X = x] > 0.9
8 Convexity requires maxa∈ARΩu(ω, a)d(γP+(1γ)Q)≤ γmaxa∈A
RΩu(ω, a)dP+(1γ) maxa∈A
RΩu(ω, a)dQ,
which follows by linearity of the integral and because the agent can do better by optimizing for each posteriorrather than choosing the optimal action for the convex combination of posteriors.
9 When r = 1ω0 , this requires f(ω0|x)>0 for all x; see footnote 6 for further discussion.
8
Given a prior H, a signal X, and R ∈ R satisfying Condition 3, we now deÞne an indexing
function that identiÞes each signal realization with a real number between zero and one. Let10
T (x) = Pr(X ≤ x)EW [r(W )|X ≤ x]EW [r(W )]
.
Observe that T is a nondecreasing function mapping X → [0, 1]. An indexing function T 0 can be
similarly deÞned for an alternative information structure F 0. Then, T (X) and T 0(X 0) are signals
with the same information content as X and X 0.11
The choice of this indexing function will play a central role in our derivations, in particular
in the derivation of necessary conditions. However, it will be easier to describe how T is derived
after we have shown how it is used. For the moment, we simply observe that if R contains the
constant functions, T (x) ≡ FX(x) signal realizations are indexed by their ex ante percentile.
If R is the set of functions that are single crossing at ω0, T (x) ≡ FX(x|ω0) signal realizations
are indexed by their percentile conditional on W = ω0. Below, we consider an example where
R is the set of functions that are nondecreasing for ω < ω0 and zero thereafter, in which case
T (x) ≡ FX(x|W ≤ ω0).
Theorem 1 Consider R ⊂ R, a prior H, and two signals X and X 0 satisfying Condition 3, where
the corresponding information structures F and F 0 are R-ordered. If
∀z ∈ [0, 1] : F 0W (·|T 0(X 0) ≥ z) ÂR FW (·|T (X) ≥ z), (MIO)
then V (F 0, u) ≥ V (F, u) for every u with R -incremental returns.
If the conditions of Theorem 1 are satisÞed, we say that X 0 is more informative than X for all
decision-makers in the family¡H, UR¢, or that F 0 ÂMIO−R F.
The (MIO) condition compares averages of posterior beliefs, where the averages are computed
according to our indexing function. Because the average of all posteriors is just the prior, (MIO)
is equivalent to saying that:
∀z ∈ [0, 1] : F 0W (·|T 0(X 0) ≤ z) ≺R FW (·|T (X) ≤ z).
Thus, our informativeness criterion requires that the high posteriors under F 0 be, on average, higher
than the high posteriors under F , and the low posteriors be, on average, lower where high
10 If there is more than one r satisfying Condition 1, simply choose one. The choice will not affect the results that
follow.11 T is strictly increasing at x unless Pr(X = x) = 0, in which case no information is lost. Moreover, following
Lehmann (1988), X can be transformed into an information-equivalent X∗ such that T ∗ is continuous and onto.
9
and low refer to the stochastic order ÂR induced by R. Put simply, informativeness correspondsto posterior beliefs being more spread out in a given stochastic order.
Proof of Theorem 1. The proof proceeds in two steps: Þrst, we apply Lemma 2 to obtain an
alternative characterization of (MIO); second, we show that for any monotone policy based on
observing X, there is an alternative policy based on observing X 0 that does at least as well.
Step 1: Fix z ∈ [0, 1]. By Lemma 2, (MIO) is equivalent to
∀r ∈ R :
ZΩr(ω)dF 0W (ω|T 0(X 0) ≥ z) ≥ λz
ZΩr(ω)dFW (ω|T (X) ≥ z),
where λz =
Rr(ω)dF 0W (ω|T 0(X 0) ≥ z)Rr(ω)dFW (ω|T (X) ≥ z) =
Pr(T (X) ≥ z)Pr(T 0(X 0) ≥ z) .
The second equality follows from our deÞnition of T, T 0 (observe that if r ≡ 1, λz = 1). Thus (MIO)holds if and only if for all z ∈ [0, 1] and r ∈ R,
EW£r(W ) | T 0(X 0) ≥ z¤Pr ¡T 0(X 0) ≥ z¢ ≥ EW [r(W ) | T (X) ≥ z] Pr (T (X) ≥ z) . (3)
Step 2: Suppose F, F 0 are R-ordered and that (MIO) holds. And suppose A = a1, ..., an isÞnite. Consider an arbitrary payoff function u with R-incremental returns. We can write u(ω, ak) =
u(ω, a1) +Pki=2 ri(ω), where ri(ω) = u(ω, ai)− u(ω, ai−1) ∈ R.
Consider an arbitrary monotone policy α : X → A for use with F . It is deÞned by a set of
cut points: x1 ≤ x2 ≤ ... ≤ xn+1, with α(x) = ai when xi < x < xi+1, and α(x) = maxj:xj=x ajwhen x = xi for some i. The cut points also can be represented by their indices zin+1i=1 where
zi ≡ T (xi). The ex-ante payoff using this policy with F is:
V (F, u,α) =
ZX
ZΩu(ω,α(x))dFW (ω, x)
= EW [u(W,a1)] +nXi=2
EW [ri(W ) | T (X) ≥ zi] Pr(T (X) ≥ zi). (4)
We now construct an alternative policy α0 for use with F 0 that does better than the policy
α used in conjunction with F . This new policy is given by cut points x0i, where we deÞneT 0(x0i) = T (xi) = zi for all i = 1, ..., n+ 1. Then, e.g., α
0(x) = ai when x0i < x < x0i+1. The payoff
to this new policy under F 0 is:
V (F 0, u,α0) = EW [u(W,a1)] +nXi=2
EW£ri(W ) | T 0(X 0) ≥ zi
¤Pr(T 0(X 0) ≥ zi). (5)
It follows immediately from (3) that if (MIO) holds, then V (F 0, u,α0) ≥ V (F, u,α). Since there isa monotone policy under F that is optimal, we are done.
10
The case where A is compact follows from a limiting argument. Any monotone policy α(x)
used under F can be approximated by a sequence of step functions α1(x),α2(x), ... converging
to α(x), and for each step function, we can construct a policy αk0(x) for use with F 0 such that
V (F 0, u,αk0) ≥ V (F, u,αk). Moreover, αk0(x)→ α0(x), for some monotone policy α0(x). Since u is
continuous in a, it follows that V (F 0, u,α0) ≥ V (F, u,α). Q.E.D.
The proof shows that for any monotone policy α : X → A for use with F , there is a policy α0 :
X → A for use with F 0 that achieves a higher ex ante expected payoff, where α0(x) = α(T 0−1(T (x))).
The key to the result is the transformation T 0−1(T ) : X → X (i.e. the indexing functions T, T 0),
which plays two important and somewhat subtle roles.
First, the reader may note from the proof that a sufficient condition for X 0 to be more informa-
tive that X is that there is some transformation Φ : X → X such that (3) holds with Φ in place of
T and the identity in place of T 0 (in which case the policy α(Φ(·)) used with X 0 does better than
the policy α(·) used with X). In general, however, (3) bears no relationship to a standard stochas-tic dominance relation, nor is it intuitive or easy to check. Thus, one role of the transformations
T 0, T is to allow us to move directly between the payoff comparison (3) and an easily interpretable
stochastic dominance relation (MIO).
Second, and more subtly, observe that the proof of Theorem 1 does not appeal to the optimality
of the initial policy α, only to its monotonicity. We show below, however, that (MIO) (or equiv-
alently (3) is both sufficient and necessary for X 0 to be more informative than X for the relevant
class of decision problems. Remarkably, this is not true for any transformation Φ : X → X other
than T 0−1(T ). That is, the analogue of (3) is sufficient but not necessary for X 0 to be more infor-
mative than X. Another way to say this is that our choice of indexing function has the very special
property that it fully incorporates an optimality restriction into a standard stochastic dominance
relation (MIO). This point is discussed in more detail following Theorem 2.
3.2 Examples
1. Payoff functions with nondecreasing incremental returns. If a decision-makers incremental
returns are nondecreasing in the state variable, she wants to match high actions with high beliefs
about W , and low actions with low beliefs. All such decision-makers with prior H will prefer a
signal X 0 to another X, given that the induced posterior beliefs can be ordered by FOSD, if for all
z ∈ [0, 1], F 0W (·|F 0X(X 0) ≥ z) dominates FW (·|FX(X) ≥ z) by FOSD that is, if high realizations
of X 0 induce, on average, higher beliefs (in the sense of FOSD) than high realizations of X.12
12 It can further be shown that in this case, (MIO) is equivalent to requiring that for all (ω, x) ∈ (Ω,X ),F 0(ω, F 0X(x)) ≥ F (ω, FX(x)). This, in turn, is equivalent to the supermodular stochastic order (see Meyer (1990)
11
Intuitively, this allows for a better match between actions and beliefs.
The following example illustrates this idea. Fix a prior H ∈ ∆(Ω) with density h and deÞne,for θ ∈ £0, θ¤, a signal Xθ with support [0, 1] such that (W,Xθ) have joint density:
fθ(ω, x) = h(ω) + θk(ω)l(x),
where k : Ω→ R and l : [0, 1]→ R are bounded increasing functions withRΩ k(ω)dω =
R 10 l(x)dx =
0, and fθ ≥ 0. Note that this example is constructed so that T θ(x) = F θX(x) = x. Further, for
any θ, the posteriors of W |Xθ = x are ordered by FOSD (typically, they will not have the MLRP).
Finally, an increase in θ makes Xθ more informative for supermodular decision-makers. However, it
will generally not be the case that Xθ0 is sufficient for Xθ when θ0 > θ, or even that Xθ0 dominates
Xθ in Lehmanns order (deÞned below in (12)). To better understand this example, observe that
for high (low) values of x, l(x) > (<)0, and an increase in θ puts more positive (negative) weight
on the nondecreasing function k, thus shifting probability weight towards (away from) high values
of ω.
Below, in Section 5, we provide further characterizations of this information order in terms of
marginal-preserving spreads.
2. Payoff functions with concave incremental returns. A decision-maker with incremental re-
turns that are concave as a function of the state variable wants to match high actions with less risky
beliefs. So X 0 is more informative than X under (MIO) if, for every z ∈ [0, 1], F 0W (·|F 0X(X 0) ≥ z)dominates FW (·|FX(X) ≥ z) by SOSD. That is, high realizations of the signal lead, on average, toless risky posteriors under F 0 than F , and low realizations of the signal lead to more risky posteriors.
Again, (MIO) ensures a better match between actions and belief.
The following example illustrates the idea. Fix a prior H ∈ ∆(Ω) with density h and deÞne, forθ ∈ £0, θ¤, a signal Xθ with support [0, 1] such that (W,Xθ) have joint density:
fθ(ω, x) = h(ω) + θk(ω)l(x),
where again l : [0, 1] → R is a bounded increasing function withR 10 l(x)dx = 0, and k : Ω → R
is a bounded function that satisÞesRΩ k(ω)dω =
RΩ ωk(ω)dω = 0 and
R t−∞
³R s−∞ k(ω)dω
´ds ≤ 0.
Again the example is constructed so that T θ(x) = F θX(x) = x. The function k centers mass
around the mean of W , so for any θ, a higher realization of Xθ leads to a posterior belief that is
higher in the sense of second order stochastic dominance. Similarly, an increase in θ means makes
Xθ more informative for decision-makers whose payoff functions have concave marginal returns.
Again, Xθ0 need not be sufficient for Xθ when θ0 > θ. To better understand this example, observe
and Shaked and Shantikumar (1997)) for the comparison between (W,F 0X(X0)) and (W,FX(X)) .
12
that for high (low) values of x, l(x) > (<)0, and an increase in θ puts more positive (negative)
weight on the centering function k, thus shifting probability weight towards (away from) central
values of ω.
Below, in Section 5, we provide further characterizations of this information order in terms of
mean and marginal-preserving spreads.
3.3 Necessary Conditions for Smoothly Parameterized Families
In this section, we demonstrate that (MIO) is not just a sufficient condition for X 0 to be more infor-
mative than X for a given class of monotone decision problems, it is also a necessary condition for
small changes in the signal structure (i.e., for making comparisons along a smoothly parameterized
family of signals).
A family of signals Xθθ∈Θ⊂R, is smoothly parametrized if, for all ω ∈ Ω, and x ∈ X , F θX(x|ω)is continuously differentiable in θ.
Theorem 2 Consider R ⊂ R, a prior H, and a smoothly parametrized family of signals Xθθ∈Θsatisfying Condition 3, where for each θ, the information structure F θ is R-ordered. Then ∂
∂θV (Fθ, u) ≥
0 for every u with R-incremental returns if and only if (MIO) holds for F 0 = F θ+dθ and F = F θ.
The only if proof proceeds in two steps. First, we use the envelope theorem to show that a
comparison between F 0 = F θ+dθ and F = F θ can be made holding the optimal decision rule Þxed.
We then construct a subset of decision-makers in¡H, UR¢ with the property that X 0 is preferred
to X for all of these decision-makers exactly when (MIO) holds.
Proof. The if direction is a consequence of Theorem 1. We prove the only if direction.
Step 1: We apply the envelope theorem. It is useful to Þrst transform the signals using the
indexing functions, and to introduce notation for the joint distribution over states and indexed
signals. For each θ, let Zθ ≡ T θ(Xθ), which is identical to Xθ in information content. Then, W
and Zθ have joint distribution Gθ : Ω× [0, 1]→ [0, 1] with Gθ(ω, z) ≡ F θ(ω, T θ−1(z)), and
V (F θ, u) = EZθ·maxa∈A
ZΩu(ω, a)dGθW (ω|Zθ)
¸.
Denoting the optimal policy αθ(z), and applying the envelope theorem:
∂
∂θV (F θ, u) =
Z[0,1]
ZΩu(ω,αθ(z))d
½∂
∂θGθ(ω, z)
¾. (6)
For small changes in θ, we can hold the optimal policy Þxed, and consider the change in expected
payoffs due to the change in the joint distribution over signals and states.
13
Step 2: Let r ∈ R and z ∈ [0, 1] be given. DeÞne a payoff function urz such that urz(ω, a)=0if a < a, while urz(ω, a) = r(ω) −Krzr(ω) if a ≥ a, where Krz = E
£r(ω)|Zθ¤ ±E
£r(ω)|Zθ¤ is a
Þxed constant. If r = 1ω0, deÞne a series of such payoff functions corresponding to r+n , r
−n as in
Condition 1. Each urz has R-incremental returns, and the optimal policy for the decision problemurz,G
θ®is to choose a ≥ a if and only if Zθ ≥ z. So
V (F θ, urz) = EW [r(W )−Krzr(W )|Zθ ≥ z] Pr(Zθ ≥ z).
Our deÞnition of the indexing function implies that EW [ r(W )|Zθ ≥ z] Pr(Zθ ≥ z) is constant in θ.Thus, a necessary condition for ∂
∂θV (Fθ, urz) ≥ 0 for all such urz is that for all r ∈ R and z ∈ [0, 1],
EWhr(W ) |T θ+dθ(Xθ+dθ) ≥ z
iPr(T θ+dθ(Xθ+dθ) ≥ z) ≥ EW
hr(W ) |T θ(Xθ) ≥ z
iPr(T θ(Xθ) ≥ z).
(7)
The proof of Theorem 1 (Step 1) shows that this is equivalent to the requirement that (MIO) holds
for F 0 = F θ+dθ and F = F θ. Q.E.D.
Again, we remark on the crucial role of the indexing functions T θ. The proof of Theorem 2 shows
that for two close signals, (7) (equivalently (MIO)) is a necessary condition for ∂∂θV (F
θ, u) ≥ 0.Observe that (7) is the same condition as the payoff comparison (3) in the proof if Theorem 1.
Earlier, we observed that an analogue of (7) with some arbitrary transformation Φ : X → X in
place of T θ and the identity in place of T θ+dθ would be a sufficient condition for ∂∂θV (F
θ, u) ≥ 0.However, it is not necessary.
To see this, suppose we had used an alternative set of increasing onto functions Sθ : X → [0, 1] to
index signals (instead of T θ). Mimicking the above proof, a necessary condition for ∂∂θV (F
θ, u) ≥ 0for all u ∈ UR would be that (7) hold (with Sθ in place of T θ) for all z ∈ [0, 1] and some subset ofincremental return functions Rz ⊂ R. SpeciÞcally, for each z, Rz contains all functions r ∈ R forwhich there exists a payoff function u ∈ UR such that the optimal policy is to switch actions at zand the payoff conditional on using this policy is E[r|Sθ(Xθ) ≥ z] Pr(Sθ(Xθ) ≥ z). Thus, making atight ranking of information structures for any comparison of signals Φ : X → X other than the onebased on our indexing functions, requires explicitly incorporating a separate optimality condition for
each payoff function in the relevant class! In contrast, by using T θ-averages to compare information
structures, we guarantee that for every z, Rz = R, so the necessary (and sufficient) condition for∂∂θV (F
θ, u) ≥ 0 is simply the stochastic dominance relation (MIO).Remark. We can be somewhat more mathematically precise about the last comment. Fix some
arbitrary indexing functions Sθ. A necessary condition for ∂∂θV (F
θ, u) ≥ 0 for all u ∈ UR is that
14
for all z ∈ [0, 1],13
0 ≤ L(z) = infr∈R
(E£r | Sθ+dθ(Xθ+dθ) ≥ z¤Pr ¡Sθ+dθ(Xθ+dθ) ≥ z¢−E
£r | Sθ(Xθ) ≥ z¤Pr ¡Sθ(Xθ) ≥ z¢
)s.t. E
£r | Sθ(Xθ) = z
¤= 0.
(8)
By standard methods, L(z) ≥ 0 if and only if there exists λ(z) ≥ 0 such that, for all r ∈ R,(E£r | Sθ+dθ(Xθ+dθ) ≥ z¤Pr ¡Sθ+dθ(Xθ+dθ) ≥ z¢−E
£r | Sθ(Xθ) ≥ z¤Pr ¡Sθ(Xθ) ≥ z¢
)≥ λ(z)·E
hr | Sθ(Xθ) = z
i. (9)
Under Condition 3,
λ(z) =
(E£r | Sθ+dθ(Xθ+dθ) ≥ z¤Pr ¡Sθ+dθ(Xθ+dθ) ≥ z¢−E
£r | Sθ(Xθ) ≥ z¤Pr ¡Sθ(Xθ) ≥ z¢
),Ehr | Sθ(Xθ) = z
i.
We can think of λ(z) as the shadow value of the optimality constraint in (8). For an arbitrary
transformation of signals Sθ, λ(z) is positive the optimality constraint is binding. In other words,
a tight comparison of Xθ+dθ and Xθ with respect to the indexing Sθ+dθ, Sθ must explicitly account
for the fact that the decision-maker will always use an optimal policy. However, by comparing
Xθ+dθ and Xθ with respect to the indexing T θ+dθ, T θ, we guarantee that λ(z) = 0 for all z. The
fact that decision-makers will optimize is relevant for the information comparison only to the extent
that it ensure they will use monotone policies. This is the key to obtaining a clean and meaningful
characterization of information value in monotone problems.
3.4 Global Necessity for Monotone Testing Problems
Theorem 2, which establishes that our informativeness criterion is necessary for all payoff functions
with R-incremental returns to prefer F 0 to F , is a local result that applies to marginal changes in
information. Lehmann (1988) was able to show a somewhat different necessity result for the class of
problems he studied. He showed that his efficiency criteria was globally necessary for improvement
in a class of monotone hypothesis testing problems.14 We now demonstrate that our information
orders have an analogous property.
13 To see this, note that if r ∈ R, then E[r | Sθ(Xθ) = z] is single crossing in z. So if E[r | Sθ(Xθ) = z] = 0, deÞner,Krz such that r−Krzr = r to obtain a payoff function urz as in the above proof. The objective here is the changein payoff for this urz when θ increases.14 Lehmanns (1988) order is necessary for information comparisons in exactly the same circumstances that our
monotone information orders are (given R)that is, there is no sense in which Lehmanns (1988) order is tighterfor the class of single crossing functions than (MIO) is for a given R.
15
Fix R ⊂ R. For any r ∈ R, we consider the statistical problem of testing the hypothesis
H0 : r(W ) ≥ 0 against the alternative r(W ) < 0, with a constraint on the level of the test. Theproblem is:
maxa:X→0,1
ZX
ZΩa(x)r(ω)dF (ω, x) (10)
s.t.ZXa(x)dT (x) = 1− z,
where a = 1 means accept and a = 0 means reject. When R is the set of functions that cross
zero at ω0, the constraint in (10) is that the test has level z. That is, the probability of rejection,
conditional on W = ω0, is z. When R contains the constant functions, the constraint is that the
average (or ex ante) probability of rejection is z. We consider an generalized economic example of
such a problem grading students on a curve below.
Call (10) an R-monotone testing problem if F is R-ordered. Then the class of all R-monotone
testing problems is obtained by considering all r ∈ R and all z ∈ [0, 1].
Theorem 3 Consider R ⊂ R, a prior H, and two signals X and X 0 satisfying Condition 3, where
the corresponding information structures F and F 0 are R-ordered. Then F 0 is more informative
than F for all R-monotone testing problems if and only if (MIO) holds.
Proof. Consider the problem (10). The optimal policy is accept, a(x) = 1 if and only if X = x ≥ x,where T (x) = z. The ex ante payoff under F is thus EW [r(W )|T (X) ≥ z] Pr(T (X) ≥ z). This
payoff is higher under F 0 if and only if (3), or equivalently (MIO), holds. Q.E.D.
3.4.1 Application: Testing and Evaluation
As an economic example, consider the problem facing an examiner who must grade a large number
students to a curve based on test results. Grading based on a students percentile in the distri-
bution is common, not only in university classes, but also in entrance exams, aptitude tests, and
applications for civil service positions around the world. Suppose the students are ex ante identical
with distribution of capabilities given by H, and that student i has unknown capability Wi. For
each student, the examiner observes an informative test score Xi. The examiner would like to
assign the highest grades to the most capable students (i.e. her payoff function u : Ω × A → R issupermodular). Suppose that the possible grades are given by j = 1, 2, ...., J with J the highest,
and that the grade curve mandates that a fraction βj must receive grade j or lower. Assuming
that the posterior beliefs are ordered by FOSD, the optimal grading policy will be monotone in the
students test scores. Given the large number of (ex ante identical) students, the optimal grading
16
policy is to assign grade j to any student with a test score x such that FX(x) ∈ (βj−1,βj ]. Thequestion of interest is under what conditions will one possible test be more valuable than another?
The answer is that the examiner necessarily will receive higher expected utility from grading if and
only if the information revealed by the test improves in the sense of (MIO) for RND the set of
nondecreasing functions.
3.5 Further Examples
3/4. Payoff functions with incremental returns that are single crossing. We now use our general
(MIO) condition to re-derive Lehmanns efficiency order, and to provide an alternative characteri-
zation that seems potentially useful for applications. Consider two information structures F 0 and
F whose posteriors are ordered by the MLRP. First, Þx the point at which incremental returns
cross zero, ω0, i.e. take R = RSC(ω0). We can re-express (MIO) using our earlier deÞnition of the
ÂR order: for all z ∈ [0, 1]:
Pr¡F 0X(X
0|ω0) ≥ z | ω¢ ≥ Pr (FX(X|ω0) ≥ z | ω)
if and only if ω ≥ ω0.15 Or, alternatively, for all z ∈ [0, 1],
FX¡F−1X (z|ω0) | ω
¢ ≥ F 0X ¡F 0−1X (z|ω0) | ω¢
(11)
if and only if ω ≥ ω0.For F 0 to be more informative than F for all payoff functions with single crossing incremental
returns, this condition must hold for all ω0. Or in other words, the inequality (11) must hold for
any z ∈ [0, 1] and any ω,ω0 ∈ Ω with ω ≥ ω0. Applying the monotone transformation F 0−1X (·|ω) toeach side, and substituting x = F−1X (z|ω0),we obtain the equivalent condition that:
∀x ∈ X , F 0−1X (FX(x|ω)|ω) is nondecreasing in ω. (12)
This is the condition derived by Lehmann (1988).
We can also use (11) and our approach of expressing information orders in terms of orders
over average posterior beliefs, to re-express Lehmanns criterion as a likelihood ratio ordering on
posterior beliefs.
15This condition turns out to have a very nice interpretation in the spirit of hypothesis testing. Suppose r(ω)crosses zero at ω0. Then a statistical test of the hypothesis H0 : r(ω) ≥ 0 against the alternative r(ω) < 0 is a test ofω > ω0. The constraint is Pr(reject|ω0) = z: the probability of rejection conditional on the truth being ω0 is z, andthe optimal test has the form: accept if and only if the realization of X is greater than x, where FX(x|ω0) = z. Thus(MIO) states that X 0 provides a more powerful test of level z than X it is more likely to accept (reject) when thetrue state is above (below) ω0.
17
Corollary 1 Suppose that for all θ and all x ∈ X , the posterior belief has a density that is dif-ferentiable. Then FXθ,W is ordered by (MIO) for R = RSC, if and only if, for all ω ∈ Ω and allz ∈ [0, 1],
f 0W (ω | Xθ ≥ F−1Xθ (z|ω))
fW (ω | Xθ ≥ F−1Xθ (z|ω))
is nondecreasing in θ. (13)
The proof of this result is in the Appendix. In applying this characterization, it is important to
observe that ω appears in two roles in the formula: once as the argument of the density function,
and once in determining the cutoff value of x, F−1Xθ (z|ω).
5. Payoff functions with that are nondecreasing on ω ≤ ω0, and zero thereafter. Payoff functionsthat take this form arise in the study of Þrst-price auctions with affiliated values, whereW represents
the opponents signal.16 Formally, deÞne RNDZ(ω0) as follows: r ∈ RNDZ(ω0) if r is nondecreasingon ω ≤ ω0 and r(ω) = 0 for all ω > ω0. This class of payoff functions satisÞes Condition 1 with
r(ω) ≡ 1ω≤ω0, and the order ÂR induced by RNDZ(ω0) requires that FW (·|x0) ÂR FW (·|x) ifFW (·|x0)/ FW (·|x) ≤ FW (ω0|x0)/FW (ω0|x); this is simply FOSD conditional on ω ≤ ω0. Taking
the set of payoff functions RNDZ = ∪ω0∈ΩRNDZ(ω0) induces as ÂR the monotone probability ratioorder, which requires that FW (·|x0)/ FW (·|x) is nondecreasing for x0 > x. This order is weaker
than the MLRP but stronger than FOSD.17
The analysis of MIO for this set is analogous to that for single crossing functions. Taking
R = RNDZ(ω0), (MIO) reduces to: for all z ∈ [0, 1],
FX(F−1X (z|W ≤ ω0)|W ≤ ω) ≤ F 0X(F 0−1X (z|W ≤ ω0)|W ≤ ω)) (14)
for all ω < ω0. When we let ω0 = sup(Ω), we have exactly (MIO) for R = RND, as desired.
For the set RNDZ , we check (14) for all ω0 ∈ Ω, yielding
∀x ∈ X , F 0−1X (FX(x|W ≤ ω)|W ≤ ω) is nondecreasing in ω. (15)
And, analogous to above, we can re-express this in more familiar terms:
16 In a two-bidder, Þrst-price, common-value auction, if bidder 2 uses a strictly monotone bidding strategy β(·), thepayoffs to bidder 1 for a given bid b, signal s, and realization of the opponents signal are given by u(b,ω) = (E[V |S =s,W = ω]− b)1b>β(ω). As a function of ω, the returns to choosing bH rather than bL < bH are Þrst negative (andconstant) at bL− bH , then (in the region where higher bids cause bidder 1 to go from losing to winning by increasingthe bid) nondecreasing in ω, and Þnally the returns are equal to 0 (when both bL and bH are losing bids). See Athey(forthcoming) for generalizations and further discussion.17Athey (2000) shows that RNDZ induces the same order ÂR as the one induced by considering the incremental
returns to investment for the class of risk-averse investors making an investment, so that u(a,ω) = v(π(a,ω)), wherev is concave and π is supermodular. The order ÂRNDZ is also the same as the order induced by the set of payoffs rthat are single crossing and quasi-concave (Athey (forthcoming)).
18
Corollary 2 Suppose that for all θ and all x ∈ X , the posterior belief has a density. Then FXθ,W
is ordered by (MIO) for R = RNDZ , if and only if, for all ω ∈ Ω and all z ∈ [0, 1],fW (ω | Xθ ≥ F−1
Xθ (z|W ≤ ω))FW (ω | Xθ ≥ F−1
Xθ (z|W ≤ ω)) is nondecreasing in θ. (16)
The proof of this result mimics Corollary 1. Athey (2000) shows that when signals are ordered
in this way, under a few additional regularity conditions (as well as the restriction that posterior
beliefs are ordered by the monotone probability ratio order), all risk-averse investors will Þnd higher
values of θ more valuable.
4 Demand for Information
In this section, we obtain a comparative statics result concerning the relative demand for infor-
mation of two different decision-makers. The following Theorem builds on an approach used by
Persico (2000) for the special case of decision-makers with single-crossing incremental returns.
Theorem 4 Consider R ⊂ R, a prior H, and a smoothly parametrized family of signals Xθθ∈Θsatisfying Condition 3, where for each θ, the information structure F θ is R-ordered, and F θθ∈Θis ordered by (MIO). DeÞne
w(ω, x) = u(ω,αθ,u(x))− v(ω,αθ,v(x)),
where αθ,u denotes an optimal decision rule for the decision problemF θ, u
®. If for all x0 > x,
w(ω, x0)−w(ω, x) ∈ R, then ∂∂θV (θ, u) ≥ ∂
∂θV (θ, v).
Proof. Recall the normalized information structures Gθ deÞned in the proof of Theorem 3. DeÞne
w(ω, z) = u(ω, aθ,u(T θ−1(z))− v(ω, aθ,v(T θ−1(z)). By the Envelope Theorem,
∂
∂θV (θ, u)− ∂
∂θV (θ, v) =
ZΩ
Z[0,1]
w(ω, z)d∂
∂θ
nGθ(ω, z)
o.
Assume A is Þnite. Then there is some series of cut-points zini=1, with z1 ≤ z2 ≤ ... ≤ zN+1, suchthat w(ω, z) = w(ω, zi) on [zi, zi+1). Let ri(ω) = w(ω, zi)−w(ω, zi−1), and note that ri(ω) ∈ R forall i = 2, ..., N. Then, ∂
∂θV (θ, u)− ∂∂θV (θ, v) ≥ 0 if
nXi=2
EWhri(W ) | Zθ+dθ ≥ zi
iPr³Zθ+dθ ≥ zi
´−
nXi=2
EWhri(W ) | Zθ ≥ zi
iPr³Zθ ≥ zi
´≥ 0,
which holds by (MIO). The case where A is continuous follows via a limiting argument. Q.E.D.
Note that the result does not rely on u, v ∈ UR. An immediate consequence follows.
19
Theorem 5 Suppose the conditions of Theorem 4 hold. For C : Θ→ R, let θ∗(u)=argmaxθ∈Θ V (θ, u)−C(θ). Then θ∗(u) ≥ θ∗(v) (in the strong set order).
Proof. By Theorem 4, [V (θ, u)−C(θ)] − [V (θ, v)−C(θ)] is nondecreasing in θ. So by TopkisMonotonicity Theorem (e.g. Topkis, 1998), θ∗(u) ≥ θ∗(u) in the strong set order. Q.E.D.
The difficulty in applying Theorem 4 is that the critical condition depends on the properties of
the two objective functions evaluated at their respective optima. Thus, it requires a fair amount of
structure. This should probably not come as a surprise. A change in preferences has at least two
effects in a decision problem under uncertainty. First, it changes the optimal behavior and hence
the responsiveness to difference realizations of information. Second, it changes the preferences over
the residual risk faced after a decision is made. A comparison of marginal values for information
must incorporate both effects. Almost by necessity, then, it must deal with potentially subtle
comparisons of the curvature of the payoff functions.
Despite this, Theorem 4 can be applied to obtain new results in a variety of contexts. Persico
(2000) analyzes auction models. Below, we apply the result to a standard model of monopoly.
We also provide an example of a class of applications, namely principal-agent problems, where the
analysis is simpliÞed because the same policy function affects both players.
The requirements of Theorem 4 for primitives can be characterized precisely when we consider
small changes in the payoff function, and when 1,−1 ∈ R. We parameterize u by γ, so thatu : Ω × A× Γ → R, and deÞne u : A× X × Γ → R by u(a, x; γ) =
Ru(ω, a; γ)dFW (ω|x). We let
subscripts denote partial derivatives.
Corollary 3 Suppose the conditions of Theorem 4 hold, and that (a) 1,−1 ∈ R; (b) Γ,A are
compact, convex subsets of R; (c) for each ω, u(ω, ·; ·) is C(3); and (d) u is quasi-concave in aand C(2). Then ∂2
∂θ∂γV (θ, u(·; γ)) ≥ 0 if, for each (a, γ), u(·, a; γ) satisÞes: (i) u(·, ·; γ) ∈ UR; (ii)uγ(·, a; γ) ∈ UR, (iii) uaaγ(·, a; γ) ≥ 0, and (iv) either uaa(·, a; γ) is a constant function of ω, orelse ua(·, ·; γ) ∈ UR and uaγ(·, a; γ) ≥ 0.
Proof. By Theorem 4, the result follows if ∂2
∂γ∂xu(·,αθ,γ(x); γ) ∈ R. Differentiating yields:∂2
∂γ∂xu(·,αθ,γ(x); γ) = uaγ(·, a; γ)αθ,γx (x) + ua(·, a; γ)αθ,γxγ (x) + uaa(·, a; γ)αθ,γx (x)αθ,γγ (x),
evaluated at a = αθ,γ(x). Since F θ is R-ordered, αθ,γx ≥ 0 by (i). Thus, the Þrst term is in R by
(ii). If uaa is constant in ω, the last term is constant in ω (and thus in R); otherwise, it is in R if
uaa ∈ R and αθ,γγ ≥ 0, which follows if uaγ ≥ 0 (as in (iv)). The second term is in R by (i), so long
as αθ,γxγ ≥ 0. To evaluate this, we apply the implicit function theorem, yielding:∂2
∂x∂γαθ,γ(x) = [−uaa(·)uaxγ(·) + uax(·)uaaγ(·)]
.(uaa(·))2
¯a=αθ,γ(x)
20
At the optimum, uaa < 0, and uax ≥ 0 since F θ is R-ordered; uaaγ ≥ 0 by (iii). Since 1,−1 ∈ R,F θW (·|x) is ordered by stochastic dominance for R; then uaγ ∈ R implies uaxγ ≥ 0. Q.E.D.
Each of conditions (ii)-(iv) has a natural interpretation and ensures that a particular effect
works in the right direction. We illustrate with a series of investment examples based on the
case of supermodular payoff functions. Let the payoff function be u(ω, a) = v(ω, a)− c(a), wherea represents an investment level, c is a cost of investment, and the returns to investment are
nondecreasing in ω. Assume c is nondecreasing and convex.
For condition (ii), suppose γ affects the scale of returns. Let u(ω, a; γ) = γar(ω) − c(a),so conditions (iii) and (iv) are trivial. Then an increase in γ increases information demand
because it makes the marginal returns to investment more sensitive to changes in ω (i.e.
more nondecreasing in ω).
For condition (iii), suppose that γ multiplies the costs of investment. Let u(ω, a; γ) = ar(ω)−(γ − γ)c(a), so that conditions (ii) and (iv) are trivial. Now, an increase in γ increases
information demand by making the investment problem less concave in a, and hence the
policy function more responsive to information about returns (i.e. ∂∂xα
θ,γ is increasing in γ).
Finally, condition (iv) becomes relevant when we generalize the functional form for investment.Let u(ω, a; γ) = v(ω, a) − (γ − γ)c(a). Since c is nondecreasing, an increase in γ increasesthe optimal policy. In turn, this increases the demand for information if it makes marginal
returns more sensitive to ω (i.e. condition (iv) requires that vaω be increasing in a). To see
an application, suppose that a is an input (e.g. labor). If u(ω, a; γ) = v(ω, a) + γa− c(a), ifvaω is increasing in a, the Þrm invests more (less) in information gathering in response to a
wage subsidy (tax).
4.1 Application: Production under Uncertainty
A growing literature considers the value of information to Þrms under imperfect competition (see,
e.g., Mirman, Samuelson, and Schlee (1994) and references therein). Here, we prove a simple but
new result using Theorem 4: under standard conditions, a monopolist will not only produce less
but will gather less information about production costs than is socially efficient.
To model this, let P (q)be the inverse demand curve. Suppose the cost of producing q units
is C(q,ω), where (letting subscripts denote partial derivatives) Cq is nonincreasing in ω. The
monopolists payoff is uM(ω, q) = qP (q) − C(q,ω), while the social planners payoff is uS(ω, q) =R q0 P (t)dt − C(q,ω). Both payoff functions are supermodular, so consider
©F θªθ∈Θ satisfying the
hypotheses of Theorem 4 for RND. The cost of information is c(θ).
21
By Theorem 1, both monopolist and social planner prefer more information to less accord-
ing to our deÞnition of information. To show that ∂∂θV (θ, u
S) ≥ ∂∂θV (θ, u
M), we ask when∂∂xu
S(ω, qS(x))− ∂∂xu
M(ω, qM(x)) is nondecreasing in ω. Eliminating terms that do not depend on
ω, we express this difference as
−Cq(qS(x),ω)qSx (x) +Cq(qM(x),ω)qMx (x). (17)
Proposition 1 In the production problem with cost uncertainty, a social planner has a higher
demand for information than a monopolist, if for any θ, (i) marginal costs are increasing in q
and submodular in (q,ω), (i.e. Cqq ≥ 0, Cqqω ≤ 0), (ii) the marginal revenue curve is downwardsloping (qP 00(q) + P 0(q) ≤ 0 for all q), and (iii) the marginal social surplus function (P (q) −E[Cq(qS ,W )|X = x]) is convex in q.
The basic intuition for the result is easily grasped. Different realizations of X shift the perceived
marginal cost curve. Under relatively standard conditions, the demand curve faced by a social
planner will be ßatter than the marginal revenue curve faced by a monopolist, so the social planner
will be more responsive to shifts in marginal cost. Consequently, the social planner beneÞts more
from improving the match between perceived and actual marginal costs. A formal proof obtains
by using the implicit function theorem to express qMx and qSx , and checking that (17) is indeed
nondecreasing in ω.
4.2 Application: Delegated Project Selection
Recently, a rapidly growing literature has focused on delegated information-gathering and decision-
making (i.e. Aghion and Tirole, 1997). In such problems, two parties have different objectives but
are subject to the same decision rule, greatly simplifying the problem. Suppose we have a principal
with payoff function v(ω, a) and an agent with payoff function u(ω, a), both supermodular. The
agent must evaluate a set of projects (for simplicity, a continuum) to be accepted or rejected (for
example, a set of investment opportunities or workers to be hired), so A = 0, 1, and ω representsquality. There is a prior distribution of projects, H(·), known to both agents, and a smoothlyparameterized family of signals, Xθθ∈Θ, satisfying the conditions of Theorem 4 for RND. Higher
values of θ are more informative according to (MIO). Project selection is as follows. First, the
principal determines a quota (denoted z), stated as the fraction of the pool of potential projects
to be accepted. The agent then chooses θ, at cost C(θ), receives a signal about each project, and
then makes acceptance decisions, obeying the quota z; no further contracting takes place.
Observe that the principal and agent will agree on the ranking of projects, but not necessarily
on what fraction to accept. Moreover, the principal may want to accept more (or less) projects
22
depending on the quality of information. The principal must form a conjecture about what choice
of θ her quota z will induce; in equilibrium, her conjecture will be fulÞlled.
The agents selection rule, denoted α(x), is to accept a given project if and only if FX(x) ≥ z.Given this, the agents payoff is EW,X|θ [u(W,α(X))]−C(θ) and the principals is EW,X|θ [v(W,α(X))].Because u, v are supermodular, both parties enjoy higher payoffs (net of C) if information qual-
ity is improved. Moreover, if v − u is supermodular (u − v is supermodular)that is, the agentsreturns to quality for accepted projects are lower (higher) than the principalsthen the agent
under-acquires (over-acquires) information relative to what the principal would choose given the
same cost function, and same acceptance rule.
5 Further Characterizations
This section provides further characterizations of our informativeness conditions. We Þrst charac-
terize informativeness in terms of marginal-preserving spreads. We then discuss the restrictions
implied by our criteria on the conditional signal distributions when we allow for sets of prior beliefs.
5.1 Marginal-Preserving Spreads
Above, we stated (MIO) in terms of average posterior beliefs. We now provide an alternative
characterization in terms of the joint distributions on the signal and the state. For concreteness,
we consider two sets of payoff functions: those with nondecreasing incremental returns (URND),
and those with concave incremental returns (URCV ).
It is convenient to work directly with the indexed signals. Given a signal X, with associated
information structure F , deÞne Z = T (X) (= FX(X)). Then Z is a random variable on [0, 1],
and has identical information content to X. Given another signal X 0, we can similarly deÞne
Z 0 (= F 0X(X0)). Let G,G0 be the joint distributions of (W,Z) and (W,Z0). Then (i) G,G0 are
R-ordered (for given R) if and only if F, F 0 are, and (ii) without loss of generality, G,G0 have
equivalent marginals.18
To minimize notation, let us take Ω to be Þnite and assume that Z,Z 0 are discrete.19 Let g, g0
be the probability mass functions corresponding to G,G0 and deÞne γ(ω, z) = g0(ω, z) − g(ω, z).We say that G0 can be obtained from G by a single marginal-preserving spread (MgPS) if for
18 The marginal of W is the prior. If X,X0 are continuous, then by normalization Gz(z) = G0z(z) = z. If X,X0 are
discrete, it may not be the case that Z,Z0 have identical uniform marginal distributions. However, using Lehmanns(1988) technique, we can construct continuous random variables X∗,X∗0 with identical information content to X,X 0
such that their indexed versions Z∗ and Z∗0 have uniform marginals.19 See footnote 18. The continous analogues are straightforward.
23
z2
z1
ω1 ω2ω3 ω4ω1 ω2
ε
ε
δ
δ
η
η
A M arginal-Preserving Spread A M ean and Marginal-Preserving Spread
z2
z1
Figure 1: Illustrations of Spreads.
some ε > 0, ω1 < ω2, and z1 < z2,
γ(ω1, z1) = γ(ω2, z2) = −γ(ω2, z1) = −γ(ω1, z2) = ε,
and otherwise γ = 0.20 Similarly, G0 can be obtained from G by a single mean and marginal-
preserving spread (MMgPS) if for some δ, η > 0, ω1 < ω2 < ω3 < ω4, and z1 < z2,
γ(ω1, z1) = γ(ω2, z2) = −γ(ω2, z1) = −γ(ω1, z2) = δ,−γ(ω3, z1) = −γ(ω4, z2) = γ(ω4, z1) = γ(ω3, z2) = η.
wherePi ωiγ(ωi, zj) = 0 for j = 1, 2, and otherwise γ = 0.
As Figure 1 illustrates, a marginal-preserving spread captures the idea of making two ran-
dom variables more positively dependent. A mean and marginal-preserving spread adds positive
dependence below E [W ] and negative dependence above E [W ].
Proposition 2 Consider a prior H, and two signals X,X 0 whose indexed information structures
G,G0 are R-ordered (for the appropriate R). (i) G0 ÂMIO−ND G if and only if G0 can be obtained
from G via a series of MgPSs; (ii) G0 ÂMIO−CV G if and only if G0 can be obtained from G via a
series of MMgPSs.
Proof: Part (i) follows a result in Meyer (1990). The proof of (ii) uses Rothschild and Stiglitzs
(1970) characterization of SOSD in terms of mean-preserving spreads. Suppose G0 is obtained by
G via a single MMgPS, where the MMgPS is carried out on z1 < z2. Then W |Z ≥ z d= W |Z 0 ≥ z
for any z > z2 and any z ≤ z1, and the distribution of W |Z ≥ z is a mean-preserving spread
of W |Z 0 ≥ z for any z1 < z ≤ z2. So W |Z 0 ≥ z dominates W |Z ≥ z by SOSD for any z, and
20 Hamada (1974), Tchen (1976), Epstein and Tanny (1980) and Meyer (1990) have all discussed this concept instudying positive dependence of random variables.
24
G0 ÂMIO−CV G. With multiple MMgPSs, the argument is repeated. For the other direction,
suppose Z,Z0 take values on z1, ..., zn and have uniform marginals, and that G0 ÂMIO−CV G.
Then the distribution ofW |Z 0 = z1 is SOSD dominated by the distribution of W |Z = z1 and hencemust be obtained by a Þnite series of mean-preserving spreads. Starting with G, make a series
of MMgPSs using z1, z2 that result in a distribution G(1) with W |Z(1) = z1d= W |Z0 = z1, and
W |Z(1) ≤ z2 d= W |Z ≤ z2. These last two are SOSD dominated by W |Z0 ≤ z2, so in particular,
W |Z(1) = z2 is stochastically dominated by W |Z 0 = z2. Now, starting with G(1) make a series
of MMgPSs using z2, z3 that result in G(2) with W |Z(2) = z2d= W |Z 0 = z2. Continuing in this
fashion, we obtain from G via a series of MMgPSs, a distribution G(n−1) such that for all z < zn,
W |Z(n−1) = z d= W |Z 0 = z. Since G(n−1) and G0 have equivalent marginals, it must therefore be
that G(n−1) = G0. Q.E.D.
This result fully characterizes (MIO) when R is the set of nondecreasing or concave functions.21
Example: A MMgPS that Violates Sufficiency. The following example illustrates a series of MMg-
PSs; it also shows how two signals X 0 and X might be information-ranked for payoff functions
with concave incremental returns without one being statistically sufficient for the other. Suppose
Ω = −2,−1, 0, 1, 2 and the prior on Ω is uniform. There are two signals, which take realizationson X = x1, x2, x2 > x1. The joint signal-state distributions are:
Pr(W = ω,X = x) −2 −1 0 1 2
x11180
580
880
580
1180
x2580
1180
880
1180
580
Pr(W = ω, X 0 = x) −2 −1 0 1 2
x11480
480
480
480
1480
x2280
1280
1280
1280
280
Simple calculation shows that the posteriors induced by X do not lie in the convex hull of the
posteriors induced by X 0, so X 0 cannot be sufficient for X. Thus for some decision-makers, X is
preferred to X 0. However, the posteriors induced by both X and X 0 are ordered by SOSD, and the
distribution of (W,X 0) is obtained from the distribution of (W,X) by a sequence of two MMgPSs
(one on (x1, x2; −1, 0, 0, 1), the other on (x1, x2; −2,−1, 1, 2)). So X 0 is preferred to X by
all decision-makers whose payoff functions have concave marginal returns.
5.2 Information Orders and Prior Beliefs
While (MIO) and the marginal preserving spread conditions are intuitive and easy to state, they
have the disadvantage of intermingling the prior belief and the conditional signal distributions. In
21 When we further restrict attention to two states and two signals (where posterior beliefs are ordered by FOSD),it can be shown that a MgPS is equivalent to two elementary linear bifurcations on the distribution over posteriors,as in Grant, Kajii and Polak (1998); these authors show that this implies statistical sufficiency, which in turn impliesthat (MIO) and sufficiency are equivalent for this case.
25
many economic contexts, it is reasonable to assume that the prior is known to the analyst; for
example, the distribution over previous asset returns, input costs, or employee test scores may be
objectively measured. But in other cases, such information may be unavailable, and we would
prefer a characterization of information that relies less on knowledge of the prior beliefs.
Put differently, we have characterized information preferences for families of decision-makers
of the form (H, UR). To obtain an information ranking for a family of decision-makers (Λ, UR)where Λ is some set of possible prior beliefs, we must check (MIO) for the entire set Λ. Typically,
this implies a further restriction on the informativeness order. To see why, write TH in place of T
to highlight dependence on the prior, and rewrite (MIO) (using (3)) as:
∀r ∈ R, z ∈ [0, 1] :ZΩr(ω) Pr(T 0H(X
0) ≥ z |ω)dH(ω) ≥ZΩr(ω) Pr(TH(X) ≥ z |ω)dH(ω). (18)
This version of (MIO) compares weighted averages of the conditional signal distributions, where
the states are weighted according to the prior H and the return functions r ∈ R.Clearly, the prior H plays a role. Consequently, X 0 might be R-information preferred to X for
one prior H1, but not another prior H2. Intuitively, relative to X, X 0 might not distinguish well
between two states, ω and ω0; but if both are unlikely, X 0 might still be R-information preferred
to X. Clearly, this feature is desirable when the prior is known, but undesirable when we consider
a set of priors where other members place large weight on ω and ω0. An additional complication is
that even for a given signal X, it might be the case that the induced posteriors are R-ordered for
one prior H1 but not for another prior H2.
Consider Þrst the set from Example 3, RSC(ωo). Here, TH(X) = FX(X|ωo), which does notdepend on the prior; further, r · h ∈ RSC(ωo) for all prior densities h, if and only if r ∈ RSC(ωo).Thus, the prior does not affect (MIO) in this case. In contrast, for nondecreasing functions RND,
TH(X) = FX(X), which depends on the prior; and further, if we consider an alternate prior densityh, r nondecreasing implies only that r · h is single crossing. Thus, this analysis indicates thatmuch of the structure imposed by commonly encountered economic restrictions (i.e. monotonicity,
concavity) can be undone by allowing for a rich enough set of prior beliefs Λ. We formalize this
discussion as follows (recalling RSC contains single crossing functions, and RND ⊂ RSC):
Proposition 3 (i) The information structure F corresponding to X is RND-ordered (i.e., FOSD-
ordered) for all prior beliefs H ∈ ∆(Ω) if and only if it is RSC-ordered (i.e., MLRP-ordered).
(ii) Given two signals X 0 and X that are MLRP-ordered, X 0 is more informative that X for all
decision-makers in (∆(Ω), URND) if and only if X 0 is more informative that X for decision-makers
in (∆(Ω), URSC ).
Proof: Part (i) is due to Milgrom (1981). For part (ii), observe that for a Þxed prior H,
26
X 0 ÂMIO−ND X states that F 0W (ω|F 0X(X 0) ≥ z) ≤ FW (ω|FX(X) ≥ z) ∀ω ∈ Ω, z ∈ [0, 1].
Applying Bayes Rule and canceling terms yields F 0X(F0−1X (z)|W ≤ ω) ≤ FX(F
−1X (z)|W ≤ ω)
∀ω ∈ Ω, z ∈ [0, 1]; substituting z = FX(x) and re-arranging:
F 0−1X (FX(x)) ≤ F 0−1X (FX(x|W ≤ ω)|W ≤ ω) ∀ω ∈ Ω, x ∈ X .
Clearly, F 0−1X (FX(x|ω)|ω) nondecreasing in ω for all x ∈ X implies this last inequality regardless of
the prior H. Moreover, for the last inequality to hold for all H ∈ ∆(Ω), it must hold for all priorswith two point support, from which it follows that F 0−1X (FX(x|ω)|ω) must be nondecreasing in ωfor all x. Finally, as discussed in Example 3, F 0−1X (FX(x|ω)|ω) is nondecreasing in ω for all x if andonly if X 0 ÂMIO−SC X. Q.E.D.
Thus, we see that allowing for a rich enough set of prior beliefs renders useless the knowledge
that r is nondecreasing rather than just single crossing. In contrast, the orderings for single crossing
functions (as well as statistical sufficiency) are independent of the prior, and knowing the prior does
not allow these conditions to be tightened. Intuitively, the sets of payoff functions being considered
are so large that if X 0 does not distinguish well between ω and ω0, then even if these states are
unlikely, there is still some payoff function that cares only about the comparison between them.
Thus, we have shown that if the set of payoff functions being studied is large, knowledge of the
prior does not help in characterizing information preferences, while such information is essential to
fully exploit the structure of smaller sets of payoff functions.22
Example With a given prior, X 0 might be preferred to X for all decision-makers with nondecreasing
incremental returns (but not for all decision-makers with single crossing incremental returns); yet,
another prior may disturb the comparison. Suppose Ω = ω1,ω2,ω3 with ω1 < ω2 < ω3 and
X = x1, x2 with x1 < x2. Consider the following two joint distributions, where F 0 is obtained
from F via a single MgPS:
Pr(W = ω, X = x) ω1 ω2 ω3
x1624
324
324
x2224
524
524
Pr(W = ω, X 0 = x) ω1 ω2 ω3
x1624
424
224
x2224
424
624
.
Both FW (ω|x) and F 0W (ω|x) have the MLRP, but X 0 and X do not satisfy (12). Consider the
following payoff function, where A = 0, 1. u(ω, a) = ar(ω), where r(ω1) = −2, r(ω2) ∈ (1, 2),and r(ω3) = 1. It is easily checked that r(ω) satisÞes single crossing but is not nondecreasing, and
that the ex ante payoff for u is higher under F than under F 0. The idea is that X 0 distinguishes
well between ω1 and ω2,ω3, between ω1,ω2 and ω3, and between ω2 and ω3, but not well22 Jewitt (1997) shows that for two-point priors, where posteriors are ordered by the monotone likelihood ratio
order, Lehmanns order and statistical sufficiency coincide.
27
between ω1 and ω2. By allowing for a larger set of prior beliefs or payoff functions, one can choose
the prior or the payoff function to focus attention on this last fact.
6 Conclusion
In this paper, we established that the additional structure of monotone decision problems can be
exploited to derive necessary and sufficient conditions for all agents in a family to prefer one signal to
another, conditions that are typically less restrictive than statistical sufficiency (the ordering for all
payoff functions) or Lehmanns order (for single crossing functions). Alternatively, our results can
be interpreted as deriving additional consequences of statistical sufficiency and Lehmanns order,
for monotone decision problems with additional structure (i.e. payoff functions are supermodular,
and the prior distribution is Þxed). These consequences lead directly to comparative statics results
in decision problems and equilibrium models.
Finally, both our deÞnitions of more information and our results on information demand may be
useful for building models in which agents acquire their information at some cost prior to playing a
game of incomplete information. The examples given here, and Persicos (2000) work on auctions,
indicate that this may be a fruitful area for further inquiry.
7 Appendix
Proof of Corollary 1: Fix θ0 > θ. First, by deÞnition, F−1X (FX(x|ω0) | ω0) = x, so that∂
∂ωF−1X (FX(x|ω0) | ω)
¯ω=ω0
= −∂∂ω0FX(x|ω0)
fX(x | ω0) ,
or, letting z = FX(x| ω0),∂
∂ω0F−1X (z|ω0) = −
∂∂ωFX(F
−1X (z|ω0) | ω)
¯ω=ω0
fX(F−1X (z|ω0) | ω0)
.
Then, letting z = FX(x|ω0), we differentiate (12), yielding ∂∂ω0F−1X0 (FX(x|ω0)| ω0) =
−∂∂ωFX0(F−1X0 (z|ω0) | ω)
¯ω=ω0
fX0(F−1X0 (z|ω0) | ω0)+
∂∂ωFX(F
−1X (z|ω0) | ω)
¯ω=ω0
fX0(F−1X0 (z|ω0) | ω0).
Multiplying both sides by fX0(F−1X0 (z|ω0) | ω0), this is nonnegative if∂
∂ω
ÃfW (ω|X ≤ F−1X (z|ω0)) FX(F−1X (z|ω0)
h(ω)
!¯¯ω=ω0
≥ ∂
∂ω
ÃfW (ω|X 0 ≤ F−1X0 (z|ω0)) FX0(F−1X0 (z|ω0)
h(ω)
!¯¯ω=ω0
.
28
Simplifying, we have
FX(F−1X (z|ω0))h(ω0)
·f 0W (ω0|X ≤ F−1X (z|ω0))− fW (ω0|X ≤ F−1X (z|ω0)) h
0(ω0)h(ω0)
¸≥ FX0(F−1X0 (z|ω0))
h(ω0)
·f 0W (ω0|X 0 ≤ F−1X0 (z|ω0))− fW (ω0|X 0 ≤ F−1X0 (z|ω0))h
0(ω0)h(ω0)
¸or, multiplying through by h(ω0) and factoring out a term, this is:
fW (ω0|X ≤ F−1X (z|ω0)) FX(F−1X (z|ω0))"f 0W (ω0|X ≤ F−1X (z|ω0))fW (ω0|X ≤ F−1X (z|ω0))
− h0(ω0)h(ω0)
#(19)
≥ fW (ω0|X 0 ≤ F−1X0 (z|ω0)) FX0(F−1X0 (z|ω0))"f 0W (ω0|X 0 ≤ F−1X0 (z|ω0))fW (ω0|X 0 ≤ F−1X0 (z|ω0))
− h0(ω0)h(ω0)
#.
But, using Bayes rule, fW (ω0|X ≤ F−1X (z|ω0)) FX(F−1X (z|ω0)) = zh(ω0), so (19) becomesf 0W (ω0 | X ≤ F−1X (z|ω0))fW (ω0 | X ≤ F−1X (z|ω0))
≥ f 0W (ω0 | X 0 ≤ F−1X0 (z|ω0))fW (ω0 | X 0 ≤ F−1X0 (z|ω0))
.
Finally, using Bayes rule and the fact that the expectation of the posteriors must equal the prior,
the latter inequality is equivalent to the desired condition.
8 References
Aghion, Philippe and Jean Tirole (1997): Formal and Real Authority in Organizations.
Journal of Political Economy. 105 (1), 1-29.
Athey, Susan (1998a): Characterizing Properties of Stochastic Objective Functions, MIT
Working Paper 96-1R.
Athey, Susan (2000): Investment and Information Value for a Risk-Averse Firm, MIT Working
Paper 00-30.
Athey, Susan (forthcoming): Comparative Statics under Uncertainty: Single Crossing Properties
and Log-Supermodularity, Quarterly Journal of Economics.
Blackwell, David (1951): Comparisons of Experiments, Proceedings of the Second Berkeley
Symposium on Mathematical Statistics, 93102.
Blackwell, David (1953): Equivalent Comparisons of Experiments, Annals of Mathematical
Statistics, 24, 265272.
Cooper, Russell (1999): Coordination Games: Complementarities and Macroeconomics, Cam-
bridge University Press.
Epstein, L. and S. Tanny (1980): Increasing Generalized Correlation: A DeÞnition and Some
Economic Consequences, Canadian Journal of Economics, 13, 16-34.
29
Grant, Simon, Ben Polak and Atsushi Kajii (1998): Intrinsic Preference for Information,
Journal of Economic Theory, 83, 233259.
Gollier, C. and M. Kimball (1995): Toward a Systematic Approach to the Economic Effects
of Uncertainty I: Comparing Risks, Mimeo, University of Toulouse.
Hamada, K. (1974): Comment on Stochastic Dominance in Choice under Uncertainty, in M.
Balch, D. McFadden, and S. Wu, eds., Essays on Economic Behavior under Uncertainty.
Amsterdam: North Holland.
Jewitt, Ian (1986): A Note on Comparative Statics and Stochastic Dominance, Journal of
Mathematical Economics, 15, 249254.
Karlin, Samuel and Howard Rubin (1956): The Theory of Decision Procedures for Distribu-
tions with Monotone Likelihood Ratio, Annals of Mathematical Statistics, 27, 272299.
Lehmann, E.L. (1988): Comparing Location Experiments, Annals of Statistics, 16 (2): 521-533.
Meyer, Margaret (1990): Interdependence of Multivariate Distributions: Stochastic Domi-
nance Theorems and an Application to the Measurement of Ex Post Inequality under
Uncertainty, Nuffield College Discussion Paper.
Milgrom, Paul (1981): Good News and Bad News: Representation Theorems and Applications,
Bell Journal of Economics, 12, 380391.
Milgrom, Paul and John Roberts (1990): Rationalizability, Learning, and Equilibrium in
Games with Strategic Complementarities, Econometrica, 58, 12551277.
Milgrom, Paul and Christina Shannon (1994): Monotone Comparative Statics, Economet-
rica, 62, 157180.
Mirman, L., L. Samuelson, and E. Schlee (1994): Strategic Manipulation of Information in
Duopolies. Journal of Economic Theory, 36, 364-384.
Persico, Nicola (2000): Information Acquisition in Auctions, Econometrica.
Rothschild, Michael and Joseph Stiglitz (1970): Increasing Risk I: A DeÞnition, Journal
of Economic Theory, 2, 223243.
Shaked, Moshe and George Shantikumar (1997): Supermodular Stochastic Order and Pos-
itive Dependence of Random Variables, Journal of Multivariate Analysis, 61, 86101.
Shannon, Chris (1995), Weak and Strong Monotone Comparative Statics, Economic Theory 5
(2), March, pp. 209-27.
Tchen, A. (1976): Inequalities for Distributions with Given Marginals, Ph.D. Thesis, Dept. of
Statistics, Stanford, June 1976
Topkis, Donald (1998): Supermodularity and Complementarity. Princeton University Press.
30