Designing and Pricing Information�
Dirk Bergemanny Alessandro Bonattiz Alex Smolinx
October 9, 2015
Abstract
A monopolist sells informative experiments to heterogeneous buyers who face a
decision problem. Buyers di¤er in their prior information, and hence in their willingness
to pay for additional signals. The monopolist o¤ers a menu of experiments. Every
optimal experiment in the optimal menu contains locally non-dispersed information,
i.e., it rules out at least one realization of the underlying state. With binary states and
actions, the optimal menu is coarse: the seller o¤ers at most two experiments, and we
derive conditions under which �at or discriminatory pricing is optimal. We apply our
�ndings to the sale of consumer-level information to enable targeted advertising.
Keywords: selling information, experiments, mechanism design, price discrimination,
product di¤erentiation.
JEL Codes: D42, D82, D83.
�We thank Ben Brooks, Giacomo Calzolari, Gabriel Carroll, Gonzalo Cisternas, Je¤ Ely, Emir Kamenica,
Alessandro Lizzeri, Alessandro Pavan, Phil Reny, Mike Riordan, Maher Said, Juuso Toikka, Alex Wolitzky
and seminar participants at Berkeley, Bocconi, Bologna, Carnegie Mellon, Chicago, Harvard, Mannheim,
NYU, Toulouse, Vienna, Yale and in the World Congress of the Econometric Society for helpful comments.yYale University, 30 Hillhouse Ave., New Haven, CT 06520, USA, [email protected] Sloan School of Management, 100 Main Street, Cambridge MA 02142, USA [email protected] University, 30 Hillhouse Ave., New Haven, CT 06520, USA, [email protected].
1
1 Introduction
The mechanisms by which information is traded can shape the creation and the distribution
of surplus in several important settings. Information about speci�c assets guides trading
decisions in �nancial markets; information about consumers facilitates targeting in online
advertising markets; and the indirect transfer of knowledge through consulting and expert
services a¤ects the behavior of many �rms. A common feature to all these markets is that
buyers of information often have private knowledge relevant to their decision problem at the
time of contracting (e.g., independent studies of traded companies, prior interactions with
speci�c consumers, and partial expertise at a given problem). In other words, buyers seek
to acquire incremental information.
Online advertising markets provide our initial motivation to study incremental infor-
mation. Large data brokers (e.g., Acxiom and Bluekai) collect and maintain databases that
record several attributes for each individual consumer. They then sell information about spe-
ci�c consumers to �rms operating on either side of the advertising market: merchants who
seek to tailor their advertising campaigns; and websites that sell their own advertising space
and wish to allocate users to speci�c �audience segments.�1 An increasingly common use of
information is behavioral retargeting� targeting advertisements to a consumer on the basis
of previous interactions that did not lead to a sale.2 Crucially, when a �rm seeks to acquire
external (�3rd-party�) data to guide a retargeting campaign, it has proprietary (�1st-party�)
data on its own customers and, hence, a private valuation for additional information.
In this paper, we develop a new framework to analyze the sale of incremental infor-
mation. We consider a single buyer who faces a decision problem under uncertainty. A
monopolist seller owns a database containing information about a �state�variable relevant
for the buyer�s decision. However, the buyer is partially and privately informed about the
state. We investigate the revenue-maximizing policy for the seller. In other words, how much
information should the seller provide? How should the seller price access to the database?
Environment In order to screen heterogeneous buyers, the seller o¤ers a menu of prod-
ucts. In our context, these products are experiments in the sense of Blackwell� signals that
reveal information about the buyer�s payo¤-relevant state. Only the information product is
assumed to be contractible. In contrast, payments cannot be made contingent on the buyer�s
actions, or the realized states and signals. Consequently, the value of an experiment for a
1The option to classify users is available to sellers of advertising on most online platforms, includingGoogle�s DoubleClick Ad Exchange.
2The technology �rm Rubicon Project integrates information from several sources to optimize advertisingcampaigns and website revenues (Your Online Attention, Bought in an Instant, the New York Times, No-vember 17, 2012). Lambrecht and Tucker (2013) empirically examine the e¢ cacy of retargeting campaigns.
2
given buyer can be computed independently of its price. This is unlike a contract specifying
contingent payments for actions, where the marginal price in�uences the buyer�s behavior
after observing a signal, and hence his willingness to pay ex-ante.
Finally, despite the buyer being potentially informed about his private beliefs, the analysis
di¤ers considerably from a belief-elicitation problem. Instead, we cast the problem into the
canonical quality-pricing framework where the buyer�s demand for information is determined
by his prior beliefs. In other words, the seller�s problem reduces to optimally designing and
pricing information products in di¤erent versions.3
Results The very nature of information products enriches the scope for price discrimi-
nation and leads to new insights relative to the classic nonlinear pricing problem of Mussa
and Rosen (1978) and Maskin and Riley (1984). In particular, because information is only
valuable if it a¤ects the decision maker�s action, buyers with heterogeneous beliefs rank in-
formative signals di¤erently. More precisely, all buyer types assign the highest value to the
perfectly informative experiment, but their ranking of imperfect experiments di¤ers substan-
tially. Thus, the value of information naturally has both a vertical quality element and a
horizontal positioning element, and the two cannot be produced independently.
As one would expect, the seller o¤ers a menu that, in general, contains both the fully
revealing experiment and partially informative, �distorted�experiments. However, the dis-
torted information products are not just noisy versions of the same data. Instead, the seller
introduces systematic distortions in the information provided. In particular, every experi-
ment o¤ered as part of the optimal menu is locally non-dispersed, i.e., the seller introduces
no noise in the distribution of signals conditional on at least one true state.
We provide a full characterization of the optimal menu for the case of two types. The
ex ante less informed �high� type purchases the fully revealing experiment. The ex ante
more informed �low� type buys a distorted experiment. The seller�s screening problem is
facilitated by the possibility of providing �directional� information: the seller introduces
noise in an appropriately chosen subset of signals to optimally reduce the information rents
of the less informed �high� type. As a result, the low type�s experiment rules out those
states he deems relatively more likely. Furthermore, the �exibility to design rich information
structures allows the seller to (generically) serve both buyer types, i.e., to avoid exclusion of
the low type.
Despite the additional �exibility in screening, we show that bundling information is
optimal quite generally with more than two types. In particular, with binary states and
3In Section 2, we relate the speci�c elements of our model to the retargeting context. In Section 6, wedescribe the information products o¤ered by large data brokers and relate them to the properties of theoptimal menu.
3
actions, the optimal menu for a continuum of types contains at most two experiments: one
is fully informative; and the other (if present) contains two signals, one of which perfectly
reveals the true state. Therefore, buyers never make a mistake whenever they choose one
of their two actions. In other words, the seller induces some buyers to make only type-I or
type-II errors.
A fundamental distinction between two types is whether their priors are congruent, i.e.,
whether they would choose the same action in the absence of further information. Because
of the linearity of expected utility in probabilities, we come to an intuition analogous to
Myerson (1981) or Riley and Zeckhauser (1983): if all types have congruent priors, the seller
should adopt �at pricing of the fully informative experiment. The seller�s problem is then to
screen buyers both within and across the two groups with congruent priors. Ideally, the seller
would charge di¤erent prices for the fully informative experiment to buyers in each group.
We show that, if the distribution of types satis�es strong regularity conditions, �at pricing of
a single information structure is, in fact, optimal. Conversely, discriminatory pricing of two
di¤erent experiments emerges as part of the optimal menu only when �ironing�is required
(Myerson, 1981). The second experiment is o¤ered as a means to serve buyers in one group,
while charging higher prices to the other group.
Related Literature In our earlier work on markets for data and online advertising (Berge-
mann and Bonatti, 2015), we examined the problem of selling information about individual
consumers as encoded in third-party cookies. Relative to the present paper, we allowed buy-
ers of information to choose from a continuum of ex-post actions (e.g., levels of advertising
spending), but we restricted the set of available information structures. In particular, we al-
lowed buyers to acquire individual cookies only� modeled as perfectly revealing signals about
individual realizations of the underlying state (i.e., the consumer�s value). This e¤ectively
divided the customer pool into �targeted�and �residual�consumers. We also required the
sellers of information to charge a constant linear price for each cookie. Thus, we abstracted
from the information-design problem, in order to focus on the e¤ect of information sales on
the market for advertising space.
Our current paper ties the literature on selling information with that on monopolistic
screening. A long literature in economics and marketing studies the properties of information
goods, emphasizing how digitalized production allows sellers to easily customize (or degrade)
the attributes of such products (Shapiro and Varian, 1999). This argument applies even more
forcefully to information products (i.e., experiments), and suggests that versioning should
be an attractive price-discrimination technique (Sarvary, 2012). In this paper, we investigate
the validity of these claims in a simple contracting environment à la Mussa and Rosen (1978).
Admati and P�eiderer (1986, 1990) provide the classic treatment of selling information to
4
a continuum of agents who trade in a rational expectations equilibrium. Our approach di¤ers
from theirs along several dimensions. Admati and P�eiderer (1986) emphasize the value of
selling informative, but noisy and idiosyncratic information to the traders, a result echoed
in Bergemann and Morris (2013). Admati and P�eiderer (1990) compare direct and indirect
trading of information, allowing the seller to o¤er shares in a mutual fund. In contrast, our
paper focuses on heterogeneous, risk-neutral buyers who do not interact in a downstream
market. Consequently, the seller o¤ers noisy versions of the data to screen the buyer�s initial
information and extract more surplus. This leads to profound di¤erences in the optimal
information structures. More recent contributions to the problem of selling information are
given by Es½o and Szentes (2007b) and Hörner and Skrzypacz (2015). These papers focus
on speci�c, distinct aspects of the problem such as bundling products and information, and
dynamic information disclosure, respectively.
Within the mechanism design literature, our approach is related to, yet conceptually
distinct from, models of discriminatory information disclosure. In these models, the seller of
a good discloses horizontal match-value information, in addition to setting a price. Several
papers, among which Ottaviani and Prat (2001), Johnson and Myatt (2006), Bergemann and
Pesendorfer (2007), Es½o and Szentes (2007a), Krähmer and Strausz (2015), and Li and Shi
(2015) analyze this problem from an ex-ante perspective. In these papers, the seller commits
(simultaneously or sequentially) to a disclosure rule and to a pricing policy.4 In related
contributions, Lizzeri (1999) considers vertical information acquisition and disclosure by a
monopoly intermediary, and Abraham, Athey, Babaio¤, and Grubb (2014) study vertical
information disclosure in auctions.
Commitment to a disclosure policy is also present in the literature on Bayesian persuasion,
e.g. Rayo and Segal (2010), Kamenica and Gentzkow (2011), and Kolotilin, Li, Mylovanov,
and Zapechelnyuk (2015). In contrast to this line of work, our model admits monetary
transfers and rules out any direct e¤ect of the buyer�s ex post action on the seller�s utility.
2 Model
We consider a model with a single agent (a buyer of information) facing a decision problem
under uncertainty. The relevant state for the buyer�s problem ! is drawn from a �nite set
: The buyer must choose an action a, also from a �nite set A: Throughout the paper, we
assume the sets of actions and states have the same cardinality jAj = jj = N:4In addition, a number of more recent papers, among which Balestrieri and Izmalkov (2014), Celik (2014),
Koessler and Skreta (2014), and Mylovanov and Tröger (2014) analyze this question from an informed-principal perspective.
5
Payo¤s The buyer�s seeks to match the state and his action. His ex post utility function
u (a; !) is given by
u (a; !) = 1[a=!]: (1)
This formulation for the utility function assumes the buyer places uniform weight on his de-
cisions. More general formulations would be equally tractable, but complicate the exposition
considerably.
Prior Information The buyer has a type � 2 � consisting of his interim beliefs about
the state. These beliefs are the buyer�s private information. They are generated from a
common prior and from privately observed signals. In particular, suppose the buyer and the
seller share a common prior
� 2 �.
The buyer privately observes an informative signal r 2 R according to a commonly knowninformation structure
� : ! �R:
The buyer then forms his interim beliefs � 2 � as
� (! j r) = � (r j !)� (!)P!0 � (r j !0)� (!0)
:
The buyer�s interim beliefs � (! j r), simply denoted by �, are his private information.From the seller�s point of view, the distribution over states and the distribution over
private signals simply induce a distribution of initial beliefs
F (�) ,
which we take as a primitive of our model. As usual, the model allows for the alternative
interpretations of a single buyer and a continuum of buyers.5
Incremental Information The seller seeks to augment the buyer�s private information
with additional experiments. An experiment (an information structure) I = f�; Sg consistsof a set of signals s 2 S and a likelihood function:
� : ! �(S) :
We assume throughout that the realizations of the buyer�s private signal r 2 R and of the5In order to interpret the model as a continuum of buyers, we shall assume that states ! are identically
and independently distributed across buyers, and that di¤erent buyers�signals are conditionally independent.
6
signal s 2 S from any experiment I are independent conditional on the state !. With the
conditional independence assumption, we are adopting the interpretation of a buyer querying
a database, or requesting a diagnostic service. In particular, the buyer and the seller draw
their information from independent sources. Under this interpretation, the seller does not
know the realized state ! at the time of contracting. The seller can, however, augment the
buyer�s original information with arbitrarily precise signals.
We now �x an information structure I and we let �ij denote the conditional probability
of signal sj in state !i,
�ij = Pr (sj j !i) ;
withP
j �ij = 1. We can then represent any information structure in matrix form as
I =
s1 sj � � �!1 �11 �1j � � �
!i �i1 �ij...
... � � �
The following experiments are of particular interest: (a) the fully informative experiment
with �ii = 1 for all i; (b) locally non-dispersed experiments that contain a unit row vector
with �ii = 1 for some i; and (c) locally noise-free experiments that contain a column sj with
only one positive entry, �jj > 0 for some j. In particular, locally non-dispersed experiments
allow the buyer to rule out state i, while locally noise-free experiments reveal state i = j
with positive probability.
An information policy or more simply a menu M = fI; tg consists of a collection ofinformation structures and an associated tari¤ function, i.e.,
I = fIg t : I ! R+:
Our goal is to characterize the revenue-maximizing menu for the seller.
We are deliberately focusing on the pure problem of designing and selling information
structures. We are therefore assuming that the seller commits to a menu before the realiza-
tion of the state and the type (!; r), and that none of the buyer�s action (a), the realized
state (!), or the experimental outcome (the signal s) is contractible. Thus, despite the buyer
being privately informed about his beliefs, scoring rules and other belief-elicitation schemes
are not available to the seller. In particular, the timing of the game is as follows: (i) the
seller posts a menuM; (ii) the true state ! and the buyer�s signal r (hence, the type �) are
7
realized; (iii) the buyer chooses an experiment I 2 I and pays the corresponding price t; (iv)the buyer observes a signal s from the experiment I (given the true state !) and chooses an
action a.
Finally, we have assumed that the seller is unconstrained in her choice of information
structures, and that it is costless to provide any information. In other words, the data is
already stored. A richer model would distinguish the cost of acquisition of information (i.e.,
building the database) from duplication and distribution of the information. In particular,
the analysis could easily be extended to a �rst stage where the seller invests in the maximal
precision of the experiments, and to �xed or linear costs of information distribution.
Retargeting interpretation Having completed the description of the model, we now
return to online advertising markets to provide a more detailed interpretation of our setting.
Consider a market with a continuum of consumers indexed by i 2 [0; 1]. Consumer ispends ! 2 R+ per website, i.e., she has a budget of !. Budgets are distributed in theoverall population according to � 2 �. However, among customers of a given retailer
r (e.g., r corresponds to Walmart, J.C. Penney, Sears, Macy�s), budgets are distributed
according to � (! j r). Retailers want to determine which ones of their previous customersthey should re-target with an advertising campaign.
A large data seller has a record of past digital purchases of consumers i and hence it
perfectly knows the budget ! of each one. The data seller can, therefore, improve the
precision of the retailer�s estimate for each individual consumer. In particular, suppose the
database records several attributes for each consumer, i.e., random variables that correlate
(positively or negatively) with individual budgets. The sale of information is then analogous
to a database query: a retailer and the seller contract on which attributes the seller will
disclose about a consumer upon the buyer�s request. This allows the seller to o¤er narrower
or wider brackets for the estimation of individual budgets. When the price of information
is speci�ed, however, the seller does not yet know the realized state, i.e., the budget of the
speci�c consumer the buyer will actually inquire about.
Because the distribution of budgets of retailer r�s previous customers is the retailer�s
private information, the retailer is also privately informed about the incremental value of
each attribute. Consequently, the seller o¤ers a menu of information packages.6 Some of
these packages may disclose only a fraction of the attributes relevant to the retailer�s decision,
leading the retailer to take action (e.g., to advertise) too frequently or too rarely.
6It is, of course, reasonable that the seller knows which consumers have visited a large retailer�s websitein the past. Therefore, the seller should be able to compute the conditional distributions � (! j r). However,under a posted-price mechanism, the seller cannot condition prices and experiments on buyers� identities.This assumption is realistic in large markets with numerous small retailers, and whenever data purchasesare made through an intermediary, such as an advertising agency.
8
Our model is also consistent with alternative interpretations. For example, suppose
each retailer r observes consumer i�s budget upon their �rst interaction. Each consumer�s
budget evolves stochastically according to a commonly known Markov process. The retailer
can query the database for information about the consumer�s most recent activity. The
retailer is then privately informed about the time of its interaction with the consumer,
which determines its beliefs about the consumer�s current budget. Finally, retailers may
have private information about their overall willingness to spend on advertising or about
other dimensions of their demand for information (e.g., the importance of their decision, or
the �stakes of the game�).
3 Information Design
3.1 Value of Information
We now characterize the value of the buyer�s initial information and the supplemental value
of an information structure I = f�; Sg. Let �i denote the initial probability that type �assigns to state !i, with i = 1; :::; N . The value of the agent�s problem in the absence of
further information is given by
u (�) , maxa
XN
i=1u (a; !i) �i:
Now suppose the buyer observes an incremental signal sj under the information structure
I = f�; Sg where the number of signal realizations is jSj = J . The buyer chooses an actionafter updating her beliefs, leading to the gross utility
u (�; sj) , maxa
XN
i=1u (a; !i)
�i�ijPNi=1 �i�ij
; (2)
wherePN
i=1 �i�ij = Pr [sj j �] is the marginal distribution of signals from the perspective of
buyer type �: Integrating over all signal realizations sj, and subtracting the value of prior
information, the net value of an information structure I for type � is given by
V (I; �) , Esj [u (�; sj)] =JXj=1
maxau (a; !i) �i�ij � u (�) :
9
Under the speci�c utility function (1), the value of information takes a simpler form:
V (I; �) =
JXj=1
maxi�i�ij �max
i�i: (3)
Quite simply, the gross value of an information structure is given by the ex ante probability
of the buyer choosing the correct action given the state.7
As for all decision problems, an information structure I is only valuable if di¤erent signals
sj lead to di¤erent actions. In particular, if argmaxi �i�ij is constant, then (3) immediately
implies V (I; �) = 0: Conversely, the fully informative experiment I� guarantees that the
buyer takes the correct action ex post. Its value is given by V (I�; �) = 1�maxi �i:Viewed as a function of types, the value of an experiment I is piecewise linear in � with
a �nite number of kinks. Linearity is a consequence of the Bayesian nature of our problem,
where types are probabilities. Downward kinks are due to the max operator in the buyer�s
reservation utility u (�): they correspond to changes in the buyer�s action under no additional
information. Upward kinks are generated by the max operator in (2): they re�ect changes
in the buyer�s preferred action upon observing a signal.
3.2 The Seller�s Problem
We now characterize the menu of experiments that maximizes the seller�s pro�ts. By the
Revelation Principle, we may state the seller�s problem as designing a direct mechanism
M = fI (�) ; t (�)g:
that assigns an information structure I, denoted by I (�) to each type of the buyer. The
seller�s problem consists of maximizing the expected transfers subject to incentive compati-
bility and individual rationality:
maxfI(�);t(�)g
Zt (�)dF (�) ; (4)
s.t. V (I (�) ; �)� t (�) � V (I (�0) ; �)� t (�0) 8 �; �0;V (I (�) ; �)� t (�) � 0 8�:
The seller�s problem can be immediately simpli�ed by reducing the set of optimal information
structures to a very tractable class.
7This is in contrast with models that specify contingent payments for actions, where the marginal pricein�uences the buyer�s behavior after observing a signal and hence his willingness to pay ex ante.
10
Lemma 1 (Maximal Cardinality) Every incentive compatible information policy can berepresented as a collection of information structures where the signal space has at most the
cardinality of the action space.
This result follows from the revelation principle for multi-stage games in Myerson (1986).
The intuition is straightforward: consider a incentive-compatible information policy that
contains an experiment I (�) with more signals than actions; the seller could combine all
signals in I(�) that lead to the same choice of action for type �; clearly, the value of this
experiment stays constant for type � (who does not modify his behavior); in addition, because
the original signal is strictly less informative than the new one, V (I(�); �0) decreases (weakly)
for all �0 6= �, relaxing the incentive constraints.Armed with this result, we can more fully characterize the buyer�s demand for informa-
tion. To do so, let us revisit the value of information in a simple binary-state example. In
particular, let = f!1; !2g and, to facilitate the exposition, de�ne
� , �11 and � , �22:
Lemma 1 allows us to focus on experiments with two signals only, i.e., on information
structures I (�) of the following kind:
I (�) =
s1 s2
!1 �(�) 1� �(�)!2 1� �(�) �(�)
:
Without loss of generality, we adopt the convention that �(�) + �(�) � 1 (else we should
relabel the two signals). With some abuse of notation, we let � = Pr [!1]. Finally, the value
of experiment (�; �) for type �, which is given by (3), can also be written as
V (�; �; �) = [�� + � (1� �)�maxf�; 1� �g]+ (5)
Figure 1 shows the value of information for experiments (�; �) 2 f(1; 1) ; (1=2; 1)g : The �rstone is fully informative, while the second one eliminates type-I errors.
While we are considering a natural type space, corresponding to the buyer�s interim
beliefs, the notions of a �high�and �low�type di¤er from the standard screening setting.
In particular, due to the nature of the buyer�s Bayesian problem, the most valuable type for
the seller is the middle type � = 1=2. Conversely, the two extreme types � 2 f0; 1g have novalue of information. However, the distance j� � 1=2j is not a su¢ cient statistic for the valueof information, as shown in the right panel of Figure 1. In particular, the di¤erent slopes on
11
Figure 1: Value of Information Structures V (�; �; �)
each side of � = 1=2 indicate di¤erent marginal bene�ts of avoiding type-I errors.
More generally, information always has both a vertical (quality) and a horizontal (po-
sitioning) dimension. Information in this sense is always high-dimensional, even when the
buyer�s type is one-dimensional, as in this example. In particular, a fundamental di¤erence
with standard quality pricing is that all types agree on the best product (i.e., the fully infor-
mative experiment), but they disagree on the ranking of �distorted�products. For example,
consider the following two information structures:
I =
s1 s2
!1 1 0
!2 1� � �
; I 0 =
s1 s2
!1 � 1� �!2 0 1
:
All types � 2 (1=2; 1) have a positive value of information for experiment I 0, because suchexperiments contain a signal that perfectly reveals state !2 and induces them to switch their
action. The converse is true for types � 2 (0; 1=2).We have so far focused on locally non-dispersed experiments� information structures with
signals that perfectly rule out one state. The next result establishes that the optimal menu
contains only such experiments, further simplifying the seller�s problem.
Lemma 2 (Non-Dispersed Information)
1. The fully informative experiment, �ii = 1 for all i, is always part of the optimal menu.
2. Every experiment in an optimal menu is locally non-dispersed, i.e., �ii = 1 for some i.
Part (1.) of this result is immediate: because every type values the fully informative
experiment I� the most, the seller can always replace the most expensive item on the menu
12
with I�, keep all prices constant, and weakly increase pro�ts. Part (2.) of Lemma 2 im-
plies that, for every experiment I, there exists a state ! that is ruled out by N � 1 signalrealizations.8 This result follows from the negative correlation present by construction in
the buyer�s beliefs over each state. Because beliefs �i sum to one, we can always identify a
vertical quality dimension (e.g., �NN) that buyers value uniformly. All other dimensions of
the experiment � enter the buyer�s utility through the di¤erence with �NN .
To obtain some intuition, consider again the case of binary states, and rearrange the
value of information in (5) as
V (�; �; �) = [(�� �) � + � �maxf�; 1� �g]+
This formulation for the value of information highlights both level e¤ects (terms depend on
the allocation or type only) and interaction e¤ects. In particular, the allocation (�; �) and
the buyer�s type � interact through the di¤erence � � � only. This means the seller canincrease the value of an experiment at the same rate for all types. In particular, the seller
can increase � and � holding � � � constant, raising the price at the same rate. This doesnot alter the attractiveness of the experiment for any buyer who is considering choosing it.
Thus, in the case of binary state and binary action, every optimal experiment must have
either � = 1 or � = 1 (or both). With two states only, any locally non-dispersed experiment
is also locally noise-free, i.e., at least one signal perfectly reveals the true state.
In the context of online advertising, any such experiment consists of asking the data seller
to group potential consumers into two categories: �Buy Ad�and �Don�t Buy Ad�(where
action a1 corresponds to buying). With this interpretation, � = 1 > � will lead to a broad
match strategy, and vice-versa � = 1 > � will lead to excessively narrow targeting.
4 Optimal Menu with Binary Types
We now consider the case of two types of buyers, � 2��H ; �L
. For simplicity we will refer
to the relatively less informed type as the �high type��H and to relatively more informed
type as the �low type��L. More speci�cally, we assume that
maxi�Hi � max
i�Li :
8Thus, the expected relative entropy of each experiment is in�nite. Cabrales, Gossner, and Serrano (2015)o¤er an alternative criterion to quantify the informational content of an experiment.
13
Equivalently, type �H is willing to pay more for the fully revealing experiment. The fraction
of high types in the population is given by
= Pr(�H):
To illustrate the basic mechanics of screening buyers with heterogeneous beliefs, we consider a
simple example with binary states and actions. We then turn to a more general environment.
4.1 Binary State Example
The optimal menu with binary states is best illustrated through a concrete example. Let
�H = 7=10 and �L = 1=5, and assume the two types are equally likely in the population
( = 1=2). Now suppose the seller o¤ers the fully informative experiment � = � = 1 to
type �H and extracts his entire surplus by charging a price of 3=10. In a canonical screening
model, the seller would now have to exclude type �L. However, when selling information,
the monopolist can design another experiment with undesirable properties for the high type.
For example, the seller can o¤er a partially informative experiment that would lead type �H
to choose action a1 when observing signal s1 and to be indi¤erent between the two actions
when observing signal s2. Because the high type would choose action a1 in the absence of
information, his willingness to pay for such an experiment is nil. The seller can then extract
the entire surplus of type �L as well, and satisfy incentive constraints. In our example, this
amounts to o¤ering the experiment (�; �) = (4=7; 1) at a price of 4=35. The net value of
both experiments as a function of the buyer�s type � 2 [0; 1] is shown in Figure 2 below.
Figure 2: Suboptimal Menu: (�; �) 2 f(1; 1); (4=7; 1)g
As is clear from this picture, both incentive compatibility constraints are slack with such
a menu. Therefore, the seller can increase the quality of the experiment sold to the low
14
type, while satisfying the incentive constraint of the high type, and still extract the entire
surplus. The optimal menu is then characterized by the most informative experiment the
seller can o¤er while extracting the entire surplus. In our example, the optimal menu is then
given by (�; �) 2 f(1; 1) ; (4=5; 1)g with the corresponding prices t 2 f3=10; 4=25g. Figure 3illustrates the net value of the two experiments o¤ered by the monopolist.
Figure 3: Optimal Menu: (�; �) 2 f(1; 1); (4=5; 1)g
In this example, the optimal menu is characterized by �no distortion at the top,� and
by full rent extraction. The (ex ante less informed) type � closer to 1=2 buys the perfectly
informative experiment. The more informed type buys a partially revealing experiment.
Intuitively, the seller lowers the information content provided to the more informed type to
lower the information rent of the uninformed type. However, the resulting partially informa-
tive experiment is not worthless for the high type. On the contrary, it is as informative as
possible, while satisfying the high type�s incentive compatibility and participation constraints
with equality.
This example highlights the �horizontal�aspect of selling information that increases the
scope for screening. Towards a more general result, assume without loss that the high type
is given by �H > 1=2 and that
���H � 1=2�� � ���L � 1=2�� :We now de�ne the following threshold:
� , 1� �11� �2
: (6)
Finally, the following distinction is useful: the priors of the two types are said to be
congruent if both types would choose the same action in the absence of the additional
15
information. With our parametrization, priors are congruent if 1=2 � �L � �H and they arenon-congruent if �L � 1=2 � �H . In Proposition 1, we characterize the optimal menu in thefully binary setting.9
Proposition 1 (Binary Types, States, and Actions)
1. With congruent priors, the seller o¤ers the fully informative experiment only. Both
types participate if and only if ��1� �L
�=�1� �H
�:
2. With non-congruent priors and � � , both types buy the fully informative experiment.
3. With non-congruent priors and > � , the high type buys the fully informative experi-
ment and the low type buys a partially informative experiment with
� =2�H � 1�H � �L
and � = 1:
Furthermore, the seller extracts the entire surplus.
This result highlights the role of linearity in the seller�s problem, and it suggests that
the optimal menu is characterized by a corner solution. In particular, it is never optimal for
the seller to o¤er a more informative experiment to the low type and simultaneously leave
some rent to the high type by lowering the price of the fully informative experiment. If such
a marginal deviation is pro�table, the optimal menu consists of pooling both types at the
fully informative experiment, as in part (2.).
With congruent priors, it is more likely that both types receive full information if they
are similar, if the more informed types are more frequent, or if they have more to gain from
additional information. With non-congruent priors, the optimal menu o¤ers the perfect in-
formation structure to both types if they are su¢ ciently close in their prior information.
O¤ering two experiments is optimal only if the two types are su¢ ciently asymmetric. How-
ever, the low type always receives some information in that case (i.e., he is not shut down),
and the high type is indi¤erent between the two items on the menu. Furthermore, the
high type only receives positive rents if he is pooled with the low type. Otherwise, the seller
extracts the high type�s surplus, even in the presence of an alternative information structure.
Even though the shape of the optimal menu depends on whether types have congruent or
non-congruent priors, the comparative statics of the optimal information structure are robust
across the di¤erent scenarios. Because the high type buys the fully informative experiment
and the low type�s experiment always has � = 1, we de�ne the noise in the information
9This result is a special case of Proposition 2 and its proof is therefore omitted.
16
structure bought by the low type as q , 1� �. We can then ask how do the distribution oftypes and the prior information of each type impact the optimal provision of information?
Corollary 1 (Comparative Statics)
1. The noise q is increasing in the frequency of the high type.
2. The noise q is increasing in the precision of the prior of the high type.
3. The noise q is decreasing in the precision of the prior of the low type.
In other words, the rent-extraction vs. e¢ ciency trade-o¤ is resolved at the expense of the
informed agent whenever the fraction of uninformed agents is larger, the uniformed agents
has a higher willingness to pay for information, or the informed type has a lower willingness
to pay for the complete information structure.
4.2 General Characterization
We now characterize the optimal menu with two types for the case of N states and actions.
To facilitate the description of the seller�s problem, we �rst establish the familiar properties
of �no distortion at the top�and �no rent at the bottom.�
Lemma 3 (Binding Constraints) In the optimal menu: the high type �H purchases the
complete information structure I�; the incentive-compatibility constraint of type �H binds;
and the participation constraint of type �L binds.
We can now simplify the seller�s problem (4). If the low type participates, the seller
extract the entire surplus from the low type. We can then maximize over the information
structure I(�L) and over the information rent of the high type, which we denote by U(�H).
Without loss of generality, we arrange the signals in the experiment I(�L) so that the low
type chooses action ai when observing signal si.
Using expression (3) for the value of information and ignoring additive constants, the
seller�s problem consists of choosing U(�H) � 0 and the diagonal entries �ii 2 [0; 1] of thelow type�s information structure I(�L) to maximize
(1� )nXi=1
�Li �ii � U(�H); (7)
subject to the high type�s incentive-compatibility constraint
nXi=1
��Li � �Hi
��ii � max
i�Li �max
i�Hi � U(�H): (8)
17
By inspection of (7) and (8), it is immediate that a higher �ii increases pro�ts and relaxes
the constraint for all !i such that �Li > �
Hi . In other words, if the low type deems a given state
!i more likely than does the high type, the seller should not distort the distribution of signals
conditional on state i, i.e., she should send signal si with probability one. In particular, the
optimal menu has �ii = 1 for the state corresponding to the low type�s default action.
With two states only (e.g., in the previous example where � < 1 = �), this observa-
tion also identi�es the most pro�table way of reducing the high type�s rent. With N > 2
states, the seller has considerably more instruments to distort the low type�s allocation. In
particular, the partially informative experiment may contain fewer signals than available
actions�the seller may �drop�some signals from I(�L) to reduce the information rents. The
logic above then suggests that the seller should distort the signal distribution in states the
high type deems very likely but the low type does not.
To formalize this intuition, we re-order the states !i by the likelihood ratios of the two
types�beliefs. In particular, let
�L1�H1
� ::: � �Li�Hi
� � � � � �LN�HN: (9)
We then de�ne two critical states. The �rst state is de�ned as
i� = min i :�Li�Hi
� : (10)
(State i� is well-de�ned because the likelihood ratio must exceed one for some i.) To de�ne
the second state, consider the following quantity:
kj :=
Pni=j+1
��Li � �Hi
��maxi �Li +maxi �Hi
�Hj � �Lj: (11)
In particular, kj corresponds to the value of �jj that satis�es (8) with equality when the
high type�s rent is nil and, in addition, �ii = 1 for all states i > j and �ii = 0 for all i < j.
We then let
j� := min j : kj � 0; (12)
which is also well-de�ned because, the de�nition of a low type implies at least kN � 0: In
Proposition 2, we provide a general characterization of the optimal menu.
18
Proposition 2 (Optimal Menu with Two Types) The optimal experiment I(�L) has�ii = 0 for all i < min fi�; j�g and �ii = 1 for all i > min fi�; j�g. Moreover:
1. If j� < i�, then �j�j� = kj� and U(�H) = 0.
2. If j� � i�, then �i�i� = 1 and U(�H) > 0.
In general, the optimal experiment I(�L) takes the following lower-triangular shape, with
�ii 2 fki; 1g depending on whether i� 7 j�, as in Proposition 2:
0 � � � 0 �1i � � � � � � �1n...
......
......
... �ii � � � � � � �in...
... 0 1 � � � 0...
......
.... . .
...
0 � � � 0 0 0 � � � 1
(13)
The distribution of signals conditional on states i < min fi�; j�g is not uniquely pinneddown. In the Appendix, we describe three algorithms that describe a procedure to compute
the solution and to assign the o¤-diagonal entries �ij in such a way that both types follow
the recommendation implied by each signal.
The seller�s choice of experiments in the optimal menu depends on two distinct factors: (a)
the extent and the structure of the two types�belief disagreement, and (b) the distribution
of types. Proposition 2 shows that the problem is separable. In particular, the optimal
experiment I(�L) is chosen from a discrete set that only depends on the two types��H ; �L
�.
Each element of this set distorts progressively more signals as in (13). The distribution of
types then determines the optimal element of the set.
Intuitively, the extent of the two types�belief disagreement should guide the seller�s choice
of a partially informative experiment. In particular, the seller is most willing to introduce
noise in signals that the low type considers relatively less likely than the high type, because
these distortions facilitate screening without sacri�cing surplus. That is, distortions are
more likely to occur in states that maximize relative disagreement. The extent of disagree-
ment is captured by the critical state j�, which represents the number of signals that must
be distorted in order to satisfy incentive compatibility, while holding the high type to his
reservation utility.
The pro�tability of distorting the allocation depends, however, on the distribution of
types. In particular, the measure of high types (as captured by the critical state i�) represents
the shadow cost of providing information to the low types. For su¢ ciently high shadow cost,
19
the monopolist is willing to distort the allocation by removing as many signals as necessary
to satisfy (8) and hold the high type to his participation constraint. For low opportunity
cost, the principal prefers to limit distortions and concede rents to the high type. Thus, as
in the binary-state case, the informativeness of the low type�s experiment is decreasing in
the fraction of high types .
We illustrate the �ndings of Proposition 2 through the following examples in a setting
with three states and actions.
Example 1 (Dissimilar Types) Let �L = (1=10; 1=10; 4=5) and �H = (1=2; 1=4; 1=4). Be-cause k1 > 0, we have j� = 1 and therefore �ii > 0 for all i = 1; 2; 3: The optimal experiment
I(�L) as a function of the distribution of types is given by
s1 s2 s3
!1 1 0 0
!2 0 1 0
!3 0 0 1
< 1=5;
and
s1 s2 s3
!1 1=4 1=4 1=2
!2 0 1 0
!3 0 0 1
> 1=5:
The high type obtains zero rent if and only if � 1=5.
Example 2 (Similar Types) Let �L = (1=10; 1=10; 4=5) but consider a relatively less in-formed high type, i.e., �H = (2=5; 3=10; 3=10) : Because k1 < 0 < k2, the partially informative
experiment I(�L) can involve dropping signal s1, i.e., setting �11 = 0. In particular, the op-
timal experiment I(�L) is given by
s1 s2 s3
!1 1 0 0
!2 0 1 0
!3 0 0 1
< 1=4;
s1 s2 s3
!1 0 1=4 3=4
!2 0 1 0
!3 0 0 1
2 [1=4; 1=3] ;
s1 s2 s3
!1 0 1=4 3=4
!2 0 1=2 1=2
!3 0 0 1
> 1=3:
The high type obtains zero rent if and only if � 1=3.
Notice that the monopolist must distort the allocation more heavily in Example 2, where
the two types are more similar. In both examples, for su¢ ciently low , the menu contains
the fully informative experiment I� only.
The following corollary of Proposition 2 illustrates two polar cases in which the distrib-
ution of types pins down the rent of the high type.
20
Corollary 2 (Information Rents)
1. If < �L1 =�H1 , both types purchase the fully informative experiment I
� and the high type
obtains positive rent U(�H) > 0.
2. If > �Li =�Hi for all i such that �
Li =�
Hi � 1, the high type obtains no rent.
The main idea of this section is that the seller can exploit disagreement between the two
types along several dimensions to pro�tably o¤er a screening menu. But when is it pro�table
to o¤er two distinct experiments? Corollary 2 dealt with the case of bunching. Corollary 3
deals with the case of exclusion. It contains su¢ cient conditions for discriminatory pricing (a
menu with two experiments) to be optimal under some distribution . Recall that congruent
priors are beliefs that lead to identical actions in the absence of new information.
Corollary 3 (Discriminatory Pricing)
1. With non-congruent priors, it is never optimal to exclude type �L.
2. With congruent priors and types not lying on a ray in the simplex, there exists such
that the optimal experiment I(�L) is partially informative.
Thus, discriminatory pricing can be pro�table even if the two types agree on the default
action, as long as the likelihood ratio �Li =�Hi takes more than two distinct values. With only
two states, this is impossible, and congruent priors imply bunching or exclusion (i.e., only
the fully informative experiment is sold). With more than two states, the seller can exploit
disagreement along any dimension and extract more surplus through a richer menu.
5 Optimal Menu with a Continuum of Types
We now investigate which properties of the optimal menu discussed above extend to the more
general setting of a continuum of types. The seller�s problem when N > 2 is, in general,
a complicated multidimensional screening problem.10 We therefore specialize the model to
the case of binary action and state, i.e., jAj = jj = 2: Recall our notation for informationstructures I (�) in the binary-state case,
I (�) =
s1 s2
!1 �(�) 1� �(�)!2 1� �(�) �(�)
10The linearity introduced by beliefs does not help in this respect. In fact, kinks in the utility function (3)make our setting substantially more di¢ cult than, for example, the bundling models of Manelli and Vincent(2006), Pavlov (2011), or Daskalakis, Deckelbaum, and Tzamos (2014).
21
and the value of information
V (�; �; �) = [(�� �) � + � �maxf�; 1� �g]+ .
Because the allocation (�; �) interacts with the buyer�s type only through the di¤erence ���,we de�ne the following one-dimensional allocation rule measuring the relative informativeness
of an experiment
q(�) , �(�)� �(�) 2 [�1; 1] :
The result in Lemma 2 implies that either � = 1 or � = 1 in any experiment. Therefore,
we adopt the convention that q (�) > 0 implies � (�) = 1 and q (�) < 0 implies � (�) = 1, i.e.,
� = min f1; 1� qg. We then express the value of experiment q 2 [�1; 1] for type � 2 [0; 1]as follows:
V (q; �) = [�q �maxfq; 0g+minf�; 1� �g]+ : (14)
With this notation, two distinct information structures (q = �1 and q = 1) correspond toreleasing no information to the buyer. (These are the two experiments that show the same
signal with probability one.) Conversely, the fully informative experiment is given by q = 0.
The utility function V (q; �) has the single-crossing property in (�; q). This indicates that
higher-� buyers, who are relatively more optimistic about state !1, assign a relatively higher
value to information structures with a higher q, because such experiments contain a signal
that is more informative about the (less likely) state !2.
The formulation in (14) for the value of information re�ects the following properties of
our screening problem: (a) buyers have type-dependent participation constraints;11 (b) buyer
type � = 1=2 has the highest willingness to pay for any experiment q; (c) the experiment
q = 0 is the most valuable for all types �; (d) di¤erent types � assign di¤erent ranks to
partially informative experiments q 6= 0.Overall, while the seller�s problem is reminiscent of classic nonlinear pricing, selling in-
formation introduces a novel aspect of horizontal di¤erentiation across buyer types. Fur-
thermore, the �vertical quality�and �horizontal position�of an information product cannot
be chosen separately by the seller. In particular, it is not possible to change the relative
informativeness of a product (i.e., choose a very high or very low q) without reducing its
overall informativeness.11In general, our setting involves both bunching and shutdown, making it di¢ cult to directly apply existing
approaches, such as the one in Jullien (2000).
22
5.1 Incentive Compatibility
We �rst simplify the problem by eliminating the non-negativity constraint from (14). To do
so, we consider a natural class of allocations.
With an abuse of terminology, we de�ne an allocation q (�) to be responsive if, for any �,
�q (�)�maxfq (�) ; 0g+minf�; 1� �g � 0) q (�) =
(�1 if � < 1=2:
+1 if � � 1=2:
Under a responsive allocation, any type � who takes the same action following both signal
realizations receives one of the two completely uninformative experiments. We then have
the following intuitive result.
Lemma 4 (Responsive Allocations) Any optimal allocation q� (�) is responsive.
In other words, if type � derives zero value from his experiment q (�), then the seller is
(weakly) better o¤ by not providing any information, which allows her to relax all other
types�incentive constraints.
We now use the structure of the problem in order to derive a characterization of all
implementable responsive allocations q. We observe that the buyer�s utility function has a
downward kink in �. This is a consequence of having an interior type (� = 1=2) assign the
highest value to any allocation, and of the linearity of the buyer�s problem. We therefore
compute the buyer�s rents U(�) on [0; 1=2] and [1=2; 1] by applying the envelope theorem to
each subinterval separately. We thus obtain two di¤erent expressions for the rent of type
� = 1=2. Continuity of the rent function then implies
U(1=2) = U(0) +
Z 1=2
0
V�(q; �)d� = U(1)�Z 1
1=2
V�(q; �)d�:
Because types � = 0 and � = 1 assign zero value to any experiment, incentive compatibility
requires U (0) = U (1). (Optimality will later require setting both to zero.) Therefore, while
any type�s utility can always be written in the above form, the novel element of our model
is that no further endogenous variables appear.12 Di¤erentiating (14) with respect to � and
simplifying, we obtain the following restriction on an incentive-compatible allocationZ 1=2
0
(q (�) + 1)d� = �Z 1
1=2
(q (�)� 1)d�: (15)
12For instance, in Mussa and Rosen (1978) and in Myerson (1981), the rent of the highest type U(1)depends on the entire allocation q.
23
Equation (15) is a new condition that sets our framework apart from most other screening
problems. In particular, incentive constraints impose an aggregate (integral) constraint on
the allocation.13 We formalize this in the following result.
Lemma 5 (Implementable Allocations) Any responsive allocation q is implementable ifand only if the following two conditions hold:
q (�) 2 [�1; 1] is non-decreasing;
andZ 1
0
q (�) d� = 0: (16)
The transfers t (�) associated with the allocation q(�) can be computed in the usual
way on the two intervals [0; 1=2] and [1=2; 1] separately. With the addition of the integral
constraint (16) for implementability, we can state the seller�s problem as follows:
maxq(�)
Z 1
0
��� � F (1=2)� F (�)
f (�)
�q (�)�max fq (�) ; 0g
�dF (�) ; (17)
s.t. q (�) 2 [�1; 1] non-decreasing,Z 1
0
q (�)d� = 0:
5.2 Optimal Menu
In order to solve the seller�s problem (17) and characterize the optimal menu, we rewrite the
objective with the density f(�) explicitly in each term:Z 1
0
[(�f (�) + F (�)) q (�)�max fq (�) ; 0g f (�)]d�:
This minor modi�cation highlights two important features of our problem: (i) the constraint
and the objective have generically di¤erent weights, d� and dF (�); and hence (ii) the problem
is non separable in the type and the allocation, which interact in two di¤erent terms. In
particular, consider the �virtual values� � (�; q), de�ned as the partial derivative of the
integrand with respect to q. Unlike in the standard separable models, our virtual values are
a function of the allocation. Because our objective is piecewise linear, the function �(�; q)
13In this sense, our model di¤ers from other instances of screening under integral constraints (e.g., con-straints on transfers due to budget or enforceability, or capacity constraints). Finally, while the resemblanceto a persuasion budget constraint is purely cosmetic, a similar condition does appear in the model of per-suasion with private information of Kolotilin, Li, Mylovanov, and Zapechelnyuk (2015).
24
takes on two values only, i.e.,
�(�; q) :=
8<:�f (�) + F (�) for q < 0;
(� � 1)f (�) + F (�) for q > 0:
These two values represent the marginal bene�t to the seller (ignoring the constraint) of
increasing each type�s allocation from �1 to 0, and from 0 to 1, respectively.
If ironing à la Myerson is required, we denote the ironed virtual value for experiment q
as �� (�; q). Finally, we say that the allocation satis�es the pooling property if it is constant
on any interval where the ironed virtual value is constant. We can then reduce the seller�s
problem to the following maximization program.
Proposition 3 (Optimal Allocation Rule) The allocation q�(�) is optimal if and only ifthere exists �� > 0 such that: (i) for all �,
q�(�) = argmaxq
�Z q
�1
��� (�; x)� ��
�dx
�;
(ii) q� has the pooling property; and (iii) q� satis�es the integral constraint (16).
The solution to the seller�s problem is then obtained by combining standard Lagrange
methods with the ironing procedure developed by Toikka (2011) that extends the approach
of Myerson (1981). In particular, Proposition 3 provides a characterization of the general
solution, and suggests an algorithm to compute it.
To gain some intuition for the shape of the solution, observe that the problem is piecewise-
linear (but concave) in the allocation. Thus, absent the integral constraint, the seller would
choose an allocation that takes values at the kinks, i.e. q�(�) 2 f�1; 0; 1g for all �. Inother words, the seller would o¤er a one-experiment menu consisting of a �at price for the
complete-information structure. It will indeed be optimal for the seller to adopt �at pricing
in a number of circumstances. The main novel result of this section is that the seller can
(sometimes) do better by o¤ering one additional experiment.
Proposition 4 (Optimal Menu) An optimal menu consists of at most two experiments.
1. The �rst experiment is fully informative.
2. The second experiment (contains a signal that) perfectly reveals one state.
To obtain some intuition for this result, consider a relaxed problem where the seller
contracts separately with buyers � < 1=2 and � > 1=2: Because of the linearity of the
25
problem, the optimal menu for each group is degenerate: the seller o¤ers the perfectly
revealing experiment at a �at price. Now consider the solution to the full problem. If
the optimal prices for each separate group are similar, the seller o¤ers the fully revealing
experiment at an intermediate price. If they are quite di¤erent, the seller prefers to distort
the information sold to one group, so as to maintain a high price for the fully revealing
experiment. However, the linearity of the environment prevents the seller from o¤ering more
than one distorted experiment, i.e., no further versioning is optimal.
We now illustrate the optimal menu under �at and discriminatory pricing separately.
5.2.1 Flat Pricing
Let types be uniformly distributed, F (�) = �, and consider the virtual values �(�; q) for
q < 0 and q � 0 separately. These values are constant in q, hence we refer to �(�;�1) and� (�; 1) respectively. For a given value of the multiplier �, the allocation that maximizes
the expected virtual surplus in Proposition 3 assigns q�(�) = �1 to all types � for which�(�;�1) < �; it assigns q�(�) = 0 to all types � for which �(�;�1) > � > �(�; 1); and
q�(�) = 1 for all types � for which �(�; 1) > �.
Figure 4: Uniform Distribution: Virtual Values and Optimal Allocation
Figure 4 illustrates the resulting allocation rule. In order to satisfy the constraint, the
optimal value of the multiplier �� must identify two symmetric threshold types (�1; �2) that
separate types receiving the e¢ cient allocation q = 0 from those receiving no information at
all, q = �1 or q = 1. The allocation then clearly satis�es the integral constraint (16). Moregenerally, if both virtual values � (�;�1) and � (�; 1) are strictly increasing in �, the optimalmenu consists of charging the monopoly price for the fully informative experiment.
Flat pricing is optimal under weaker conditions than strictly increasing virtual values.
We now summarize the su¢ cient conditions for this result.
26
Proposition 5 (Flat Pricing) Suppose any of the following conditions hold:
1. F (�) + �f(�) and F (�) + (� � 1)f(�) are strictly increasing;
2. the density f(�) = 0 for all � > 1=2 or � < 1=2;
3. the density f(�) is symmetric around � = 1=2.
The optimal menu contains only the fully informative experiment.
An implication of Proposition 5 is that the seller o¤ers a second experiment only if ironing
is required. At the same time, there exist examples with non-monotone virtual values and
one-itemmenus. Symmetric distributions are one such instance: for any distribution function
F (�), e.g. hazard rate, the solution to the restricted problem on [0; 1=2] or [1=2; 1] is a cuto¤
policy. Because the cuto¤s under a symmetric distributions are symmetric about 1=2, it
follows that the solutions to the two subproblems satisfy the integral constraint, and hence
provide a tight upper bound to the seller�s pro�ts.
5.2.2 Discriminatory Pricing
The monotonicity conditions of Proposition 5 that guarantee increasing virtual values are
not entirely appealing in our context. When types correspond to interim beliefs, it is quite
natural to consider bimodal densities (e.g., a well-informed population in a binary model)
that fail the regularity conditions, and introduce the need for ironing. For example, starting
from the common prior, if buyers observe binary signals, a bimodal distribution of beliefs
would result with types holding beliefs above and below the mean of the common prior �.
In general, non monotone densities and distributions violating the standard monotonicity
conditions are a quite natural benchmark. Therefore, ironing is not a technical curiosity in
our case, but rather a technique that becomes unavoidable because of the features of the
information environment.
We now illustrate the ironing procedure when virtual values are not monotone, and how
it leads to a richer (two-item) optimal menu. Figure 5 considers a bimodal distribution of
types, illustrating the probability density function and the associated virtual values.14
We therefore consider the �ironed� versions of each virtual value, and we identify the
equilibrium value of the multiplier ��. In this case, the multiplier must be at the �at level of
one of the virtual values: suppose not, apply the procedure from the regular case, and verify
that it is impossible to satisfy the integral constraint.
14In particular, the distribution in the left panel is a mixture (with equal weights) of two Beta distributionswith parameters (8; 30) and (60; 30).
27
Figure 5: Probability Density Function and Virtual Values
Figure 6 illustrates the optimal two-item menu. Note that for types in the �pooling�
region � 2 [0:17; 0:55], the level of the allocation (q� � �0:22) is uniquely pinned down bythe pooling property and by the integral constraint.
Figure 6: Ironed Virtual Values and Optimal Allocation
In both examples, extreme types with a low value of information are excluded from the
purchase of informative signals. In the latter example, the monopolist is o¤ering a second
information structure that is tailored towards relatively lower types. This structure (with
q < 0) contains one signal that perfectly reveals the high state. This experiment is relatively
unattractive for higher types, and it allows the monopolist to increase the price for the large
mass of types located around � � 0:7.The properties of the optimal discriminatory pricing scheme re�ect the fact that the type
structure is quite di¤erent from the standard screening environment. While information rents
U (�) peak at � = 1=2, the ex ante least informed type � = 1=2 need not purchase the fully
revealing experiment, despite having the highest value of information.15 In the example
above, inducing the types around � = 1=2 to purchase the fully informative experiment
15This possibility can also be shown in three-type examples with arbitrary state spaces.
28
requires imposing further distortions and charging a lower price for the second experiment.
This leads to a loss of revenue from types around � � 0:2. Because such types are quite
frequent, this loss more than o¤sets the gain in revenue from types around 1=2. In other
words, whenever discriminatory pricing is optimal, the menu depends on the distribution of
types in a rich way.
5.2.3 Beyond Binary States
As mentioned, the analysis of the optimal menu with N states and actions involves all the
di¢ culties associated with multidimensional screening. Nevertheless, the construction of
the optimal menu under discriminatory pricing extends to multidimensional settings. We
develop an example that preserves the single-dimensional nature of the screening problem,
while illustrating how the seller can design richer experiments.
In particular, consider a setting with three states and actions, where types � are distrib-
uted along the rays of the two-dimensional simplex. If types are uniformly distributed on the
rays, it is immediate to see that the �at pricing menu is optimal. Figure 7 (left) shows the set
of participating (solid) and excluded (dashed) buyers when the fully informative experiment
is o¤ered at the monopoly price.
However, if the distribution of types di¤ers across rays, a richer menu may be more
pro�table. In particular, let types along one ray be distributed according to a mixture of
Beta distributions (as in Figure 5), maintaining the uniform distribution along the other two
rays. Then discriminatory pricing is optimal, as shown in Figure 7 (right).
Figure 7: Optimal Allocations under Uniform and Beta Distributions
In particular, the seller o¤ers the fully revealing experiment to all types closest to the
centroid, and a distorted experiment to a subset of types on the ray through the vertex !1.
29
This experiment is given by
I1
s1 s2 s3
!1 1 0 0
!2 :35 :65 0
!3 :35 0 :65
and perfectly reveals states !2 and !3, but leads buyers to take action a1 more often than
would be optimal.
6 Implications for observable variables
The solution of the screening problem characterized in the previous sections is related to the
practical design of information products by a monopolist. Indeed, our seller�s problem is to
degrade the quality of her information and to set the prices of various versions in order to
segment the population of buyers.16
Our results describe the optimal patterns for the degrading of the information o¤ered by
a monopolist. In other words, one should not observe arbitrarily damaged goods. Instead,
the optimal menu of products contains �directionally� informative signals. In the special
case of binary action, one should only observe type-I or type-II errors. Furthermore, absent
costs of transmitting information or other sources of nonlinearity (e.g., buyers�risk aversion),
one should not observe multiple distortions of the same kind, either.
While seemingly abstract patterns, these results can inform the design of information
products o¤ered by several large data brokers. For example, in the retargeting applica-
tion, Blackwell experiments correspond to �data packages��contracts in which the seller of
information commits to identifying consumers with given attributes�and individual signals
correspond to the realized values of these attributes.17 In this context, acquiring a data pack-
age that reveals coarse information about the consumer�s attributes will inevitably lead to
both type-I and type-II errors in the targeting of an audience. The optimality of directional
information distortions then suggests to limit information disclosure (i.e., package design) to
attributes correlated with very high- or low- value customers only.
Menu pricing is widely used among data brokers. In particular, as one would expect,
most data sellers seek to o¤er higher-quality, high-markup products, in addition to partially
informative data packages. In particular, one may consider data management platforms
16Our setting di¤ers from the �damaged goods�model of Deneckere and McAfee (1996) because degradingquality is costless. In many applications, however, it is naturally costly to introduce noise in the data. Thisis the case, for instance, when the seller is concerned with preserving the anonymity of her information.17For example, see http://www.acxiom.com/data-packages/ for an exhaustive list of the packages sold by
the data broker Acxiom.
30
(DMPs)�customized software that automates the integration of 1st- and 3rd-party data,
enabling websites to track their users and place them into more precise market segments�as
information products aimed at high-end buyers. Indeed, DMPs promise to use all and only
the relevant data to guide actions, while advertising campaigns run on the basis of limited
targeting criteria provide a low-cost option.18
Menu pricing for information is also used by LinkedIn, which sells member pro�les two
versions: a Lite version that allows one to condition searches on a limited number of criteria;
and a Premium version that grants access to the entire database. Finally, menus of infor-
mation products are also o¤ered in contexts where private beliefs constitute a less relevant
dimension of buyer heterogeneity. For example, buyers of time-sensitive data can be screened
according to their value for timely information, i.e., their discount rates. Consequently, sell-
ers such as Bloomberg or Thomson-Reuters o¤er menus of essentially homogeneous products
that di¤er only in the timing of their availability.19
Returning to online advertising markets, information is, of course, also sold in an indirect
way. For instance, large sellers of online space e¤ectively bundle data and advertising prod-
ucts by o¤ering a menu of targeting opportunities. This of course suggests the question of
richer contract spaces. For example, a large seller of advertising, such as Google or Yahoo!,
may charge di¤erent prices for its space as a function of the information requested by the
buyer (i.e., the desired degree of targeting). In this sense, our approach is better suited at an-
alyzing a data broker�s pricing problem when merchants wish to buy retargeted �advertising
products�from a di¤erent seller. More generally, our results apply to any transaction where
the advertising space is priced separately from the information required to better allocate
that space.
7 Conclusions
We have examined the problem of a monopolist who sells incremental information to privately
informed buyers. We have deliberately focused on the �packaging�or versioning problem of
a seller who is (a) uninterested in the buyer�s actions, and (b) free to acquire and degrade
information. Fundamental to the seller�s incentives to degrade information is the Bayesian
nature of the buyer�s problem: on the one hand, private beliefs widen the scope for price
18Acxiom o¤ers �Data Integration� services in multiple plans, and Oracle acquired the Bluekai DataManagement Platform in 2014. For a detailed description of DMPs and a comparison with direct sales ofdata, see �Who Do Advertisers Think You Are?�, The New York Times, November 30, 2012.19For example, the Consumer Sentiment Index released by Thomson-Reuters and the University of Michi-
gan, and PAWWS Financial Network�s portfolio accounting system (Shapiro and Varian, 1999) were initiallyavailable in di¤erent versions, based on the timing of their release.
31
discrimination through directional information; on the other hand, the linearity of the buyer�s
utility function limits the use of versioning.
In general, the seller�s problem consists of screening within and across groups of buyers
with congruent priors. When states and actions are binary, the optimal mechanism involves
at most two experiments, and we obtained su¢ cient conditions for �at pricing to be optimal.
With arbitrary states and actions, a two-type example illustrates the seller�s ability to exploit
relative di¤erences in the buyers�beliefs to reduce information rents while limiting the surplus
that must be sacri�ced to provide incentives.
These results provide only a �rst pass at understanding the trade-o¤s involved in screening
through information products. While we have relied on belief heterogeneity to motivate the
sale of incremental information, there may be several sources of heterogeneous demand for
information. Buyers may di¤er in their payo¤s from taking each action, in the relevance
of their decision (i.e., their �stakes�), or in their ability to process additional signals. Each
of these extensions can be implemented in the framework we have outlined. Combining
di¤erent sources of heterogeneity (e.g., beliefs and stakes) appears more challenging, but
promises deeper insights into the optimal allocation of information. Further interesting
extensions include the role of frictions in the acquisition and transmission of information;
the e¤ect of competition among sellers of information (i.e. formalizing the intuition that each
seller will be able to extract the surplus related to the innovation element of his database).
32
AppendixProof of Lemma 2. The argument for part (1.) is given in the text. For part (2.), considerany individually rational and incentive compatible direct mechanism M = fI (�) ; t (�)g.Fix a type � with the associated experiment I (�) and let �ij (�) denote the conditional
probability of signal sj in state !i under experiment I (�). A consequence of Lemma 1 is
that each type � has a di¤erent optimal action for each signal in I (�). Without loss of
generality, we then arrange the signals in I (�) so that type � takes action ai when observing
si. If type � never takes action ai, we drop signal si from I (�), i.e. we set the i-th column
of � (�) to zero. Because beliefs �i�i = 1, we can write the value of information (3) as
V (I (�) ; �) =
"NXi=1
�i�ii (�)�maxi�i
#+
=
"N�1Xi=1
�i (�ii (�)� �NN (�)) + �NN (�)�maxi�i
#+. (18)
Now de�ne " (�) := 1 � max�ii (�), and construct a new experiment I 0 (�) where �0ii (�) =�ii (�) + " (�) for all i and for all �. For each state !i, the corresponding o¤-diagonal entries
�ij (�), j 6= i, are correspondingly reduced by " (�), without further restrictions on how thisis operation is performed. It then follows from (18) that
[V (I 0 (�) ; �)� " (�)]+ = V (I (�) ; �) :
Furthermore, for all types �0 6= �, the value of mimicking type � increases by less than " (�)(strictly so, if type �0 6= � responds to the signals in I (�) di¤erently from type �). Suppose
type �0 chooses action ai(j) upon observing signal sj from experiment I (�). We then have
V (I (�) ; �0) =
"NXj=1
�0i(j)�i(j)j (�)�maxi�0i
#+:
Furthermore, because i (j) need not coincide with j, the entries �i(j)j in V (I (�) ; �0) increase
at most by " (�). Therefore, we have
[V (I 0 (�) ; �0)� " (�)]+ � V (I (�) ; �0) .
Consequently, the direct mechanismM0 = fI 0 (�) ; t (�) + " (�)g is also individually rationaland incentive compatible. Moreover, all experiments I (�) are locally non-dispersed by con-
struction, and all transfers are weakly greater than in the original mechanismM. �
33
Proof of Lemma 3. We know from Lemma 2 that at least one type must buy the fully
revealing experiment I�. Suppose only type �L buys I� as part of the optimal menu. Then
the price of I� is at most V (I�; �L). By incentive compatibility, if the high type �H purchases
I 6= I�, it must be that t(�H) < V (I�; �L). Therefore, eliminating the experiment I(�H) fromthe menu strictly improves the seller�s pro�ts. �
Proof of Proposition 2. We know from Lemma 3 that the high type �H purchases the
fully informative experiment. We now derive the optimal experiment I(�L). Suppose (as we
later verify) that both types �H and �L would choose action ai after observing signal si from
the optimal experiment I(�L). The seller then chooses �ii 2 [0; 1] ; i = 1; : : : n to solve thelinear program (7) subject to (5). We can write the Lagrangian as
L = (1� )"
nXi=1
�ii��Li + �
��Li � �Hi
��� �max
i�Li + �max
i�Hi + �U(�
H)
#� U(�H);
where � � 0 represents the shadow value of increasing type the informational rent U(�H).
It follows immediately that
�ii =
8><>:0 if �Li + �
��Li � �Hi
�< 0;
[0; 1] if �Li + ���Li � �Hi
�= 0;
1 if �Li + ���Li � �Hi
�> 0:
(19)
and that
U(�H) =
( Pni=1(�
H)�ii +maxi �Li �maxi �Hi if < � (1� ) ;
0 if � � (1� ) :(20)
To characterize the solution, we need only identify the value of the multiplier ��. We proceed
in the following steps.
1. Recall states !i are ordered by increasing likelihood ratio of beliefs �Li =�
Hi , and consider
state !1. If !1 is the only state i for which �ii < 1, (19) implies that the multiplier is
given by
�� = mini
1�Hi�Li+ 1
=�L1
�L1 + �H1
: (21)
Therefore, �11 solves (5) with equality, i.e., the candidate optimal value is �11 = k1
as de�ned in (11). If k1 � 0 and �L1 =�H1 < , then (20) implies U(�H) = 0 and (19)
34
implies we must set �ii = 1 for all i � 2: If instead < �L1 =�H1 then then �11 = 1
and U(�H) > 0 solves (5), i.e., U(�H) = maxi �Li � maxi �Hi : Finally, if k1 < 0 and
�L1 =�H1 < , there is no solution with the value of �
� given in (21).
2. Suppose the above candidate solution is not, in fact, the optimum. The value of
the multiplier must then satisfy �� > �L1 =(�L1 + �
H1 ), which implies the non-negativity
constraint binds, i.e., �11 = 0: This means the optimal partially informative experiment
has less than full rank. The next candidate value for the multiplier �� is �L2 =(�L2 + �
H2 ).
Again, if k2 � 0 and �L2 =�H2 < , then (20) implies U(�H) = 0 and (19) implies �ii = 1for all i � 3: If instead < �L1 =�
H1 then then �11 = 1 and U(�H) > 0 solves (5)
with equality. Finally, if k2 < 0 and �L2 =�H2 < , there is no solution with the value
of �� = �L2 =(�L2 + �
H2 ) and the next candidate is �
� = �L3 =(�L3 + �
H3 ); which implies
�11 = �22 = 0:
3. The procedure iterates until either (a) state i� is reached, i.e., �Li =�Hi > , or (b) state
j� is reached, i.e., kj � 0.
4. We must then verify that the o¤-diagonal entries �ij, i 6= j can be set to ensure bothtypes �H and �L choose action ai when observing signal si. This requires
�ii�i � �ji�j
for both �L and �H and for all j < i, because the signal matrix can be taken to be
lower triangular. Fix a signal i and an alternative action aj. We need
�i�j� �ji�ii:
By construction we have �Hi =�Hj < �Li =�
Lj for all i > j, hence, we need only worry
about the incentives of type �H . We can then assign the probabilities �ji following
the procedure described in Algorithm 1: for any i0 > i, we make type �H indi¤erent
between following the recommendation of signal i0 and choosing actions ai; we do so
beginning with �jn and proceeding backward as long as required; if the procedure
assigns positive weight to �ji then it must be that �ji = 1 � �ns=i+1�Hs =�Hj : It followsthat type �H has strict incentives to follow the recommendation of signal i: Note that
�ii 2 fki; 1g, hence, if �ii = ki, we have by de�nition
ki�Hi =
Pns=i+1
��Ls � �Hs
��maxj �Lj +maxj �Hj
�Hi � �Li�Hi :
35
Now, because argmaxj �Lj > i we can bound the above expression by
ki�Hi >
�Hi�Hi � �Li
��Hj � �ns=i+1�Hs
�> �Hj � �ns=i+1�Hs = �Hj �ji:
A fortiori, type �H has strict incentives to choose ai if �ii = 1.
5. Finally, the seller may choose �ij such that the high type does not follow every recom-
mendation. However, it is easy to see that this is not optimal. By revealed preference,
the high type would obtain a higher utility than under the above construction, for
every experiment �: Thus, the incentive compatibility constraint becomes harder to
satisfy, and the value of the seller�s problem decreases.
This ends the proof. �
Algorithm 1 (Optimal Menu with Binary Type) We construct the two optimal exper-iments as a function of the distribution of types. This algorithm contains three sub-routines,
beginning with the construction of the allocation described in part (1.) of Proposition 2. We
refer to this allocation as the Maximally Informative No-Rents Experiment.
Maximally Informative No-Rents Experiment Order states i in increasing order ofthe likelihood ratios �Li =�
Hi and set U(�
H) = 0: Let �ii = 1 for i = 2; : : : ; n and solve
(8) with equality with respect to �11. The solution is given by k1 in (11). If k1 � 0,
stop. If k1 < 0, set �11 = 0 and �ii = 1 for i = 3; : : : ; n: Solve (8) with equality with
respect to �22; which yields k2. If k2 � 0, stop, otherwise iterate the procedure. The
procedure terminates at state j� de�ned in (12).
We now use the distribution of types to identify which step of the construction yields the
pro�t-maximizing experiment.
Optimal Experiment Begin at state j�. If > �Lj�=�Hj� the Maximally Informative No-
Rents Experiment is part of the optimal menu. If < �Lj�=�Hj�, set �j�j� = 1 and
consider j� � 1. If > �Lj��1=�Hj��1, stop, and choose U(�
H) > 0 to satisfy (8) with
equality. Otherwise set �j��1;j��1 = 1 and consider j��2; iterating until reaching statei� de�ned in (10) and adjusting the rent U(�H) to satisfy the incentive constraint.
Thus, as the fraction of high types increases, the optimal menu involves potentially steeper
distortions. Conversely, when i� = 1, the menu involves bunching both types at the fully
revealing experiment. Finally, we illustrate a simple procedure to assign the o¤-diagonal
probabilities to the partially informative experiment.
36
O¤-Diagonal Entries Suppose the optimal menu sets �ii = 0 for all i < {̂, for some {̂ > 1.To assign the o¤ diagonal entries �i;j with i < {̂ and j � {̂; �x a state i < {̂ and
begin with signal n: Let �in = minf�Hn =�Hi ; 1g. If �Hn =�Hi > 1, stop. If �Hn =�Hi < 1,
set �i;n�1 = minf�Hn�1=�Hi ; 1 � �Hn =�Hi g, and proceed backwards until reaching �i;i+1.The entries so constructed sum to one because by the de�nition of kj in (11), we have
�H{̂ > k{̂�H{̂ > �
Hi
�1� �nj={̂+1�Hj =�Hi
�.
Proof of Lemma 4. Suppose the allocation q (�) is optimal, and that, for some type �, theexpression in brackets on the right-hand side of (14) is negative, i.e.,
�q (�)�maxfq (�) ; 0g+minf�; 1� �g � 0:
Then we can replace q (�) with ~q (�) = �1 if � < 1=2 and with ~q (�) = 1 if � � 1=2. This
does not change the value of the allocation for type � and the price the seller can charge,
which are both zero. However, the experiment ~q (�) is less informative, and hence relaxes
incentive constraints for all �0 6= �. �
Proof of Lemma 5. We begin with necessity. Consider an incentive compatible allocationq: For any two types �2 > �1 we have
V (q1; �1)� t1 � V (q2; �1)� t2;V (q2; �2)� t2 � V (q1; �2)� t1;
V (q2; �2)� V (q1; �2) � t2 � t1 � V (q2; �1)� V (q1; �1) :
It follows from the single-crossing property of V (q; �) that q2 � q1 hence q (�) is increasing.Because the buyer�s rent is di¤erentiable with respect to � on [0; 1=2] and [1=2; 1] respec-
tively, we can compute the function U (�) on these two intervals separately. We obtain the
expression in the text,
U(1=2) = U (0) +
Z 1=2
0
V� (q; �) d� = U (1)�Z 1
1=2
V� (q; �) d�:
By the envelope theorem V� (q; �) = q + 1 for � < 1=2 and = q � 1 for � > 1=2. Finally,
because U (0) = U (1) we obtain Z 1
0
q (�)d� = 0:
37
We now turn to su¢ ciency. Suppose the allocation q is increasing and satis�es the integral
constraint. Then construct the following transfers
t (�) =
(�q (�)�max fq (�) ; 0g �
R �0q (x) dx if � < 1=2;
�q (�)�max fq (�) ; 0g+R 1�q (x) dx if � � 1=2:
(22)
Because the allocation satis�es the integral constraint, we haveZ �
0
q (x) dx = �Z 1
�
q (x) dx,
and we can express all transfers t (�) as
t (�) = q (�) � �max fq (�) ; 0g �Z �
0
q (x) dx.
Finally, the expected utility of type � from reporting type �0 is given by
V (q (�0) ; �)� t (�0) = (� � �0) q (�0) +Z �0
0
q (x) dx+min f�; 1� �g .
Because q is monotone, the expression on the right-hand side is maximized at �0 = � and,
hence, the incentive constraints are satis�ed. Finally, because the rent function U (�) =
V (q (�) ; �)� t (�) is non-negative for all � 2 [0; 1], the participation constraints are satis�edas well. �
Proof of Proposition 3. We �rst derive the seller�s objective in the usual way. Using (22),we write the expected transfers asZ 1
0
t (�)dF (�) =
Z 1
0
(�q (�)�max fq (�) ; 0g)dF (�)
�Z 1=2
0
Z �
0
q (x) dxdF (�) +Z 1
1=2
Z 1
�
q (x) dxdF (�) :
Integrating the last two terms by parts, we obtain
�F (1=2)Z 1=2
0
q (x) dx+
Z 1=2
0
q (x)F (x) dx� F (1=2)Z 1
1=2
q (x) dx+
Z 1
1=2
q (x)F (x)dx;
and henceZ 1
0
t (�)dF (�) =Z 1
0
[(�q (�)�max fq (�) ; 0g) f (�)� (F (1=2)� F (�)) q (�)]d�:
38
We now establish that the solution to the seller�s problem (17) can be characterized
through Lagrangian methods. For necessity, note that the objective is concave in the allo-
cation rule; the set of non-decreasing functions is convex; and the integral constraint can be
weakened to the real-valued inequality constraintZ 1
0
q(�)d� � 0: (23)
Necessity of the Lagrangian then follows from Theorem 8.3.1 in Luenberger (1969). Su¢ -
ciency follows from Theorem 8.4.1 in Luenberger (1969). In particular, any solution maxi-
mizer of the Lagrangian q(�) with Z 1
0
q(�)d� = �q
maximizes the original objective subject to the inequality constraintZ 1
0
q(�)d� � �q:
Thus, any solution to the Lagrangian that satis�es the constraint solves the original problem.
Because the Lagrangian approach is valid, we can apply the results of Toikka (2011) to
the solve the seller�s problem for a given value of the multiplier � on the integral constraint.
Write the Lagrangian asZ 1
0
[(�f (�) + F (�)� �) q (�)�max fq (�) ; 0g f (�)]d�:
In order to maximize the Lagrangian subject to the monotonicity constraint, consider the
generalized virtual surplus
�J(�; q) :=
Z q
�1
��� (�; x)� ��
�dx;
where �� (�; x) denotes the ironed virtual value for allocation x. Note that �J(�; q) is weakly
concave in q. Because the multiplier � shifts all virtual values by a constant, the result in
Proposition 3 then follows from Theorem 4.4 in Toikka (2011). Finally, note that ��(�; q) � 0for all � implies the value �� is strictly positive (otherwise the solution q� would have a
strictly positive integral). Therefore, the integral constraint (23) binds. �
39
Proof of Proposition 4. From the Lagrangian maximization, we have the following nec-
essary conditions
q�(�) =
8>>>>>>>>><>>>>>>>>>:
�1 if ��(�;�1) < ��;
�q 2 [�1; 0] if ��(�;�1) = ��;
0 if ��(�;�1) > �� > ��(�; 1);
�q0 2 [0; 1] if ��(�; 1) = ��;
1 if ��(�;�1) > ��;
and Z 1
0
q�(�)d� = 0:
If �� coincides with the �at portion of one virtual value, then by the pooling property of
Myerson (1981), the optimal allocation rule must be constant over that interval, and the
level of the allocation is uniquely determined by the integral constraint. Finally, suppose ��
equals the value of ��(�; q�(�)) over more than one �at portion of the virtual values ��(�;�1)and ��(�; 1). Then, we can focus, without loss of generality, on the allocation q� that assigns
experiment q = 0 or q 2 f�1; 1g to all types in one of the two intervals. �
Proof of Proposition 5. (1.) If F (�)+ �f (�) and F (�)+(� � 1) f (�) are strictly increas-ing, then ironing is not required and it follows from the analysis in the text that the optimal
solution has q 2 f�1; 0; 1g for all �.(2.) If all types are located at one side from 1=2 then the integral constraint has no bite
since the allocation rule q (�) can always be adjusted on the other side to satisfy it. The
solution on each side of 1=2 involves a cuto¤ type and q 2 f�1; 0g or q 2 f0; 1g ; both ofwhich result in �at pricing.
(3.) If types are symmetrically distributed, then the separately optimal menus for types
� < 1=2 and � > 1=2 are identical. Therefore, the union of the two solutions satis�es the
integral constraint, and hence solves the original problem. �
40
References
Abraham, I., S. Athey, M. Babaioff, and M. Grubb (2014): �Peaches, Lemons,
and Cookies: Designing Auction Markets with Dispersed Information,�Discussion paper,
Microsoft Research, Stanford, and Boston College.
Admati, A. R., and P. Pfleiderer (1986): �A monopolistic market for information,�
Journal of Economic Theory, 39(2), 400�438.
(1990): �Direct and indirect sale of information,�Econometrica, 58(4), 901�928.
Balestrieri, F., and S. Izmalkov (2014): �Informed seller in a Hotelling market,�Dis-
cussion paper, HP Labs and New Economic School.
Bergemann, D., and A. Bonatti (2015): �Selling Cookies,�American Economic Journal:
Microeconomics, 7(3), 259�294.
Bergemann, D., and S. Morris (2013): �Robust Predictions in Games with Incomplete
Information,�Econometrica, 81(4), 1251�1308.
Bergemann, D., and M. Pesendorfer (2007): �Information Structures in Optimal
Auctions,�Journal of Economic Theory, 137(1), 580�609.
Cabrales, A., O. Gossner, and R. Serrano (2015): �Demand for Information and the
Appeal of Information Transactions,�Discussion paper, University College London.
Celik, L. (2014): �Information unraveling revisited: disclosure of horizontal attributes,�
Journal of Industrial Economics, 62(1), 113�136.
Daskalakis, C., A. Deckelbaum, and C. Tzamos (2014): �Strong duality for a
multiple-good monopolist,�Discussion paper, MIT.
Deneckere, R., and R. McAfee (1996): �Damaged Goods,�Journal of Economics and
Management Strategy, 5, 149�174.
Es ½O, P., and B. Szentes (2007a): �Optimal information disclosure in auctions and the
handicap auction,�Review of Economic Studies, 74(3), 705�731.
(2007b): �The price of advice,�Rand Journal of Economics, 38(4), 863�880.
Hörner, J., and A. Skrzypacz (2015): �Selling Information,�Journal of Political Econ-
omy, forthcoming.
41
Johnson, J. P., and D. P. Myatt (2006): �On the Simple Economics of Advertising,
Marketing, and Product Design,�American Economic Review, 96(3), 756�784.
Jullien, B. (2000): �Participation Constraints in Adverse Selection Models,� Journal of
Economic Theory, 93(1), 1�47.
Kamenica, E., and M. Gentzkow (2011): �Bayesian Persuasion,�American Economic
Review, 101(6), 2590�2615.
Koessler, F., and V. Skreta (2014): �Sales Talk,�Discussion paper, PSE and UCL.
Kolotilin, A., M. Li, T. Mylovanov, and A. Zapechelnyuk (2015): �Persuasion of
a Privately Informed Receiver,�Discussion paper, University of New South Wales.
Krähmer, D., and R. Strausz (2015): �Ex post information rents in sequential screen-
ing,�Games and Economic Behavior, 90, 257�273.
Lambrecht, A., and C. Tucker (2013): �When does retargeting work? Information
speci�city in online advertising,�Journal of Marketing Research, 50(5), 561�576.
Li, H., and X. Shi (2015): �Discriminatory Information Disclosure,� Discussion paper,
University of British Columbia and University of Toronto.
Lizzeri, A. (1999): �Information revelation and certi�cation intermediaries,�Rand Journal
of Economics, 30(2), 214�231.
Luenberger, D. G. (1969): Optimization by Vector Space Methods. John Wiley & Sons.
Manelli, A. M., and D. R. Vincent (2006): �Bundling as an optimal selling mechanism
for a multiple-good monopolist,�Journal of Economic Theory, 127(1), 1�35.
Maskin, E., and J. Riley (1984): �Monopoly with Incomplete Information,�Rand Journal
of Economics, 15(2), 171�196.
Mussa, M., and S. Rosen (1978): �Monopoly and Product Quality,�Journal of Economic
Theory, 18(2), 301�317.
Myerson, R. (1981): �Optimal Auction Design,�Mathematics of Operations Research, 6,
58�73.
(1986): �Multistage Games with Communication,�Econometrica, 54(2), 323�358.
42
Mylovanov, T., and T. Tröger (2014): �Mechanism Design by an Informed Principal:
Private Values with Transferable Utility,�Review of Economic Studies, 81(4), 1668�1707.
Ottaviani, M., and A. Prat (2001): �The value of public information in monopoly,�
Econometrica, 69(6), 1673�1683.
Pavlov, G. (2011): �Optimal Mechanism for Selling Two Goods,� The BE Journal of
Theoretical Economics, 11(1), 1�33.
Rayo, L., and I. Segal (2010): �Optimal Information Disclosure,� Journal of Political
Economy, 118(5), 949�987.
Riley, J., and R. Zeckhauser (1983): �Optimal Selling Strategies: When to Haggle,
When to Hold Firm,�Quarterly Journal of Economics, 98, 267�290.
Sarvary, M. (2012): Gurus and Oracles: The Marketing of Information. MIT Press.
Shapiro, C., and H. R. Varian (1999): Information Rules: A Strategic Guide to the
Network Economy. Harvard Business Press.
Toikka, J. (2011): �Ironing without Control,�Journal of Economic Theory, 146(6), 2510�
2526.
43