Designing and Pricing Informationbonatti/se.pdf · Designing and Pricing Information Dirk...

Designing and Pricing Information�

Dirk Bergemanny Alessandro Bonattiz Alex Smolinx

October 9, 2015

Abstract

A monopolist sells informative experiments to heterogeneous buyers who face a

decision problem. Buyers di¤er in their prior information, and hence in their willingness

to pay for additional signals. The monopolist o¤ers a menu of experiments. Every

optimal experiment in the optimal menu contains locally non-dispersed information,

i.e., it rules out at least one realization of the underlying state. With binary states and

actions, the optimal menu is coarse: the seller o¤ers at most two experiments, and we

derive conditions under which �at or discriminatory pricing is optimal. We apply our

�ndings to the sale of consumer-level information to enable targeted advertising.

Keywords: selling information, experiments, mechanism design, price discrimination,

product di¤erentiation.

JEL Codes: D42, D82, D83.

�We thank Ben Brooks, Giacomo Calzolari, Gabriel Carroll, Gonzalo Cisternas, Je¤ Ely, Emir Kamenica,

Alessandro Lizzeri, Alessandro Pavan, Phil Reny, Mike Riordan, Maher Said, Juuso Toikka, Alex Wolitzky

and seminar participants at Berkeley, Bocconi, Bologna, Carnegie Mellon, Chicago, Harvard, Mannheim,

NYU, Toulouse, Vienna, Yale and in the World Congress of the Econometric Society for helpful comments.yYale University, 30 Hillhouse Ave., New Haven, CT 06520, USA, [email protected] Sloan School of Management, 100 Main Street, Cambridge MA 02142, USA [email protected] University, 30 Hillhouse Ave., New Haven, CT 06520, USA, [email protected].

1

1 Introduction

The mechanisms by which information is traded can shape the creation and the distribution

of surplus in several important settings. Information about speci�c assets guides trading

decisions in �nancial markets; information about consumers facilitates targeting in online

advertising markets; and the indirect transfer of knowledge through consulting and expert

services a¤ects the behavior of many �rms. A common feature to all these markets is that

buyers of information often have private knowledge relevant to their decision problem at the

time of contracting (e.g., independent studies of traded companies, prior interactions with

speci�c consumers, and partial expertise at a given problem). In other words, buyers seek

to acquire incremental information.

Online advertising markets provide our initial motivation to study incremental infor-

mation. Large data brokers (e.g., Acxiom and Bluekai) collect and maintain databases that

record several attributes for each individual consumer. They then sell information about spe-

ci�c consumers to �rms operating on either side of the advertising market: merchants who

seek to tailor their advertising campaigns; and websites that sell their own advertising space

and wish to allocate users to speci�c �audience segments.�1 An increasingly common use of

information is behavioral retargeting� targeting advertisements to a consumer on the basis

of previous interactions that did not lead to a sale.2 Crucially, when a �rm seeks to acquire

external (�3rd-party�) data to guide a retargeting campaign, it has proprietary (�1st-party�)

data on its own customers and, hence, a private valuation for additional information.

In this paper, we develop a new framework to analyze the sale of incremental infor-

mation. We consider a single buyer who faces a decision problem under uncertainty. A

monopolist seller owns a database containing information about a �state�variable relevant

for the buyer�s decision. However, the buyer is partially and privately informed about the

state. We investigate the revenue-maximizing policy for the seller. In other words, how much

information should the seller provide? How should the seller price access to the database?

Environment In order to screen heterogeneous buyers, the seller o¤ers a menu of prod-

ucts. In our context, these products are experiments in the sense of Blackwell� signals that

reveal information about the buyer�s payo¤-relevant state. Only the information product is

assumed to be contractible. In contrast, payments cannot be made contingent on the buyer�s

actions, or the realized states and signals. Consequently, the value of an experiment for a

1The option to classify users is available to sellers of advertising on most online platforms, includingGoogle�s DoubleClick Ad Exchange.

2The technology �rm Rubicon Project integrates information from several sources to optimize advertisingcampaigns and website revenues (Your Online Attention, Bought in an Instant, the New York Times, No-vember 17, 2012). Lambrecht and Tucker (2013) empirically examine the e¢ cacy of retargeting campaigns.

2

given buyer can be computed independently of its price. This is unlike a contract specifying

contingent payments for actions, where the marginal price in�uences the buyer�s behavior

after observing a signal, and hence his willingness to pay ex-ante.

Finally, despite the buyer being potentially informed about his private beliefs, the analysis

di¤ers considerably from a belief-elicitation problem. Instead, we cast the problem into the

canonical quality-pricing framework where the buyer�s demand for information is determined

by his prior beliefs. In other words, the seller�s problem reduces to optimally designing and

pricing information products in di¤erent versions.3

Results The very nature of information products enriches the scope for price discrimi-

nation and leads to new insights relative to the classic nonlinear pricing problem of Mussa

and Rosen (1978) and Maskin and Riley (1984). In particular, because information is only

valuable if it a¤ects the decision maker�s action, buyers with heterogeneous beliefs rank in-

formative signals di¤erently. More precisely, all buyer types assign the highest value to the

perfectly informative experiment, but their ranking of imperfect experiments di¤ers substan-

tially. Thus, the value of information naturally has both a vertical quality element and a

horizontal positioning element, and the two cannot be produced independently.

As one would expect, the seller o¤ers a menu that, in general, contains both the fully

revealing experiment and partially informative, �distorted�experiments. However, the dis-

torted information products are not just noisy versions of the same data. Instead, the seller

introduces systematic distortions in the information provided. In particular, every experi-

ment o¤ered as part of the optimal menu is locally non-dispersed, i.e., the seller introduces

no noise in the distribution of signals conditional on at least one true state.

We provide a full characterization of the optimal menu for the case of two types. The

ex ante less informed �high� type purchases the fully revealing experiment. The ex ante

more informed �low� type buys a distorted experiment. The seller�s screening problem is

facilitated by the possibility of providing �directional� information: the seller introduces

noise in an appropriately chosen subset of signals to optimally reduce the information rents

of the less informed �high� type. As a result, the low type�s experiment rules out those

states he deems relatively more likely. Furthermore, the �exibility to design rich information

structures allows the seller to (generically) serve both buyer types, i.e., to avoid exclusion of

the low type.

Despite the additional �exibility in screening, we show that bundling information is

optimal quite generally with more than two types. In particular, with binary states and

3In Section 2, we relate the speci�c elements of our model to the retargeting context. In Section 6, wedescribe the information products o¤ered by large data brokers and relate them to the properties of theoptimal menu.

3

actions, the optimal menu for a continuum of types contains at most two experiments: one

is fully informative; and the other (if present) contains two signals, one of which perfectly

reveals the true state. Therefore, buyers never make a mistake whenever they choose one

of their two actions. In other words, the seller induces some buyers to make only type-I or

type-II errors.

A fundamental distinction between two types is whether their priors are congruent, i.e.,

whether they would choose the same action in the absence of further information. Because

of the linearity of expected utility in probabilities, we come to an intuition analogous to

Myerson (1981) or Riley and Zeckhauser (1983): if all types have congruent priors, the seller

should adopt �at pricing of the fully informative experiment. The seller�s problem is then to

screen buyers both within and across the two groups with congruent priors. Ideally, the seller

would charge di¤erent prices for the fully informative experiment to buyers in each group.

We show that, if the distribution of types satis�es strong regularity conditions, �at pricing of

a single information structure is, in fact, optimal. Conversely, discriminatory pricing of two

di¤erent experiments emerges as part of the optimal menu only when �ironing�is required

(Myerson, 1981). The second experiment is o¤ered as a means to serve buyers in one group,

while charging higher prices to the other group.

Related Literature In our earlier work on markets for data and online advertising (Berge-

mann and Bonatti, 2015), we examined the problem of selling information about individual

consumers as encoded in third-party cookies. Relative to the present paper, we allowed buy-

ers of information to choose from a continuum of ex-post actions (e.g., levels of advertising

spending), but we restricted the set of available information structures. In particular, we al-

lowed buyers to acquire individual cookies only� modeled as perfectly revealing signals about

individual realizations of the underlying state (i.e., the consumer�s value). This e¤ectively

divided the customer pool into �targeted�and �residual�consumers. We also required the

sellers of information to charge a constant linear price for each cookie. Thus, we abstracted

from the information-design problem, in order to focus on the e¤ect of information sales on

the market for advertising space.

Our current paper ties the literature on selling information with that on monopolistic

screening. A long literature in economics and marketing studies the properties of information

goods, emphasizing how digitalized production allows sellers to easily customize (or degrade)

the attributes of such products (Shapiro and Varian, 1999). This argument applies even more

forcefully to information products (i.e., experiments), and suggests that versioning should

be an attractive price-discrimination technique (Sarvary, 2012). In this paper, we investigate

the validity of these claims in a simple contracting environment à la Mussa and Rosen (1978).

Admati and P�eiderer (1986, 1990) provide the classic treatment of selling information to

4

a continuum of agents who trade in a rational expectations equilibrium. Our approach di¤ers

from theirs along several dimensions. Admati and P�eiderer (1986) emphasize the value of

selling informative, but noisy and idiosyncratic information to the traders, a result echoed

in Bergemann and Morris (2013). Admati and P�eiderer (1990) compare direct and indirect

trading of information, allowing the seller to o¤er shares in a mutual fund. In contrast, our

paper focuses on heterogeneous, risk-neutral buyers who do not interact in a downstream

market. Consequently, the seller o¤ers noisy versions of the data to screen the buyer�s initial

information and extract more surplus. This leads to profound di¤erences in the optimal

information structures. More recent contributions to the problem of selling information are

given by Es½o and Szentes (2007b) and Hörner and Skrzypacz (2015). These papers focus

on speci�c, distinct aspects of the problem such as bundling products and information, and

dynamic information disclosure, respectively.

Within the mechanism design literature, our approach is related to, yet conceptually

distinct from, models of discriminatory information disclosure. In these models, the seller of

a good discloses horizontal match-value information, in addition to setting a price. Several

papers, among which Ottaviani and Prat (2001), Johnson and Myatt (2006), Bergemann and

Pesendorfer (2007), Es½o and Szentes (2007a), Krähmer and Strausz (2015), and Li and Shi

(2015) analyze this problem from an ex-ante perspective. In these papers, the seller commits

(simultaneously or sequentially) to a disclosure rule and to a pricing policy.4 In related

contributions, Lizzeri (1999) considers vertical information acquisition and disclosure by a

monopoly intermediary, and Abraham, Athey, Babaio¤, and Grubb (2014) study vertical

information disclosure in auctions.

Commitment to a disclosure policy is also present in the literature on Bayesian persuasion,

e.g. Rayo and Segal (2010), Kamenica and Gentzkow (2011), and Kolotilin, Li, Mylovanov,

and Zapechelnyuk (2015). In contrast to this line of work, our model admits monetary

transfers and rules out any direct e¤ect of the buyer�s ex post action on the seller�s utility.

2 Model

We consider a model with a single agent (a buyer of information) facing a decision problem

under uncertainty. The relevant state for the buyer�s problem ! is drawn from a �nite set

: The buyer must choose an action a, also from a �nite set A: Throughout the paper, we

assume the sets of actions and states have the same cardinality jAj = jj = N:4In addition, a number of more recent papers, among which Balestrieri and Izmalkov (2014), Celik (2014),

Koessler and Skreta (2014), and Mylovanov and Tröger (2014) analyze this question from an informed-principal perspective.

5

Payo¤s The buyer�s seeks to match the state and his action. His ex post utility function

u (a; !) is given by

u (a; !) = 1[a=!]: (1)

This formulation for the utility function assumes the buyer places uniform weight on his de-

cisions. More general formulations would be equally tractable, but complicate the exposition

considerably.

Prior Information The buyer has a type � 2 � consisting of his interim beliefs about

the state. These beliefs are the buyer�s private information. They are generated from a

common prior and from privately observed signals. In particular, suppose the buyer and the

seller share a common prior

� 2 �.

The buyer privately observes an informative signal r 2 R according to a commonly knowninformation structure

� : ! �R:

The buyer then forms his interim beliefs � 2 � as

� (! j r) = � (r j !)� (!)P!0 � (r j !0)� (!0)

:

The buyer�s interim beliefs � (! j r), simply denoted by �, are his private information.From the seller�s point of view, the distribution over states and the distribution over

private signals simply induce a distribution of initial beliefs

F (�) ,

which we take as a primitive of our model. As usual, the model allows for the alternative

interpretations of a single buyer and a continuum of buyers.5

Incremental Information The seller seeks to augment the buyer�s private information

with additional experiments. An experiment (an information structure) I = f�; Sg consistsof a set of signals s 2 S and a likelihood function:

� : ! �(S) :

We assume throughout that the realizations of the buyer�s private signal r 2 R and of the5In order to interpret the model as a continuum of buyers, we shall assume that states ! are identically

and independently distributed across buyers, and that di¤erent buyers�signals are conditionally independent.

6

signal s 2 S from any experiment I are independent conditional on the state !. With the

conditional independence assumption, we are adopting the interpretation of a buyer querying

a database, or requesting a diagnostic service. In particular, the buyer and the seller draw

their information from independent sources. Under this interpretation, the seller does not

know the realized state ! at the time of contracting. The seller can, however, augment the

buyer�s original information with arbitrarily precise signals.

We now �x an information structure I and we let �ij denote the conditional probability

of signal sj in state !i,

�ij = Pr (sj j !i) ;

withP

j �ij = 1. We can then represent any information structure in matrix form as

I =

s1 sj � � �!1 �11 �1j � � �

!i �i1 �ij...

... � � �

The following experiments are of particular interest: (a) the fully informative experiment

with �ii = 1 for all i; (b) locally non-dispersed experiments that contain a unit row vector

with �ii = 1 for some i; and (c) locally noise-free experiments that contain a column sj with

only one positive entry, �jj > 0 for some j. In particular, locally non-dispersed experiments

allow the buyer to rule out state i, while locally noise-free experiments reveal state i = j

with positive probability.

An information policy or more simply a menu M = fI; tg consists of a collection ofinformation structures and an associated tari¤ function, i.e.,

I = fIg t : I ! R+:

Our goal is to characterize the revenue-maximizing menu for the seller.

We are deliberately focusing on the pure problem of designing and selling information

structures. We are therefore assuming that the seller commits to a menu before the realiza-

tion of the state and the type (!; r), and that none of the buyer�s action (a), the realized

state (!), or the experimental outcome (the signal s) is contractible. Thus, despite the buyer

being privately informed about his beliefs, scoring rules and other belief-elicitation schemes

are not available to the seller. In particular, the timing of the game is as follows: (i) the

seller posts a menuM; (ii) the true state ! and the buyer�s signal r (hence, the type �) are

7

realized; (iii) the buyer chooses an experiment I 2 I and pays the corresponding price t; (iv)the buyer observes a signal s from the experiment I (given the true state !) and chooses an

action a.

Finally, we have assumed that the seller is unconstrained in her choice of information

structures, and that it is costless to provide any information. In other words, the data is

already stored. A richer model would distinguish the cost of acquisition of information (i.e.,

building the database) from duplication and distribution of the information. In particular,

the analysis could easily be extended to a �rst stage where the seller invests in the maximal

precision of the experiments, and to �xed or linear costs of information distribution.

Retargeting interpretation Having completed the description of the model, we now

return to online advertising markets to provide a more detailed interpretation of our setting.

Consider a market with a continuum of consumers indexed by i 2 [0; 1]. Consumer ispends ! 2 R+ per website, i.e., she has a budget of !. Budgets are distributed in theoverall population according to � 2 �. However, among customers of a given retailer

r (e.g., r corresponds to Walmart, J.C. Penney, Sears, Macy�s), budgets are distributed

according to � (! j r). Retailers want to determine which ones of their previous customersthey should re-target with an advertising campaign.

A large data seller has a record of past digital purchases of consumers i and hence it

perfectly knows the budget ! of each one. The data seller can, therefore, improve the

precision of the retailer�s estimate for each individual consumer. In particular, suppose the

database records several attributes for each consumer, i.e., random variables that correlate

(positively or negatively) with individual budgets. The sale of information is then analogous

to a database query: a retailer and the seller contract on which attributes the seller will

disclose about a consumer upon the buyer�s request. This allows the seller to o¤er narrower

or wider brackets for the estimation of individual budgets. When the price of information

is speci�ed, however, the seller does not yet know the realized state, i.e., the budget of the

speci�c consumer the buyer will actually inquire about.

Because the distribution of budgets of retailer r�s previous customers is the retailer�s

private information, the retailer is also privately informed about the incremental value of

each attribute. Consequently, the seller o¤ers a menu of information packages.6 Some of

these packages may disclose only a fraction of the attributes relevant to the retailer�s decision,

leading the retailer to take action (e.g., to advertise) too frequently or too rarely.

6It is, of course, reasonable that the seller knows which consumers have visited a large retailer�s websitein the past. Therefore, the seller should be able to compute the conditional distributions � (! j r). However,under a posted-price mechanism, the seller cannot condition prices and experiments on buyers� identities.This assumption is realistic in large markets with numerous small retailers, and whenever data purchasesare made through an intermediary, such as an advertising agency.

8

Our model is also consistent with alternative interpretations. For example, suppose

each retailer r observes consumer i�s budget upon their �rst interaction. Each consumer�s

budget evolves stochastically according to a commonly known Markov process. The retailer

can query the database for information about the consumer�s most recent activity. The

retailer is then privately informed about the time of its interaction with the consumer,

which determines its beliefs about the consumer�s current budget. Finally, retailers may

have private information about their overall willingness to spend on advertising or about

other dimensions of their demand for information (e.g., the importance of their decision, or

the �stakes of the game�).

3 Information Design

3.1 Value of Information

We now characterize the value of the buyer�s initial information and the supplemental value

of an information structure I = f�; Sg. Let �i denote the initial probability that type �assigns to state !i, with i = 1; :::; N . The value of the agent�s problem in the absence of

further information is given by

u (�) , maxa

XN

i=1u (a; !i) �i:

Now suppose the buyer observes an incremental signal sj under the information structure

I = f�; Sg where the number of signal realizations is jSj = J . The buyer chooses an actionafter updating her beliefs, leading to the gross utility

u (�; sj) , maxa

XN

i=1u (a; !i)

�i�ijPNi=1 �i�ij

; (2)

wherePN

i=1 �i�ij = Pr [sj j �] is the marginal distribution of signals from the perspective of

buyer type �: Integrating over all signal realizations sj, and subtracting the value of prior

information, the net value of an information structure I for type � is given by

V (I; �) , Esj [u (�; sj)] =JXj=1

maxau (a; !i) �i�ij � u (�) :

9

Under the speci�c utility function (1), the value of information takes a simpler form:

V (I; �) =

JXj=1

maxi�i�ij �max

i�i: (3)

Quite simply, the gross value of an information structure is given by the ex ante probability

of the buyer choosing the correct action given the state.7

As for all decision problems, an information structure I is only valuable if di¤erent signals

sj lead to di¤erent actions. In particular, if argmaxi �i�ij is constant, then (3) immediately

implies V (I; �) = 0: Conversely, the fully informative experiment I� guarantees that the

buyer takes the correct action ex post. Its value is given by V (I�; �) = 1�maxi �i:Viewed as a function of types, the value of an experiment I is piecewise linear in � with

a �nite number of kinks. Linearity is a consequence of the Bayesian nature of our problem,

where types are probabilities. Downward kinks are due to the max operator in the buyer�s

reservation utility u (�): they correspond to changes in the buyer�s action under no additional

information. Upward kinks are generated by the max operator in (2): they re�ect changes

in the buyer�s preferred action upon observing a signal.

3.2 The Seller�s Problem

We now characterize the menu of experiments that maximizes the seller�s pro�ts. By the

Revelation Principle, we may state the seller�s problem as designing a direct mechanism

M = fI (�) ; t (�)g:

that assigns an information structure I, denoted by I (�) to each type of the buyer. The

seller�s problem consists of maximizing the expected transfers subject to incentive compati-

bility and individual rationality:

maxfI(�);t(�)g

Zt (�)dF (�) ; (4)

s.t. V (I (�) ; �)� t (�) � V (I (�0) ; �)� t (�0) 8 �; �0;V (I (�) ; �)� t (�) � 0 8�:

The seller�s problem can be immediately simpli�ed by reducing the set of optimal information

structures to a very tractable class.

7This is in contrast with models that specify contingent payments for actions, where the marginal pricein�uences the buyer�s behavior after observing a signal and hence his willingness to pay ex ante.

10

Lemma 1 (Maximal Cardinality) Every incentive compatible information policy can berepresented as a collection of information structures where the signal space has at most the

cardinality of the action space.

This result follows from the revelation principle for multi-stage games in Myerson (1986).

The intuition is straightforward: consider a incentive-compatible information policy that

contains an experiment I (�) with more signals than actions; the seller could combine all

signals in I(�) that lead to the same choice of action for type �; clearly, the value of this

experiment stays constant for type � (who does not modify his behavior); in addition, because

the original signal is strictly less informative than the new one, V (I(�); �0) decreases (weakly)

for all �0 6= �, relaxing the incentive constraints.Armed with this result, we can more fully characterize the buyer�s demand for informa-

tion. To do so, let us revisit the value of information in a simple binary-state example. In

particular, let = f!1; !2g and, to facilitate the exposition, de�ne

� , �11 and � , �22:

Lemma 1 allows us to focus on experiments with two signals only, i.e., on information

structures I (�) of the following kind:

I (�) =

s1 s2

!1 �(�) 1� �(�)!2 1� �(�) �(�)

:

Without loss of generality, we adopt the convention that �(�) + �(�) � 1 (else we should

relabel the two signals). With some abuse of notation, we let � = Pr [!1]. Finally, the value

of experiment (�; �) for type �, which is given by (3), can also be written as

V (�; �; �) = [�� + � (1� �)�maxf�; 1� �g]+ (5)

Figure 1 shows the value of information for experiments (�; �) 2 f(1; 1) ; (1=2; 1)g : The �rstone is fully informative, while the second one eliminates type-I errors.

While we are considering a natural type space, corresponding to the buyer�s interim

beliefs, the notions of a �high�and �low�type di¤er from the standard screening setting.

In particular, due to the nature of the buyer�s Bayesian problem, the most valuable type for

the seller is the middle type � = 1=2. Conversely, the two extreme types � 2 f0; 1g have novalue of information. However, the distance j� � 1=2j is not a su¢ cient statistic for the valueof information, as shown in the right panel of Figure 1. In particular, the di¤erent slopes on

11

Figure 1: Value of Information Structures V (�; �; �)

each side of � = 1=2 indicate di¤erent marginal bene�ts of avoiding type-I errors.

More generally, information always has both a vertical (quality) and a horizontal (po-

sitioning) dimension. Information in this sense is always high-dimensional, even when the

buyer�s type is one-dimensional, as in this example. In particular, a fundamental di¤erence

with standard quality pricing is that all types agree on the best product (i.e., the fully infor-

mative experiment), but they disagree on the ranking of �distorted�products. For example,

consider the following two information structures:

I =

s1 s2

!1 1 0

!2 1� � �

; I 0 =

s1 s2

!1 � 1� �!2 0 1

:

All types � 2 (1=2; 1) have a positive value of information for experiment I 0, because suchexperiments contain a signal that perfectly reveals state !2 and induces them to switch their

action. The converse is true for types � 2 (0; 1=2).We have so far focused on locally non-dispersed experiments� information structures with

signals that perfectly rule out one state. The next result establishes that the optimal menu

contains only such experiments, further simplifying the seller�s problem.

Lemma 2 (Non-Dispersed Information)

1. The fully informative experiment, �ii = 1 for all i, is always part of the optimal menu.

2. Every experiment in an optimal menu is locally non-dispersed, i.e., �ii = 1 for some i.

Part (1.) of this result is immediate: because every type values the fully informative

experiment I� the most, the seller can always replace the most expensive item on the menu

12

with I�, keep all prices constant, and weakly increase pro�ts. Part (2.) of Lemma 2 im-

plies that, for every experiment I, there exists a state ! that is ruled out by N � 1 signalrealizations.8 This result follows from the negative correlation present by construction in

the buyer�s beliefs over each state. Because beliefs �i sum to one, we can always identify a

vertical quality dimension (e.g., �NN) that buyers value uniformly. All other dimensions of

the experiment � enter the buyer�s utility through the di¤erence with �NN .

To obtain some intuition, consider again the case of binary states, and rearrange the

value of information in (5) as

V (�; �; �) = [(�� ) � + � �maxf�; 1� �g]+

This formulation for the value of information highlights both level e¤ects (terms depend on

the allocation or type only) and interaction e¤ects. In particular, the allocation (�; �) and

the buyer�s type � interact through the di¤erence � � � only. This means the seller canincrease the value of an experiment at the same rate for all types. In particular, the seller

can increase � and � holding � � � constant, raising the price at the same rate. This doesnot alter the attractiveness of the experiment for any buyer who is considering choosing it.

Thus, in the case of binary state and binary action, every optimal experiment must have

either � = 1 or � = 1 (or both). With two states only, any locally non-dispersed experiment

is also locally noise-free, i.e., at least one signal perfectly reveals the true state.

In the context of online advertising, any such experiment consists of asking the data seller

to group potential consumers into two categories: �Buy Ad�and �Don�t Buy Ad�(where

action a1 corresponds to buying). With this interpretation, � = 1 > � will lead to a broad

match strategy, and vice-versa � = 1 > � will lead to excessively narrow targeting.

4 Optimal Menu with Binary Types

We now consider the case of two types of buyers, � 2��H ; �L

. For simplicity we will refer

to the relatively less informed type as the �high type��H and to relatively more informed

type as the �low type��L. More speci�cally, we assume that

maxi�Hi � max

i�Li :

8Thus, the expected relative entropy of each experiment is in�nite. Cabrales, Gossner, and Serrano (2015)o¤er an alternative criterion to quantify the informational content of an experiment.

13

Equivalently, type �H is willing to pay more for the fully revealing experiment. The fraction

of high types in the population is given by

= Pr(�H):

To illustrate the basic mechanics of screening buyers with heterogeneous beliefs, we consider a

simple example with binary states and actions. We then turn to a more general environment.

4.1 Binary State Example

The optimal menu with binary states is best illustrated through a concrete example. Let

�H = 7=10 and �L = 1=5, and assume the two types are equally likely in the population

( = 1=2). Now suppose the seller o¤ers the fully informative experiment � = � = 1 to

type �H and extracts his entire surplus by charging a price of 3=10. In a canonical screening

model, the seller would now have to exclude type �L. However, when selling information,

the monopolist can design another experiment with undesirable properties for the high type.

For example, the seller can o¤er a partially informative experiment that would lead type �H

to choose action a1 when observing signal s1 and to be indi¤erent between the two actions

when observing signal s2. Because the high type would choose action a1 in the absence of

information, his willingness to pay for such an experiment is nil. The seller can then extract

the entire surplus of type �L as well, and satisfy incentive constraints. In our example, this

amounts to o¤ering the experiment (�; �) = (4=7; 1) at a price of 4=35. The net value of

both experiments as a function of the buyer�s type � 2 [0; 1] is shown in Figure 2 below.

Figure 2: Suboptimal Menu: (�; �) 2 f(1; 1); (4=7; 1)g

As is clear from this picture, both incentive compatibility constraints are slack with such

a menu. Therefore, the seller can increase the quality of the experiment sold to the low

14

type, while satisfying the incentive constraint of the high type, and still extract the entire

surplus. The optimal menu is then characterized by the most informative experiment the

seller can o¤er while extracting the entire surplus. In our example, the optimal menu is then

given by (�; �) 2 f(1; 1) ; (4=5; 1)g with the corresponding prices t 2 f3=10; 4=25g. Figure 3illustrates the net value of the two experiments o¤ered by the monopolist.

Figure 3: Optimal Menu: (�; �) 2 f(1; 1); (4=5; 1)g

In this example, the optimal menu is characterized by �no distortion at the top,� and

by full rent extraction. The (ex ante less informed) type � closer to 1=2 buys the perfectly

informative experiment. The more informed type buys a partially revealing experiment.

Intuitively, the seller lowers the information content provided to the more informed type to

lower the information rent of the uninformed type. However, the resulting partially informa-

tive experiment is not worthless for the high type. On the contrary, it is as informative as

possible, while satisfying the high type�s incentive compatibility and participation constraints

with equality.

This example highlights the �horizontal�aspect of selling information that increases the

scope for screening. Towards a more general result, assume without loss that the high type

is given by �H > 1=2 and that

��H � 1=2�� L � 1=2�� :We now de�ne the following threshold:

� , 1� �11� �2

: (6)

Finally, the following distinction is useful: the priors of the two types are said to be

congruent if both types would choose the same action in the absence of the additional

15

information. With our parametrization, priors are congruent if 1=2 � �L � �H and they arenon-congruent if �L � 1=2 � �H . In Proposition 1, we characterize the optimal menu in thefully binary setting.9

Proposition 1 (Binary Types, States, and Actions)

1. With congruent priors, the seller o¤ers the fully informative experiment only. Both

types participate if and only if ��1� �L

�=�1� �H

�:

2. With non-congruent priors and � � , both types buy the fully informative experiment.

3. With non-congruent priors and > � , the high type buys the fully informative experi-

ment and the low type buys a partially informative experiment with

� =2�H � 1�H � �L

and � = 1:

Furthermore, the seller extracts the entire surplus.

This result highlights the role of linearity in the seller�s problem, and it suggests that

the optimal menu is characterized by a corner solution. In particular, it is never optimal for

the seller to o¤er a more informative experiment to the low type and simultaneously leave

some rent to the high type by lowering the price of the fully informative experiment. If such

a marginal deviation is pro�table, the optimal menu consists of pooling both types at the

fully informative experiment, as in part (2.).

With congruent priors, it is more likely that both types receive full information if they

are similar, if the more informed types are more frequent, or if they have more to gain from

additional information. With non-congruent priors, the optimal menu o¤ers the perfect in-

formation structure to both types if they are su¢ ciently close in their prior information.

O¤ering two experiments is optimal only if the two types are su¢ ciently asymmetric. How-

ever, the low type always receives some information in that case (i.e., he is not shut down),

and the high type is indi¤erent between the two items on the menu. Furthermore, the

high type only receives positive rents if he is pooled with the low type. Otherwise, the seller

extracts the high type�s surplus, even in the presence of an alternative information structure.

Even though the shape of the optimal menu depends on whether types have congruent or

non-congruent priors, the comparative statics of the optimal information structure are robust

across the di¤erent scenarios. Because the high type buys the fully informative experiment

and the low type�s experiment always has � = 1, we de�ne the noise in the information

9This result is a special case of Proposition 2 and its proof is therefore omitted.

16

structure bought by the low type as q , 1� �. We can then ask how do the distribution oftypes and the prior information of each type impact the optimal provision of information?

Corollary 1 (Comparative Statics)

1. The noise q is increasing in the frequency of the high type.

2. The noise q is increasing in the precision of the prior of the high type.

3. The noise q is decreasing in the precision of the prior of the low type.

In other words, the rent-extraction vs. e¢ ciency trade-o¤ is resolved at the expense of the

informed agent whenever the fraction of uninformed agents is larger, the uniformed agents

has a higher willingness to pay for information, or the informed type has a lower willingness

to pay for the complete information structure.

4.2 General Characterization

We now characterize the optimal menu with two types for the case of N states and actions.

To facilitate the description of the seller�s problem, we �rst establish the familiar properties

of �no distortion at the top�and �no rent at the bottom.�

Lemma 3 (Binding Constraints) In the optimal menu: the high type �H purchases the

complete information structure I�; the incentive-compatibility constraint of type �H binds;

and the participation constraint of type �L binds.

We can now simplify the seller�s problem (4). If the low type participates, the seller

extract the entire surplus from the low type. We can then maximize over the information

structure I(�L) and over the information rent of the high type, which we denote by U(�H).

Without loss of generality, we arrange the signals in the experiment I(�L) so that the low

type chooses action ai when observing signal si.

Using expression (3) for the value of information and ignoring additive constants, the

seller�s problem consists of choosing U(�H) � 0 and the diagonal entries �ii 2 [0; 1] of thelow type�s information structure I(�L) to maximize

(1� )nXi=1

�Li �ii � U(�H); (7)

subject to the high type�s incentive-compatibility constraint

nXi=1

��Li � �Hi

��ii � max

i�Li �max

i�Hi � U(�H): (8)

17

By inspection of (7) and (8), it is immediate that a higher �ii increases pro�ts and relaxes

the constraint for all !i such that �Li > �

Hi . In other words, if the low type deems a given state

!i more likely than does the high type, the seller should not distort the distribution of signals

conditional on state i, i.e., she should send signal si with probability one. In particular, the

optimal menu has �ii = 1 for the state corresponding to the low type�s default action.

With two states only (e.g., in the previous example where � < 1 = �), this observa-

tion also identi�es the most pro�table way of reducing the high type�s rent. With N > 2

states, the seller has considerably more instruments to distort the low type�s allocation. In

particular, the partially informative experiment may contain fewer signals than available

actions�the seller may �drop�some signals from I(�L) to reduce the information rents. The

logic above then suggests that the seller should distort the signal distribution in states the

high type deems very likely but the low type does not.

To formalize this intuition, we re-order the states !i by the likelihood ratios of the two

types�beliefs. In particular, let

�L1�H1

� ::: � �Li�Hi

� � � � � �LN�HN: (9)

We then de�ne two critical states. The �rst state is de�ned as

i� = min i :�Li�Hi

� : (10)

(State i� is well-de�ned because the likelihood ratio must exceed one for some i.) To de�ne

the second state, consider the following quantity:

kj :=

Pni=j+1

��Li � �Hi

��maxi �Li +maxi �Hi

�Hj � �Lj: (11)

In particular, kj corresponds to the value of �jj that satis�es (8) with equality when the

high type�s rent is nil and, in addition, �ii = 1 for all states i > j and �ii = 0 for all i < j.

We then let

j� := min j : kj � 0; (12)

which is also well-de�ned because, the de�nition of a low type implies at least kN � 0: In

Proposition 2, we provide a general characterization of the optimal menu.

18

Proposition 2 (Optimal Menu with Two Types) The optimal experiment I(�L) has�ii = 0 for all i < min fi�; j�g and �ii = 1 for all i > min fi�; j�g. Moreover:

1. If j� < i�, then �j�j� = kj� and U(�H) = 0.

2. If j� � i�, then �i�i� = 1 and U(�H) > 0.

In general, the optimal experiment I(�L) takes the following lower-triangular shape, with

�ii 2 fki; 1g depending on whether i� 7 j�, as in Proposition 2:

0 � � � 0 �1i � � � � � � �1n...

......

......

... �ii � � � � � � �in...

... 0 1 � � � 0...

......

.... . .

...

0 � � � 0 0 0 � � � 1

(13)

The distribution of signals conditional on states i < min fi�; j�g is not uniquely pinneddown. In the Appendix, we describe three algorithms that describe a procedure to compute

the solution and to assign the o¤-diagonal entries �ij in such a way that both types follow

the recommendation implied by each signal.

The seller�s choice of experiments in the optimal menu depends on two distinct factors: (a)

the extent and the structure of the two types�belief disagreement, and (b) the distribution

of types. Proposition 2 shows that the problem is separable. In particular, the optimal

experiment I(�L) is chosen from a discrete set that only depends on the two types��H ; �L

�.

Each element of this set distorts progressively more signals as in (13). The distribution of

types then determines the optimal element of the set.

Intuitively, the extent of the two types�belief disagreement should guide the seller�s choice

of a partially informative experiment. In particular, the seller is most willing to introduce

noise in signals that the low type considers relatively less likely than the high type, because

these distortions facilitate screening without sacri�cing surplus. That is, distortions are

more likely to occur in states that maximize relative disagreement. The extent of disagree-

ment is captured by the critical state j�, which represents the number of signals that must

be distorted in order to satisfy incentive compatibility, while holding the high type to his

reservation utility.

The pro�tability of distorting the allocation depends, however, on the distribution of

types. In particular, the measure of high types (as captured by the critical state i�) represents

the shadow cost of providing information to the low types. For su¢ ciently high shadow cost,

19

the monopolist is willing to distort the allocation by removing as many signals as necessary

to satisfy (8) and hold the high type to his participation constraint. For low opportunity

cost, the principal prefers to limit distortions and concede rents to the high type. Thus, as

in the binary-state case, the informativeness of the low type�s experiment is decreasing in

the fraction of high types .

We illustrate the �ndings of Proposition 2 through the following examples in a setting

with three states and actions.

Example 1 (Dissimilar Types) Let �L = (1=10; 1=10; 4=5) and �H = (1=2; 1=4; 1=4). Be-cause k1 > 0, we have j� = 1 and therefore �ii > 0 for all i = 1; 2; 3: The optimal experiment

I(�L) as a function of the distribution of types is given by

s1 s2 s3

!1 1 0 0

!2 0 1 0

!3 0 0 1

< 1=5;

and

s1 s2 s3

!1 1=4 1=4 1=2

!2 0 1 0

!3 0 0 1

> 1=5:

The high type obtains zero rent if and only if � 1=5.

Example 2 (Similar Types) Let �L = (1=10; 1=10; 4=5) but consider a relatively less in-formed high type, i.e., �H = (2=5; 3=10; 3=10) : Because k1 < 0 < k2, the partially informative

experiment I(�L) can involve dropping signal s1, i.e., setting �11 = 0. In particular, the op-

timal experiment I(�L) is given by

s1 s2 s3

!1 1 0 0

!2 0 1 0

!3 0 0 1

< 1=4;

s1 s2 s3

!1 0 1=4 3=4

!2 0 1 0

!3 0 0 1

2 [1=4; 1=3] ;

s1 s2 s3

!1 0 1=4 3=4

!2 0 1=2 1=2

!3 0 0 1

> 1=3:

The high type obtains zero rent if and only if � 1=3.

Notice that the monopolist must distort the allocation more heavily in Example 2, where

the two types are more similar. In both examples, for su¢ ciently low , the menu contains

the fully informative experiment I� only.

The following corollary of Proposition 2 illustrates two polar cases in which the distrib-

ution of types pins down the rent of the high type.

20

Corollary 2 (Information Rents)

1. If < �L1 =�H1 , both types purchase the fully informative experiment I

� and the high type

obtains positive rent U(�H) > 0.

2. If > �Li =�Hi for all i such that �

Li =�

Hi � 1, the high type obtains no rent.

The main idea of this section is that the seller can exploit disagreement between the two

types along several dimensions to pro�tably o¤er a screening menu. But when is it pro�table

to o¤er two distinct experiments? Corollary 2 dealt with the case of bunching. Corollary 3

deals with the case of exclusion. It contains su¢ cient conditions for discriminatory pricing (a

menu with two experiments) to be optimal under some distribution . Recall that congruent

priors are beliefs that lead to identical actions in the absence of new information.

Corollary 3 (Discriminatory Pricing)

1. With non-congruent priors, it is never optimal to exclude type �L.

2. With congruent priors and types not lying on a ray in the simplex, there exists such

that the optimal experiment I(�L) is partially informative.

Thus, discriminatory pricing can be pro�table even if the two types agree on the default

action, as long as the likelihood ratio �Li =�Hi takes more than two distinct values. With only

two states, this is impossible, and congruent priors imply bunching or exclusion (i.e., only

the fully informative experiment is sold). With more than two states, the seller can exploit

disagreement along any dimension and extract more surplus through a richer menu.

5 Optimal Menu with a Continuum of Types

We now investigate which properties of the optimal menu discussed above extend to the more

general setting of a continuum of types. The seller�s problem when N > 2 is, in general,

a complicated multidimensional screening problem.10 We therefore specialize the model to

the case of binary action and state, i.e., jAj = jj = 2: Recall our notation for informationstructures I (�) in the binary-state case,

I (�) =

s1 s2

!1 �(�) 1� �(�)!2 1� �(�) �(�)

10The linearity introduced by beliefs does not help in this respect. In fact, kinks in the utility function (3)make our setting substantially more di¢ cult than, for example, the bundling models of Manelli and Vincent(2006), Pavlov (2011), or Daskalakis, Deckelbaum, and Tzamos (2014).

21

and the value of information

V (�; �; �) = [(�� ) � + � �maxf�; 1� �g]+ .

Because the allocation (�; �) interacts with the buyer�s type only through the di¤erence ��,we de�ne the following one-dimensional allocation rule measuring the relative informativeness

of an experiment

q(�) , �(�)� �(�) 2 [�1; 1] :

The result in Lemma 2 implies that either � = 1 or � = 1 in any experiment. Therefore,

we adopt the convention that q (�) > 0 implies � (�) = 1 and q (�) < 0 implies � (�) = 1, i.e.,

� = min f1; 1� qg. We then express the value of experiment q 2 [�1; 1] for type � 2 [0; 1]as follows:

V (q; �) = [�q �maxfq; 0g+minf�; 1� �g]+ : (14)

With this notation, two distinct information structures (q = �1 and q = 1) correspond toreleasing no information to the buyer. (These are the two experiments that show the same

signal with probability one.) Conversely, the fully informative experiment is given by q = 0.

The utility function V (q; �) has the single-crossing property in (�; q). This indicates that

higher-� buyers, who are relatively more optimistic about state !1, assign a relatively higher

value to information structures with a higher q, because such experiments contain a signal

that is more informative about the (less likely) state !2.

The formulation in (14) for the value of information re�ects the following properties of

our screening problem: (a) buyers have type-dependent participation constraints;11 (b) buyer

type � = 1=2 has the highest willingness to pay for any experiment q; (c) the experiment

q = 0 is the most valuable for all types �; (d) di¤erent types � assign di¤erent ranks to

partially informative experiments q 6= 0.Overall, while the seller�s problem is reminiscent of classic nonlinear pricing, selling in-

formation introduces a novel aspect of horizontal di¤erentiation across buyer types. Fur-

thermore, the �vertical quality�and �horizontal position�of an information product cannot

be chosen separately by the seller. In particular, it is not possible to change the relative

informativeness of a product (i.e., choose a very high or very low q) without reducing its

overall informativeness.11In general, our setting involves both bunching and shutdown, making it di¢ cult to directly apply existing

approaches, such as the one in Jullien (2000).

22

5.1 Incentive Compatibility

We �rst simplify the problem by eliminating the non-negativity constraint from (14). To do

so, we consider a natural class of allocations.

With an abuse of terminology, we de�ne an allocation q (�) to be responsive if, for any �,

�q (�)�maxfq (�) ; 0g+minf�; 1� �g � 0) q (�) =

(�1 if � < 1=2:

+1 if � � 1=2:

Under a responsive allocation, any type � who takes the same action following both signal

realizations receives one of the two completely uninformative experiments. We then have

the following intuitive result.

Lemma 4 (Responsive Allocations) Any optimal allocation q� (�) is responsive.

In other words, if type � derives zero value from his experiment q (�), then the seller is

(weakly) better o¤ by not providing any information, which allows her to relax all other

types�incentive constraints.

We now use the structure of the problem in order to derive a characterization of all

implementable responsive allocations q. We observe that the buyer�s utility function has a

downward kink in �. This is a consequence of having an interior type (� = 1=2) assign the

highest value to any allocation, and of the linearity of the buyer�s problem. We therefore

compute the buyer�s rents U(�) on [0; 1=2] and [1=2; 1] by applying the envelope theorem to

each subinterval separately. We thus obtain two di¤erent expressions for the rent of type

� = 1=2. Continuity of the rent function then implies

U(1=2) = U(0) +

Z 1=2

0

V�(q; �)d� = U(1)�Z 1

1=2

V�(q; �)d�:

Because types � = 0 and � = 1 assign zero value to any experiment, incentive compatibility

requires U (0) = U (1). (Optimality will later require setting both to zero.) Therefore, while

any type�s utility can always be written in the above form, the novel element of our model

is that no further endogenous variables appear.12 Di¤erentiating (14) with respect to � and

simplifying, we obtain the following restriction on an incentive-compatible allocationZ 1=2

0

(q (�) + 1)d� = �Z 1

1=2

(q (�)� 1)d�: (15)

12For instance, in Mussa and Rosen (1978) and in Myerson (1981), the rent of the highest type U(1)depends on the entire allocation q.

23

Equation (15) is a new condition that sets our framework apart from most other screening

problems. In particular, incentive constraints impose an aggregate (integral) constraint on

the allocation.13 We formalize this in the following result.

Lemma 5 (Implementable Allocations) Any responsive allocation q is implementable ifand only if the following two conditions hold:

q (�) 2 [�1; 1] is non-decreasing;

andZ 1

0

q (�) d� = 0: (16)

The transfers t (�) associated with the allocation q(�) can be computed in the usual

way on the two intervals [0; 1=2] and [1=2; 1] separately. With the addition of the integral

constraint (16) for implementability, we can state the seller�s problem as follows:

maxq(�)

Z 1

0

�� F (1=2)� F (�)

f (�)

�q (�)�max fq (�) ; 0g

�dF (�) ; (17)

s.t. q (�) 2 [�1; 1] non-decreasing,Z 1

0

q (�)d� = 0:

5.2 Optimal Menu

In order to solve the seller�s problem (17) and characterize the optimal menu, we rewrite the

objective with the density f(�) explicitly in each term:Z 1

0

[(�f (�) + F (�)) q (�)�max fq (�) ; 0g f (�)]d�:

This minor modi�cation highlights two important features of our problem: (i) the constraint

and the objective have generically di¤erent weights, d� and dF (�); and hence (ii) the problem

is non separable in the type and the allocation, which interact in two di¤erent terms. In

particular, consider the �virtual values� � (�; q), de�ned as the partial derivative of the

integrand with respect to q. Unlike in the standard separable models, our virtual values are

a function of the allocation. Because our objective is piecewise linear, the function �(�; q)

13In this sense, our model di¤ers from other instances of screening under integral constraints (e.g., con-straints on transfers due to budget or enforceability, or capacity constraints). Finally, while the resemblanceto a persuasion budget constraint is purely cosmetic, a similar condition does appear in the model of per-suasion with private information of Kolotilin, Li, Mylovanov, and Zapechelnyuk (2015).

24

takes on two values only, i.e.,

�(�; q) :=

8<:�f (�) + F (�) for q < 0;

(� � 1)f (�) + F (�) for q > 0:

These two values represent the marginal bene�t to the seller (ignoring the constraint) of

increasing each type�s allocation from �1 to 0, and from 0 to 1, respectively.

If ironing à la Myerson is required, we denote the ironed virtual value for experiment q

as �� (�; q). Finally, we say that the allocation satis�es the pooling property if it is constant

on any interval where the ironed virtual value is constant. We can then reduce the seller�s

problem to the following maximization program.

Proposition 3 (Optimal Allocation Rule) The allocation q�(�) is optimal if and only ifthere exists �� > 0 such that: (i) for all �,

q�(�) = argmaxq

�Z q

�1

�� (�; x)� ��

�dx

�;

(ii) q� has the pooling property; and (iii) q� satis�es the integral constraint (16).

The solution to the seller�s problem is then obtained by combining standard Lagrange

methods with the ironing procedure developed by Toikka (2011) that extends the approach

of Myerson (1981). In particular, Proposition 3 provides a characterization of the general

solution, and suggests an algorithm to compute it.

To gain some intuition for the shape of the solution, observe that the problem is piecewise-

linear (but concave) in the allocation. Thus, absent the integral constraint, the seller would

choose an allocation that takes values at the kinks, i.e. q�(�) 2 f�1; 0; 1g for all �. Inother words, the seller would o¤er a one-experiment menu consisting of a �at price for the

complete-information structure. It will indeed be optimal for the seller to adopt �at pricing

in a number of circumstances. The main novel result of this section is that the seller can

(sometimes) do better by o¤ering one additional experiment.

Proposition 4 (Optimal Menu) An optimal menu consists of at most two experiments.

1. The �rst experiment is fully informative.

2. The second experiment (contains a signal that) perfectly reveals one state.

To obtain some intuition for this result, consider a relaxed problem where the seller

contracts separately with buyers � < 1=2 and � > 1=2: Because of the linearity of the

25

problem, the optimal menu for each group is degenerate: the seller o¤ers the perfectly

revealing experiment at a �at price. Now consider the solution to the full problem. If

the optimal prices for each separate group are similar, the seller o¤ers the fully revealing

experiment at an intermediate price. If they are quite di¤erent, the seller prefers to distort

the information sold to one group, so as to maintain a high price for the fully revealing

experiment. However, the linearity of the environment prevents the seller from o¤ering more

than one distorted experiment, i.e., no further versioning is optimal.

We now illustrate the optimal menu under �at and discriminatory pricing separately.

5.2.1 Flat Pricing

Let types be uniformly distributed, F (�) = �, and consider the virtual values �(�; q) for

q < 0 and q � 0 separately. These values are constant in q, hence we refer to �(�;�1) and� (�; 1) respectively. For a given value of the multiplier �, the allocation that maximizes

the expected virtual surplus in Proposition 3 assigns q�(�) = �1 to all types � for which�(�;�1) < �; it assigns q�(�) = 0 to all types � for which �(�;�1) > � > �(�; 1); and

q�(�) = 1 for all types � for which �(�; 1) > �.

Figure 4: Uniform Distribution: Virtual Values and Optimal Allocation

Figure 4 illustrates the resulting allocation rule. In order to satisfy the constraint, the

optimal value of the multiplier �� must identify two symmetric threshold types (�1; �2) that

separate types receiving the e¢ cient allocation q = 0 from those receiving no information at

all, q = �1 or q = 1. The allocation then clearly satis�es the integral constraint (16). Moregenerally, if both virtual values � (�;�1) and � (�; 1) are strictly increasing in �, the optimalmenu consists of charging the monopoly price for the fully informative experiment.

Flat pricing is optimal under weaker conditions than strictly increasing virtual values.

We now summarize the su¢ cient conditions for this result.

26

Proposition 5 (Flat Pricing) Suppose any of the following conditions hold:

1. F (�) + �f(�) and F (�) + (� � 1)f(�) are strictly increasing;

2. the density f(�) = 0 for all � > 1=2 or � < 1=2;

3. the density f(�) is symmetric around � = 1=2.

The optimal menu contains only the fully informative experiment.

An implication of Proposition 5 is that the seller o¤ers a second experiment only if ironing

is required. At the same time, there exist examples with non-monotone virtual values and

one-itemmenus. Symmetric distributions are one such instance: for any distribution function

F (�), e.g. hazard rate, the solution to the restricted problem on [0; 1=2] or [1=2; 1] is a cuto¤

policy. Because the cuto¤s under a symmetric distributions are symmetric about 1=2, it

follows that the solutions to the two subproblems satisfy the integral constraint, and hence

provide a tight upper bound to the seller�s pro�ts.

5.2.2 Discriminatory Pricing

The monotonicity conditions of Proposition 5 that guarantee increasing virtual values are

not entirely appealing in our context. When types correspond to interim beliefs, it is quite

natural to consider bimodal densities (e.g., a well-informed population in a binary model)

that fail the regularity conditions, and introduce the need for ironing. For example, starting

from the common prior, if buyers observe binary signals, a bimodal distribution of beliefs

would result with types holding beliefs above and below the mean of the common prior �.

In general, non monotone densities and distributions violating the standard monotonicity

conditions are a quite natural benchmark. Therefore, ironing is not a technical curiosity in

our case, but rather a technique that becomes unavoidable because of the features of the

information environment.

We now illustrate the ironing procedure when virtual values are not monotone, and how

it leads to a richer (two-item) optimal menu. Figure 5 considers a bimodal distribution of

types, illustrating the probability density function and the associated virtual values.14

We therefore consider the �ironed� versions of each virtual value, and we identify the

equilibrium value of the multiplier ��. In this case, the multiplier must be at the �at level of

one of the virtual values: suppose not, apply the procedure from the regular case, and verify

that it is impossible to satisfy the integral constraint.

14In particular, the distribution in the left panel is a mixture (with equal weights) of two Beta distributionswith parameters (8; 30) and (60; 30).

27

Figure 5: Probability Density Function and Virtual Values

Figure 6 illustrates the optimal two-item menu. Note that for types in the �pooling�

region � 2 [0:17; 0:55], the level of the allocation (q� � �0:22) is uniquely pinned down bythe pooling property and by the integral constraint.

Figure 6: Ironed Virtual Values and Optimal Allocation

In both examples, extreme types with a low value of information are excluded from the

purchase of informative signals. In the latter example, the monopolist is o¤ering a second

information structure that is tailored towards relatively lower types. This structure (with

q < 0) contains one signal that perfectly reveals the high state. This experiment is relatively

unattractive for higher types, and it allows the monopolist to increase the price for the large

mass of types located around � � 0:7.The properties of the optimal discriminatory pricing scheme re�ect the fact that the type

structure is quite di¤erent from the standard screening environment. While information rents

U (�) peak at � = 1=2, the ex ante least informed type � = 1=2 need not purchase the fully

revealing experiment, despite having the highest value of information.15 In the example

above, inducing the types around � = 1=2 to purchase the fully informative experiment

15This possibility can also be shown in three-type examples with arbitrary state spaces.

28

requires imposing further distortions and charging a lower price for the second experiment.

This leads to a loss of revenue from types around � � 0:2. Because such types are quite

frequent, this loss more than o¤sets the gain in revenue from types around 1=2. In other

words, whenever discriminatory pricing is optimal, the menu depends on the distribution of

types in a rich way.

5.2.3 Beyond Binary States

As mentioned, the analysis of the optimal menu with N states and actions involves all the

di¢ culties associated with multidimensional screening. Nevertheless, the construction of

the optimal menu under discriminatory pricing extends to multidimensional settings. We

develop an example that preserves the single-dimensional nature of the screening problem,

while illustrating how the seller can design richer experiments.

In particular, consider a setting with three states and actions, where types � are distrib-

uted along the rays of the two-dimensional simplex. If types are uniformly distributed on the

rays, it is immediate to see that the �at pricing menu is optimal. Figure 7 (left) shows the set

of participating (solid) and excluded (dashed) buyers when the fully informative experiment

is o¤ered at the monopoly price.

However, if the distribution of types di¤ers across rays, a richer menu may be more

pro�table. In particular, let types along one ray be distributed according to a mixture of

Beta distributions (as in Figure 5), maintaining the uniform distribution along the other two

rays. Then discriminatory pricing is optimal, as shown in Figure 7 (right).

Figure 7: Optimal Allocations under Uniform and Beta Distributions

In particular, the seller o¤ers the fully revealing experiment to all types closest to the

centroid, and a distorted experiment to a subset of types on the ray through the vertex !1.

29

This experiment is given by

I1

s1 s2 s3

!1 1 0 0

!2 :35 :65 0

!3 :35 0 :65

and perfectly reveals states !2 and !3, but leads buyers to take action a1 more often than

would be optimal.

6 Implications for observable variables

The solution of the screening problem characterized in the previous sections is related to the

practical design of information products by a monopolist. Indeed, our seller�s problem is to

degrade the quality of her information and to set the prices of various versions in order to

segment the population of buyers.16

Our results describe the optimal patterns for the degrading of the information o¤ered by

a monopolist. In other words, one should not observe arbitrarily damaged goods. Instead,

the optimal menu of products contains �directionally� informative signals. In the special

case of binary action, one should only observe type-I or type-II errors. Furthermore, absent

costs of transmitting information or other sources of nonlinearity (e.g., buyers�risk aversion),

one should not observe multiple distortions of the same kind, either.

While seemingly abstract patterns, these results can inform the design of information

products o¤ered by several large data brokers. For example, in the retargeting applica-

tion, Blackwell experiments correspond to �data packages��contracts in which the seller of

information commits to identifying consumers with given attributes�and individual signals

correspond to the realized values of these attributes.17 In this context, acquiring a data pack-

age that reveals coarse information about the consumer�s attributes will inevitably lead to

both type-I and type-II errors in the targeting of an audience. The optimality of directional

information distortions then suggests to limit information disclosure (i.e., package design) to

attributes correlated with very high- or low- value customers only.

Menu pricing is widely used among data brokers. In particular, as one would expect,

most data sellers seek to o¤er higher-quality, high-markup products, in addition to partially

informative data packages. In particular, one may consider data management platforms

16Our setting di¤ers from the �damaged goods�model of Deneckere and McAfee (1996) because degradingquality is costless. In many applications, however, it is naturally costly to introduce noise in the data. Thisis the case, for instance, when the seller is concerned with preserving the anonymity of her information.17For example, see http://www.acxiom.com/data-packages/ for an exhaustive list of the packages sold by

the data broker Acxiom.

30

(DMPs)�customized software that automates the integration of 1st- and 3rd-party data,

enabling websites to track their users and place them into more precise market segments�as

information products aimed at high-end buyers. Indeed, DMPs promise to use all and only

the relevant data to guide actions, while advertising campaigns run on the basis of limited

targeting criteria provide a low-cost option.18

Menu pricing for information is also used by LinkedIn, which sells member pro�les two

versions: a Lite version that allows one to condition searches on a limited number of criteria;

and a Premium version that grants access to the entire database. Finally, menus of infor-

mation products are also o¤ered in contexts where private beliefs constitute a less relevant

dimension of buyer heterogeneity. For example, buyers of time-sensitive data can be screened

according to their value for timely information, i.e., their discount rates. Consequently, sell-

ers such as Bloomberg or Thomson-Reuters o¤er menus of essentially homogeneous products

that di¤er only in the timing of their availability.19

Returning to online advertising markets, information is, of course, also sold in an indirect

way. For instance, large sellers of online space e¤ectively bundle data and advertising prod-

ucts by o¤ering a menu of targeting opportunities. This of course suggests the question of

richer contract spaces. For example, a large seller of advertising, such as Google or Yahoo!,

may charge di¤erent prices for its space as a function of the information requested by the

buyer (i.e., the desired degree of targeting). In this sense, our approach is better suited at an-

alyzing a data broker�s pricing problem when merchants wish to buy retargeted �advertising

products�from a di¤erent seller. More generally, our results apply to any transaction where

the advertising space is priced separately from the information required to better allocate

that space.

7 Conclusions

We have examined the problem of a monopolist who sells incremental information to privately

informed buyers. We have deliberately focused on the �packaging�or versioning problem of

a seller who is (a) uninterested in the buyer�s actions, and (b) free to acquire and degrade

information. Fundamental to the seller�s incentives to degrade information is the Bayesian

nature of the buyer�s problem: on the one hand, private beliefs widen the scope for price

18Acxiom o¤ers �Data Integration� services in multiple plans, and Oracle acquired the Bluekai DataManagement Platform in 2014. For a detailed description of DMPs and a comparison with direct sales ofdata, see �Who Do Advertisers Think You Are?�, The New York Times, November 30, 2012.19For example, the Consumer Sentiment Index released by Thomson-Reuters and the University of Michi-

gan, and PAWWS Financial Network�s portfolio accounting system (Shapiro and Varian, 1999) were initiallyavailable in di¤erent versions, based on the timing of their release.

31

discrimination through directional information; on the other hand, the linearity of the buyer�s

utility function limits the use of versioning.

In general, the seller�s problem consists of screening within and across groups of buyers

with congruent priors. When states and actions are binary, the optimal mechanism involves

at most two experiments, and we obtained su¢ cient conditions for �at pricing to be optimal.

With arbitrary states and actions, a two-type example illustrates the seller�s ability to exploit

relative di¤erences in the buyers�beliefs to reduce information rents while limiting the surplus

that must be sacri�ced to provide incentives.

These results provide only a �rst pass at understanding the trade-o¤s involved in screening

through information products. While we have relied on belief heterogeneity to motivate the

sale of incremental information, there may be several sources of heterogeneous demand for

information. Buyers may di¤er in their payo¤s from taking each action, in the relevance

of their decision (i.e., their �stakes�), or in their ability to process additional signals. Each

of these extensions can be implemented in the framework we have outlined. Combining

di¤erent sources of heterogeneity (e.g., beliefs and stakes) appears more challenging, but

promises deeper insights into the optimal allocation of information. Further interesting

extensions include the role of frictions in the acquisition and transmission of information;

the e¤ect of competition among sellers of information (i.e. formalizing the intuition that each

seller will be able to extract the surplus related to the innovation element of his database).

32

AppendixProof of Lemma 2. The argument for part (1.) is given in the text. For part (2.), considerany individually rational and incentive compatible direct mechanism M = fI (�) ; t (�)g.Fix a type � with the associated experiment I (�) and let �ij (�) denote the conditional

probability of signal sj in state !i under experiment I (�). A consequence of Lemma 1 is

that each type � has a di¤erent optimal action for each signal in I (�). Without loss of

generality, we then arrange the signals in I (�) so that type � takes action ai when observing

si. If type � never takes action ai, we drop signal si from I (�), i.e. we set the i-th column

of � (�) to zero. Because beliefs �i�i = 1, we can write the value of information (3) as

V (I (�) ; �) =

"NXi=1

�i�ii (�)�maxi�i

#+

=

"N�1Xi=1

�i (�ii (�)� �NN (�)) + �NN (�)�maxi�i

#+. (18)

Now de�ne " (�) := 1 � max�ii (�), and construct a new experiment I 0 (�) where �0ii (�) =�ii (�) + " (�) for all i and for all �. For each state !i, the corresponding o¤-diagonal entries

�ij (�), j 6= i, are correspondingly reduced by " (�), without further restrictions on how thisis operation is performed. It then follows from (18) that

[V (I 0 (�) ; �)� " (�)]+ = V (I (�) ; �) :

Furthermore, for all types �0 6= �, the value of mimicking type � increases by less than " (�)(strictly so, if type �0 6= � responds to the signals in I (�) di¤erently from type �). Suppose

type �0 chooses action ai(j) upon observing signal sj from experiment I (�). We then have

V (I (�) ; �0) =

"NXj=1

�0i(j)�i(j)j (�)�maxi�0i

#+:

Furthermore, because i (j) need not coincide with j, the entries �i(j)j in V (I (�) ; �0) increase

at most by " (�). Therefore, we have

[V (I 0 (�) ; �0)� " (�)]+ � V (I (�) ; �0) .

Consequently, the direct mechanismM0 = fI 0 (�) ; t (�) + " (�)g is also individually rationaland incentive compatible. Moreover, all experiments I (�) are locally non-dispersed by con-

struction, and all transfers are weakly greater than in the original mechanismM. �

33

Proof of Lemma 3. We know from Lemma 2 that at least one type must buy the fully

revealing experiment I�. Suppose only type �L buys I� as part of the optimal menu. Then

the price of I� is at most V (I�; �L). By incentive compatibility, if the high type �H purchases

I 6= I�, it must be that t(�H) < V (I�; �L). Therefore, eliminating the experiment I(�H) fromthe menu strictly improves the seller�s pro�ts. �

Proof of Proposition 2. We know from Lemma 3 that the high type �H purchases the

fully informative experiment. We now derive the optimal experiment I(�L). Suppose (as we

later verify) that both types �H and �L would choose action ai after observing signal si from

the optimal experiment I(�L). The seller then chooses �ii 2 [0; 1] ; i = 1; : : : n to solve thelinear program (7) subject to (5). We can write the Lagrangian as

L = (1� )"

nXi=1

�ii��Li + �

��Li � �Hi

�� max

i�Li + �max

i�Hi + �U(�

H)

#� U(�H);

where � � 0 represents the shadow value of increasing type the informational rent U(�H).

It follows immediately that

�ii =

8><>:0 if �Li + �

��Li � �Hi

�< 0;

[0; 1] if �Li + ��Li � �Hi

�= 0;

1 if �Li + ��Li � �Hi

�> 0:

(19)

and that

U(�H) =

( Pni=1(�

H)�ii +maxi �Li �maxi �Hi if < � (1� ) ;

0 if � � (1� ) :(20)

To characterize the solution, we need only identify the value of the multiplier ��. We proceed

in the following steps.

1. Recall states !i are ordered by increasing likelihood ratio of beliefs �Li =�

Hi , and consider

state !1. If !1 is the only state i for which �ii < 1, (19) implies that the multiplier is

given by

�� = mini

1�Hi�Li+ 1

=�L1

�L1 + �H1

: (21)

Therefore, �11 solves (5) with equality, i.e., the candidate optimal value is �11 = k1

as de�ned in (11). If k1 � 0 and �L1 =�H1 < , then (20) implies U(�H) = 0 and (19)

34

implies we must set �ii = 1 for all i � 2: If instead < �L1 =�H1 then then �11 = 1

and U(�H) > 0 solves (5), i.e., U(�H) = maxi �Li � maxi �Hi : Finally, if k1 < 0 and

�L1 =�H1 < , there is no solution with the value of �

� given in (21).

2. Suppose the above candidate solution is not, in fact, the optimum. The value of

the multiplier must then satisfy �� > �L1 =(�L1 + �

H1 ), which implies the non-negativity

constraint binds, i.e., �11 = 0: This means the optimal partially informative experiment

has less than full rank. The next candidate value for the multiplier �� is �L2 =(�L2 + �

H2 ).

Again, if k2 � 0 and �L2 =�H2 < , then (20) implies U(�H) = 0 and (19) implies �ii = 1for all i � 3: If instead < �L1 =�

H1 then then �11 = 1 and U(�H) > 0 solves (5)

with equality. Finally, if k2 < 0 and �L2 =�H2 < , there is no solution with the value

of �� = �L2 =(�L2 + �

H2 ) and the next candidate is �

� = �L3 =(�L3 + �

H3 ); which implies

�11 = �22 = 0:

3. The procedure iterates until either (a) state i� is reached, i.e., �Li =�Hi > , or (b) state

j� is reached, i.e., kj � 0.

4. We must then verify that the o¤-diagonal entries �ij, i 6= j can be set to ensure bothtypes �H and �L choose action ai when observing signal si. This requires

�ii�i � �ji�j

for both �L and �H and for all j < i, because the signal matrix can be taken to be

lower triangular. Fix a signal i and an alternative action aj. We need

�i�j� �ji�ii:

By construction we have �Hi =�Hj < �Li =�

Lj for all i > j, hence, we need only worry

about the incentives of type �H . We can then assign the probabilities �ji following

the procedure described in Algorithm 1: for any i0 > i, we make type �H indi¤erent

between following the recommendation of signal i0 and choosing actions ai; we do so

beginning with �jn and proceeding backward as long as required; if the procedure

assigns positive weight to �ji then it must be that �ji = 1 � �ns=i+1�Hs =�Hj : It followsthat type �H has strict incentives to follow the recommendation of signal i: Note that

�ii 2 fki; 1g, hence, if �ii = ki, we have by de�nition

ki�Hi =

Pns=i+1

��Ls � �Hs

��maxj �Lj +maxj �Hj

�Hi � �Li�Hi :

35

Now, because argmaxj �Lj > i we can bound the above expression by

ki�Hi >

�Hi�Hi � �Li

��Hj � �ns=i+1�Hs

�> �Hj � �ns=i+1�Hs = �Hj �ji:

A fortiori, type �H has strict incentives to choose ai if �ii = 1.

5. Finally, the seller may choose �ij such that the high type does not follow every recom-

mendation. However, it is easy to see that this is not optimal. By revealed preference,

the high type would obtain a higher utility than under the above construction, for

every experiment �: Thus, the incentive compatibility constraint becomes harder to

satisfy, and the value of the seller�s problem decreases.

This ends the proof. �

Algorithm 1 (Optimal Menu with Binary Type) We construct the two optimal exper-iments as a function of the distribution of types. This algorithm contains three sub-routines,

beginning with the construction of the allocation described in part (1.) of Proposition 2. We

refer to this allocation as the Maximally Informative No-Rents Experiment.

Maximally Informative No-Rents Experiment Order states i in increasing order ofthe likelihood ratios �Li =�

Hi and set U(�

H) = 0: Let �ii = 1 for i = 2; : : : ; n and solve

(8) with equality with respect to �11. The solution is given by k1 in (11). If k1 � 0,

stop. If k1 < 0, set �11 = 0 and �ii = 1 for i = 3; : : : ; n: Solve (8) with equality with

respect to �22; which yields k2. If k2 � 0, stop, otherwise iterate the procedure. The

procedure terminates at state j� de�ned in (12).

We now use the distribution of types to identify which step of the construction yields the

pro�t-maximizing experiment.

Optimal Experiment Begin at state j�. If > �Lj�=�Hj� the Maximally Informative No-

Rents Experiment is part of the optimal menu. If < �Lj�=�Hj�, set �j�j� = 1 and

consider j� � 1. If > �Lj��1=�Hj��1, stop, and choose U(�

H) > 0 to satisfy (8) with

equality. Otherwise set �j��1;j��1 = 1 and consider j��2; iterating until reaching statei� de�ned in (10) and adjusting the rent U(�H) to satisfy the incentive constraint.

Thus, as the fraction of high types increases, the optimal menu involves potentially steeper

distortions. Conversely, when i� = 1, the menu involves bunching both types at the fully

revealing experiment. Finally, we illustrate a simple procedure to assign the o¤-diagonal

probabilities to the partially informative experiment.

36

O¤-Diagonal Entries Suppose the optimal menu sets �ii = 0 for all i < {̂, for some {̂ > 1.To assign the o¤ diagonal entries �i;j with i < {̂ and j � {̂; �x a state i < {̂ and

begin with signal n: Let �in = minf�Hn =�Hi ; 1g. If �Hn =�Hi > 1, stop. If �Hn =�Hi < 1,

set �i;n�1 = minf�Hn�1=�Hi ; 1 � �Hn =�Hi g, and proceed backwards until reaching �i;i+1.The entries so constructed sum to one because by the de�nition of kj in (11), we have

�H{̂ > k{̂�H{̂ > �

Hi

�1� �nj={̂+1�Hj =�Hi

�.

Proof of Lemma 4. Suppose the allocation q (�) is optimal, and that, for some type �, theexpression in brackets on the right-hand side of (14) is negative, i.e.,

�q (�)�maxfq (�) ; 0g+minf�; 1� �g � 0:

Then we can replace q (�) with ~q (�) = �1 if � < 1=2 and with ~q (�) = 1 if � � 1=2. This

does not change the value of the allocation for type � and the price the seller can charge,

which are both zero. However, the experiment ~q (�) is less informative, and hence relaxes

incentive constraints for all �0 6= �. �

Proof of Lemma 5. We begin with necessity. Consider an incentive compatible allocationq: For any two types �2 > �1 we have

V (q1; �1)� t1 � V (q2; �1)� t2;V (q2; �2)� t2 � V (q1; �2)� t1;

V (q2; �2)� V (q1; �2) � t2 � t1 � V (q2; �1)� V (q1; �1) :

It follows from the single-crossing property of V (q; �) that q2 � q1 hence q (�) is increasing.Because the buyer�s rent is di¤erentiable with respect to � on [0; 1=2] and [1=2; 1] respec-

tively, we can compute the function U (�) on these two intervals separately. We obtain the

expression in the text,

U(1=2) = U (0) +

Z 1=2

0

V� (q; �) d� = U (1)�Z 1

1=2

V� (q; �) d�:

By the envelope theorem V� (q; �) = q + 1 for � < 1=2 and = q � 1 for � > 1=2. Finally,

because U (0) = U (1) we obtain Z 1

0

q (�)d� = 0:

37

We now turn to su¢ ciency. Suppose the allocation q is increasing and satis�es the integral

constraint. Then construct the following transfers

t (�) =

(�q (�)�max fq (�) ; 0g �

R �0q (x) dx if � < 1=2;

�q (�)�max fq (�) ; 0g+R 1�q (x) dx if � � 1=2:

(22)

Because the allocation satis�es the integral constraint, we haveZ �

0

q (x) dx = �Z 1

�

q (x) dx,

and we can express all transfers t (�) as

t (�) = q (�) � �max fq (�) ; 0g �Z �

0

q (x) dx.

Finally, the expected utility of type � from reporting type �0 is given by

V (q (�0) ; �)� t (�0) = (� � �0) q (�0) +Z �0

0

q (x) dx+min f�; 1� �g .

Because q is monotone, the expression on the right-hand side is maximized at �0 = � and,

hence, the incentive constraints are satis�ed. Finally, because the rent function U (�) =

V (q (�) ; �)� t (�) is non-negative for all � 2 [0; 1], the participation constraints are satis�edas well. �

Proof of Proposition 3. We �rst derive the seller�s objective in the usual way. Using (22),we write the expected transfers asZ 1

0

t (�)dF (�) =

Z 1

0

(�q (�)�max fq (�) ; 0g)dF (�)

�Z 1=2

0

Z �

0

q (x) dxdF (�) +Z 1

1=2

Z 1

�

q (x) dxdF (�) :

Integrating the last two terms by parts, we obtain

�F (1=2)Z 1=2

0

q (x) dx+

Z 1=2

0

q (x)F (x) dx� F (1=2)Z 1

1=2

q (x) dx+

Z 1

1=2

q (x)F (x)dx;

and henceZ 1

0

t (�)dF (�) =Z 1

0

[(�q (�)�max fq (�) ; 0g) f (�)� (F (1=2)� F (�)) q (�)]d�:

38

We now establish that the solution to the seller�s problem (17) can be characterized

through Lagrangian methods. For necessity, note that the objective is concave in the allo-

cation rule; the set of non-decreasing functions is convex; and the integral constraint can be

weakened to the real-valued inequality constraintZ 1

0

q(�)d� � 0: (23)

Necessity of the Lagrangian then follows from Theorem 8.3.1 in Luenberger (1969). Su¢ -

ciency follows from Theorem 8.4.1 in Luenberger (1969). In particular, any solution maxi-

mizer of the Lagrangian q(�) with Z 1

0

q(�)d� = �q

maximizes the original objective subject to the inequality constraintZ 1

0

q(�)d� � �q:

Thus, any solution to the Lagrangian that satis�es the constraint solves the original problem.

Because the Lagrangian approach is valid, we can apply the results of Toikka (2011) to

the solve the seller�s problem for a given value of the multiplier � on the integral constraint.

Write the Lagrangian asZ 1

0

[(�f (�) + F (�)� �) q (�)�max fq (�) ; 0g f (�)]d�:

In order to maximize the Lagrangian subject to the monotonicity constraint, consider the

generalized virtual surplus

�J(�; q) :=

Z q

�1

�� (�; x)� ��

�dx;

where �� (�; x) denotes the ironed virtual value for allocation x. Note that �J(�; q) is weakly

concave in q. Because the multiplier � shifts all virtual values by a constant, the result in

Proposition 3 then follows from Theorem 4.4 in Toikka (2011). Finally, note that ��(�; q) � 0for all � implies the value �� is strictly positive (otherwise the solution q� would have a

strictly positive integral). Therefore, the integral constraint (23) binds. �

39

Proof of Proposition 4. From the Lagrangian maximization, we have the following nec-

essary conditions

q�(�) =

8>>>>>>>>><>>>>>>>>>:

�1 if ��(�;�1) < ��;

�q 2 [�1; 0] if ��(�;�1) = ��;

0 if ��(�;�1) > �� > ��(�; 1);

�q0 2 [0; 1] if ��(�; 1) = ��;

1 if ��(�;�1) > ��;

and Z 1

0

q�(�)d� = 0:

If �� coincides with the �at portion of one virtual value, then by the pooling property of

Myerson (1981), the optimal allocation rule must be constant over that interval, and the

level of the allocation is uniquely determined by the integral constraint. Finally, suppose ��

equals the value of ��(�; q�(�)) over more than one �at portion of the virtual values ��(�;�1)and ��(�; 1). Then, we can focus, without loss of generality, on the allocation q� that assigns

experiment q = 0 or q 2 f�1; 1g to all types in one of the two intervals. �

Proof of Proposition 5. (1.) If F (�)+ �f (�) and F (�)+(� � 1) f (�) are strictly increas-ing, then ironing is not required and it follows from the analysis in the text that the optimal

solution has q 2 f�1; 0; 1g for all �.(2.) If all types are located at one side from 1=2 then the integral constraint has no bite

since the allocation rule q (�) can always be adjusted on the other side to satisfy it. The

solution on each side of 1=2 involves a cuto¤ type and q 2 f�1; 0g or q 2 f0; 1g ; both ofwhich result in �at pricing.

(3.) If types are symmetrically distributed, then the separately optimal menus for types

� < 1=2 and � > 1=2 are identical. Therefore, the union of the two solutions satis�es the

integral constraint, and hence solves the original problem. �

40

References

Abraham, I., S. Athey, M. Babaioff, and M. Grubb (2014): �Peaches, Lemons,

and Cookies: Designing Auction Markets with Dispersed Information,�Discussion paper,

Microsoft Research, Stanford, and Boston College.

Admati, A. R., and P. Pfleiderer (1986): �A monopolistic market for information,�

Journal of Economic Theory, 39(2), 400�438.

(1990): �Direct and indirect sale of information,�Econometrica, 58(4), 901�928.

Balestrieri, F., and S. Izmalkov (2014): �Informed seller in a Hotelling market,�Dis-

cussion paper, HP Labs and New Economic School.

Bergemann, D., and A. Bonatti (2015): �Selling Cookies,�American Economic Journal:

Microeconomics, 7(3), 259�294.

Bergemann, D., and S. Morris (2013): �Robust Predictions in Games with Incomplete

Information,�Econometrica, 81(4), 1251�1308.

Bergemann, D., and M. Pesendorfer (2007): �Information Structures in Optimal

Auctions,�Journal of Economic Theory, 137(1), 580�609.

Cabrales, A., O. Gossner, and R. Serrano (2015): �Demand for Information and the

Appeal of Information Transactions,�Discussion paper, University College London.

Celik, L. (2014): �Information unraveling revisited: disclosure of horizontal attributes,�

Journal of Industrial Economics, 62(1), 113�136.

Daskalakis, C., A. Deckelbaum, and C. Tzamos (2014): �Strong duality for a

multiple-good monopolist,�Discussion paper, MIT.

Deneckere, R., and R. McAfee (1996): �Damaged Goods,�Journal of Economics and

Management Strategy, 5, 149�174.

Es ½O, P., and B. Szentes (2007a): �Optimal information disclosure in auctions and the

handicap auction,�Review of Economic Studies, 74(3), 705�731.

(2007b): �The price of advice,�Rand Journal of Economics, 38(4), 863�880.

Hörner, J., and A. Skrzypacz (2015): �Selling Information,�Journal of Political Econ-

omy, forthcoming.

41

Johnson, J. P., and D. P. Myatt (2006): �On the Simple Economics of Advertising,

Marketing, and Product Design,�American Economic Review, 96(3), 756�784.

Jullien, B. (2000): �Participation Constraints in Adverse Selection Models,� Journal of

Economic Theory, 93(1), 1�47.

Kamenica, E., and M. Gentzkow (2011): �Bayesian Persuasion,�American Economic

Review, 101(6), 2590�2615.

Koessler, F., and V. Skreta (2014): �Sales Talk,�Discussion paper, PSE and UCL.

Kolotilin, A., M. Li, T. Mylovanov, and A. Zapechelnyuk (2015): �Persuasion of

a Privately Informed Receiver,�Discussion paper, University of New South Wales.

Krähmer, D., and R. Strausz (2015): �Ex post information rents in sequential screen-

ing,�Games and Economic Behavior, 90, 257�273.

Lambrecht, A., and C. Tucker (2013): �When does retargeting work? Information

speci�city in online advertising,�Journal of Marketing Research, 50(5), 561�576.

Li, H., and X. Shi (2015): �Discriminatory Information Disclosure,� Discussion paper,

University of British Columbia and University of Toronto.

Lizzeri, A. (1999): �Information revelation and certi�cation intermediaries,�Rand Journal

of Economics, 30(2), 214�231.

Luenberger, D. G. (1969): Optimization by Vector Space Methods. John Wiley & Sons.

Manelli, A. M., and D. R. Vincent (2006): �Bundling as an optimal selling mechanism

for a multiple-good monopolist,�Journal of Economic Theory, 127(1), 1�35.

Maskin, E., and J. Riley (1984): �Monopoly with Incomplete Information,�Rand Journal

of Economics, 15(2), 171�196.

Mussa, M., and S. Rosen (1978): �Monopoly and Product Quality,�Journal of Economic

Theory, 18(2), 301�317.

Myerson, R. (1981): �Optimal Auction Design,�Mathematics of Operations Research, 6,

58�73.

(1986): �Multistage Games with Communication,�Econometrica, 54(2), 323�358.

42

Mylovanov, T., and T. Tröger (2014): �Mechanism Design by an Informed Principal:

Private Values with Transferable Utility,�Review of Economic Studies, 81(4), 1668�1707.

Ottaviani, M., and A. Prat (2001): �The value of public information in monopoly,�

Econometrica, 69(6), 1673�1683.

Pavlov, G. (2011): �Optimal Mechanism for Selling Two Goods,� The BE Journal of

Theoretical Economics, 11(1), 1�33.

Rayo, L., and I. Segal (2010): �Optimal Information Disclosure,� Journal of Political

Economy, 118(5), 949�987.

Riley, J., and R. Zeckhauser (1983): �Optimal Selling Strategies: When to Haggle,

When to Hold Firm,�Quarterly Journal of Economics, 98, 267�290.

Sarvary, M. (2012): Gurus and Oracles: The Marketing of Information. MIT Press.

Shapiro, C., and H. R. Varian (1999): Information Rules: A Strategic Guide to the

Network Economy. Harvard Business Press.

Toikka, J. (2011): �Ironing without Control,�Journal of Economic Theory, 146(6), 2510�

2526.

43

Date post:	28-Sep-2020
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

Designing and Pricing Informationbonatti/se.pdf · Designing and Pricing Information Dirk...

Documents