Date post: | 30-Nov-2023 |
Category: |
Documents |
Upload: | paulleaderinbrackets |
View: | 0 times |
Download: | 0 times |
FOUNDATIONS OF MULTI-AGENT SYSTEMS
A dissertation
submitted to theGraduate School of Business
and the Committee on Graduate Studies
of Stanford University
in partial fulfillment of the requirements
for the degree of
Doctor of Philosophy
in the subject of
Economics Analysis and Policy
Pierfrancesco La Mura
July 1999
I certify that I have read this dissertation and that in my
opinion it is fully adequate, in scope and quality, as a
dissertation for the degree of Doctor of Philosophy.
_____________________________________________
Robert Wilson Principal Adviser
I certify that I have read this dissertation and that in my
opinion it is fully adequate, in scope and quality, as a
dissertation for the degree of Doctor of Philosophy.
_____________________________________________
Sven Rady
I certify that I have read this dissertation and that in my
opinion it is fully adequate, in scope and quality, as a
dissertation for the degree of Doctor of Philosophy.
_____________________________________________
Yoav Shoham
Approved for the University Committee on Graduate Studies:
_____________________________________________
iii
Abstract
The subject of multi-agent systems is relevant to both economic theory and
artificial intelligence. Yet, very little work so far has tried to bridge the gap be-
tween the different formalisms, and provide a unified treatment. In this thesis, we
approach the subject of multi-agent systems from first principles, and develop a
new approach which enjoys foundational as well as computational advantages with
respect to other existing treatments.
First we develop a revealed-preference decision theory for many agents. Our
theory has several advantageous features, including the ability to represent small
worlds, conditional and counterfactual preferences and beliefs, and higher-order
utilities and probabilities.
Next, we introduce a novel representation for single-agent decision problems,
Expected Utility Networks (EU nets). EU nets generalize probabilistic networks
from the AI literature, and provide a modular and compact framework for strate-
gic inference. The representation relies on a novel notion of utility independence,
closely related to its probabilistic counterpart, which we present and discuss in the
context of other existing notions. Together, probability and utility independence
are shown to imply expected utility (EU) independence. Finally, we argue that
expected utility independence can be used to decentralize complex single-agent de-
cision problems, in that conditionally EU independent sub-tasks can be allocated to
simpler, conditionally independent sub-agents.
iv
Abstract v
We then extend the EU nets formalism to multi-agent decision problems, and
introduce a new class of representations, Game Networks (G nets). G nets constitute
an alternative to existing game-theoretic representation, with distinctive advantages.
In particular, G nets are more structured and compact than extensive forms, as both
probabilities and utilities enjoy a modular representation.
Next, we discuss strategic equilibrium in game networks. Existence and con-
vergence results are presented, for the special case of simultaneous networks. Al-
though we do not explicitly consider the general case, the proof technique and the
main results should carry over to general G nets.
As a concrete application of G nets, we then show how G nets can be used
to formally represent a second-price auction and solve for the equilibrium. We also
show how G nets can be used to reduce the computational complexity of finding
an optimal allocation and the corresponding payments in Generalized Vickrey Auc-
tions. In general, the problem is NP-hard; yet, we show that for an important class
of problems the solution can be obtained in linear time.
Our notion of utility independence can be given an alternative justification,
based on the notion of maximum entropy. We present a novel entropy method,
which allows for the characterization of “typical” preference structures in the face
of observed constraints (revealed preferences). We discuss potential applications of
the method, and provide an axiomatic justification.
Acknowledgments
First I would like to thank my family, and all my friends – no life, no truth.
Then I would like to thank my academic advisors, to whom goes my unconditional
admiration and respect not only for their contributions and guidance but also for
their human richness. I would also like to thank all the great people in the Graduate
School of Business and the department of Computer Science who helped me and
advised me. Thanks!
The material in the first three chapters was developed jointly with Yoav Shoham,
whose supervision has proved invaluable. Even the material in the remaining chap-
ters was strongly influenced by our discussions, and would have certaintly been
inferior otherwise. The material in chapter 4 was strongly influenced by several
discussions with Bob Wilson and Robert Susserland, to whom goes all my grati-
tude. The material in chapter 5 was presented at Yoav Shoham’s CoABS seminars,
and greatly benefited of the feedback offered by the seminar participants. Chapter
6 is part of an ongoing research project with Peter Grunwald, to whom I would also
like to express all my gratitude. Finally, I would like to thank Mario Eboli, who
provided constant encouragement and invaluable discussions.
vi
Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Plan of the work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1 Revealed-preference foundations of multi-agent systems. . . . . . . . . . . . .11
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.2 Hierarchical conditional preferences: the single agent case . . . . . . . . . . . . . . . . . . . 14
1.2.1 Foundation: Jeffrey utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2.2 A static expected utility representation of hierarchical preferences . . . . . 17
1.2.3 A dynamic expected utility representation of hierarchical preferences . . 21
1.3 Multi-agent construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2 Expected utility networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.2 Conditional independence of probabilities and utilities . . . . . . . . . . . . . . . . . . . . . . . 34
2.3 Expected utility networks: a formal definition and some structural properties . . 38
2.4 Conditional expected utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.5 Conditional expected utility independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.6 Inference in expected utility networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3 Game networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52
vii
Contents viii
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.2 G nets: a formal definition and some results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3 Some examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4 Strategic equilibrium in Game Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.3 Existence of equilibrium in game networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.4 Global convergence to equilibrium in simultaneous games . . . . . . . . . . . . . . . . . . . . 67
4.5 Computing all the Nash equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6 Convergence to an optimal policy in single-agent game networks . . . . . . . . . . . . . 70
5 Application: auctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75
5.1 An independent-value, second-price auction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.2 The Generalized Vickrey auction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.3 Some results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.2.4 Example: a Coase-tal economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
6 A maximum entropy method for expected utilities . . . . . . . . . . . . . . . . . . . .85
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.3 Utility entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.4 Axiomatic justification of our maximum entropy method for expected utilities . 92
Contents ix
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .97
List of Figures
A simple EU net........................................................................................39
The beer-quiche game...............................................................................57
The random order game............................................................................59
An independent-value, second-price auction............................................76
x
Introduction
Motivation
The formal analysis of multi-agent systems is a topic of interest to both economic the-
ory (von Neumann and Morgenstern 1944) and artificial intelligence (Hayes-Roth 1995).
While game-theoretic notions and methodologies have already populated the economic
mainstream, only recently they started to attract interest in the context of artificial intel-
ligence (Halpern 1997, Koller and Pfeiffer 1997), where their integration with existing
methodologies constitutes one of the most promising areas of new research. Research in
artificial intelligence, on the other hand, has traditionally put a great deal of emphasis on
logic and epistemic foundations (McCarthy 1959, Faginet al. 1995), while the importance
of such foundations in the context of economic theory (specifically, of game theory) has
only recently been recognized (Mertens and Zamir 1985, Tan and Werlang 1988, Aumann
and Brandenburger 1995).
Very little work so far has tried to bridge the gap between the different formalisms,
and provide a unified treatment. There are several reasons to believe that such integration
would be auspicable, and fruitful to both disciplines. For instance, several paradoxes and
technical difficulties (such as those related to counterfactual reasoning) appear in many dif-
ferent formalisms, suggesting that the demand for more rigorous foundations may be better
addressed through a unified treatment. Moreover, the various applications which moti-
vate foundational work in the two fields together impose very strong requirements on what
1
Introduction 2
would constitute a good foundation. Traditionally, and – we submit – for good reasons,
economic theory insists that the basic elements of a good framework should be in prin-
ciple observable, either through interrogation of a perspective decision-maker or through
the examination of an agent’s behavior (revealed preferences). Artificial intelligence, on
the other hand, insists – for equally good reasons – on a number of other requirements
for a good formalism, such as compactness, modularity, and, most importantly, compu-
tational tractability. In this respect, the standard game-theoretic framework proves fairly
inadequate for many real-world applications, whose complexity is typically so high that
not only explicit strategic reasoning, but even the formal specification of the system (for
instance, writing down its extensive-form representation) often turn out to be exceedingly
cumbersome tasks. We believe that, by addressing all these demands together, an integrated
formalism would prove beneficial to both disciplines, and open the possibility for a more
intense cross-fertilization between the two fields. Besides its potential for the advancement
of the core theoretical research in the two fields, such cross-fertilization would be, in our
opinion, important in view of several new research areas which are arising at their bound-
ary, such as electronic commerce, electronic marketing, or the economics of computerized
networks.
Plan of the work
In this thesis we approach the subject of multi-agent systems from first principles, and
develop a new approach which enjoys foundational as well as computational advantages
with respect to other existing treatments.
Introduction 3
In the first chapter we develop a revealed-preference decision theory for many agents.
Our theory has several advantageous features, which are never found together in other
treatments. For instance, it includes the ability to represent small worlds; a “small world”
(Savage 1954) can be thought of as an incomplete, coarse description of the world, and is
opposed to a “large world”, which can be regarded as an exhaustive description in which
no detail is left unspecified. In Savage’s theory the decision maker’s preferences are ex-
pressed on acts, which in turn are defined on a large world of states of Nature. In a chess
game, for instance, an act would be a complete conditional plan for the game, a veritably
overwhelming object; in turn, the practical impossibility to elicit preferences over such rich
objects curtails the applicability of the theory. Our approach, by contrast, uses as its basic
building blocks preferences over Boolean algebras of events, and there is no presumption
that any such event constitutes an exhaustive description of the world.
Another important feature of our approach is the ability to represent conditional and
counterfactual preferences. Other treatments (Samet 1994, Battigalli and Siniscalchi 1999)
also represent conditional and counterfactual beliefs, but they do not offer a decision-
theoretic foundation of the epistemic notions they use; rather, subjective beliefs are taken as
primitive notions. In our treatment we conform to a long-standing perspective in economic
theory (Pareto 1906, De Finetti 1937, Savage 1954), and take as our primitive notion the
agent’s conditional behavior, while epistemic notions such as the agents’ subjective beliefs
are regarded as aspects of the individual preferences.
Introduction 4
Finally, our approach includes higher-order utilities and probabilities. Most other
treatments1 generally focus on higher-order probabilities only (Mertens and Zamir 1985,
Heifetz and Samet 1996), and construct universal spaces in which the agents are classified
according to their epistemic type, i.e. according to the nature of their beliefs. By contrast,
we classify agents according to their decision-theoretic types, i.e. their preferences, and
derive epistemic types from decision-theoretic types.
In the second chapter we introduce a novel representation for single-agent decision
problems, Expected Utility Networks (EU nets). EU nets generalize probabilistic networks
from the AI literature (Pearl 1988), and provide a modular and compact framework for
strategic inference. Modularity is a central notion in AI; it allows concise representations
of otherwise quite complex concepts. In probabilistic networks, modularity is achieved by
exploiting the notion of conditional probabilistic independence. In recent years there have
been several attempts to provide a modular utility representations of preferences (Bacchus
and Grove 1995, Doyle and Wellman 1995, Shoham 1997).
It has proven difficult to devise a useful representation of utilities; this difficulty
can certainly be ascribed to the different properties of utility and probability functions, but
also, more fundamentally, to the fact that reasoning about probabilities and utilities together
requires more than simply gluing together a representation of utility and one of probability.
In fact, just as probabilistic inference involves the computation of conditional prob-
abilities, strategic inference– the reasoning process which underlies rational decision-
1 The only notable exception being, to the best of our knowledge, (Epstein and Wang 1996).
Introduction 5
making2 – involves the computation of conditional expected utilities for alternative plans
of action, which may not have a modular representation even if probabilities and utilities,
taken separately, do.
Our representation relies on a novel notion of utility independence, closely related to
its probabilistic counterpart, which we present and discuss in the context of other existing
notions. Together, probability and utility independence are shown to imply conditional ex-
pected utility (EU) independence. In this respect, choosing the “right” notion of conditional
utility independence turns out to be crucial.
What is important about conditionally independent decisions is that they can be effec-
tively decentralized: a single, complicated agent can be replaced by a number of simpler,
conditionally independent sub-agents, who can do just as well. This property is of inter-
est not only to artificial intelligence, since it can be exploited to reduce the complexity of
planning, but also to economic theory, as it suggests a principled way for the identification
of optimal task allocations within economic organizations.
In the third chapter, we then extend the EU nets formalism to multi-agent decision
problems, and introduce a new class of representations for mathematical games, Game Net-
works (G nets). G nets constitute an alternative to standard game-theoretic representations
such as normal and extensive forms, but in several respects G nets are more advantageous
than the latter.
2 Here, and elsewhere, the term “strategic” is used in the context of individual decision-making, and doesnot necessarily refer to a multi-agent scenario.
Introduction 6
Extensive-form representations, for instance, are generally more compact than normal-
form ones, but they are still quite “redundant”: in concrete examples, many branches of the
tree often have the same payoffs, but the recognition of these symmetries does not lead to
a more parsimonious representation. Furthermore, changing a few details in the setup –
for both normal and extensive forms – usually entails rewriting the whole game; in other
words, standard representations are not particularly modular.
Compared to standard game-theoretic representations G nets are more structured and
more compact, as both probabilities and utilities enjoy a modular representation. Moreover,
one can easily modify or extend an existing G net: for instance, adding more types of a
player or an extra round of moves is usually much easier than in extensive forms.
More fundamentally, G nets provide a computationally advantageous framework for
strategic inference, as one can exploit conditional probability and utility independencies to
simplify the inference process.
Finally, we remark that G nets generalize probabilistic networks in AI, for which a
large and sophisticated algorithmic “toolbox” is already available.
In the fourth chapter, we establish existence and convergence results for strategic
equilibrium in simultaneous game networks. While existence results are easily obtained,
the issue of convergence deserves special attention. Many convergence procedures have
already been proposed in the game-theoretic literature, but none of them has proved com-
pletely satisfactory. Fictitious play is one of the most popular; it is simple and intuitively
appealing, but it may fail to converge to a Nash equilibrium, due to the possible existence
of cycles, which may persist forever even though their frequency goes to zero. The tracing
Introduction 7
procedure (Harsanyi and Selten 1988) is also problematic: it may not converge, and more-
over is based on a non-constructive argument which makes it difficult to use in practice.
We introduce a new, simple iterative method, which always converges to a unique
Nash equilibrium in genericn-players normal form games. The equilibrium is uniquely
determined by the payoff structure of the game, can be computed with any level of accuracy,
and involves no weakly dominated strategies.
In many cases one is not interested in finding just one equilibrium, but rather would
like to obtain a complete list of all the Nash equilibria. It turns out that an adaptation of the
previous method can be put to such use.
Finally we address a related issue, which becomes significant in the context of some
potential applications of game networks. In principle, conditionally independent sub-tasks
may be allocated to conditionally independent sub-agents, as conditional EU independence
implies that any optimal policy can be reproduced by the joint behavior of such sub-agents.
Yet, before one can decentralize its execution, one first needs to find an optimal policy, and
the task may turn out to be very cumbersome from a computational point of view.
Alternatively, one can distribute the decision problem among the sub-agents first, and
then let the system converge to some global policy in a decentralized fashion. On the other
hand, if we do so, we have no a priori guarantee that the system will indeed converge to an
optimal policy. We study the convergence properties of such decentralized systems in the
case of simultaneous G nets.
As a concrete application of G nets, we then show (in chapter 5) how they can be used
to formally represent a second-price auction and solve for the equilibrium. Our goal will be
Introduction 8
to show how to use G nets to formally represent an economic mechanism – in our case, an
auction – and reason about it. We shall not try to establish new results: rather, our aim is to
show how the informal representations and reasoning typical of auction theory can be made
completely formal thanks to the G net machinery. Formal specification is a rich subject area
in the AI literature, while it is not significantly represented in the economic literature. We
conjecture that electronic commerce applications will motivate the investigation of formal
specification methods also in the context of economic theory.
We also show how G nets can be used to reduce the computational complexity of
finding an optimal allocation and the corresponding payments in Generalized Vickrey Auc-
tions. In general, the problem is computationally very hard; yet, we show that for an im-
portant class of problems the solution can be obtained in linear time, and demonstrate our
methodology in the context of a simple economic example.
Our notion of utility independence can be given an alternative justification, based on
the notion of maximum entropy. Maximum entropy (Shannon 1949, Shannon and Weaver
1949, Jaynes 1983) has recently attracted considerable interest in the context of Bayesian
probability theory (Good 1983, Jaynes 1998), as it provides a relatively simple, yet princi-
pled way to represent incomplete knowledge about uncertain domains.
One way to think about maximum entropy is in terms of “typical” beliefs; in cases
where the available data are compatible with a large class of probabilistic models, it always
attempts to pick the most reasonable one, on the basis of symmetry considerations.
In chapter 6 we derive a symmetric notion of maximum entropy for utilities, and
investigate its connections with the notion of utility independence introduced in chapter 2.
Introduction 9
As it turns out, our maximum entropy method for utilities always returns indifference and
u-independence whenever possible; hence, utility entropy provides an indirect justification
for our notion ofu-independence, although this is not our main motivation in carrying on
this program.
Our primary motivation is to provide a method for the revealed-preference estimation
of an unknown preference structure, based on observations on the agent’s behavior. For
instance, a current topic of interest in the field of marketing is the estimation of the potential
demand for goods or services based on observations coming from large, heterogeneous
databases of consumer data. Typically, the data are compatible with a broad spectrum of
possible consumer types, and hence the densities of the different types in the population
should first be estimated, and then incorporated in the model to produce an estimate of the
aggregate demand.
To reduce the complexity of this task one may want to resort to a representative agent
formulation, in which the set of observed constraints is used to generate a “typical” prefer-
ence structure consistent with the observed data. Maximum entropy methods characterize
“typical” belief structures, based on invariance considerations. Our aim is to extend those
methods in a principled way to the case of expected utilities, and use them to characterize
“typical” preference structures in the face of observed decision behavior (the representative
agent’s revealed preferences). We do so by imposing a set of axiomatic conditions on the
selection rule, based on invariance considerations. Our conditions mirror those introduced
by Shore and Johnson (Shore and Johnson 1980) for the selection of posterior probabilities.
Introduction 10
Even though our result is a straightforward consequence of existing results for the
case of posterior probabilities, we submit that it is not without merit. Surprisingly, the
required assumptions are still quite reasonable if we reinterpret the posterior probability
as value and use the axiomatic derivation to characterize “typical” expected utilities via
our notion of utility entropy. We believe that our maximum entropy method for expected
utilities constitutes a valid and principled way to identify a representative agent from a
large population of types in the face of revealed-preference constraints, and could find
application not only to economic theory, but also to problems in marketing and electronic
commerce.
Chapter 1Revealed-preference foundations of
multi-agent systems
We develop a revealed-preference theory for multiple agents. Some features of our
construction, which draws heavily on Jeffrey’s utility theory and on formal constructions
by Domotor and Fishburn, are as follows. First, our system enjoys the “small-worlds”
property. Second, it represents hierarchical preferences. As a result our expected utility
representation is reminiscent of type constructions in game theory, except that our con-
struction features higher order utilities as well as higher order probabilities. Finally, our
construction includes the representation of conditional preferences, including counterfac-
tual preferences.
1.1 Introduction
Two aspects of game theory are very evident nowadays. The first is that it has become an
indispensable tool, not only in economics but in a variety of other disciplines as well, from
philosophy and psychology to political science and computer science. The other is that
game theory lacks comprehensive foundations of the scope and depth found in single-agent
decision theory. Evidence of this limitation can be found in the many debates concerning
backward induction problems, admissibility, and other examples in which the traditional
game theoretic predictions are either paradoxical or ambiguous. It would be quite handy to
have a Savage-style characterization of game theory, which would clarify the assumptions
11
1.1 Introduction 12
underlying different solution concepts, and therefore also the contexts in which each one
applies.
What would such a theory look like? At a minimum it should involve a set of agents
and a set of intuitive objects (such as events or acts), individual preferences over these ob-
jects for each agent that can plausibly be elicited from people via interrogation or observa-
tion, and a representation of each such individual preference ordering in terms of subjective
probability and utility particular to the individual agent.
In this chapter we will provide a theory that has these properties. Somewhat paradox-
ically, while our primary motivation is this multi-agent setting, the bulk of our construction
can be explained already in the single-agent setting; the extension to the multi-agent set-
ting then becomes obvious. Indeed, we are critical of existing foundations of decision
theory (and in particular of Savage’s framework), and believe that our theory provides bet-
ter foundations. However, rather than waste our ammunition by attacking Savage’s theory,
whose shortcomings can be (and have been) well camouflaged, we simply note that we are
not aware of any successful attempt to generalize Savage’s framework to the multi-agent
setting, and claim that it is no accident.
After such boasting we must introduce a caveat. One of the attractive features of
Savage’s framework is the treatment of causality, as embodied in the notion of an act.
Although we believe that acts fit in quite naturally in the theory we are about to present and
are working to incorporate them, in this version of our construction acts will play no role.
For this reason we are also not yet in a position to tackle the paradoxes of game theory with
our framework (that too is next on our agenda).
1.1 Introduction 13
The main ingredients of our construction can be relayed concisely by reference to
several existing lines of research in decision theory, on which we draw liberally (the fol-
lowing will not make sense to the reader unfamiliar with the references, but the rest of the
chapter is self contained). From Jeffrey (Jeffrey 1965) we borrow the scalable expected util-
ity construction with the “small worlds” property. We then specialize and extend the frame-
work. We first specialize it by constructing a particular algebra on which to define Jeffrey
utilities, one that involves higher-order preferences (that is, preferences over preferences).
Applying results due to Domotor (Domotor 1978) we immediately get an expected-utility
representation of higher-order preferences, albeit a problematic one. Among its chief defi-
ciencies is the lack of account for the dynamics of preferences, or how preferences change
in the face of new evidence (including counterfactual evidence - this turns out to be an im-
portant point). We then exploit the structure of our hierarchical construction, and, adapting
and reinterpreting a relatively unknown construction due to Fishburn (Fishburn 1982), we
strenghten the expected-utility representation and avoid these deficiencies. In both cases
the resulting expected utility representation is in the spirit of existing type constructions in
game theory (Mertens and Zamir 1985, Heifetz and Samet 1996), but whereas these nest
only probabilities our representation nests both probabilities and utilities. The work which
comes closest to our approach is, to the best of our knowledge, (Epstein and Wang 1996).
Yet, our representation turns out to be quite different from the one in (Epstein and Wang
1996), and extends to conditional and counterfactual preferences.
1.2 Hierarchical conditional preferences: the single agent case 14
The next section contains the bulk of the technical material, and presents our single-
agent construction. In the following section we harvest the fruit of this construction by
easily extending it to the multi-agent case. We conclude with a brief summary.
1.2 Hierarchical conditional preferences: the single agentcase
As we have said, most of the work in our construction is done already in the single-agent
case, which we explain in this section. Before we begin the construction, let us point out
three important ingredients in it:
1. “Small worlds” property: One need not be required to express preference only (or
even at all) among entities that depend on objects which are sufficiently rich to resolve
all ambiguity (by way of contrast, Savage’s preferences are defined on acts, which in
turn are defined relative to states which are such rich objects).
2. “Hierarchical preferences”: One can assign preferences over preferences, as in
preferring smoking to not smoking but wishing one did not have that preference.
3. “Conditional preferences” (including counterfactual preferences): One should be
able to specify how one’s preferences change in the face of new evidence, including
evidence given a prior probability of 0.
Although we believe these criteria to be desirable and are proud that our theory meets
them, we do not request the reader to accept this desirability as self evident. We mention
1.2 Hierarchical conditional preferences: the single agent case 15
them now because it is helpful to keep them in mind when following the stages of the
construction; in particular, the next three subsections correspond to these three criteria.
1.2.1 Foundation: Jeffrey utility
The technical development in this subsection is due to Jeffrey (Jeffrey 1965) and Domotor
(Domotor 1978). We start with a set of possible worldsW, and a finite Boolean algebraA
of subsets ofW. A preference ordering onA is a complete and transitive binary relation%
between nonempty pairsE,F ∈ A.
Definition 1 An expected utility representation of an ordering % on a finite algebra
A is a pair (p, u), where p : A → [0, 1] is a probability function and u : A−{∅} → R
is a utility function such that:
1. for all nonemptyE, F ∈ A, u(E) ≥ u(F ) if and only ifE % F
2. for all nonemptyE ∈ A, p(E) > 0, u(E) > 0 andu(W ) = 1
3. u(E)p(E) =∑
k u(Ek)p(Ek) for any finite, measurable partition{Ek}Kk=1 of E.
Notice the remarkable structure of this definition, as compared to the expected util-
ity representations of von Neumann and Morgenstern, Savage, and others. Here prefer-
ences on events are represented by their relative utilities rather than their relative expected
utilities, and the probabilities only serve to constrain the utilities via the last condition.
One way to think about Jeffrey utilities is as simultaneously playing the role of tradi-
tional (e.g., Savage) utilities and traditional conditional expected utilities (orCEUs, where
1.2 Hierarchical conditional preferences: the single agent case 16
CEU(E) = EU(E|E)). This is a direct reflection of the “small worlds” property; since
the events among which one expresses preference are under-specified states of the worlds,
the utility of each event has an expectation flavor to it. Thevalueof an event, defined as the
product of its probability and utility, can be thought of as representing the event’s standard
(unconditional) expected utility. Note that in Jeffrey’s framework probability and value are
additive functions, but utility is not.
Remark 1 Since the probability of any nonempty event is positive, p(F |E) = p(E ∩
F )/p(E) is always well defined. Hence, (3) can be equivalently written in the condi-
tional form u(E) =∑
k u(Ek)p(Ek|E).
The question is whether there exist conditions on the ordering that guarantee the
existence of an expected utility representation. Jeffrey (Jeffrey 1965) gives a number of
such conditions, which constrain not only the ordering but also the algebra itself. Similar
conditions are provided by Bolker (Bolker 1967). These sets of conditions are sufficient
to guarantee the existence of an expected utility representation, and have the additional
advantage of being fairly intuitive. Here, however, we proceed to present an alternative
axiomatization, provided by Domotor (Domotor 1978). Domotor’s axiomatization has two
advantages - it constrains only the ordering but not the algebra, and is a necessary as well
as sufficient condition for representability via Jeffrey utility (Theorem 1 below). However,
it suffers the disadvantage of being highly technical and unintuitive, and so we introduce
the relevant axiom by reference only.
1.2 Hierarchical conditional preferences: the single agent case 17
Definition 2 A preference ordering is regular if and only if it satisfies J2 in (Domotor
1978).
Theorem 1 (Domotor 1978) A preference ordering on a finite algebra is regular if
and only if it admits a (real-valued) expected utility representation.
Remark 2 (Domotor 1978) If the algebra is infinite Theorem 1 continues to hold, the
only difference being that probability and utility must be allowed to take nonstandard
(i.e., infinitesimal) values.
1.2.2 A static expected utility representation of hierarchicalpreferences
We now proceed to construct a particular Jeffrey/Domotor structure, one that will cap-
ture the hierarchical nature of the agent’s preferences while still enjoying the small-worlds
property.
We first construct a large-worlds ontology, and then use it to define a hierarchical
small-worlds framework. We start with a setW of possible worlds; a possible world is
to be thought of as a rich object that completely captures the truth of all propositions,
including the agent’s preferences. Next, we introduce a function% which associates with
each possible world a regular ordering over2W ; %w is to be thought of as extracting from
each possible worldw the agent’s preferences atw.
1.2 Hierarchical conditional preferences: the single agent case 18
Remark 3 The reader familiar with modal logic will note the difference between this
construction and Kripke-style possible-worlds semantics (Kripke 1980). From the
conceptual point of view, in the latter a possible world settles on the truth value of
objective facts, whereas here a possible world settles the truth value of all propositions,
including the subjective ones. From the technical standpoint, in a Kripke structure
each world is mapped by the accessibility relation to a set of worlds, whereas here
each world is mapped by % to a total ordering on the power set of worlds.
We are actually not interested in the entire algebra2W , but rather in a specific sub-
algebra,A, which is defined as follows. We start with a finite Boolean algebraA0 (a
sub-algebra of2W , and ultimately also ofA). A0 is thought of as describing the objective
events, ones that do not capture the agent’s mental state. These are the objective events the
agent is aware of, or is capable of imagining. There is no requirement that these “exhaust”
the space of objective events in any sense.
Next, for all nonemptyE, F ∈ A0, let [E Â F ] = {w | E Âw F} , [E ∼ F ] =
{w | E ∼w F} , and[E % F ] = [E Â F ] ∪ [E ∼ F ].
Let B0 be the set generated from propositions[E Â F ], [E ∼ F ], whereE, F ∈ A0,
by closing off with respect to finite intersections. ThenB0 is aπ-system3, as it contains
both the empty set∅ = [E Â E] andW = [E ∼ E]. Elements ofB0 correspond to partial
ordering onA0; B0 describes the set ofzero-order preferences.
3 A π-systemon a setX is a family of subsets which contains bothX and the empty set, and is closed withrespect to finite intersections.
1.2 Hierarchical conditional preferences: the single agent case 19
Let A1 = A0 ∪ B0 be the algebra generated by propositionsE whereE ∈ A0 or
E ∈ B0. Clearly, bothA0 andB0 are sub-algebras ofA1. Again, letB1 be theπ-system of
finite intersections of propositions[E Â F ], [E ∼ F ], where nowE, F ∈ A1. Elements of
B1 representfirst-order preferences.
Now recursively define then-th order (n > 1) algebras and preferences as follows:
• An = An−1 ∪Bn−1,
• Bn is the set of all finite intersections of propositions[E Â F ], [E ∼ F ], where
E,F ∈ An.
Let A = ∪nAn be the algebra generated by eventsE ∈ An, n ≥ 0. ThenE ∈ A if
and only ifE ∈ An for somen. Let B be theπ-system generated by[E Â F ], [E ∼ F ],
whereE, F ∈ A. Notice thatB ⊂ A, and henceA = A∪B. Therefore, further iteration is
superfluous: all the preferences on events inA are already included inA.
Remark 4 Recall that A is the sub-algebra of 2W in which we are interested. It might
be asked why not define the preferences %w only on A, and turn the above construction
into a fixpoint definition. It turns out, however, that in a later development in this
chapter (when we define mixture operations) we will need to include events outside
A.
We are now close to achieving our first goal, an expected utility representation of
hierarchical preferences. What we are after is, for eachw, giving the ordering%w on A
1.2 Hierarchical conditional preferences: the single agent case 20
an expected utility representation. Our work is done almost automatically by the Jeffrey-
Domotor result. We simply need to note that, if an expected utility representation exists for
preferences on the full algebra2W , then it also exists for the restriction of those preferences
to any subalgebra. This fact is recorded in the following lemma.
Lemma 2 If an ordering over an algebra is regular, so is its projection to any sub-
algebra.
So it would seem that we have accomplished our goal, but in fact there are several
interrelated reasons for dissatisfaction:
• The requirement that every possible (i.e., nonempty) event be given a non-zero
probability is conceptually problematic, since it does not allow the agent to recognize
certain events as meaningful (or “possible”) and disbelieve them at the same time.
In particular, in the multi-agent setting, this will prohibit representing dominated
strategies as actual but disbelieved possibilities.
• Beyond the conceptual difficulty, the above requirement has unpleasant technical
ramifications. In particular, sinceA is in general infinite, the representation we have
uses nonstandard (i.e., infinitesimal) probabilities and utilities.
• The current theory does not account for the way in which preferences (and thus
probabilities and utilities) change in the face of new information; for this reason we
term it “static.” In particular, there is no obvious role for Bayesian conditioning, and
no account of counterfactual conditioning.
1.2 Hierarchical conditional preferences: the single agent case 21
• Perhaps most damningly, the current theory really does not make use of the
hierarchical construction, beyond the weak use in Lemma . In particular, nothing
in the theory constrains the relationship between preferences at different levels,
contradicting intuition that such “coherence” constraints ought to exist.
We now proceed to develop a theory that does not have these shortcomings.
1.2.3 A dynamic expected utility representation of hierarchicalpreferences
Recall that the expected utility representation afforded by the Jeffrey/Domotor construction
involves probabilities and utilities that obey the following equation:
u(E) =∑
k
u(Ek)p(Ek|E) (1)
First on our agenda is to strenghten this property, and ensure that the probability and
utility obey the equation
u(E) =∑
k
u(Ek)pE(Ek) (2)
wherep is a conditional probability system (CPS). Recall that, given an algebraA,
a CPS (a.k.a. Popper function)p assigns to every non-empty conditioning eventE ∈ A
a probability function overA ∩ E. Furthermore,pE agrees with Bayesian conditioning
whenever possible: for any nonemptyE, F, G ∈ A, such thatG ⊂ F ⊂ E, pE(G) =
pE(F )pF (G). If E = W, and the unconditional probability ofF is positive (that is,p(F ) =
1.2 Hierarchical conditional preferences: the single agent case 22
pW (F ) > 0), then the above formula yields
pF (G) = p(G | F ) =p(G ∩ F )
p(F ).
If we manage to guarantee (1.2) we will have escaped the first two limitations of the
(1.1)-based representation discussed above. Now let us go a step further. Consider any
E ∈ An andF ∈ Bn. Our claim is that, given the intuition behind our construction,E
ought to be probabilistically independent ofF ; the lower order events do not determine
the higher order preferences, and vice versa. From this it follows that our expected utility
representation should validate the following property:
u(E ∩ F ) =∑
u(Ek ∩ F )pE(Ek), (3)
Note that (1.1) is obtained as a special case of (1.3) by selectingp(E) > 0 and
F = W . Note also that we have now escaped the third and fourth limitations of (1.1)-based
representation.
How do we obtain (1.3)? We do so by leveraging a relatively unknown construc-
tion due to Fishburn (Fishburn 1982). His motivation was different from ours – giving a
conditional version of Savage’s construction. However, we will adapt and reinterpret the
mathematics to fit our intended interpretation. Fishburn starts with preferences defined on
pairs(x,E), where the first argument is an act, and the second an environmental event.4 In
4 We don’t discuss Fishburn’s inuition at length here, both because ours is different and because his isperhaps problematic. Briefly, however, the event is taken from an algebra on a set of states of nature, andthe act is taken from a mixture set. An actx can be thought of as a probability measure on environmentalevents, and in this interpretation the representation is an interesting mix between von Neumann-Morgensternand Savage: the agent chooses objective lotteries, and Nature chooses the context in which the lottery isperformed. While the lotteries are objective, the probability of an (external) event in the context of anotherevent is subjective, and can be uniquely derived from the agent’s preferences on pairs(x,E). Unfortunately,
1.2 Hierarchical conditional preferences: the single agent case 23
our interpretation, eventsE ∩ F, whereE ∈ An andF ∈ Bn play the role of (act,event)
pairs(F, E). In other words, acts are viewed as being themselves events: they represent
(generally incomplete) descriptions of the agent’s conditional preferences (and hence be-
liefs), and characterize conditional revealed-preference behavior.
We proceed now with the technical construction.
Mixtures
Let R be the set of all expected utility representations(p, u) on some algebraA.
For any two representations(p′, u′), (p′′, u′′) ∈ R, and for anyλ ∈ [0, 1], we define
their (λ−)mixture to be a new representation(p, u) = (p′, u′)λ(p′′, u′′) such that:
p(E) = p′λp′′(E) = λp′(E) + (1− λ)p′′(E)
u(E) = u′λu′′(E) =λu′(E)p′(E) + (1− λ)u′′(E)p′′(E)
p(E).
For any two nonempty subsets of representationsx, y ∈ 2R, we define theirλ-mixture
as the (nonempty) subset
xλy = {(p, u) ∈ R | (p, u) = (p′, u′)λ(p′′, u′′), (p′, u′) ∈ x, (p′′, u′′) ∈ y} .
Definition 3 A nonempty subset x ∈ 2R is (mixture) convex if, for any λ ∈ [0, 1],
x = xλx.
some of the hypotheses introduced by Fishburn in order to derive his representation are quite unappealing inthe suggested interpretation, and this is probably why the result never gained much popularity.
1.2 Hierarchical conditional preferences: the single agent case 24
Let M ⊂ 2R be the set of all convex subsets ofR. M has the useful property of being
closed with respect to mixtures:
Proposition 3 For any x, y ∈ M, and for any λ ∈ [0, 1], xλy ∈ M .
Proof Let (u′, p′) and(u′′, p′′) be elements ofxλy. Then(u′, p′) = (u′x, p′x)λ(u′y, p
′y) and
(u′′, p′′) = (u′′x, p′′x)λ(u′′y, p
′′y) for some(u′x, p
′x), (u
′′x, p
′′x) ∈ x and(u′y, p
′y), (u
′′y, p
′′y) ∈ y. We
want to show that(u, p) := (p′, u′)µ(p′′, u′′) is also inxλy, whereµ ∈ [0, 1]. After some al-
gebra, one finds thatp = p′µp′′ = (p′xµp′′x)λ(p′yµp′′y), and similarlyu = (u′xµu′′x)λ(u′yµu′′y).
But (p′xµp′′x, u′xµu′′x) ∈ x and(p′yµp′′y, u
′yµu′′y) ∈ y by convexity ofx andy, and therefore
(u, p) is in xλy.
Many decision-theoretic treatments postulate that the set of objects on which prefer-
ences are defined is amixture set:
Definition 4 A set X is a mixture set with respect to a mixture operation xλy if,
for any x, y ∈ X and λ, µ ∈ [0, 1],
1. x1y = x
2. xλy = y(1− λ)x
3. (xλy)µy = x(λµ)y.
The setM defined above turns out to have the desired structure.
1.2 Hierarchical conditional preferences: the single agent case 25
Proposition 4 M is a mixture set.
Proof Properties1 and2 in Definition hold trivially, so we shall concentrate on3. First
we show that(xλy)µy ⊂ x(λµ)y. Let (u, p) be an element of(xλy)µy. Then there exist
(ux, px) ∈ x and(uy, py), (u′y, p
′y) ∈ y such thatp = (pxλpy)µp′y andu = (uxλuy)µu′y.
Note thatp = px(λµ)p′′y, wherep′′y = pyγp′y, andγ = µ(1−λ)(1−λµ)
∈ [0, 1]. Similarly, one
finds (after some algebra) thatu = ux(λµ)u′′y, whereu′′y = uyγu′y. But (u′′y, p′′y) is in y
by convexity, and hence(u, p) is in x(λµ)y. Now we show thatx(λµ)y ⊂ (xλy)µy. Let
(u, p) be an element ofx(λµ)y; then there exist(ux, px) ∈ x and(uy, py) ∈ y such that
p = px(λµ)py andu = ux(λµ)uy. But px(λµ)py = (pxλpy)µpy, and similarlyux(λµ)uy =
(uxλuy)µuy, which implies the result.
Axiomatization
As before, letM be the set of all convex subsets ofR. For anyx ∈ M, let [x] =
{w | there exist(p, u) ∈ x that represents%w} .
Remark 5 Note that the slight abuse of notation implicit in our use of the [. . . ]
operator is quite helpful. To begin with, [E Â F ] and [x] are of the same type,
but the relationship is even tighter. Define |E Â F | = {(p, u) | u(E) > u(F )} and
|E ∼ F | = {(p, u) | u(E) = u(F )}. Note that both these types of set are convex, as are
their finite intersections. Moreover, since E ∼ (Â)F if and only if u(E) > (=)u(F ),
then [E ∼ (Â)F ] = [|E ∼ (Â)F |].
1.2 Hierarchical conditional preferences: the single agent case 26
With this machinery, given a regular preference orderingÂwe can define the induced
partial orderingÂ∗ on pairs(x, E), wherex ∈ M andE ∈ A are such that[x] ∩ E is
nonempty.
• (x,E) Â∗ (y, F ) if and only if [x] ∩ E Â [y] ∩ F
• (x,E) ∼∗ (y, F ) if and only if [x] ∩ E ∼ [y] ∩ F
Remark 6 Notice that (x,E) is ranked if and only if [x] ∩ E is nonempty. Hence,
it is immediately verified that Â∗ is asymmetric and transitive, and ∼∗ is symmetric
and transitive.
Next, we introduce five additional axioms. First, we assume that mixtures of possible
preferences are also possible.
A1. If x, y ∈ M correspond to nonempty[x]∩E and[y]∩E, then[xλy]∩E is also
nonempty for allλ ∈ (0, 1).
The second axiom explicitly relates to mixtures.
A2. (Substitution) For allE,F ∈ A andx, y, z, t ∈ M , if (x,E) ∼∗ (z, F ) and
(y, E) ∼∗ (t, F ) then(x12y, E) ∼∗ (z 1
2t, F ).
The third axiom also relates to mixtures, and ensures a classical (i.e., standard) rep-
resentation.
1.2 Hierarchical conditional preferences: the single agent case 27
A3. (Archimedean){α : (xαy, E) %∗ (z, F )} and{β : (z, F ) %∗ (xβy, E)} are closed
subsets of[0, 1].
We remark thatA3 is imposed in order to obtain a classical representation: its main
role is to ensure that probabilities and utilities can be taken to be standard-valued.
The fourth axiom is fairly uninteresting. Its main role is to avoid triviality.
A4. (Relevance) There existx, y ∈ M such that(x,W ) Â∗ (y,W ).
Finally, the fifth axiom is a consistency requirement.
A5. (Consistency) For all nonemptyE,F, G ∈ A with E ∩ F = ∅, [E Â F ∼
G] ∩ E Â [E Â F ∼ G] ∩ F ∼ [E Â F ∼ G] ∩G.
This last axiom says, somewhat tautologically, that in the even that the agent prefers
E to F and is indifferent betweenF andG (i.e., whenever[E Â F ∼ G], whereG may or
may not coincide withF ), E is indeed preferred toF, andF is indifferent toG.
Theorem 5 Under J2 and A1 − A5, there exists a conditional representation (p, u)
such that:
1. p : A× (A− {∅}) → [0, 1]
2. u is a real-valued utility function, defined for all(x, E) ∈ M × A such that
[x] ∩ E 6= ∅, with the following properties:
(a) u(x,E) > (=)u(y, F ) if and only if [x] ∩ E Â (∼)[y] ∩ F
1.3 Multi-agent construction 28
(b) u(x,E) =∑
u(x, Ek)pE(Ek) for any finite, measurable partition{Ek} of E.
Proof For any nonemptyE ∈ A, let M(E) be the set of allx ∈ M such that[x] ∩ E
is nonempty. IfA1 holds, thenM(E) is a mixture set. LetS be the set of all nonempty
[x] ∩ E, where(x,E) ∈ M × A. Then the following condition holds:
P1. Â∗ is an asymmetric weak order onS.
The following condition is a consequence ofJ2, via Theorem 1:
P5. If (x,E) %∗ (x, F ) andE ∩ F = ∅ then(x,E) %∗ (x, E ∩ F ) %∗ (x, F ).
Moreover, the following two conditions descend fromA4:
P6. If E ∩ F = ∅ then(x′, E) Â∗ (x′, F ) and(y′, F ) Â∗ (y′, E) for somex′, y′ ∈ M.
P7. If E, F andG are mutually disjoint events inA, and if (x, F ) ∼∗ (x, G) for some
x ∈ M, then there is ay ∈ M at which exactly two of(y, E), (y, F ) and (y, G) are
indifferent.
Together withA2−A4, the above conditions validateP1−P7 in (Fishburn 1982, p.
151-154) onS. The result then follows from the proof of Theorem 2 in (Fishburn 1982, p.
155), replacing the full product spaceM × A throughout with the subsetS ⊂ M × A.
Remark 7 If F is an element of B, the representation specializes to
u′(F ∩ E) =∑
u′(F ∩ Ek)pE(Ek),
where u′ is defined by u′([x] ∩ E) = u(x,E), whenever F ∩ Ek 6= ∅ for all k.
Furthermore, if we take F to be the whole set W, we get u′(E) =∑
u′(Ek)pE(Ek).
1.3 Multi-agent construction 29
1.3 Multi-agent construction
The construction introduced in the previous section can be easily generalized to the multi-
agent case. As was mentioned in the introduction, although the multi-agent case is our
primary motivation, this section is brief because it is a straightforward extension of the
single-agent case.
Let I = {1, ..., n} be a set of agents. Agenti ∈ I is assumed to have preferences (and
hence beliefs) not only about the basic events in a finite algebraA0 and its own preferences,
but on other agents’ preferences as well. (In the treatment here we have all agents share the
base algebraA0, though this can be relaxed.)
As in the single-agent case,W represents a set of possible worlds, and%iw is a func-
tion associating to each pair(i, w) a regular preference ordering on2W .
For anyE, F ∈ A0, we denote by[E Âi F ] the proposition{w ∈ W | E Âiw F}
(” i (strictly) prefersE to F ”), and by [E ∼i F ] the set{w ∈ W | E ∼iw F} (” i is indif-
ferent betweenE andF ”). We denote byBi0 the π-system obtained by taking all finite
intersections of propositions[E Âi F ] and[E ∼i F ].
We recursively definen-th order (n > 1) algebras and preferences:
• An = An−1 ∪ (∪i∈IBin−1),
• Bin (i ∈ I) is the set of all finite intersections of propositions[E Âi F ], [E ∼i F ],
whereE, F ∈ An.
1.4 Conclusions 30
• A = ∪nAn is the algebra generated by eventsE ∈ An (n ≥ 0), andBi is the
π-system generated by[E Âi F ], [E ∼i F ], whereE, F ∈ A.
Again, we obtain a Jeffrey-style representation of agenti’s preferences, for allAn
andi ∈ I:
ui(E)pi(E) =∑
ui(Ek)pi(Ek).
Moreover, under the same conditions introduced in the single-agent case, we get a
Fishburn-style representation for pairs(E, F ) ∈ A×Bi, which satisfies
ui(E ∩ F ) =∑
ui(Ek ∩ F )piE(Ek)
wheneverE ∩ F 6= ∅.
1.4 Conclusions
We presented a decision-theoretic approach aimed at overcoming several well-known lim-
itations of existing constructions, limitations that become particularly apparent – and dis-
turbing – in multi-agent applications. Our approach enjoys several advantageous features,
including the ability to represent:
• Preferences on incomplete descriptions of the world.
• Conditional behavior, even contingent on disbelieved (counterfactual) events.
1.4 Conclusions 31
• Higher-order preferences (and hence beliefs).
The current limitations include:
• The axioms are not optimized for the proposed interpretation. That is, we glue
together constraints drawn from Domotor (or Jeffrey) and from Fishburn. Together
these are sufficient to guarantee the representation we seek, but there is no reason to
believe that they are necessary.
• We do not account for ‘agent causality’, or actions.
• As a result, we are not yet in a position to apply our construction to the resolution of
most game-theoretic paradoxes.
We view the first limitation as more of a mathematical annoyance than anything else,
and are actively working on removing the second one. We leave the explicit application of
our theory to game theoretic paradoxes to future work.
Chapter 2Expected utility networks
2.1 Introduction
Modularity is the cornerstone of knowledge representation in artificial intelligence (AI); it
allows concise representations of otherwise quite complex concepts. Logic offers modu-
larity via the compositional nature of the logical connectives, and the property is exploited
by theorem provers. Probability allows this via the notion of probabilistic independence,
a notion fully exploited by Bayesian networks (Pearl 1988). In recent years there have
been several attempts to provide modular utility representations of preferences (Bacchus
and Grove 1995, Doyle and Wellman 1995, Shoham 1997).
It has proven difficult to devise a useful representation of utilities; this difficulty
can certainly be ascribed to the different properties of utility and probability functions, but
also, more fundamentally, to the fact that reasoning about probabilities and utilities together
requires more than simply gluing together a representation of utility and one of probability.
In fact, just as probabilistic inference involves the computation of conditional prob-
abilities, strategic inference– the reasoning process which underlies rational decision-
making – involves the computation of conditional expected utilities for alternative plans
of action, which may not have a modular representation even if probabilities and utilities,
taken separately, do.
32
2.1 Introduction 33
The purpose of this chapter is to introduce a new class of graphical representations,
expected utility networks (EU nets). EU nets are undirected graphs with two types of arc,
representing probability and utility dependencies respectively. In EU nets not only prob-
abilities, but also utilities enjoy a modular representation. The representation of utilities
is based on a novel notion of conditional utility independence, which departs significantly
from other existing proposals, and is defined in close analogy with its probabilistic coun-
terpart.
We also define a novel notion of conditional expected utility (EU) independence, and
show that in EU nets node separation with respect to the probability and utility subgraphs
implies conditional EU independence. In this respect, choosing the “right” notion of con-
ditional utility independence turns out to be crucial.
What is important about conditionally independent decisions is that they can be effec-
tively decentralized: a single, complicated agent can be replaced by simpler, conditionally
independent sub-agents, who can do just as well. This property is of interest not only to ar-
tificial intelligence, since it can be exploited to reduce the complexity of planning, but also
to economic theory, as it suggests a principled way for the identification of optimal task
allocations within economic organizations.
The rest of the chapter is organized as follows. In section 2 we introduce our notion
of conditional utility independence, and discuss it in the context of other recent propos-
als in the literature. In section 3 we formally introduce EU nets, and discuss some of their
structural properties. Next, we extend the utility function from elementary “states” or out-
comes to general events, with the interpretation that the utility of an event is the expected
2.2 Conditional independence of probabilities and utilities 34
utility of that event, conditional on it being true. We explain this, and other characteris-
tics of the underlying decision theoretic framework, in section 4. In section 5 we show that
conditional probability and utility independence jointly imply conditional expected utility
independence, and argue that conditionally independent decisions can be effectively de-
centralized. In section 6 we address the issue of probabilistic and strategic inference in
EU nets, and show how conditional probabilities and conditional expected utilities can be
recovered from the structural elements of EU nets.
2.2 Conditional independence of probabilities and utilities
Conditional probability independence is a powerful notion: it incorporates a natural, in-
tuitive notion of relevance, and may dramatically reduce the complexity of probabilistic
inference by allowing a convenient decomposition of the probability function.
In strategic inference, reducing the complexity of the decision problem calls for a de-
composition of utilities along with probabilities. Yet, this is generally not enough: even
if probabilities and utilities are separately decomposable, strategic inference typically in-
volves computation of theexpected utilitiesfor alternative plans of action, and hence what
is really important is the ability to decompose the expected utility function.
Several proposals which recently appeared in the literature (Bacchus and Grove 1995,
Shoham 1997) rely onadditivenotions of utility independence, while the familiar notion of
probabilistic independence ismultiplicative. This difference may account for the difficulty
encountered by these proposals to achieve a convenient decomposition of the expected
utility function, and hence an effective reduction in the complexity of strategic inference.
2.2 Conditional independence of probabilities and utilities 35
In this section, we propose a multiplicative notion of conditional utility indepen-
dence, which is a close analogue of its probabilistic counterpart. In the following sections,
we argue that these two notions turn out to play well together, by inducing a modular de-
composition of the expected utility function, and a consequent simplification of the decision
process.
Let {Xi}i∈N (N = {1, ..., n}) be a finite, ordered set of random variables5, and
let x0 = (x01, ..., x
0n) be some arbitrary given realization which will act as the reference
point (we use uppercase letters to denote random variables, and lowercase to denote their
realizations). A joint realizationx = (x1, ..., xn) represents a (global)state, or outcome.
For anyM ⊂ N , we denote byXM the set{Xi}i∈M . Let p be a strictly positive probability
measure defined on the Boolean algebraA generated byXN , and letu be a (utility) function
which assigns to each statex a positive real number. We assume that the decision maker’s
beliefs and preferences are completely characterized by(p, u). Specifically, we assume that
p represents the decision maker’s prior beliefs, and that for any two probability measures
p′ andp′′, p′ Â p′′ (p′ is preferred top′′) if and only if Ep′(u) > Ep′′(u), whereEp(u) =
∑x u(x)p(x). Finally, let
q(xM |xN−M) =p(xM , xN−M)
p(x0M , xN−M)
.
The interpretation ofq is in terms ofceteris paribuscomparisons: it tells us how
the probability changes when the values ofXM are shifted away from the reference point,
while the values ofXN−M are held fixed atxN−M .
5 To keep the notation simple, we assume that they may take only finitely many values. Yet, the constructionis easily extended to more general classes of random variables.
2.2 Conditional independence of probabilities and utilities 36
We also define a correspondingceteris paribuscomparison operator for utilities:
w(xM |xN−M) =u(xM , xN−M)
u(x0M , xN−M)
One way to interpretw is as a measure of the intensity of preference forxM (with
respect to the reference point) conditional onxN−M .
Suppose thatq(xM |xN−M) only depends onxK , whereK ⊂ N − M, but not on
xN−M−K . It is easily verified that this condition holds for allxN if and only if XM is prob-
abilistically independent ofXN−M−K givenXK . We express (and record) this by defining
new quantitiesq(xM |xK) = q(xM |xN−M), where the conditionsxN−M−K are dropped. We
call this notionp-independence: note that it is only defined in terms of states (as opposed
to general events), and corresponds to conditional probability independence whenever it is
defined. Specifically, ifA,B andC are three subsets of the setXN of all random variables,
the statement “A is p-independent ofB givenC” only makes sense ifA,B andC consti-
tute a partition ofXN , and in that case it is equivalent to the statement thatA andB are
probabilistically independent givenC.
A corresponding notion of conditional utility independence (u-independence) is de-
fined accordingly. Suppose thatw(xM |xN−M) depends onxK , but not onxN−M−K . Hence,
the intensity of preference for the variables inXM (relative to their reference values) de-
pends on the values ofXK , but not on those ofXN−M−K . In that case, we again define new
quantitiesw(xM |xK) = w(xM |xN−M), and say thatXM is u-independent ofXN−M−K
givenXK .
2.2 Conditional independence of probabilities and utilities 37
It is instructive to compare our notion of conditional utility independence with sev-
eral other proposals which have appeared in the literature, in the context of an example
adapted from (Bacchus and Grove 1995). Suppose that there are two basic events,H and
W (“health and wealth”), and that the following payoff tables, where payoffs are expressed
as multiples ofu(¬H ∩¬W ) (an arbitrary reference point), represent the decision maker’s
preferences overH andW in two different scenarios:
¬W W¬H 1 2H 3 6
¬W W¬H 1 2H 3 4
Bacchus and Grove’s utility independence is, fundamentally, a qualitative notion,
which in the example reduces to payoff dominance. SinceH dominates¬H andW domi-
nates¬W , utility independence holds in both cases.
Additive utility independence specializes utility independence by requiring that prob-
ability measures with the same marginals be indifferent. In our2×2 example, this amounts
to the restriction thatu(H ∩W ) + u(¬H ∩ ¬W ) = u(H ∩ ¬W ) + u(¬H ∩W ). Hence,
additive utility independence holds in the second case but not in the first.
Shoham further specializes the notion of additive independence, with the following
intended interpretation: the two Boolean variables in the example are independent if it is
possible to associate to each of them a linear “contribution”, such that the utility of a joint
realization is given by the sum of the contributions. In our2 × 2 example, this criterion
coincides with additive independence; however, in the general case it is more stringent.
2.3 Expected utility networks: a formal definition and some structural properties38
We too introduce quantitative information, but in a different way: the two variables in
the example are independent in our sense (conditional on the empty set) if the increment in
utility relative to the reference point is the product of the increments along each component.
The intended interpretation of utility independence in our case is that the “intensity of
preference” for one variable with respect to its reference value, represented by theceteris
paribusutility ratio, does not depend on the particular value taken by the other variable.
Hence, in our sense,H andW are independent in the first scenario but not in the second.
We claim thatu-independence is a particularly attractive notion for two reasons:
• it is information that is natural to elicit from people, as it purely involves relevance
considerations and order-of-magnitude comparisons between utilities
• it gives rise to a graphical representation and associated inference mechanism,
expected utility networks (defined in the next section), which is simultaneously
modular in probabilities, utilities and expected utilities.
2.3 Expected utility networks: a formal definition and somestructural properties
We define an expected utility network as an undirected graphG with two types of arc,
representing probability and utility dependencies respectively. Each node represents a
random variable (say,Xi), and is associated with two positive functions,q(xi|xP (i)) and
2.3 Expected utility networks: a formal definition and some structural properties39
w(xi|xU(i)), whereP (i) denotes the set of nodes directly connected toXi via probability
arcs6, andU(i) the corresponding set of nodes directly connected toXi via utility arcs.
These quantities are interpreted as the probability and utility ratios (defined in the
previous section) produced by some expected utility representation(p, u), and may be as-
sessed by the decision maker throughceteris paribuscomparisons. Alternatively, the prob-
ability layer of a EU net may be initially specified as a Bayes network, and the probability
ratiosq derived from conditional probability tables.
Figure 1 depicts a simple EU net. Although it is possible to present much richer
examples, we select this one because it ties in with an example discussed in the chapter on
auctions.
1.A simple EU net. Probability arcs are represented by solid lines, and utility arcs bydashed lines.
6 P (i) corresponds to theMarkov mantleof Xi, i.e., the minimal set of variables such that, conditional onthose,Xi is (probability) independent of everything else.
2.3 Expected utility networks: a formal definition and some structural properties40
If the q andw functions are specified directly, then any arbitrary assignment of posi-
tive functionsq(xi|xBP (i), x0AP (i)) for all i (whereAP (i) denotes the set of all variables in
P (i) whose index is greater thani, andBP (i) = P (i)− AP (i)) uniquely identifies a cor-
responding probability function. Similarly, any arbitrary assignment of positive functions
w(xi|xBU(i), x0AU(i)) (whereAU(i) is the set of all variables inU(i) whose index is greater
thani, andBU(i) = U(i) − AU(i)) identifies a utility function, unique up to normaliza-
tion.7
We remark that, ifq(xi|x−i) only depends onxP (i), then fixing xP (i) completely
specifies the behavior of the probability function along thei − th coordinate (up to the
probability of the reference point), and that such behavior does not depend on the particular
values taken by the other variables. The same is true about the utility ofxi with respect to
its reference value, givenxU(i).
It turns out that node separation with respect to the probability and utility subgraphs
characterizes all the impliedp- andu- independencies. More precisely, for any probability
– utility pair (p, u) there exists an undirected graphG such that, ifA, B andC are three
subsets of variables (each variable being associated with a node in the graph),A is p-
(resp.,u-) independent ofB givenC if and only if C separatesA from B with respect to
the probability (resp., utility) subgraph (in the sense that every path fromA to B in the
subgraph must pass throughC). In the language of (Pearl 1988),G is a perfect map of the
independence structure.
7 The particular normalization we adopt is discussed in section 4.
2.3 Expected utility networks: a formal definition and some structural properties41
Theorem 6 The set of p– and u– independencies generated by any pair (p, u) has a
perfect map.
Proof We follow the methodology of Bacchus and Grove (Bacchus and Grove 1995),
that is, we appeal to a necessary and sufficient condition in Pearl and Paz (Pearl and Paz
1989) and check that suitable generalizations ofp-independence andu-independence both
possess the following five properties: symmetry, decomposition, intersection, strong union
and transitivity. We prove it in the case of utility; the proof for probability is analogous.
Let A,B,C,D,R,R′, R′′ be subsets of random variables, whereR, R′ and R′′ always
denote the subset of remaining variables in the appropriate context (so, for instance, in the
context of someA,B andC, R = XN − A − B − C). As elsewhere in this chapter,
we use uppercase/lowercase to denote (subsets of) random variables and their realizations
respectively.
For the purpose of this proof, let us say thatA is independent ofB givenC, and write
I(A,B|C) if and only if w(a|b, c, r) = w(a|b0, c, r) for all (a, b, c, r). Note that this notion
generalizesu–independence. Then the following properties hold.
Symmetry:I(A,B|C) ⇒ I(B, A|C).
This follows because
w(b|a, c, r) = u(a,b,c,r)u(a,b0,c,r)
= u(a,b,c,r)u(a0,b,c,r)
u(a0,b,c,r)u(a,b0,c,r)
= u(a,b0,c,r)u(a0,b0,c,r)
u(a0,b,c,r)u(a,b0,c,r)
= w(b|a0, c, r).
Decomposition:I(A,B ∪D|C) ⇒ I(A,B|C) ∧ I(A,D|C).
2.4 Conditional expected utility 42
This is equivalent to saying thatw(a|b, c, d, r) = w(a|b0, c, d0, r) impliesw(a|b, c, r′) =
w(a|b0, c, r′) andw(a|c, d, r′′) = w(a|c, d0, r
′′). This follows trivially, becauser′ = (d, r)
andr′′ = (b, r).
Intersection:I(A,B|C ∪D) ∧ I(A,D|B ∪ C)⇒ I(A,B ∪D|C).
Equivalently,w(a|b, c, d, r) = w(a|b0, c, d, r) and w(a|b, c, d, r) = w(a|b, c, d0,r)
imply w(a|b, c, d, r) = w(a|b0,c, d0, r).
This also follows quite easily by algebraic manipulation.
Strong union:I(A,B|C) ⇒ I(B, A|C ∪D).
Equivalently,w(a|c, b, r) = w(a|b0, c, r) implies w(b|a, c, d, r′) = w(b|a0, c, d, r′).
This follows by symmetry, and the fact thatr = (d, r′).
Transitivity: I(A,B|C) ⇒ I(A, V |C) ∨ I(B, V |C),whereV is any single vari-
able. This is equivalent to saying thatw(a|b, c, r) = w(a|b0, c, r) impliesw(a|v, c, r′) =
w(a|v0, c, r′) or w(b|v, c, r′′) = w(b|v0, c, r
′′), which follows by observing that eitherv is
in b or else is inr.
2.4 Conditional expected utility
While probability is a set function, defined for general events, utility is so far only defined
for elementary events (states). The notion ofp-independence introduced in section 2 pre-
cisely corresponds to conditional probability independence whenever it is defined, and it is
only defined in terms of states – in turn, this enabled us to define a new notion of conditional
utility independence, also defined in terms of states, which we namedu-independence.
2.4 Conditional expected utility 43
As we have seen, allp- andu- independencies can be immediately recovered from
the graphical structure of EU nets, because they are fully characterized by node separation
with respect to the probability and utility subgraphs. For instance, in the simple EU net
represented in figure 1,V andC are conditionallyp- andu- independent of each other,
givenA,B andS.
Suppose now thatA andB are controllable variables, in the sense that their values
can be freely set by the decision maker. A rational decision maker will want to choose
values ofA andB which maximize expected utility; hence, for each assignment(a, b), the
decision maker should compute the corresponding expected utility, and identify an optimal
decision. Clearly, the decision process becomes quite cumbersome when there are many
decision variables; to reduce its complexity, we seek conditions under which the expected
utility calculations can be conveniently decomposed.
The first step will be to extend utility to a be a set function as well, with the following
interpretation: the utility of an event is the expected utility of the event, conditional on the
event being true. Formally,
u(E) =∑x∈E
u(x)p(x|E).
The following important property is an immediate consequence of the definition: for
any nonemptyE ∈ A and for any non-empty, finite partition{Ek} of E, where theEk may
or may not be elementary “states”,
2.4 Conditional expected utility 44
u(E) =∑
k
u(Ek)p(Ek|E).
The (von Neumann - Morgenstern) utilities we start from8 are only defined up to
positive affine transformations. It is natural then to normalize the utility measure around
certain values, just as probabilities are normalized to lie between zero and one. Hence,
we require thatu(True) = 1, whereTrue denotes the tautological event, or the entire
universe.9
Although it shall not play a direct role in EU nets, in order to facilitate the exposition
we also define thevalue(or impact) of eventE:
v(E) = u(E)p(E)
Under the above normalizationv is a (strictly positive) probability measure, since it
is an additive set function, and
v(True) = 1 ≥ v(E) > 0
for all nonemptyE. Moreover, sincep is also strictly positive, we have that
u(E) =v(E)
p(E).
8 We start with von Neumann - Morgenstern utilities and a given prior probability, as it is customary ineconomics. Of course, the decision-theoretic representation developed in chapter 1 is also fully adequate(and, we believe, offers better foundations) for our treatment of EUNs, although we decided to start with vonNeumann - Morgenstern utilities to emphasize that none of the results in the foregoing treatment depend onour particular decision-theoretic perspective.9 This normalization uniquely identifies the expected utility function if the utility of a second eventE0 isalso fixed, or, equivalently, if utilities are expressed as multiples ofu(x0), the utility of an arbitrary referencepoint, as we do in EUNs.
2.5 Conditional expected utility independence 45
Note the remarkable structure of conditional expected utility: the utility “measure” is
simply the ratio of two probability measures, one representing value, and the other belief.
Beside being important for the practical construction of EU nets, this normalization
of u allows us to speak about “good” and “bad” events.True – the status quo – is neutral,
neither good or bad. An eventE is said to be good (i.e., better thanTrue) if u(E) > 1,
and bad ifu(E) < 1.
The conditional versions of the three set functions – probability, utility, and value
– are defined in the natural way:p(E|F ) = p(E ∩ F )/p(F ), and similarlyu(E|F ) =
u(E ∩ F )/u(F ), andv(E|F ) = v(E ∩ F )/v(F ).
The three notions of conditioning are related by
v(E|F ) = u(E|F )p(E|F ).
Being a probability measure,p obeys Bayes’ rule (and, clearly, so doesv):
p(F |E) =p(E|F )p(F )
p(E|F )p(F ) + p(E|-F )p(-F )
Bayes’ rule does not hold for utilities, but a modified version of it does:
u(F |E) =u(E|F )u(F )
u(E|F )u(F )p(F |E) + u(E|-F )u(-F )p(-F |E)
Note that this is a “hybrid” relationship: conditional utility depends, among other
things, on conditional probabilities. This is another fact which is important to keep in mind
in connection with EU nets.
2.5 Conditional expected utility independence 46
2.5 Conditional expected utility independence
We have now extended the utility function from complete states to arbitrary events, but
this new concept will be useful only insofar as it can be associated with a corresponding
independence notion, and this extended notion is also captured in the structure of the graph.
In this section we show that both these conditions hold. First we shall define a notion of
conditional expected utility independence, and then show that this notion is indeed captured
in the graphical structure of EU nets.
We defineconditional expected utility independence(or, more concisely, conditional
EU independence) for general events in analogy with the familiar notion of conditional
probability independence. Two events,E andF , are said to be conditionally EU indepen-
dent given a third eventG if
u(E ∩ F |G) = u(E|G)u(F |G).
Conditional expected utility independence generalizesu-independence from states to
general events, much as conditional probability independence generalizesp-independence.
Yet, since expected utilities involve probabilities as well, the relationship between condi-
tional EU independence andu-independence is mediated by probabilities.
Let us look at the general case first. Consider a partition of the set of all random
variables into three subsetsA, B andC. The conditional expected utility ofb givena is
u(b|a) =u(a, b)
u(a)=
∑c u(a, b, c)p(c|a, b)∑
b,c u(a, b, c)p(b, c|a).
2.5 Conditional expected utility independence 47
Suppose now thata separatesb from c with respect to both the probability and utility
subgraphs. Then the following simplification obtains:
u(b|a) =w(b|a)∑
b w(b|a)p(b|a)
p(b|a) =q(b|a)∑b q(b|a)
.
Hence, the formula foru(b|a) does not involve terms inC, and similarlyu(c|a) does
not involve terms inB.
This is not true ifB andC are notp-independent, as the following example shows.
Example 1 Consider the special case in which A is empty, and w(b) = 1 (in which
case, we say that B is payoff − irrelevant). Then B and C are u-independent,
although they may not be p-independent. Hence, u(b)u(b′) =
Pc w(c)q(c|b)Pc w(c)q(c|b′) , a quantity which
is generally different from one. Intuitively, in this case B is purely instrumental to C:
it is irrelevant in itself, but its expected utility reflects the influence that a particular
choice of B has on the probability of C. If B and C are also p-independent, then the
above expression reduces to u(b)u(b′) = 1 (in which case, we say that B is strategically
irrelevant).
These observations are central to the following result.
Theorem 7 p– and u– independence jointly imply conditional expected utility inde-
pendence.
2.6 Inference in expected utility networks 48
Proof Let b andc be p– andu– independent givena. We want to show thatu(bc|a) =
u(b|a)u(c|a). We have that
u(bc|a) =w(a, b, c)∑
b,c w(a, b, c)p(bc|a)
=w(c|a, b)w(b|ac0)w(a|b0c0)∑
b,c w(a, b, c)p(b|a)p(c|a)
=w(c|a)w(b|a)∑
b,c w(c|a)w(b|a)p(b|a)p(c|a)
=w(b|a)∑
b w(b|a)p(b|a)
w(c|a)∑c w(c|a)p(c|a)
= u(b|a)u(c|a),
which proves the result.
Hence, the graphical structure of EU nets can be exploited to identify conditional EU
independencies: node separation with respect to both the utility and probability subgraph
implies conditional EU independence. The upshot is that, conditional onA, decisions
regardingB and C can be effectively decomposed: if bothB and C contain variables
which are under the control of the decision maker, it is not necessary nor useful to possess
information aboutC in order to decide onB, and vice versa. One way to think about such
decomposability is in terms of strategic decentralization: a single, centralized decision
maker can be replaced by two conditionally independent, simpler agents, who only need to
worry about their own respective domains in order to make jointly optimal decisions.
2.6 Inference in expected utility networks
In EU nets, probabilities and utilities are implicitly described by theq andw functions,
together with the topological structure of the network. Probabilistic inference involves
2.6 Inference in expected utility networks 49
the computation of conditional probabilities, and strategic inference the computation of
conditional expected utilities; in this section, we show how these quantities can be readily
recovered from the structural elements of EU nets.
The probability layer of a EU net is essentially a Markov network10, even though
probabilities are subject to a somewhat unusual normalization, and hence probabilistic in-
ference can be performed with standard techniques whenever the (Markov) potentialsq are
known. In turn, potentials can either be assigned directly by the decision maker in the form
of ceteris paribuscomparisons, or derived from conditional probabilities if one starts with
a Bayes network.
The advantage of using utility “potentials” in EU nets is that they are based purely on
utility comparisons between states, which do not involve probabilities: this enables one to
elicit all the relevant preferences from the decision maker without assuming that he or she
already knows the probabilities.
Although we do not tackle here the issue of computational efficiency for probabilistic
and strategic inference in EU nets (a topic which is next on our research agenda), we shall
show how the two fundamental operations of marginalization and conditionalization for
probabilities and expected utilities can be easily reduced to operations on the probability
and utility potentials.
Once the potentials are known, the computation ofp(x)/p(x0) is straightforward:
10 For a good introduction to Markov networks, we refer to (Pearl 1988, chapter 3).
2.6 Inference in expected utility networks 50
p(x)
p(x0)=
p(x1, x0−1)
p(x0)
p(x)
p(x1, x0−1)
= q(x1|x0P (1))
p(x2, x1, x0−{1,2})
p(x02, x1, x0
−{1,2})p(x)
p(x2, x1, x0−{1,2})
= q(x1|x0P (1))q(x2|x1, x
0AP (2))
p(x)
p(x2, x1, x0−{1,2})
= ... = ×iq(xi|xBP (i), x0AP (i)).
One can obtainp(xM)/p(x0) (wherep(xM) is now the marginal probability function
for a subset of random variablesXM ) by summing over theXN−M :
p(xM)
p(x0)=
∑xN−M
p(xM , xN−M)
p(x0)
One can then usep(xM)/p(x0) to compute ratios of marginal probabilitiesp(xA)/p(xB),
and in particular conditional probabilitiesp(xA|xB) = p(xA, xB)/p(xB).
To computeu(x)/u(x0), we use the same decomposition:
u(x)
u(x0)= ×iw(xi|xBU(i), x
0AU(i))
The marginal (expected) utility ofxM , relative to the reference point, can hence be
computed as
u(xM)
u(x0)=
∑xN−M
u(xM , xN−M)
u(x0)p(xN−M |xM)
One can then use ratiosu(xM)/u(x0) to compute ratios of expected utilitiesu(xA)/u(xB),
and in particular conditional expected utilitiesu(xA|xB) = u(xA, xB)/u(xB).
2.6 Inference in expected utility networks 51
Notice that marginal utilities generally depend on probabilities. For instance, the util-
ity of catching a particular cab rather than not is measured by the ratiou(Cab)/u(¬Cab),
and will generally depend on how likely it is that another cab will show up, on the proba-
bility of rain, and so on.
Again, we remark that using utility “potentials” as the initial data in EU nets (rather
than conditional expected utilities) enables the decision maker to specify all the relevant
preferences without any prior knowledge of the probabilities.
Chapter 3Game networks
3.1 Introduction
There are two standard representations for mathematical games: the normal (or strategic)
form, and the extensive form. The extensive form is more structured than the normal form:
not only it describes the identities of the players, the strategies available to each player, and
the payoff functions, as the normal form does; but also the information held by the agents
at any possible state of the system, and the causal structure of events in the game.
The extensive form is also more general than the normal form: any game in nor-
mal form can be interpreted as a game with simultaneous moves, and represented in the
extensive form. The reverse operation (from the extensive to the normal form) is also pos-
sible, but some information (namely, the structure of conditional independencies) is lost in
the process, and hence not all extensive-form solution concepts have a normal-form equiv-
alent. Furthermore, the normal-form representation is often so large that resorting to an
extensive-form representation is the only viable way to even write down the game.
Even though extensive-form representations are more compact than normal-form
ones, they are still quite “redundant”: in concrete examples, many branches of the tree
often have the same payoffs or conditional probabilities, but the recognition of these sym-
metries does not lead to a more parsimonious representation. Furthermore, changing a few
52
3.2 G nets: a formal definition and some results 53
details in the setup usually entails rewriting the whole game; in other words, the extensive-
form representation is not particularly modular.
Finally, for the vast majority of solution concepts, strategic inference is not explicitly
(i.e., algorithmically) supported in either form, but rather it relies on informal reasoning on
behalf of the analyst. This lack of explicit decision-theoretic foundations is reflected in the
proliferation of solution concepts in the literature, each of them backed by different, and to
some degree informal, strategic justifications.
We present a new graphical representation for mathematical games: Game Networks
(G nets). Specifically, G nets:
• are more general than extensive-form representations, but are more structured and
more compact
• admit rigorous foundations
• provide a computationally advantageous framework for strategic inference
• generalize probabilistic networks in AI, for which a large algorithmic “toolbox” is
already available.
3.2 G nets: a formal definition and some results
G nets are comprised of a finite, ordered set of nodesXN (whereN = {1, ..., n}), corre-
sponding to a set of relevant variables, a partitionI of N which determines the identity of
3.2 G nets: a formal definition and some results 54
the agent responsible for the decision at each node (including Nature), and two types of
arc, representing causal and preferential (teleological) dependencies. Causal dependencies
are represented by directed (probability) arcs, and preferential dependencies by undirected
(utility) arcs.
Each node is associated with two quantities,w(xk|xU(k)) and p(xk|xP (k)), where
U(k) are the nodes directly connected toxk via utility arcs, andP (k) is the set of prob-
ability parents11 of xk (k ∈ N ), which we assume to be empty for one or more (initial)
nodes. Whilep(xk|xP (k)) is a conditional probability function, the same for all players,w
is a vector of functions, one for each player. For eachi ∈ I, pi(xk|xP (k)) is a (subjective)
conditional probability, whilewi(xk|xU(k)) is a ceteris paribus comparison operator:
w(xM |xN−M) =u(xM , xN−M)
u(x0M , xN−M)
,
wherexN = (xM , xN−M) is a state andx0 is an arbitrary reference point; moreover,
u is a strictly positive von Neumann-Morgenstern utility function.
Probability loops are allowed, but they must have zero probability (this will insure
thatp(−E|E) = 0). Furthermore, zero probability is identified with impossibility.
The incoming probability arcs represent those events which the agent who controls
Xk can observe at the moment of decision. A decision-maker (including Nature) may
choose any random rule, as long as it depends on the truth values of theP (k) only.
An element of the partition generated by theP (k) is called an information set atk.
11 The probability parents of a nodexk are those nodes which are immediate predecessors ofxk in theordering induced by the (directed) probability arcs.
3.2 G nets: a formal definition and some results 55
We say that payoffs areregular if the utilities of all states are positive, and are ex-
pressed as multiples ofu(x0) (the arbitrary reference point). Clearly, any game can be
transformed in one with regular payoffs via a positive affine rescaling of the original pay-
offs. Hence, without loss of generality, from now on we’ll concentrate on games with
regular payoffs.
A G net in which only the probabilities of Nature’s actions are specified is aG frame.
A G frame can be regarded as the set of all G nets which respect the implied independence
structure, and agree in the utility assignments and in the probabilities of Nature’s moves.
Finally, asimultaneousG net is one in which there are no probability dependencies.
Theorem 8 Any finite game in extensive form has a G frame representation.
Proof It is well known that any finite game in extensive form has a normal form represen-
tation (Fudenberg and Tirole 1991, p. 85). But any normal-form game can be represented
as a simultaneous G frame, where each node is identified with the strategy set of some
player, which in turn implies the result.
Let Ai(H) be the set of actions available to playeri at an information setH. Ai(H)
can be regarded as a partition ofH into possible actionsEH .
Definition 5 A vector of probability measures {pi}i∈N is said to satisfy individual
rationality if the following condition holds for all i, for all information sets H, and
for all EH ∈ Ai(H): pi(EH |H) > 0 only if ui(EH |H) ≥ ui(FH |H) for all FH ∈ Ai(H).
3.3 Some examples 56
Anticipating the discussion on the existence of strategic equilibrium in Game net-
works in the next chapter, we state the following important corollary.
Corollary 9 For any finite, simultaneous G frame, there exists a corresponding G
net which satisfies individual rationality.
3.3 Some examples
As a first example of G net we present the beer/quiche game. In this game, Nature selects
the type of player1, who may be either strong (S) or weak (-S). Player1 (who knows his
type) is in a pub, and has to decide whether to get beer (B) or quiche (-B). His decisions
are observed by player2, who is a bully and, as such, enjoys fighting against weak types.
After observing what player1 orders, player2 decides whether to start a fight with player
1 (F) or not (-F). If player1 is strong he will fight back, and hence player2 in that case
prefers not to fight. Player1 always prefers not to get into a fight, but more strongly so if
he is a weak type and hence knows that he will be beaten up. Finally, strong types of player
1 prefer beer to quiche, while weak types have the opposite preference.
A G net (or, more precisely, a G frame) representation of this game is depicted in
Figure 2, along with the corresponding extensive-form representation. The probability
dependencies are represented by solid arrows, while the dashed and dotted lines represent
utility dependencies for players1 and2 respectively.
In the informal description above there are several implicit independence assump-
tions. For instance, it is implicitly assumed that Nature’s choice cannot depend on what
3.3 Some examples 57
2.The beer-quiche game
the two players will do later, or that player2’s decision contingent on the observation of
player1’s behavior is independent of Nature’s choice. Moreover, it is implicitly assumed
that player1 prefers beer to quiche regardless of whether he will have to fight or not, or
that player2 only cares about1’s type if he chooses to fight, but not otherwise.
In the G net representation of this game, both probability and utility independencies
are captured in the structure of the network. By contrast, the extensive form only captures
probability independencies, while the existence of utility independencies (which induce
symmetries in the payoff structure) does not lead to a more compact representation. In our
example, compactness is also reflected in the number of parameters needed to specify the
game payoffs. In the extensive form, one needs 16 parameters to identify the payoffs. In
the G net representation, instead, one only needs 8 parameters.
3.3 Some examples 58
Compactness is only one of several advantages of G nets over extensive forms; an-
other one is modularity. Suppose that, in the context of our beer-quiche example, one is
told that Nature can choose not two but three types, corresponding to different strength lev-
els. If we already have an extensive form representation for the case of two types we cannot
easily reuse it; in order to incorporate a third type we need a completely new drawing. In
the G net representation, by contrast, introducing a new type in an existing representation
is quite easy, as it amounts to adding some more probabilities and utility potentials. Sim-
ilarly, in a G net one can easily introduce new moves (for instance, the reaction of a third
player to player2’s decision to fight or not), or change the informational assumptions while
at the same time retaining most of the existing structure (for instance, payoffs do not need
to be completely reassessed if the state space is refined).
A third advantageous feature of G nets is that the relevant information about utilities
can generally be introduced more naturally than in extensive forms. In the context of our
example, for instance, going from the informal description to a numeric assessment of the
payoffs is relatively cumbersome; the decision maker needs to report absolute utility val-
ues for all possible outcomes, while the informal description only compares a few different
scenarios. To construct a G net, instead, one only needs information about payoff depen-
dencies, and order-of-magnitude comparisons between the relative utilities of alternative
scenarios, which is closer to the type of information contained in the informal description.
A second example is the “random order” game, in which Nature selects the order in
which agents play, but each agent only observes whether it is its turn to play or not. In
this game there are three players, plus Nature. Nature decides who plays first; each player
3.3 Some examples 59
only observes whether it is its turn to play or not, but does not directly observe the order
of play. When it is its turn, each player chooses eitherout, in which case the game ends,
or across, in which case the move goes to the next player unless two players have already
moved, in which case the game ends. Figure 3 shows the extensive-form and the G net
representation of this game, where the payoffs have been omitted for simplicity. In the G
net representation,Ai represents playeri’s action in the event that it is its turn to move (Ei).
3.The random order game
Notice that, in the G net representation of this game, the information setsE1, E2, E3
are payoff-irrelevant: only actions and the order in which they are taken matters. In the
extensive-form representation, that is implicit.
This game illustrates why we do not model the probability layer as a Bayes network,
but rather we allow zero-probability directed loops. Even though the game always ends
after a finite number of moves, no information set logically precedes the others. By way
3.3 Some examples 60
of contrast, Nature’s decision on who moves first logically precedes all other events, in
the sense that the truth value of no other event can be set, if not conditionally on the truth
value of Nature’s choice. If all information sets are singletons (i.e., the game has perfect
information), then it is always possible to ascribe a unilateral direction of causality to the
events in the game (which simply reduce to actions). Yet, if the information states are
not the same as physical events, then it becomes inconvenient to impose that a unilateral
direction of causality must hold among any arbitrary group of events.
Chapter 4Strategic equilibrium in Game Networks
4.1 Introduction
In this chapter, we establish existence and convergence results for strategic equilibrium in
game networks. While existence results (which we present in section 3 below) are easily
obtained, the issue of convergence deserves special attention. Many convergence proce-
dures have already been proposed in the game-theoretic literature, but none of them has
proved completely satisfactory. Fictitious play is one of the most popular; it is simple
and intuitively appealing, but it has some problems. The main one is that it may fail to
converge to a Nash equilibrium, due to the possible existence of cycles, which may per-
sist forever even though their frequency goes to zero. The tracing procedure (Harsanyi
and Selten 1988) is also problematic: it may not converge, and moreover is based on a
non-constructive argument which makes it difficult to use in practice.
In section 4 we introduce a new, simple iterative method, which always converges
to a unique Nash equilibrium in genericn-players normal form games. The equilibrium is
uniquely determined by the payoff structure of the game, can be computed with any level
of accuracy, and involves no weakly dominated strategies.
In many cases one is not interested in finding just one equilibrium, but rather would
like to obtain a complete list of all the Nash equilibria. It turns out that an adaptation of the
previous method can be put to such use. We discuss such method in section 5.
61
4.2 Setup 62
Finally we address a related issue, which becomes significant in the context of some
potential applications of game networks. In principle, conditionally independent sub-tasks
may be allocated to conditionally independent sub-agents, as conditional EU independence
implies that any optimal policy can be reproduced by the joint behavior of such sub-agents.
Yet, before one can decentralize its execution, one first needs to find an optimal policy, and
the task may turn out to be very cumbersome from a computational point of view.
Alternatively, one can distribute the decision problem among the sub-agents first, and
then let the system converge to some global policy in a decentralized fashion. On the other
hand, if we do so, we have no a priori guarantee that the system will indeed converge to
an optimal policy. In section 6 we study the convergence properties of such decentralized
systems in the special case of simultaneous G nets.
4.2 Setup
Let X be a finite set of states, and letA be the Boolean algebra of all the subsets ofX.
We assume that the agent’s preferences admit a von Neumann-Morgenstern expected
utility representation(p, u), wherep is a probability measure defined onA and w is a
function associating to each statex a positive real numberu(x).
We extend the utility function12 from states to general events by defining
u(E) =∑x∈X
u(x)p(x|E).
12 In the foregoing treatment we depart from the notational convention established in the previous chapters,and denote byu the utility function prior to normalization.
4.3 Existence of equilibrium in game networks 63
Let v(E) = u(E)u(X)
p(E); we say thatv is the value of eventE. Notice thatv is a
probability measure, since it is an additive set function, and0 ≤ v(E) ≤ 1. Moreover, for
any nonempty eventE whose probability is nonzero,
u(E)
u(X)=
v(E)
p(E).
4.3 Existence of equilibrium in game networks
In this section we present an existence result which only holds for simultaneous games.
Yet, the argument we use can be extended to general game networks, although we shall
concentrate on simultaneous games to keep the notation simple.
Let X = (X1, ..., Xn) be a matrix of pure strategies, andui : X → R+ be the payoff
matrix for playeri (i = 1, ..., n). Let ∆ be the set of all product measuresp on X, and let
v : ∆ → ∆ be the function defined byv(p) = ×ivi(p), where
vi(p)(xi) :=
∑x−i
p(xi, x−i)ui(xi, x−i)∑x p(x)ui(x)
= pi(xi)ui(p)(xi)
ui(p)(X).
Then v is a continuous self-function on a convex and compact subset ofRn, and
hence it has a fixed point by Brouwer’s theorem. A fixed point is characterized by the set
of equalities(pi = vi(p))i∈N . Let F be the set of such fixed points.
A Nash equilibrium is defined as a probability measurep ∈ ∆ such that
∑xi
pi(xi)ui(p)(xi) ≥∑xi
qi(xi)ui(p)(xi)
4.3 Existence of equilibrium in game networks 64
for all qi ∈ ∆i and for alli ∈ N. Clearly, the set of Nash equilibriaE is contained in
F, since equilibrium probabilities trivially satisfy the conditionpi = vi(p) for all i. This is
an immediate consequence of the following result.
Proposition 10 The following two statements are equivalent:
1. p is a Nash equilibrium
2. for all i in N, and for allxi ∈ Xi, eitherpi(xi) > 0 and ui(p)(xi)ui(p)(X)
= 1, or ui(p)(xi)ui(p)(X)
≤ 1.
Proof Clearly, if(2) holds, there exists noi ∈ N andqi ∈ ∆i such that∑
xiqi(xi)ui(p)(xi) >
∑xi
pi(xi)ui(p)(xi), so we just need to prove that(1) implies (2). If p is a Nash equilib-
rium, then the expected utility of any two strategies played with positive probability by
player i ∈ N must be the same, otherwise playeri can always obtain a higher expected
utility by relocating probability mass from the less profitable to the more profitable strat-
egy. Hence,ui(p)(X) =∑
xipi(xi)ui(p)(xi) = ui(p)(xi), which implies thatui(p)(xi)
ui(p)(X)= 1
for all xi such thatpi(xi) > 0. Moreover, it must be the case thatui(p)(xi) ≤ ui(p)(X)
for any strategyxi such thatpi(xi) = 0, otherwise playeri could obtain a higher expected
utility by giving probability 1 toxi.
How do we know that the set of Nash equilibria is nonempty? Of course we can
appeal to Nash’s theorem, but we can also prove it directly. Since the direct proof motivates
the convergence method we define later on, we present it here.
4.3 Existence of equilibrium in game networks 65
Let fε = ×ifi, wherefi = εzi +(1− ε)vi andzi is the uniform distribution on player
i’s strategies. Brouwer’s theorem guarantees that the set of fixed points offε is not empty.
Fixed points offε have an important property:
Proposition 11 If p is a fixed point of fε, then ui(p)(xi) > ui(p)(x′i) implies pi(xi) >
pi(x′i).
Proof To prove the claim, it suffices to observe that:pi(xi) = ε/k + (1 − ε)vi(p)(xi),
wherek is the number of pure strategies for playeri
vi(p)(xi) > 0, and hencepi(xi) > ε/k
ui(p)(xi)ui(p)(x′i)
=pi(x
′i)(pi(xi)−ε/k)
pi(xi)(pi(x′i)−ε/k)> 1 trivially implies pi(xi) > pi(x
′i).
We define arobust equilibriumas a limit point of a sequence of fixed points offε, as
ε goes to zero. By compactness of the strategy space any such sequence has a limit point,
and hence the set of robust equilibria is nonempty in∆.
Since a robust equilibrium always exists, the following result ensures that the set of
Nash equilibria is not empty.
Proposition 12 Any robust equilibrium is a Nash equilibrium.
Proof Supposep is a robust equilibrium. Thenvi(p)(xi) = pi(xi) for all i andxi, and
thereforeui(p)(xi)ui(p)(X)
= 1 for all xi which get positive probability. Suppose now thatx′i has
zero probability in equilibrium, butui(p)(x′i) > ui(p)(xi) for somexi which is played with
positive probability. Then, by continuity,ui(x′i) > ui(xi) also for the perturbed problem
4.4 Global convergence to equilibrium in simultaneous games 66
whenε is small, and hence in the limitpi(x′i) ≥ pi(xi), which contradicts our assump-
tion thatpi(x′i) is equal to zero. Therefore, a robust equilibrium satisfies condition(2) in
Proposition 10, which proves the claim.
Robust equilibria are similar to proper equilibria (Myerson 1978), in that they are
limits of sequences of strictly positive measures{pn} where strategies with higher utilities
always have higher probabilities. We conjecture that robustness is enough to guarantee that
weakly dominated strategies are not played in equilibrium.
The following result is also worth of note.
Proposition 13 Any fixed point of f which is the limit of a sequence of strictly positive
probability measures {pn} such that (vi(pn)− pn
i )(pn+1
i − pni
) ≥ 0 for all i and n is a
Nash equilibrium.
Proof To see why this holds suppose that, under the limit measurep, the probability ofxi
is zero butui(p)(xi)ui(p)(X)
> 1. Then, for largen, ui(pn)(xi)
ui(pn)(X)> 1 by continuity, i.e.vi(p
n)−pni > 0.
But thenpn+1i ≥ pn
i , and thereforepni cannot go to zero. Hence, the normalized utility
of any strategy which is played with zero probability is less or equal than one, and the
normalized utility of any strategy which is played with positive probability is equal to 1
becausep is a fixed point off. But thenp is a Nash equilibrium by Proposition 10.
4.4 Global convergence to equilibrium in simultaneous games 67
4.4 Global convergence to equilibrium in simultaneous games
Let (p, u) be the individual expected utilities in a finite, simultaneous game. We saw that
the equilibria of this game are probability vectorsp = (p1, ..., pn) which satisfyF (p) = 0,
whereF (p) is the vector
F (p) = ( ui(p)(X)[pi − vi(p)] )i∈n, and
vi(p)(xi) =
∑s−i
pi(xi)p−i(x−i)ui(xi, x−i)∑s p(x)ui(x)
= pi(xi)ui(p)(xi)
ui(p)(X).
Note that both the numerator and the denominator ofvi(p) are polynomial functions
of p, and hence( ui(p)(X)[pi−vi(p)] )i∈N is a vector of polynomial functions, whose zeros
include all the Nash equilibria.
We study convergence to a zero of the vectorF under the assumption that the prob-
ability of a strategy increases or decreases in proportion to its relative utility with respect
to the other available strategies. For now, we shall ignore the fact that some fixed points
may fail to be equilibria; in fact, we show below that the method we are presenting will
converge to a Nash equilibrium in generic games.
Consider the perturbed problemFε = u(p)(X)[p − fε(p)]. Observe thatFε can be
rewritten asεF 0(p) + (1 − ε)F (p), whereF (p) is the target system whose zeros we want
to find andF 0(p) is the trivial systemu(p)(X)[p− 1k], whose unique solution isp = 1
k.
ThenFε defines a convex-linear homotopyh(p, t) = F1−t (Morgan 1987, p.135) with
parametert ∈ [0, 1]. Note thath coincides with the trivial system fort = 0, and with the
target system fort = 1.
4.5 Computing all the Nash equilibria 68
In our setting,h is extremely well behaved: it satisfies conditions 1,2,3 and 4b in
(Morgan 1987, p.122) by construction, and moreover it satisfies condition 5 (inRn) because
F has no solutions at infinity and has at least one real root (since a Nash equilibrium exists).
Therefore, the homotopy continuation method discussed in (Morgan 1987) is guaranteed
to converge for generic games. The end point of the homotopy path is a robust equilibrium,
as it is the limit, asε goes to zero, of a sequence of solutions for the perturbed problem.
To handle degenerate cases, in which the uniform distribution is a bad choice of initial
condition, it suffices to introduce a slight random perturbation to the initial distribution or
to the game payoffs to guarantee convergence.
Now we are in the position of defining a new solution concept (the end point of the
homotopy path), which we namefirst equilibrium, and claim that:
• any generic simultaneous game has a unique first equilibrium
• the first equilibrium is uniquely determined by the payoff structure of the game
• the first equilibrium of a generic game can be approximated using a simple iterative
procedure
• the first equilibrium is a robust equilibrium of the game.
4.5 Computing all the Nash equilibria 69
4.5 Computing all the Nash equilibria
The method we discussed in the past section only tracks a single robust equilibrium. Yet,
in many cases one wants a complete list of all the Nash equilibria. It turns out that an
adaptation of the previous method can be put to such use.
A Nash equilibrium may not have any homotopy path converging to it inRn. Yet, the
following result ensures that we can get at them in the complex spaceCn. Let F (x) be a
polynomial system, whose zeros we want to find, and letG be the initial system defined by
Gi(x) = αdii xdi
i − βdii ,
wheredi is the degree offi and αi and βi are generic complex constants. Then
G(x) = 0 hasd = ×idi solutions. Leth(x, t) be the homotopy defined by
hi(x, t) = (1− t)Gi(x) + tFi(x).
Then the following result in (Morgan 1987, p. 124) applies.
Theorem 14 Given F, there are sets of measure zero, Aα and Aβ inCn such that, if
α /∈ Aα and β /∈ Aβ, then:
1. the solution set{(x, t) ∈ Cn× [0, 1) : h(x, t) = 0} is a collection ofd non-overlapping
(smooth) paths
2. the paths move fromt = 0 to t = 1 without backtracking int
3. each geometrically isolated solution ofF = 0 of multiplicity m has exactlym
continuation paths converging to it
4.6 Convergence to an optimal policy in single-agent game networks 70
4. a continuation path can diverge to infinity only ast → 1
5. if F = 0 has no solutions at infinity, all the paths remain bounded. IfF = 0
has a solution at infinity, at least one path will diverge to infinity ast → 1. Each
geometrically isolated solution at infinity ofF = 0 will generate exactlym diverging
continuation paths.
Observe that this method will identify all the zeros ofF, and in particular those which
are not Nash equilibria. Yet, discarding the zeros which are not equilibria is relatively
straightforward (one just needs to check if there are any profitable deviations).
4.6 Convergence to an optimal policy in single-agent gamenetworks
To conclude our discussion on equilibrium convergence in game networks we would like to
concentrate on a related issue, which becomes significant in the context of some potential
applications of game networks.
Suppose that a large single-agent decision problem is modeled as a G net, and that the
network topology identifies several conditionally EU independent sub-tasks. In principle,
these sub-tasks may be allocated to conditionally independent sub-agents, as conditional
EU independence implies that any optimal policy can be reproduced by the joint behavior
of such sub-agents. Yet, before one can decentralize its execution, one first needs to find
an optimal policy, and the task may turn out to be very cumbersome from a computational
point of view.
4.6 Convergence to an optimal policy in single-agent game networks 71
Alternatively, one can distribute the decision problem among the sub-agents first, and
then let the system converge to some global policy in a decentralized fashion. On the other
hand, if we do so, we have no a priori guarantee that the system will indeed converge to an
optimal policy. In this section, we study the convergence properties of such decentralized
systems in the special case of simultaneous G nets, leaving the general case to future work.
Let X = (Xi, X−i) be a matrix of pure strategies for playeri, and for a non-strategic
opponent−i (Nature), whose policyp−i is fixed Also, letui(X) be the payoff matrix for
playeri. We now define the following iterative method. We start with a uniform priorp0i
onXi, and compute a vector of values for playeri as follows:
v0i (xi) =
∑x−i
p0i (xi)p−i(x−i)ui(xi, x−i)∑
x p0i (xi)p−i(x−i)ui(x)
.
Next, we define a new probability measure
p1i (x) = v0
i (xi),
and use it to calculate new vectorsv1i .
The procedure is formally defined as follows, fork ≥ 1 :
p0i = uniform
vki (xi) =
∑x−i
pk−1(xi, x−i)ui(xi, x−i)∑x pk−1(x)ui(x)
pk(x) = vki (xi)× p−i(x−i).
4.6 Convergence to an optimal policy in single-agent game networks 72
Essentially, the procedure increases or decreases the probability of each strategy in
proportion to its relative expected utility. The following theorem says that in single-agent
decision problems the method always picks an optimal strategy.
Theorem 15 The iterative procedure always converges (in L1) to an optimal policy.
Proof Let fn = vni , and letu(xi) =
∑x−i
p−i(x−i)ui(x). Lp is a complete metric space,
and hence all Cauchy sequences converge. We first show that{fn} is indeed a Cauchy
sequence inL1. In our single-agent case,fn is defined by
f 0 = uniform
f 1 =f 0u∑f 0u
fk+1 =fku∑fku
=
fk−1uPfk−1u
u∑ fk−1uP
fk−1uu
= ... =f 0uk+1
∑f 0uk+1
=uk+1
∑uk+1
.
Hence, we need to show that for anyε there exists anN such that
∥∥∥∥un
∑un− um
∑um
∥∥∥∥p
< ε
for all n,m > N.
Notice that, ifu is constant, thenfn = f 0 for all n and we are done. Hence, we shall
assume thatu is not constant. First note that
4.6 Convergence to an optimal policy in single-agent game networks 73
∥∥∥∥un
∑un− um
∑um
∥∥∥∥1
=
∥∥∥∥uN
uN
(uk uN
∑un
− uhuN
∑um
)∥∥∥∥1
,
whereu is the maximum value ofu andk = n−N, h = m−N.
By the Schwarz inequality, the above is less or equal than
∥∥∥∥uN
uN
∥∥∥∥2
∥∥∥∥uk uN
∑un
− uhuN
∑um
∥∥∥∥2
,
which in turn is less or equal than
∥∥∥∥uN
uN
∥∥∥∥2
(∥∥∥∥uk uN
∑un
∥∥∥∥2
+
∥∥∥∥uhuN
∑um
∥∥∥∥2
)
by the Minkowsky inequality.
Sinceu ≤ u, we also have that
uk uN
∑un
≤ un
∑un
≤ 1
if un ≥ 1, and
uk uN
∑un
≤ un
∑un
≤ 1
otherwise, and the same holds foruhbuNPum . Therefore, the above expression is less or
equal than2∥∥∥uNbuN
∥∥∥2.
Furthermore, we have that
4.6 Convergence to an optimal policy in single-agent game networks 74
∥∥∥∥uN
uN
∥∥∥∥2
=
(1
s
s∑1
(u
u
)2N) 1
2
,
which goes to zero asN gets large (unlessu is a constant function, but we already
ruled out that case).
Finally, notice thatpn+1i > (<) pn
i if and only if vn+1i > (<) vn
i . But then the limit
policy p∗i is optimal by the argument used in the proof of Proposition 13.
It should be clear that, in the presence of EU independencies, the computation of
the current values only requires “local” information. Therefore, convergence to an optimal
policy can take place even if EU independent sub-tasks are allocated to independent sub-
agents.
Chapter 5Application: auctions
5.1 An independent-value, second-price auction
In this section, we propose a concrete application of the G net machinery in the context
of an economic example: a second-price (“Vickrey”) auction. Our goal here is to show
how to use G nets to formally represent an economic mechanism – in our case, an auction
– and reason about it. We shall not try to establish new results: rather, our aim is to
show how the informal representations and reasoning typical of auction theory can be made
completely precise thanks to the G net machinery. Formal specification is a rich subject area
in the AI literature, while it is not significantly represented in the economic literature. We
conjecture that electronic commerce applications will motivate the investigation of formal
specification methods also in the context of economic theory.
In a second-price auction the highest bidder gets the auctioned good, but only pays
the second-highest bid. We assume that there are only two bidders13, and that the values of
the good to each bidder correspond to the realizations of independent random variables.
Agent 1 privately observes her own value for the good (denoted byV ), and then
decides how much to bid (B). Independently, the value of the good for agent2 (S) is
realized, and contingent on that he decides how much to bid (C). The two bids jointly
13 This assumption is made for simplicity only: the same methodology and results apply if there are morethan two agents.
75
5.1 An independent-value, second-price auction 76
determine the final allocation (A), which is a paira = (g, m) denoting who gets the good
(g = 1, 2) and how much must be paid for it (m).
4.An independent-value, second-price auction, from the point of view of agent1.
To remove potential confusion, we emphasize that we only model the game from the
point of view of a single agent, and solve it as an individual decision problem. Yet, in a
second price auction, as well as in other dominance-solvable games, this also suffices to
identify the unique equilibrium.
Figure 4 represents the auction from the point of view of agent1. The probability
layer is represented as a Bayes network, to emphasize the causal structure of the events in
the game. Once again, we remark that the probability potentialsq (and the corresponding
Markov representation) can be readily derived from the conditional probability tables, in
which case the resulting G net looks like the one depicted in Figure 1. Here we omit this
extra step, as we shall not need it in the context of the example (since we will simplify the
expected utility function directly, rather than appealing to Theorem 7). Also, we omit all
5.1 An independent-value, second-price auction 77
the utility potentialsw which are identically equal to1 (corresponding to payoff-irrelevant
variables).
The probability of ending up with a particular allocation(g,m) depends onb andc,
and is given by14
p(a|b, c) =
1 if b ≥ c, c = m, g = 11 if b < c, b = m, g = 20 otherwise
For the purpose of exposition, we also postulate a specific (multiplicatively separable)
functional form for agent1’s preferences. We assume that the following condition holds:
w(a|v) =
{1+v1+m
if g = 11 otherwise
Note that agent1’s preferences on different allocations depend on her realized value
for the good. We also assume that the distribution of agent2’s bids, from the point of view
of agent1, has full support.
Agent1 chooses her bid in order to maximize utility, given her private value for the
good. The expected utility ofb conditional onv is given by:
u(b|v) =
∫u(a, b, c, s|v)p(da, dc, ds|b, v)
=u(x0)
u(v)
∫w(a, b, c, s, v)p(da|b, c)p(dc, ds)
=u(x0)w(v|a0)
u(v)
b∫
−∞
1 + v
1 + cp(dc) +
∞∫
b
p(dc)
.
14 For definiteness, we assume that in the case of identical bids agent1 gets the good.
5.2 The Generalized Vickrey auction 78
The first-order condition for optimality is given by
1 + v
1 + cp(c)
c=b∗= p(c)
c=b∗
and returnsb∗ = v.
Hence, regardless of what her opponent is going to bid (as long as the distribution
has full support), the optimal strategy for agent1 is to bid her true evaluation: i.e., to bid
exactly the amount of money which keeps her indifferent between getting the good (and
paying for it) or not.
5.2 The Generalized Vickrey auction
5.2.1 Introduction
In this section, we show how G nets can be used to reduce the computational complexity
of the generalized Vickrey auction (MacKie-Mason and Varian 1994). To do that we shall
not need the full power of G nets; in particular, we only need information about the agent’s
utilities, and not their subjective probabilities. Therefore, we shall only use the utility layer
of G nets (henceforth, U nets).
5.2.2 Setup
As is customary in auction theory, we assume that the agents’ preferences are represented
by quasi-linear utility functions
Ui(x, m) = Ui(x) + m.
5.2 The Generalized Vickrey auction 79
We say thatUi(x) is the agent’sevaluation(in monetary units) of outcomex.
For our purposes, a generalized Vickrey auction is a simultaneous auction which im-
plements the Groves-Clarke mechanism. Running a generalized Vickrey auction involves
three steps:
• collect the agents’ preferences
• identify an efficient allocation
• determine the corresponding monetary transfers.
The first step is to collect all the individual preferences, in the presumption that the
auction mechanism causes the agents to truthfully reveal them (this presumption is essen-
tially justified in the case of the GVA). Even this step can be quite cumbersome, if the
structure of preferences is not known in advance and the space of possible allocations is
large.
Ideally, the agents should submit a complete payoff table representing their evalua-
tions of the different outcomes. Yet, the space of possible allocations is often far too large
to even write down such table.
Fortunately, if preferences are well-behaved (i.e., in the presence ofu-independencies),
their U net representation can be much more parsimonious than a payoff table. Hence, col-
lecting the agents’ preferences in the form of U nets achieves the double goal of obtaining
the data in a compact and convenient form, without having to impose predefined, arbitrary
restrictions on the structure of the individual utilities. The agents themselves, when they
5.2 The Generalized Vickrey auction 80
submit the U net representing their preferences, also identify the relevant independencies,
and those may in turn be exploited by the auctioneer in order to decrease the complexity of
finding an efficient allocation.
In the next subsection, we illustrate the computational advantages of this method in
the context of a stylized example.
To make them suitable for a U net representation, utilities should first be rescaled to
be strictly positive. Letui(x,m) = ui(x)em, whereui(x) = eUi(x). Clearly,ui represents
the same preferences asUi, since it is obtained by a monotone increasing transformation
of the first (remember: these are just utility functions, and not expected utilities). Note that
theui are strictly positive, and (multiplicatively) separable inx andm. The latter property
implies thatm is u-independent of everything else (no income effects).
The second problem is to find an efficient allocation; this can be achieved by maxi-
mizing the sum of the evaluationsUi. Equivalently, the problem can be restated asmaxx∈X u,
whereu = ×iui(x), andX is some feasible set.
If we were to maximize a single utility function, we could use the U net representation
to identifyu-independencies, and – whenever possible – exploit them in order to reduce the
complexity of utility maximization.
What about the aggregate evaluationu? If we wish to represent it as a U net, we need
to construct the utility potentials
w(xM |xN−M) =u(xM , xN−M)
u(x0M , xN−M)
,
and identifyu-independencies. We show how to do this momentarily.
5.2 The Generalized Vickrey auction 81
5.2.3 Some results
Suppose that, for all agents,XA and XB are u-independent given the set of remaining
variablesXC . This is true if and only ifXC separatesXA from XB. In that case, we have
that
w(xA|xB, xC) =×iui(xA, xB, xC)
×iui(x0A, xB, xC)
= ×iwi(xA|xC),
which shows that the sameu-independence condition is also satisfied by the aggre-
gate evaluation. This fact is recorded in the following proposition.
Proposition 16 Preference aggregation preserves unanimous u-independence.
A straightforward consequence of this result15 is that, if we take the individual U nets,
and superimpose them, so that in the resulting graph there is an arc between two nodes if
and only if the same arc appears in at least one of the individual U nets, the resulting graph
is a perfect map of theu-independence structure for the aggregate evaluation. To obtain the
aggregate U net, we just need to associate to each node in the resulting graph the product
of the individual potentialswi for the same node.
Now we can perform the second step, and maximize the aggregate evaluation; clearly,
if the independence structure of the individual U nets is identical (i.e., the agents care ex-
actly for the same things), then the same independence relations also holds for the aggregate
15 Incidentally, we remark that a similar result applies for unanimousp– independence, in the more generalcontext of EUNs. Combining the two result, one obtains that aggregation preserves unanimous conditionalEU independence.
5.2 The Generalized Vickrey auction 82
evaluation, and therefore maximizing the latter is as hard as maximizing any of the indi-
vidual utilities. On the other hand, if everybody is concerned about different (but related)
aspects of the allocation, then maximizing the aggregate evaluation can be much harder,
since the resulting U net will approximate a completely connected graph. In general, the
problem of finding a maximal allocation is NP-hard16, but – as in the case of Bayes net-
works – for special configurations the problem is substantially simpler, as the following
proposition shows.
Proposition 17 In the case of polytrees, an efficient allocation can be found in linear
time.
Proof We first consider the case of a linear polytree of Boolean variables, defined as fol-
lows. LetX1, ..., Xn be Boolean variables, such thatX1 isu–independent of everything else
givenX2, X2 is u–independent of everything else givenX1 andX3, Xk (k = 3, ..., n− 1)
is u–independent of everything else givenXk−1 andXk+1, andXn givenXn−1. Suppose,
without loss of generality, thatw(X1|X2) ≥ 1. Then, wheneverX2 is chosen, indepen-
dently of the values of the remaining variables, it is convenient to chooseX1 rather than
−X1. Similarly, according to whetherw(X1|−X2) is greater than1 or not, it is convenient
to chooseX1 or−X1 whenever−X2 is chosen. Letx1(x2) be the optimal value ofx1 as a
function ofx2. Note that identifyingx1(x2) requires two binary comparisons, according to
the magnitudes ofw(x1|x2). To decide the optimal value ofx2 as a function ofx3 one needs
16 A problem is said to be NP-hard if there is no non-deterministic algorithm which is guaranteed to identifythe solution in polynomial time.
5.2 The Generalized Vickrey auction 83
two more comparisons, according to the magnitudes ofw(x1(x2), x2|x3). Let x2(x3) be the
optimal decision; iterating this procedure, one then finds the optimal values ofxk as a func-
tion of xk+1 for k = 3, ..., n − 1, each step requiring two comparisons. Finally, one can
choose the optimal value ofxn according to whetherw(x1(x2), x2(x3), ..., xk(xk+1), ..., xn)
is greater than1 or not, which requires two more comparisons. In total, that makes2n com-
parisons, which is linear inn. By way of contrast, if there are nou–independencies one
would need up to2n comparisons, which is exponential inn. A similar argument applies
to general polytrees; if each variable has up toh distinct values andk neighbors, one finds
that no more thannhkhobservations are needed, a quantity which is again linear inn.
The third step is to figure out the payments, once an efficient allocation has been
picked. The complexity of this step, in the case of the generalized Vickrey auction, is the
same as the complexity of finding an efficient allocation.
In the next section, we illustrate the U net methodology in the context of a stylized
example.
5.2.4 Example: a Coase-tal economy
The coast of a lake is divided inton equally sized estates. Each estate can be either left
undeveloped, or used for one of three possible activities: a beach resort, a fishery, or a
chemical plant. The profitability of each of these activities depends on the type of activ-
ity practiced in the two neighboring estates, but not on the other ones. For instance, the
profitability of a beach resort is highest when the two neighboring estates are left undevel-
oped, is only mildly affected by the proximity of other beach resorts, is more significantly
5.2 The Generalized Vickrey auction 84
affected by being contiguous to a fishery, and is dramatically upset by being close to a
chemical plant.
In a GVA the owner of each estate should express his evaluations on22n possible
configurations, but in practice – given the local nature of preferences – it suffices to elicit his
evaluations only on the26 triplets of activities in his neighborhood. The agents’ preferences
are submitted in the form of U nets, one for each land owner. The individual U nets are
then combined to form the aggregate evaluation, whose U net will have a “ring” structure.
Such structure is not a polytree, but it becomes a polytree (of triplets) if one of the triplets
is instantiated. Hence, the complexity of finding an efficient allocation (and the complexity
of calculating the corresponding payments, with the choice of constants described above)
is linear inn. Considering that in the general case finding an efficient allocation is NP-hard,
this constitutes a substantial improvement whenn is a large number.
Chapter 6A maximum entropy method for expected
utilities
6.1 Introduction
The notion of maximum entropy (Shannon 1949, Shannon and Weaver 1949, Jaynes 1983)
has recently attracted considerable interest in the context of Bayesian probability theory
(Good 1983, Jaynes 1998), as it provides a relatively simple, yet principled way to repre-
sent incomplete knowledge about uncertain domains. Not only maximum entropy methods
exhibit deep and interesting connections with seemingly unrelated notions such as mini-
mum description length and default reasoning, but - most importantly - they provide intu-
itively reasonable answers to many statistical problems, and have been successfully applied
to such diverse fields as physics, computer science, and even finance (Cover and Thomas
1991).
One way to think about maximum entropy is in terms of “typical” beliefs; in cases
where the available data are compatible with a large class of probabilistic models, it always
attempts to pick the most reasonable one, on the basis of symmetry considerations.
In economic decision theory, (subjective) probabilities are not regarded as a primitive
notion, but rather they are seen as aspects of the agent’s preferences, together with the sub-
jective utilities. While maximum entropy methods permit the characterization of “typical”
subjective beliefs, there is no corresponding notion of “typicality” for the case of utilities.
85
6.1 Introduction 86
The aim of the present chapter is to derive a symmetric notion of maximum en-
tropy for utilities, and investigate its connections with the notion of utility independence
(u-independence) which we introduced in the context of EU networks. As it turns out,
our maximum entropy method for utilities always returns indifference andu-independence
whenever possible. Hence, utility entropy provides an indirect justification for our notion
of u-independence, although this is not our main motivation in carrying on this program.
Our primary motivation is to provide a method for the revealed-preference estima-
tion of an unknown preference structure, based on observations on the agent’s behavior.
For instance, a current topic of interest in the theory of marketing is the estimation of the
potential demand for goods or services based on observations coming from large, heteroge-
neous databases of consumer data. Typically, the data are compatible with a broad spectrum
of possible consumer types, and hence the densities of the different types in the population
should first be estimated, and then incorporated in the model to produce an estimate of the
aggregate demand. To reduce the complexity of this task one may want to resort to a rep-
resentative agent formulation, in which the set of observed constraints is used to generate a
“typical” preference structure consistent with the observed data. Maximum entropy meth-
ods characterize “typical” belief structures, based on invariance considerations. Our aim is
to extend those methods in a principled way to the case of expected utilities, and use them
to characterize “typical” preference structures in the face of observed decision behavior
(the representative agent’s revealed preferences).
Our secondary motivation is to develop an algoritmically feasible procedure for co-
ordination in game networks. In economics, work on coordination has usually suffered
6.2 Setup 87
from the difficulty of capturing the fine psychological aspects underlying the phenomenon
of coordination in actual social environments. Language, mutual information, individual
characteristics all contribute substantially to the outcome of the coordination process, and
hence – in the impossibility to formally and practically account for those parameters – a
theory of human coordination would have nearly no empirical content.
Yet, in artificial environments, such as those created by the interaction of comput-
erized agents, endowed with individual motivations and information, the normative study
of coordination becomes a highly relevant issue. Not only we can choose the linguistic
and “psychologic” characteristics of our agents by design, but in principle we can also en-
dow them with an effective algorithmic toolbox for coordination which would give them
a strategic advantage in team games. For instance, in military applications, several divi-
sions of the same army may face a “coordinated attack” dilemma in a situation in which
the lines of communication between different divisions are disrupted. Having a common
protocol for coordination (say, a computerized unit for each division, running the same co-
ordination software) may reduce the risk of mis-coordination, and at the same time allow
the different divisions to improve their joint performance. More generally, since maximum
entropy methods provide a way to generate a unique “representative agent” (i.e., a unique
expected utility) in the face of convex observed constraints, in principle one could use them
to generate efficient reference points in game networks.
6.2 Setup
Let X be a finite set of states, and letA be the Boolean algebra of all the subsets ofX.
6.2 Setup 88
We assume that the agent’s preferences admit a (von Neumann-Morgenstern) ex-
pected utility representation(p, u), wherep is a strictly positive probability measure de-
fined onA andu is a function associating to each statex a positive real numberu(x).
We extend the utility function from states to general events by defining
u(E) =∑x∈X
u(x)p(x|E),
and normalize it by requiring thatu(X) = 1.
Let v(E) = u(E)p(E); we say thatv is the value of eventE. Hence, under the
above normalizationv is a probability measure, since it is an additive set function, and
0 ≤ v(E) ≤ 1. Moreover, sincep is strictly positive, for any nonempty eventE we can
write
u(E) =v(E)
p(E).
Once again, notice the remarkable structure of utilities: utility is simply the ratio of
two probability measures, one representing value and the other belief.
Theentropyof a probability measurep is defined as
H(p) = −∑x∈X
p(x) ln p(x).
Moreover, if p and q are two probability measures, therelative entropyof q with
respect top is defined as
D(q||p) =∑x∈X
q(x) ln
(q(x)
p(x)
).
6.3 Utility entropy 89
This quantity is always nonnegative, and is equal to zero if and only ifq ≡ p.
6.3 Utility entropy
The relative entropy of value with respect to probability is given by
D(v||p) =∑
v(x) ln(v(x)
p(x)) =
∑u(x) ln u(x)p(x) = Ep[u(x) ln u(x)]. (4)
Notice that minimizing the relative entropy ofv relative top is the same as minimiz-
ing the expectation ofu(x) ln u(x). This expectation is always non-positive, and is equal
to zero if and only ifv ≡ p, in which caseu(x) = 1 for all x, i.e. there is complete
indifference among all states.
Minimizing the relative entropy of value with respect to belief provides us with a
simple way of generating “typical” utilities, which we shall callmaximum entropy utilities.
If there is no information regarding the utilities of different states, then the method returns
complete indifference, as one would expect. What happens if we incorporate some infor-
mation about utilities? It turns out that the method returns the most symmetrical utility
function, i.e. it assumes indifference and independence whenever possible.
Let us see what happens in the context of two simple examples.
6.3 Utility entropy 90
A first example
Let X = {x1, x2, x3} , and letpi = p(xi), ui = u(xi), etc. We assume that the
probabilities are given by(p1, p2, p3) = (1/5, 1/3, 7/15), and thatu1 = 2, i.e. v1 = 2p1 =
2/5.
The maximum entropy utility can then be obtained by minimizing
Ep(u ln u) = v1 ln(v1
p1
) + v2 ln(v2
p2
) + (1− v1 − v2) ln(1− v1 − v2
1− p1 − p2
)
with respect tov, subject to the constraintv1 = 2p1. The unique solution is given by
(v1, v2, v3) = (2/5, 1/4, 7/20).
Finally, one can solve for the corresponding utilities, which yields
(u1, u2, u3) = (2, 3/4, 3/4).
Hence, in the absence of any information regarding the utilities ofx2 andx3, their
maximum entropy utility turns out to be symmetric, as one would expect.
A second example
Suppose that there are two Boolean variables (Health and Wealth), and let
(x1, x2, x3, x4) = (HW,H −W,−HW,−H −W ) be the corresponding four states.
Suppose that the two variables are probabilistically independent, and that the proba-
bilities of H andW are 0.8 and 0.3 respectively.
Clearly, if there are no further constraints, the maximum entropy utility assigns com-
plete indifference to all states, which trivially implies thatH andW areu-independent.
6.3 Utility entropy 91
If we further assume thatu(H) = 3u(−H), andu(W ) = 2u(−W ), are the two
variables stillu-independent under the maximum entropy utility? The answer turns out to
be yes.
The two constraints can be rewritten as
v1 + v2
p1 + p2
= 3v3 + v4
p3 + p4
v1 + v3
p1 + p3
= 2v2 + v4
p2 + p4
wherev4 = 1− v1 − v2 − v3, andp4 = 1− p1 − p2 − p3.
Solving forv2 andv3, we obtain
v2 = −v1+2v1p1+2v1p2−3p1−3p2
1+2p1+2p2, v3 = −v1+v1p1+v1p3−2p1−2p3
1+p1+p3.
The probabilities are given by(p1, p2, p3) = (24/100, 56/100, 6/100), and we wish
to find the valuev which minimizes
Ep[u ln u] =
v1 ln v1
p1+ v2 ln v2
p2+ v3 ln v3
p3+ (1− v1 − v2 − v3) ln 1−v1−v2−v3
1−p1−p2−p3
subject to the two constraints.
The unique solution is given by:
(v1, v2, v3, v4) = ( 72169
, 84169
, 6169
, 7169
),
6.4 Axiomatic justification of our maximum entropy method for expected utilities92
while the marginals are given byv1+v2 = 1213
, v3+v4 = 113
, v1+v3 = 613
, v2+v4 = 713
.
This information is summarized in the following table, where the last row and column
represent the marginals.
72169
84169
1213
6169
7169
113
613
713
One can then calculate the corresponding utilitiesui = vi/pi:
(u1, u2, u3, u4) = (300169
, 150169
, 100169
, 50169
).
The table below shows the utilities, in multiples ofu4 (an arbitrary reference point).
W −WH 6 3−H 2 1
Notice thatH andW areu-independent, as one would expect. This is always true if
the probabilities are independent, but does not hold in general.
6.4 Axiomatic justification of our maximum entropy methodfor expected utilities
So far, we defined a maximum entropy method which can be used to derive a unique ex-
pected utility function satisfying some (convex) observed constraints. Yet, a useful method
should not only be simple but also principled; a natural question is hence whether it is
possible to provide a rigorous justification to our maximum entropy method for expected
utilities. In this section, we show that the answer is yes: the method can be given a rig-
orous axiomatic foundation, in terms of a set of appealing invariance properties which we
present and discuss below.
6.4 Axiomatic justification of our maximum entropy method for expected utilities93
Suppose that we are trying to estimate an agent’s expected utility function over a
set of possible statesX, whose subjective probabilities are known. Further, suppose that
observation of this agent’s behavior returns a set of constraints on its expected utilities,
of the formEU(E) ≤ α (in fact, our treatment will allow for a more general class of
constraints). For instance, we may have observations on the agent’s purchasing behavior,
or employment decisions, etc.
Our observations in general will be compatible with an infinite number of possible
utility functions; among those, we would like to identify a unique utility function repre-
senting – in some sense – the least informative preference structure compatible with the
observed data. We do so by imposing a set of axiomatic conditions on the selection rule,
based on invariance considerations. Our conditions mirror those introduced by Shore and
Johnson (Shore and Johnson 1980) for the selection of posterior probabilities, and can be
informally stated as follows.
1. Uniqueness: the result should be unique.
2. Invariance: the choice of coordinate system should not matter.
3. System Independence: it should not matter whether one accounts for independent
information about independent systems together or separately.
4. Subset Independence: it should not matter whether one incorporates information on
mutually disjoint subsets of system states in terms of separate conditional utilities or
in terms of the full system utility.
6.4 Axiomatic justification of our maximum entropy method for expected utilities94
Let U be the set of normalized expected utility functions on a setX, and letC be the
set of all closed and convex (with respect to mixtures) subsets ofU. Then any elementI of
C can be expressed in terms of a (possibly infinite) set of inequalities of the type
∑u(x)f(x) ≥ 0,
and conversely any such set of constraints identifies an element ofC. We assume
that the information on the agent’s preferences always corresponds to a closed and convex
constraint.
Without loss of generality, we can state our inference problem as follows: find
H(u, p) = min(u′,p)∈bI H(u′, p),
whereH is a suitable functional,p is given, andI is a closed and convex constraint.
Equivalently, sinceu = v/p, the problem can be restated asminv∈I H(v, p) for a
suitableH, whereI is the (closed and convex) constraint onv corresponding to the closed
and convex constraintI. We shall work with this equivalent formulation, as it is more
convenient for our treatment. We remark that, if probabilities are known (as we assumed
throughout), then the observed constraints – which come in the form of inequalities on
expected utilities – can be equivalently restated in terms of value inequalities.
We now introduce an inference operator◦, which associates to any constraintI and
probability measurep a value functionv = p ◦ I. This operation can be regarded as the
outcome of the above minimization for some functionalH.
Our assumptions can hence be formally stated as follows.
6.4 Axiomatic justification of our maximum entropy method for expected utilities95
Assumption 1 (Uniqueness).p◦I is unique for any priorp and new informationI ∈ C.
Assumption 1 is quite appealing: it says that the outcome of the inference process
should be unambiguously determined.
Assumption 2 (Invariance). LetΓ be a transformation fromx ∈ X to y ∈ X ′, with
(Γv)(y) := v(Γ−1(y)). Let ΓI be the set of value measures onX ′ corresponding to the
original constraintI. Then, for anyp andI,
(Γp) ◦ (ΓI) = Γ(p ◦ I).
Assumption 2 says that coordinate transformations should not matter: the inference
method should be invariant with respect to relabelings.
Now suppose that the state space is composed of two independent subspaces, i.e.
X = Xi × X2, and thatpi, vi are probability measures onXi (i = 1, 2). Moreover, letI1
andI2 two constraints onv1 andv2 respectively. We then require that the following holds.
Assumption 3 (System Independence).(p1 × p2) ◦ (I1 × I2) = (p1 ◦ I1)× (p2 ◦ I2).
The assumption says that it should not matter if the inference rule on the two inde-
pendent subsystems is applied jointly or separately.
Finally, let {Xk} be a partition ofX, and letpXk, vXk
be the corresponding condi-
tional measures. LetIk be a family of constraints involving only the conditional values
6.4 Axiomatic justification of our maximum entropy method for expected utilities96
vXk, let I = ∩kIk, and letM be a constraint of the formv(Xk) = mk, where themk are
known values. We require that the following holds.
Assumption 4 (Subset Independence). Letv = p ◦ (I ∩M). ThenvXk= pXk
◦ Ik.
This last assumption states that it should not matter if new information on the condi-
tional values of mutually exclusive subsets is incorporated jointly or separately.
The following result then holds:
Theorem 18 There exists an inference rule which satisfies all the above assumptions.
Moreover, any such rule is equivalent to the minimization of (6.4).
Proof The result is an immediate corollary of Theorem III and Theorem IV in (Shore and
Johnson 1980) for the discrete case.
Even though our theorem is a straightforward consequence of existing results for the
case of posterior probabilities, we submit that it is not without merit. Surprisingly, the
required assumptions are still quite reasonable if we reinterpret the posterior probability
as value and use the axiomatic derivation to characterize “typical” expected utilities via
our notion of utility entropy. We believe that our maximum entropy method for expected
utilities constitutes a valid and principled way to identify a representative agent from a
large population of types in the face of revealed-preference constraints, and could find
application not only to economic theory, but also to problems in marketing and electronic
commerce.
References
Aumann, R. and Brandenburger, A., 1995, “Epistemic Conditions for Nash Equilibrium”,Econometrica63, n.5, 1161-1180.
Bacchus, F. and Grove, A., 1995, Graphical models for preference and utility. InProc. 11thConference on Uncertainty in Artificial Intelligence, 3-10.
Battigalli, P. and Siniscalchi, M., 1999, “Hierarchies of Conditional Beliefs and InteractiveEpistemology”,Journal of Economic Theory, forthcoming.
Bolker, E.D., 1967, “A Simultaneous Axiomatization of Utility and Subjective Probabil-ity”, Philosophy of Science34, 333-340.
Cover, T. and Thomas, J., 1991,Elements of Information Theory, Wiley.
De Finetti, B., 1937, “La prevision: ses lois logiques, ses sources subjectives”,Annales del’Institut Henry Poincare, n.7, 1-68.
Domotor, Z., 1978, “Axiomatization of Jeffrey Utilities”,Synthese39, 165-210.
Doyle, J. and Wellman, M. P., 1995, Defining preferences as Ceteris Paribus Comparatives.In Proc. AAAI Spring Symp. on Qualitative Decision Making, 69-75.
Epstein, L. and Wang, T., 1996, “‘Beliefs about Beliefs’ without Probabilities”,Economet-rica 64, n.4, 1343-74.
Fagin, R., Halpern, J., Moses, Y., and Vardi, M. Y., 1995,Reasoning about Knowledge.MIT Press, Cambridge.
Fishburn, P.C., 1982,The Foundations of Expected Utility, Reidel, Dordrecht, Holland.
Fudenberg, D. and Tirole, J., 1991,Game Theory, MIT Press, Cambridge.
Good, I. J., 1983,Good thinking : the foundations of probability and its applications.Uni-versity of Minnesota Press.
Halpern, J., 1997, “On ambiguities in the interpretation of game trees”,Games and Eco-nomic Behavior20, 66-96.
97
References 98
Harsanyi, J. and Selten, R., 1988,A General Theory of Equilibrium Selection in Games.MIT Press, Cambridge.
Hayes-Roth, B., 1995, Agents on Stage: Advancing the State of the Art in AI. InProc. 14thInternational Joint Conference on Artificial Intelligence, 967-971.
Heifetz, A. and Samet, D., 1996, “Topology-Free Typology of Beliefs”, ewp-game/9609002.
Jaynes, E. T., 1983,Papers on Probability, Statistics, and Statistical Physics.Kluwer,Boston.
Jaynes, E. T., 1998,Probability Theory: The Logic Of Science.ftp site : bayes.wustl.eduon subdirectory Jaynes.book.
Jeffrey, R.C., 1965,The Logic of Decision, University of Chicago Press, Chicago.
Koller, D. and Pfeiffer, A., 1997, “Representations and Solutions for Game-Theoretic Prob-lems”,Artificial Intelligence94, n.1, 167-215.
Kripke, S., 1980,Naming and Necessity. Harvard University Press, Cambridge.
La Mura, P. and Shoham, Y., 1998, Conditional, hierarchical, multi-agent preferences. InProc. of Theoretical Aspects of Rationality and Knowledge– VII, 215-224.
MacKie-Mason, J. K. and Varian, H. R., 1994, “Generalized Vickrey Auctions”. AnnArbor, MI, Dept. of Economics, University of Michigan. Available from http://www-personal.umich.edu/~jmm/papers/gva3.pdf.
McCarthy, J., 1959, Programs with Common Sense. InMechanization of Thought Pro-cesses, Proceedings of the Symposium of the National Physics Laboratory, 77-84.
Mertens, J.-F. and Zamir, Z., 1985, “Formulation of Bayesian Analysis for Games withIncomplete Information”,International Journal of Game Theory14, 1-29.
Morgan, A., 1987.Solving polynomial systems using continuation for engineering andscientific problems.Prentice-Hall.
Myerson, R., 1978, “Refinements of the Nash Equilibrium Concept”,International Journalof Game Theory7, 73-80.
Pareto, V., 1906,Manuel d’Economie Politique.
Pearl, J., 1988,Probabilistic reasoning in intelligent systems. Morgan Kaufmann.
References 99
Pearl, J., and Paz, A., 1989, Graphoids: A Graph-Based Logic for Reasoning About Rele-vance Relations. In B. Du Boulay (Ed.),Advances in Artificial Intelligence - II,North-Holland.
Samet, D., 1994, “Hypothetical Knowledge and Games of Perfect Information”, workingpaper, Tel Aviv University.
Savage, L.J., 1954,The Foundations of Statistics, Wiley, New York.
Shannon, C., 1949, “A Mathematical Theory of Communication”,Bell Systems TechnicalJournal47, 143-157.
Shannon, C. and Weaver, W., 1949,The Mathematical Theory of Communication, Univ. ofIllinois Press.
Shoham, Y., 1997, A Symmetric View of Probabilities and Utilities. InProc. 13th Confer-ence on Uncertainty in Artificial Intelligence, p. 429-436.
Shore, J. E. and Johnson, R. W., 1980, Axiomatic derivation of the principle of maximumentropy and the principle of minimum cross-entropy.IEEE Transactions on InformationTheory, vol. IT-26, no. 1, 26-37.
Shub and Smale, 1983. “Complexity of Bezout’s theorem II - volumes and probabilities”,Computational Algebraic Geometry109.
Tan, T. and Werlang, S., 1988, “The Bayesian Foundations of Solution Concepts of Games”,Journal of Economic Theory45, 370-391.
Vickrey, W., 1961, “Counterspeculation, auctions, and competitive sealed tenders”.Journalof Finance16, 8-37.
von Neumann, J. and Morgenstern, O., 1944,Theory of Games and Economic Behavior,Princeton University Press, Princeton, New Jersey.