Foundations of Multi-Agent Systems

FOUNDATIONS OF MULTI-AGENT SYSTEMS

A dissertation

submitted to theGraduate School of Business

and the Committee on Graduate Studies

of Stanford University

in partial fulfillment of the requirements

for the degree of

Doctor of Philosophy

in the subject of

Economics Analysis and Policy

Pierfrancesco La Mura

July 1999

© 1999 Pierfrancesco La Mura

All rights reserved

ii

I certify that I have read this dissertation and that in my

opinion it is fully adequate, in scope and quality, as a

dissertation for the degree of Doctor of Philosophy.

_____________________________________________

Robert Wilson Principal Adviser




_____________________________________________

Sven Rady




_____________________________________________

Yoav Shoham

Approved for the University Committee on Graduate Studies:

_____________________________________________

iii

Abstract

The subject of multi-agent systems is relevant to both economic theory and

artificial intelligence. Yet, very little work so far has tried to bridge the gap be-

tween the different formalisms, and provide a unified treatment. In this thesis, we

approach the subject of multi-agent systems from first principles, and develop a

new approach which enjoys foundational as well as computational advantages with

respect to other existing treatments.

First we develop a revealed-preference decision theory for many agents. Our

theory has several advantageous features, including the ability to represent small

worlds, conditional and counterfactual preferences and beliefs, and higher-order

utilities and probabilities.

Next, we introduce a novel representation for single-agent decision problems,

Expected Utility Networks (EU nets). EU nets generalize probabilistic networks

from the AI literature, and provide a modular and compact framework for strate-

gic inference. The representation relies on a novel notion of utility independence,

closely related to its probabilistic counterpart, which we present and discuss in the

context of other existing notions. Together, probability and utility independence

are shown to imply expected utility (EU) independence. Finally, we argue that

expected utility independence can be used to decentralize complex single-agent de-

cision problems, in that conditionally EU independent sub-tasks can be allocated to

simpler, conditionally independent sub-agents.

iv

Abstract v

We then extend the EU nets formalism to multi-agent decision problems, and

introduce a new class of representations, Game Networks (G nets). G nets constitute

an alternative to existing game-theoretic representation, with distinctive advantages.

In particular, G nets are more structured and compact than extensive forms, as both

probabilities and utilities enjoy a modular representation.

Next, we discuss strategic equilibrium in game networks. Existence and con-

vergence results are presented, for the special case of simultaneous networks. Al-

though we do not explicitly consider the general case, the proof technique and the

main results should carry over to general G nets.

As a concrete application of G nets, we then show how G nets can be used

to formally represent a second-price auction and solve for the equilibrium. We also

show how G nets can be used to reduce the computational complexity of finding

an optimal allocation and the corresponding payments in Generalized Vickrey Auc-

tions. In general, the problem is NP-hard; yet, we show that for an important class

of problems the solution can be obtained in linear time.

Our notion of utility independence can be given an alternative justification,

based on the notion of maximum entropy. We present a novel entropy method,

which allows for the characterization of “typical” preference structures in the face

of observed constraints (revealed preferences). We discuss potential applications of

the method, and provide an axiomatic justification.

Acknowledgments

First I would like to thank my family, and all my friends – no life, no truth.

Then I would like to thank my academic advisors, to whom goes my unconditional

admiration and respect not only for their contributions and guidance but also for

their human richness. I would also like to thank all the great people in the Graduate

School of Business and the department of Computer Science who helped me and

advised me. Thanks!

The material in the first three chapters was developed jointly with Yoav Shoham,

whose supervision has proved invaluable. Even the material in the remaining chap-

ters was strongly influenced by our discussions, and would have certaintly been

inferior otherwise. The material in chapter 4 was strongly influenced by several

discussions with Bob Wilson and Robert Susserland, to whom goes all my grati-

tude. The material in chapter 5 was presented at Yoav Shoham’s CoABS seminars,

and greatly benefited of the feedback offered by the seminar participants. Chapter

6 is part of an ongoing research project with Peter Grunwald, to whom I would also

like to express all my gratitude. Finally, I would like to thank Mario Eboli, who

provided constant encouragement and invaluable discussions.

vi

Contents

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1

Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Plan of the work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1 Revealed-preference foundations of multi-agent systems. . . . . . . . . . . . .11

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.2 Hierarchical conditional preferences: the single agent case . . . . . . . . . . . . . . . . . . . 14

1.2.1 Foundation: Jeffrey utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.2.2 A static expected utility representation of hierarchical preferences . . . . . 17

1.2.3 A dynamic expected utility representation of hierarchical preferences . . 21

1.3 Multi-agent construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

1.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2 Expected utility networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.2 Conditional independence of probabilities and utilities . . . . . . . . . . . . . . . . . . . . . . . 34

2.3 Expected utility networks: a formal definition and some structural properties . . 38

2.4 Conditional expected utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.5 Conditional expected utility independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

2.6 Inference in expected utility networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3 Game networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52

vii

Contents viii

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3.2 G nets: a formal definition and some results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.3 Some examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

4 Strategic equilibrium in Game Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4.3 Existence of equilibrium in game networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.4 Global convergence to equilibrium in simultaneous games . . . . . . . . . . . . . . . . . . . . 67

4.5 Computing all the Nash equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.6 Convergence to an optimal policy in single-agent game networks . . . . . . . . . . . . . 70

5 Application: auctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75

5.1 An independent-value, second-price auction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.2 The Generalized Vickrey auction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

5.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

5.2.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

5.2.3 Some results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.2.4 Example: a Coase-tal economy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6 A maximum entropy method for expected utilities . . . . . . . . . . . . . . . . . . . .85

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.3 Utility entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

6.4 Axiomatic justification of our maximum entropy method for expected utilities . 92

Contents ix

References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .97

List of Figures

A simple EU net........................................................................................39

The beer-quiche game...............................................................................57

The random order game............................................................................59

An independent-value, second-price auction............................................76

x

Introduction

Motivation

The formal analysis of multi-agent systems is a topic of interest to both economic the-

ory (von Neumann and Morgenstern 1944) and artificial intelligence (Hayes-Roth 1995).

While game-theoretic notions and methodologies have already populated the economic

mainstream, only recently they started to attract interest in the context of artificial intel-

ligence (Halpern 1997, Koller and Pfeiffer 1997), where their integration with existing

methodologies constitutes one of the most promising areas of new research. Research in

artificial intelligence, on the other hand, has traditionally put a great deal of emphasis on

logic and epistemic foundations (McCarthy 1959, Faginet al. 1995), while the importance

of such foundations in the context of economic theory (specifically, of game theory) has

only recently been recognized (Mertens and Zamir 1985, Tan and Werlang 1988, Aumann

and Brandenburger 1995).

Very little work so far has tried to bridge the gap between the different formalisms,

and provide a unified treatment. There are several reasons to believe that such integration

would be auspicable, and fruitful to both disciplines. For instance, several paradoxes and

technical difficulties (such as those related to counterfactual reasoning) appear in many dif-

ferent formalisms, suggesting that the demand for more rigorous foundations may be better

addressed through a unified treatment. Moreover, the various applications which moti-

vate foundational work in the two fields together impose very strong requirements on what

1

Introduction 2

would constitute a good foundation. Traditionally, and – we submit – for good reasons,

economic theory insists that the basic elements of a good framework should be in prin-

ciple observable, either through interrogation of a perspective decision-maker or through

the examination of an agent’s behavior (revealed preferences). Artificial intelligence, on

the other hand, insists – for equally good reasons – on a number of other requirements

for a good formalism, such as compactness, modularity, and, most importantly, compu-

tational tractability. In this respect, the standard game-theoretic framework proves fairly

inadequate for many real-world applications, whose complexity is typically so high that

not only explicit strategic reasoning, but even the formal specification of the system (for

instance, writing down its extensive-form representation) often turn out to be exceedingly

cumbersome tasks. We believe that, by addressing all these demands together, an integrated

formalism would prove beneficial to both disciplines, and open the possibility for a more

intense cross-fertilization between the two fields. Besides its potential for the advancement

of the core theoretical research in the two fields, such cross-fertilization would be, in our

opinion, important in view of several new research areas which are arising at their bound-

ary, such as electronic commerce, electronic marketing, or the economics of computerized

networks.

Plan of the work

In this thesis we approach the subject of multi-agent systems from first principles, and

develop a new approach which enjoys foundational as well as computational advantages

with respect to other existing treatments.

Introduction 3

In the first chapter we develop a revealed-preference decision theory for many agents.

Our theory has several advantageous features, which are never found together in other

treatments. For instance, it includes the ability to represent small worlds; a “small world”

(Savage 1954) can be thought of as an incomplete, coarse description of the world, and is

opposed to a “large world”, which can be regarded as an exhaustive description in which

no detail is left unspecified. In Savage’s theory the decision maker’s preferences are ex-

pressed on acts, which in turn are defined on a large world of states of Nature. In a chess

game, for instance, an act would be a complete conditional plan for the game, a veritably

overwhelming object; in turn, the practical impossibility to elicit preferences over such rich

objects curtails the applicability of the theory. Our approach, by contrast, uses as its basic

building blocks preferences over Boolean algebras of events, and there is no presumption

that any such event constitutes an exhaustive description of the world.

Another important feature of our approach is the ability to represent conditional and

counterfactual preferences. Other treatments (Samet 1994, Battigalli and Siniscalchi 1999)

also represent conditional and counterfactual beliefs, but they do not offer a decision-

theoretic foundation of the epistemic notions they use; rather, subjective beliefs are taken as

primitive notions. In our treatment we conform to a long-standing perspective in economic

theory (Pareto 1906, De Finetti 1937, Savage 1954), and take as our primitive notion the

agent’s conditional behavior, while epistemic notions such as the agents’ subjective beliefs

are regarded as aspects of the individual preferences.

Introduction 4

Finally, our approach includes higher-order utilities and probabilities. Most other

treatments1 generally focus on higher-order probabilities only (Mertens and Zamir 1985,

Heifetz and Samet 1996), and construct universal spaces in which the agents are classified

according to their epistemic type, i.e. according to the nature of their beliefs. By contrast,

we classify agents according to their decision-theoretic types, i.e. their preferences, and

derive epistemic types from decision-theoretic types.

In the second chapter we introduce a novel representation for single-agent decision

problems, Expected Utility Networks (EU nets). EU nets generalize probabilistic networks

from the AI literature (Pearl 1988), and provide a modular and compact framework for

strategic inference. Modularity is a central notion in AI; it allows concise representations

of otherwise quite complex concepts. In probabilistic networks, modularity is achieved by

exploiting the notion of conditional probabilistic independence. In recent years there have

been several attempts to provide a modular utility representations of preferences (Bacchus

and Grove 1995, Doyle and Wellman 1995, Shoham 1997).

It has proven difficult to devise a useful representation of utilities; this difficulty

can certainly be ascribed to the different properties of utility and probability functions, but

also, more fundamentally, to the fact that reasoning about probabilities and utilities together

requires more than simply gluing together a representation of utility and one of probability.

In fact, just as probabilistic inference involves the computation of conditional prob-

abilities, strategic inference– the reasoning process which underlies rational decision-

1 The only notable exception being, to the best of our knowledge, (Epstein and Wang 1996).

Introduction 5

making2 – involves the computation of conditional expected utilities for alternative plans

of action, which may not have a modular representation even if probabilities and utilities,

taken separately, do.

Our representation relies on a novel notion of utility independence, closely related to

its probabilistic counterpart, which we present and discuss in the context of other existing

notions. Together, probability and utility independence are shown to imply conditional ex-

pected utility (EU) independence. In this respect, choosing the “right” notion of conditional

utility independence turns out to be crucial.

What is important about conditionally independent decisions is that they can be effec-

tively decentralized: a single, complicated agent can be replaced by a number of simpler,

conditionally independent sub-agents, who can do just as well. This property is of inter-

est not only to artificial intelligence, since it can be exploited to reduce the complexity of

planning, but also to economic theory, as it suggests a principled way for the identification

of optimal task allocations within economic organizations.

In the third chapter, we then extend the EU nets formalism to multi-agent decision

problems, and introduce a new class of representations for mathematical games, Game Net-

works (G nets). G nets constitute an alternative to standard game-theoretic representations

such as normal and extensive forms, but in several respects G nets are more advantageous

than the latter.

2 Here, and elsewhere, the term “strategic” is used in the context of individual decision-making, and doesnot necessarily refer to a multi-agent scenario.

Introduction 6

Extensive-form representations, for instance, are generally more compact than normal-

form ones, but they are still quite “redundant”: in concrete examples, many branches of the

tree often have the same payoffs, but the recognition of these symmetries does not lead to

a more parsimonious representation. Furthermore, changing a few details in the setup –

for both normal and extensive forms – usually entails rewriting the whole game; in other

words, standard representations are not particularly modular.

Compared to standard game-theoretic representations G nets are more structured and

more compact, as both probabilities and utilities enjoy a modular representation. Moreover,

one can easily modify or extend an existing G net: for instance, adding more types of a

player or an extra round of moves is usually much easier than in extensive forms.

More fundamentally, G nets provide a computationally advantageous framework for

strategic inference, as one can exploit conditional probability and utility independencies to

simplify the inference process.

Finally, we remark that G nets generalize probabilistic networks in AI, for which a

large and sophisticated algorithmic “toolbox” is already available.

In the fourth chapter, we establish existence and convergence results for strategic

equilibrium in simultaneous game networks. While existence results are easily obtained,

the issue of convergence deserves special attention. Many convergence procedures have

already been proposed in the game-theoretic literature, but none of them has proved com-

pletely satisfactory. Fictitious play is one of the most popular; it is simple and intuitively

appealing, but it may fail to converge to a Nash equilibrium, due to the possible existence

of cycles, which may persist forever even though their frequency goes to zero. The tracing

Introduction 7

procedure (Harsanyi and Selten 1988) is also problematic: it may not converge, and more-

over is based on a non-constructive argument which makes it difficult to use in practice.

We introduce a new, simple iterative method, which always converges to a unique

Nash equilibrium in genericn-players normal form games. The equilibrium is uniquely

determined by the payoff structure of the game, can be computed with any level of accuracy,

and involves no weakly dominated strategies.

In many cases one is not interested in finding just one equilibrium, but rather would

like to obtain a complete list of all the Nash equilibria. It turns out that an adaptation of the

previous method can be put to such use.

Finally we address a related issue, which becomes significant in the context of some

potential applications of game networks. In principle, conditionally independent sub-tasks

may be allocated to conditionally independent sub-agents, as conditional EU independence

implies that any optimal policy can be reproduced by the joint behavior of such sub-agents.

Yet, before one can decentralize its execution, one first needs to find an optimal policy, and

the task may turn out to be very cumbersome from a computational point of view.

Alternatively, one can distribute the decision problem among the sub-agents first, and

then let the system converge to some global policy in a decentralized fashion. On the other

hand, if we do so, we have no a priori guarantee that the system will indeed converge to an

optimal policy. We study the convergence properties of such decentralized systems in the

case of simultaneous G nets.

As a concrete application of G nets, we then show (in chapter 5) how they can be used

to formally represent a second-price auction and solve for the equilibrium. Our goal will be

Introduction 8

to show how to use G nets to formally represent an economic mechanism – in our case, an

auction – and reason about it. We shall not try to establish new results: rather, our aim is to

show how the informal representations and reasoning typical of auction theory can be made

completely formal thanks to the G net machinery. Formal specification is a rich subject area

in the AI literature, while it is not significantly represented in the economic literature. We

conjecture that electronic commerce applications will motivate the investigation of formal

specification methods also in the context of economic theory.

We also show how G nets can be used to reduce the computational complexity of

finding an optimal allocation and the corresponding payments in Generalized Vickrey Auc-

tions. In general, the problem is computationally very hard; yet, we show that for an im-

portant class of problems the solution can be obtained in linear time, and demonstrate our

methodology in the context of a simple economic example.

Our notion of utility independence can be given an alternative justification, based on

the notion of maximum entropy. Maximum entropy (Shannon 1949, Shannon and Weaver

1949, Jaynes 1983) has recently attracted considerable interest in the context of Bayesian

probability theory (Good 1983, Jaynes 1998), as it provides a relatively simple, yet princi-

pled way to represent incomplete knowledge about uncertain domains.

One way to think about maximum entropy is in terms of “typical” beliefs; in cases

where the available data are compatible with a large class of probabilistic models, it always

attempts to pick the most reasonable one, on the basis of symmetry considerations.

In chapter 6 we derive a symmetric notion of maximum entropy for utilities, and

investigate its connections with the notion of utility independence introduced in chapter 2.

Introduction 9

As it turns out, our maximum entropy method for utilities always returns indifference and

u-independence whenever possible; hence, utility entropy provides an indirect justification

for our notion ofu-independence, although this is not our main motivation in carrying on

this program.

Our primary motivation is to provide a method for the revealed-preference estimation

of an unknown preference structure, based on observations on the agent’s behavior. For

instance, a current topic of interest in the field of marketing is the estimation of the potential

demand for goods or services based on observations coming from large, heterogeneous

databases of consumer data. Typically, the data are compatible with a broad spectrum of

possible consumer types, and hence the densities of the different types in the population

should first be estimated, and then incorporated in the model to produce an estimate of the

aggregate demand.

To reduce the complexity of this task one may want to resort to a representative agent

formulation, in which the set of observed constraints is used to generate a “typical” prefer-

ence structure consistent with the observed data. Maximum entropy methods characterize

“typical” belief structures, based on invariance considerations. Our aim is to extend those

methods in a principled way to the case of expected utilities, and use them to characterize

“typical” preference structures in the face of observed decision behavior (the representative

agent’s revealed preferences). We do so by imposing a set of axiomatic conditions on the

selection rule, based on invariance considerations. Our conditions mirror those introduced

by Shore and Johnson (Shore and Johnson 1980) for the selection of posterior probabilities.

Introduction 10

Even though our result is a straightforward consequence of existing results for the

case of posterior probabilities, we submit that it is not without merit. Surprisingly, the

required assumptions are still quite reasonable if we reinterpret the posterior probability

as value and use the axiomatic derivation to characterize “typical” expected utilities via

our notion of utility entropy. We believe that our maximum entropy method for expected

utilities constitutes a valid and principled way to identify a representative agent from a

large population of types in the face of revealed-preference constraints, and could find

application not only to economic theory, but also to problems in marketing and electronic

commerce.

Chapter 1Revealed-preference foundations of

multi-agent systems

We develop a revealed-preference theory for multiple agents. Some features of our

construction, which draws heavily on Jeffrey’s utility theory and on formal constructions

by Domotor and Fishburn, are as follows. First, our system enjoys the “small-worlds”

property. Second, it represents hierarchical preferences. As a result our expected utility

representation is reminiscent of type constructions in game theory, except that our con-

struction features higher order utilities as well as higher order probabilities. Finally, our

construction includes the representation of conditional preferences, including counterfac-

tual preferences.

1.1 Introduction

Two aspects of game theory are very evident nowadays. The first is that it has become an

indispensable tool, not only in economics but in a variety of other disciplines as well, from

philosophy and psychology to political science and computer science. The other is that

game theory lacks comprehensive foundations of the scope and depth found in single-agent

decision theory. Evidence of this limitation can be found in the many debates concerning

backward induction problems, admissibility, and other examples in which the traditional

game theoretic predictions are either paradoxical or ambiguous. It would be quite handy to

have a Savage-style characterization of game theory, which would clarify the assumptions

11

1.1 Introduction 12

underlying different solution concepts, and therefore also the contexts in which each one

applies.

What would such a theory look like? At a minimum it should involve a set of agents

and a set of intuitive objects (such as events or acts), individual preferences over these ob-

jects for each agent that can plausibly be elicited from people via interrogation or observa-

tion, and a representation of each such individual preference ordering in terms of subjective

probability and utility particular to the individual agent.

In this chapter we will provide a theory that has these properties. Somewhat paradox-

ically, while our primary motivation is this multi-agent setting, the bulk of our construction

can be explained already in the single-agent setting; the extension to the multi-agent set-

ting then becomes obvious. Indeed, we are critical of existing foundations of decision

theory (and in particular of Savage’s framework), and believe that our theory provides bet-

ter foundations. However, rather than waste our ammunition by attacking Savage’s theory,

whose shortcomings can be (and have been) well camouflaged, we simply note that we are

not aware of any successful attempt to generalize Savage’s framework to the multi-agent

setting, and claim that it is no accident.

After such boasting we must introduce a caveat. One of the attractive features of

Savage’s framework is the treatment of causality, as embodied in the notion of an act.

Although we believe that acts fit in quite naturally in the theory we are about to present and

are working to incorporate them, in this version of our construction acts will play no role.

For this reason we are also not yet in a position to tackle the paradoxes of game theory with

our framework (that too is next on our agenda).

1.1 Introduction 13

The main ingredients of our construction can be relayed concisely by reference to

several existing lines of research in decision theory, on which we draw liberally (the fol-

lowing will not make sense to the reader unfamiliar with the references, but the rest of the

chapter is self contained). From Jeffrey (Jeffrey 1965) we borrow the scalable expected util-

ity construction with the “small worlds” property. We then specialize and extend the frame-

work. We first specialize it by constructing a particular algebra on which to define Jeffrey

utilities, one that involves higher-order preferences (that is, preferences over preferences).

Applying results due to Domotor (Domotor 1978) we immediately get an expected-utility

representation of higher-order preferences, albeit a problematic one. Among its chief defi-

ciencies is the lack of account for the dynamics of preferences, or how preferences change

in the face of new evidence (including counterfactual evidence - this turns out to be an im-

portant point). We then exploit the structure of our hierarchical construction, and, adapting

and reinterpreting a relatively unknown construction due to Fishburn (Fishburn 1982), we

strenghten the expected-utility representation and avoid these deficiencies. In both cases

the resulting expected utility representation is in the spirit of existing type constructions in

game theory (Mertens and Zamir 1985, Heifetz and Samet 1996), but whereas these nest

only probabilities our representation nests both probabilities and utilities. The work which

comes closest to our approach is, to the best of our knowledge, (Epstein and Wang 1996).

Yet, our representation turns out to be quite different from the one in (Epstein and Wang

1996), and extends to conditional and counterfactual preferences.

1.2 Hierarchical conditional preferences: the single agent case 14

The next section contains the bulk of the technical material, and presents our single-

agent construction. In the following section we harvest the fruit of this construction by

easily extending it to the multi-agent case. We conclude with a brief summary.

1.2 Hierarchical conditional preferences: the single agentcase

As we have said, most of the work in our construction is done already in the single-agent

case, which we explain in this section. Before we begin the construction, let us point out

three important ingredients in it:

1. “Small worlds” property: One need not be required to express preference only (or

even at all) among entities that depend on objects which are sufficiently rich to resolve

all ambiguity (by way of contrast, Savage’s preferences are defined on acts, which in

turn are defined relative to states which are such rich objects).

2. “Hierarchical preferences”: One can assign preferences over preferences, as in

preferring smoking to not smoking but wishing one did not have that preference.

3. “Conditional preferences” (including counterfactual preferences): One should be

able to specify how one’s preferences change in the face of new evidence, including

evidence given a prior probability of 0.

Although we believe these criteria to be desirable and are proud that our theory meets

them, we do not request the reader to accept this desirability as self evident. We mention


them now because it is helpful to keep them in mind when following the stages of the

construction; in particular, the next three subsections correspond to these three criteria.

1.2.1 Foundation: Jeffrey utility

The technical development in this subsection is due to Jeffrey (Jeffrey 1965) and Domotor

(Domotor 1978). We start with a set of possible worldsW, and a finite Boolean algebraA

of subsets ofW. A preference ordering onA is a complete and transitive binary relation%

between nonempty pairsE,F ∈ A.

Definition 1 An expected utility representation of an ordering % on a finite algebra

A is a pair (p, u), where p : A → [0, 1] is a probability function and u : A−{∅} → R

is a utility function such that:

1. for all nonemptyE, F ∈ A, u(E) ≥ u(F ) if and only ifE % F

2. for all nonemptyE ∈ A, p(E) > 0, u(E) > 0 andu(W ) = 1

3. u(E)p(E) =∑

k u(Ek)p(Ek) for any finite, measurable partition{Ek}Kk=1 of E.

Notice the remarkable structure of this definition, as compared to the expected util-

ity representations of von Neumann and Morgenstern, Savage, and others. Here prefer-

ences on events are represented by their relative utilities rather than their relative expected

utilities, and the probabilities only serve to constrain the utilities via the last condition.

One way to think about Jeffrey utilities is as simultaneously playing the role of tradi-

tional (e.g., Savage) utilities and traditional conditional expected utilities (orCEUs, where


CEU(E) = EU(E|E)). This is a direct reflection of the “small worlds” property; since

the events among which one expresses preference are under-specified states of the worlds,

the utility of each event has an expectation flavor to it. Thevalueof an event, defined as the

product of its probability and utility, can be thought of as representing the event’s standard

(unconditional) expected utility. Note that in Jeffrey’s framework probability and value are

additive functions, but utility is not.

Remark 1 Since the probability of any nonempty event is positive, p(F |E) = p(E ∩

F )/p(E) is always well defined. Hence, (3) can be equivalently written in the condi-

tional form u(E) =∑

k u(Ek)p(Ek|E).

The question is whether there exist conditions on the ordering that guarantee the

existence of an expected utility representation. Jeffrey (Jeffrey 1965) gives a number of

such conditions, which constrain not only the ordering but also the algebra itself. Similar

conditions are provided by Bolker (Bolker 1967). These sets of conditions are sufficient

to guarantee the existence of an expected utility representation, and have the additional

advantage of being fairly intuitive. Here, however, we proceed to present an alternative

axiomatization, provided by Domotor (Domotor 1978). Domotor’s axiomatization has two

advantages - it constrains only the ordering but not the algebra, and is a necessary as well

as sufficient condition for representability via Jeffrey utility (Theorem 1 below). However,

it suffers the disadvantage of being highly technical and unintuitive, and so we introduce

the relevant axiom by reference only.


Definition 2 A preference ordering is regular if and only if it satisfies J2 in (Domotor

1978).

Theorem 1 (Domotor 1978) A preference ordering on a finite algebra is regular if

and only if it admits a (real-valued) expected utility representation.

Remark 2 (Domotor 1978) If the algebra is infinite Theorem 1 continues to hold, the

only difference being that probability and utility must be allowed to take nonstandard

(i.e., infinitesimal) values.

1.2.2 A static expected utility representation of hierarchicalpreferences

We now proceed to construct a particular Jeffrey/Domotor structure, one that will cap-

ture the hierarchical nature of the agent’s preferences while still enjoying the small-worlds

property.

We first construct a large-worlds ontology, and then use it to define a hierarchical

small-worlds framework. We start with a setW of possible worlds; a possible world is

to be thought of as a rich object that completely captures the truth of all propositions,

including the agent’s preferences. Next, we introduce a function% which associates with

each possible world a regular ordering over2W ; %w is to be thought of as extracting from

each possible worldw the agent’s preferences atw.


Remark 3 The reader familiar with modal logic will note the difference between this

construction and Kripke-style possible-worlds semantics (Kripke 1980). From the

conceptual point of view, in the latter a possible world settles on the truth value of

objective facts, whereas here a possible world settles the truth value of all propositions,

including the subjective ones. From the technical standpoint, in a Kripke structure

each world is mapped by the accessibility relation to a set of worlds, whereas here

each world is mapped by % to a total ordering on the power set of worlds.

We are actually not interested in the entire algebra2W , but rather in a specific sub-

algebra,A, which is defined as follows. We start with a finite Boolean algebraA0 (a

sub-algebra of2W , and ultimately also ofA). A0 is thought of as describing the objective

events, ones that do not capture the agent’s mental state. These are the objective events the

agent is aware of, or is capable of imagining. There is no requirement that these “exhaust”

the space of objective events in any sense.

Next, for all nonemptyE, F ∈ A0, let [E Â F ] = {w | E Âw F} , [E ∼ F ] =

{w | E ∼w F} , and[E % F ] = [E Â F ] ∪ [E ∼ F ].

Let B0 be the set generated from propositions[E Â F ], [E ∼ F ], whereE, F ∈ A0,

by closing off with respect to finite intersections. ThenB0 is aπ-system3, as it contains

both the empty set∅ = [E Â E] andW = [E ∼ E]. Elements ofB0 correspond to partial

ordering onA0; B0 describes the set ofzero-order preferences.

3 A π-systemon a setX is a family of subsets which contains bothX and the empty set, and is closed withrespect to finite intersections.


Let A1 = A0 ∪ B0 be the algebra generated by propositionsE whereE ∈ A0 or

E ∈ B0. Clearly, bothA0 andB0 are sub-algebras ofA1. Again, letB1 be theπ-system of

finite intersections of propositions[E Â F ], [E ∼ F ], where nowE, F ∈ A1. Elements of

B1 representfirst-order preferences.

Now recursively define then-th order (n > 1) algebras and preferences as follows:

• An = An−1 ∪Bn−1,

• Bn is the set of all finite intersections of propositions[E Â F ], [E ∼ F ], where

E,F ∈ An.

Let A = ∪nAn be the algebra generated by eventsE ∈ An, n ≥ 0. ThenE ∈ A if

and only ifE ∈ An for somen. Let B be theπ-system generated by[E Â F ], [E ∼ F ],

whereE, F ∈ A. Notice thatB ⊂ A, and henceA = A∪B. Therefore, further iteration is

superfluous: all the preferences on events inA are already included inA.

Remark 4 Recall that A is the sub-algebra of 2W in which we are interested. It might

be asked why not define the preferences %w only on A, and turn the above construction

into a fixpoint definition. It turns out, however, that in a later development in this

chapter (when we define mixture operations) we will need to include events outside

A.

We are now close to achieving our first goal, an expected utility representation of

hierarchical preferences. What we are after is, for eachw, giving the ordering%w on A


an expected utility representation. Our work is done almost automatically by the Jeffrey-

Domotor result. We simply need to note that, if an expected utility representation exists for

preferences on the full algebra2W , then it also exists for the restriction of those preferences

to any subalgebra. This fact is recorded in the following lemma.

Lemma 2 If an ordering over an algebra is regular, so is its projection to any sub-

algebra.

So it would seem that we have accomplished our goal, but in fact there are several

interrelated reasons for dissatisfaction:

• The requirement that every possible (i.e., nonempty) event be given a non-zero

probability is conceptually problematic, since it does not allow the agent to recognize

certain events as meaningful (or “possible”) and disbelieve them at the same time.

In particular, in the multi-agent setting, this will prohibit representing dominated

strategies as actual but disbelieved possibilities.

• Beyond the conceptual difficulty, the above requirement has unpleasant technical

ramifications. In particular, sinceA is in general infinite, the representation we have

uses nonstandard (i.e., infinitesimal) probabilities and utilities.

• The current theory does not account for the way in which preferences (and thus

probabilities and utilities) change in the face of new information; for this reason we

term it “static.” In particular, there is no obvious role for Bayesian conditioning, and

no account of counterfactual conditioning.


• Perhaps most damningly, the current theory really does not make use of the

hierarchical construction, beyond the weak use in Lemma . In particular, nothing

in the theory constrains the relationship between preferences at different levels,

contradicting intuition that such “coherence” constraints ought to exist.

We now proceed to develop a theory that does not have these shortcomings.

1.2.3 A dynamic expected utility representation of hierarchicalpreferences

Recall that the expected utility representation afforded by the Jeffrey/Domotor construction

involves probabilities and utilities that obey the following equation:

u(E) =∑

k

u(Ek)p(Ek|E) (1)

First on our agenda is to strenghten this property, and ensure that the probability and

utility obey the equation

u(E) =∑

k

u(Ek)pE(Ek) (2)

wherep is a conditional probability system (CPS). Recall that, given an algebraA,

a CPS (a.k.a. Popper function)p assigns to every non-empty conditioning eventE ∈ A

a probability function overA ∩ E. Furthermore,pE agrees with Bayesian conditioning

whenever possible: for any nonemptyE, F, G ∈ A, such thatG ⊂ F ⊂ E, pE(G) =

pE(F )pF (G). If E = W, and the unconditional probability ofF is positive (that is,p(F ) =


pW (F ) > 0), then the above formula yields

pF (G) = p(G | F ) =p(G ∩ F )

p(F ).

If we manage to guarantee (1.2) we will have escaped the first two limitations of the

(1.1)-based representation discussed above. Now let us go a step further. Consider any

E ∈ An andF ∈ Bn. Our claim is that, given the intuition behind our construction,E

ought to be probabilistically independent ofF ; the lower order events do not determine

the higher order preferences, and vice versa. From this it follows that our expected utility

representation should validate the following property:

u(E ∩ F ) =∑

u(Ek ∩ F )pE(Ek), (3)

Note that (1.1) is obtained as a special case of (1.3) by selectingp(E) > 0 and

F = W . Note also that we have now escaped the third and fourth limitations of (1.1)-based

representation.

How do we obtain (1.3)? We do so by leveraging a relatively unknown construc-

tion due to Fishburn (Fishburn 1982). His motivation was different from ours – giving a

conditional version of Savage’s construction. However, we will adapt and reinterpret the

mathematics to fit our intended interpretation. Fishburn starts with preferences defined on

pairs(x,E), where the first argument is an act, and the second an environmental event.4 In

4 We don’t discuss Fishburn’s inuition at length here, both because ours is different and because his isperhaps problematic. Briefly, however, the event is taken from an algebra on a set of states of nature, andthe act is taken from a mixture set. An actx can be thought of as a probability measure on environmentalevents, and in this interpretation the representation is an interesting mix between von Neumann-Morgensternand Savage: the agent chooses objective lotteries, and Nature chooses the context in which the lottery isperformed. While the lotteries are objective, the probability of an (external) event in the context of anotherevent is subjective, and can be uniquely derived from the agent’s preferences on pairs(x,E). Unfortunately,


our interpretation, eventsE ∩ F, whereE ∈ An andF ∈ Bn play the role of (act,event)

pairs(F, E). In other words, acts are viewed as being themselves events: they represent

(generally incomplete) descriptions of the agent’s conditional preferences (and hence be-

liefs), and characterize conditional revealed-preference behavior.

We proceed now with the technical construction.

Mixtures

Let R be the set of all expected utility representations(p, u) on some algebraA.

For any two representations(p′, u′), (p′′, u′′) ∈ R, and for anyλ ∈ [0, 1], we define

their (λ−)mixture to be a new representation(p, u) = (p′, u′)λ(p′′, u′′) such that:

p(E) = p′λp′′(E) = λp′(E) + (1− λ)p′′(E)

u(E) = u′λu′′(E) =λu′(E)p′(E) + (1− λ)u′′(E)p′′(E)

p(E).

For any two nonempty subsets of representationsx, y ∈ 2R, we define theirλ-mixture

as the (nonempty) subset

xλy = {(p, u) ∈ R | (p, u) = (p′, u′)λ(p′′, u′′), (p′, u′) ∈ x, (p′′, u′′) ∈ y} .

Definition 3 A nonempty subset x ∈ 2R is (mixture) convex if, for any λ ∈ [0, 1],

x = xλx.

some of the hypotheses introduced by Fishburn in order to derive his representation are quite unappealing inthe suggested interpretation, and this is probably why the result never gained much popularity.


Let M ⊂ 2R be the set of all convex subsets ofR. M has the useful property of being

closed with respect to mixtures:

Proposition 3 For any x, y ∈ M, and for any λ ∈ [0, 1], xλy ∈ M .

Proof Let (u′, p′) and(u′′, p′′) be elements ofxλy. Then(u′, p′) = (u′x, p′x)λ(u′y, p

′y) and

(u′′, p′′) = (u′′x, p′′x)λ(u′′y, p

′′y) for some(u′x, p

′x), (u

′′x, p

′′x) ∈ x and(u′y, p

′y), (u

′′y, p

′′y) ∈ y. We

want to show that(u, p) := (p′, u′)µ(p′′, u′′) is also inxλy, whereµ ∈ [0, 1]. After some al-

gebra, one finds thatp = p′µp′′ = (p′xµp′′x)λ(p′yµp′′y), and similarlyu = (u′xµu′′x)λ(u′yµu′′y).

But (p′xµp′′x, u′xµu′′x) ∈ x and(p′yµp′′y, u

′yµu′′y) ∈ y by convexity ofx andy, and therefore

(u, p) is in xλy.

Many decision-theoretic treatments postulate that the set of objects on which prefer-

ences are defined is amixture set:

Definition 4 A set X is a mixture set with respect to a mixture operation xλy if,

for any x, y ∈ X and λ, µ ∈ [0, 1],

1. x1y = x

2. xλy = y(1− λ)x

3. (xλy)µy = x(λµ)y.

The setM defined above turns out to have the desired structure.


Proposition 4 M is a mixture set.

Proof Properties1 and2 in Definition hold trivially, so we shall concentrate on3. First

we show that(xλy)µy ⊂ x(λµ)y. Let (u, p) be an element of(xλy)µy. Then there exist

(ux, px) ∈ x and(uy, py), (u′y, p

′y) ∈ y such thatp = (pxλpy)µp′y andu = (uxλuy)µu′y.

Note thatp = px(λµ)p′′y, wherep′′y = pyγp′y, andγ = µ(1−λ)(1−λµ)

∈ [0, 1]. Similarly, one

finds (after some algebra) thatu = ux(λµ)u′′y, whereu′′y = uyγu′y. But (u′′y, p′′y) is in y

by convexity, and hence(u, p) is in x(λµ)y. Now we show thatx(λµ)y ⊂ (xλy)µy. Let

(u, p) be an element ofx(λµ)y; then there exist(ux, px) ∈ x and(uy, py) ∈ y such that

p = px(λµ)py andu = ux(λµ)uy. But px(λµ)py = (pxλpy)µpy, and similarlyux(λµ)uy =

(uxλuy)µuy, which implies the result.

Axiomatization

As before, letM be the set of all convex subsets ofR. For anyx ∈ M, let [x] =

{w | there exist(p, u) ∈ x that represents%w} .

Remark 5 Note that the slight abuse of notation implicit in our use of the [. . . ]

operator is quite helpful. To begin with, [E Â F ] and [x] are of the same type,

but the relationship is even tighter. Define |E Â F | = {(p, u) | u(E) > u(F )} and

|E ∼ F | = {(p, u) | u(E) = u(F )}. Note that both these types of set are convex, as are

their finite intersections. Moreover, since E ∼ (Â)F if and only if u(E) > (=)u(F ),

then [E ∼ (Â)F ] = [|E ∼ (Â)F |].


With this machinery, given a regular preference orderingÂwe can define the induced

partial orderingÂ∗ on pairs(x, E), wherex ∈ M andE ∈ A are such that[x] ∩ E is

nonempty.

• (x,E) Â∗ (y, F ) if and only if [x] ∩ E Â [y] ∩ F

• (x,E) ∼∗ (y, F ) if and only if [x] ∩ E ∼ [y] ∩ F

Remark 6 Notice that (x,E) is ranked if and only if [x] ∩ E is nonempty. Hence,

it is immediately verified that Â∗ is asymmetric and transitive, and ∼∗ is symmetric

and transitive.

Next, we introduce five additional axioms. First, we assume that mixtures of possible

preferences are also possible.

A1. If x, y ∈ M correspond to nonempty[x]∩E and[y]∩E, then[xλy]∩E is also

nonempty for allλ ∈ (0, 1).

The second axiom explicitly relates to mixtures.

A2. (Substitution) For allE,F ∈ A andx, y, z, t ∈ M , if (x,E) ∼∗ (z, F ) and

(y, E) ∼∗ (t, F ) then(x12y, E) ∼∗ (z 1

2t, F ).

The third axiom also relates to mixtures, and ensures a classical (i.e., standard) rep-

resentation.


A3. (Archimedean){α : (xαy, E) %∗ (z, F )} and{β : (z, F ) %∗ (xβy, E)} are closed

subsets of[0, 1].

We remark thatA3 is imposed in order to obtain a classical representation: its main

role is to ensure that probabilities and utilities can be taken to be standard-valued.

The fourth axiom is fairly uninteresting. Its main role is to avoid triviality.

A4. (Relevance) There existx, y ∈ M such that(x,W ) Â∗ (y,W ).

Finally, the fifth axiom is a consistency requirement.

A5. (Consistency) For all nonemptyE,F, G ∈ A with E ∩ F = ∅, [E Â F ∼

G] ∩ E Â [E Â F ∼ G] ∩ F ∼ [E Â F ∼ G] ∩G.

This last axiom says, somewhat tautologically, that in the even that the agent prefers

E to F and is indifferent betweenF andG (i.e., whenever[E Â F ∼ G], whereG may or

may not coincide withF ), E is indeed preferred toF, andF is indifferent toG.

Theorem 5 Under J2 and A1 − A5, there exists a conditional representation (p, u)

such that:

1. p : A× (A− {∅}) → [0, 1]

2. u is a real-valued utility function, defined for all(x, E) ∈ M × A such that

[x] ∩ E 6= ∅, with the following properties:

(a) u(x,E) > (=)u(y, F ) if and only if [x] ∩ E Â (∼)[y] ∩ F

1.3 Multi-agent construction 28

(b) u(x,E) =∑

u(x, Ek)pE(Ek) for any finite, measurable partition{Ek} of E.

Proof For any nonemptyE ∈ A, let M(E) be the set of allx ∈ M such that[x] ∩ E

is nonempty. IfA1 holds, thenM(E) is a mixture set. LetS be the set of all nonempty

[x] ∩ E, where(x,E) ∈ M × A. Then the following condition holds:

P1. Â∗ is an asymmetric weak order onS.

The following condition is a consequence ofJ2, via Theorem 1:

P5. If (x,E) %∗ (x, F ) andE ∩ F = ∅ then(x,E) %∗ (x, E ∩ F ) %∗ (x, F ).

Moreover, the following two conditions descend fromA4:

P6. If E ∩ F = ∅ then(x′, E) Â∗ (x′, F ) and(y′, F ) Â∗ (y′, E) for somex′, y′ ∈ M.

P7. If E, F andG are mutually disjoint events inA, and if (x, F ) ∼∗ (x, G) for some

x ∈ M, then there is ay ∈ M at which exactly two of(y, E), (y, F ) and (y, G) are

indifferent.

Together withA2−A4, the above conditions validateP1−P7 in (Fishburn 1982, p.

151-154) onS. The result then follows from the proof of Theorem 2 in (Fishburn 1982, p.

155), replacing the full product spaceM × A throughout with the subsetS ⊂ M × A.

Remark 7 If F is an element of B, the representation specializes to

u′(F ∩ E) =∑

u′(F ∩ Ek)pE(Ek),

where u′ is defined by u′([x] ∩ E) = u(x,E), whenever F ∩ Ek 6= ∅ for all k.

Furthermore, if we take F to be the whole set W, we get u′(E) =∑

u′(Ek)pE(Ek).

1.3 Multi-agent construction 29

1.3 Multi-agent construction

The construction introduced in the previous section can be easily generalized to the multi-

agent case. As was mentioned in the introduction, although the multi-agent case is our

primary motivation, this section is brief because it is a straightforward extension of the

single-agent case.

Let I = {1, ..., n} be a set of agents. Agenti ∈ I is assumed to have preferences (and

hence beliefs) not only about the basic events in a finite algebraA0 and its own preferences,

but on other agents’ preferences as well. (In the treatment here we have all agents share the

base algebraA0, though this can be relaxed.)

As in the single-agent case,W represents a set of possible worlds, and%iw is a func-

tion associating to each pair(i, w) a regular preference ordering on2W .

For anyE, F ∈ A0, we denote by[E Âi F ] the proposition{w ∈ W | E Âiw F}

(” i (strictly) prefersE to F ”), and by [E ∼i F ] the set{w ∈ W | E ∼iw F} (” i is indif-

ferent betweenE andF ”). We denote byBi0 the π-system obtained by taking all finite

intersections of propositions[E Âi F ] and[E ∼i F ].

We recursively definen-th order (n > 1) algebras and preferences:

• An = An−1 ∪ (∪i∈IBin−1),

• Bin (i ∈ I) is the set of all finite intersections of propositions[E Âi F ], [E ∼i F ],

whereE, F ∈ An.

1.4 Conclusions 30

• A = ∪nAn is the algebra generated by eventsE ∈ An (n ≥ 0), andBi is the

π-system generated by[E Âi F ], [E ∼i F ], whereE, F ∈ A.

Again, we obtain a Jeffrey-style representation of agenti’s preferences, for allAn

andi ∈ I:

ui(E)pi(E) =∑

ui(Ek)pi(Ek).

Moreover, under the same conditions introduced in the single-agent case, we get a

Fishburn-style representation for pairs(E, F ) ∈ A×Bi, which satisfies

ui(E ∩ F ) =∑

ui(Ek ∩ F )piE(Ek)

wheneverE ∩ F 6= ∅.

1.4 Conclusions

We presented a decision-theoretic approach aimed at overcoming several well-known lim-

itations of existing constructions, limitations that become particularly apparent – and dis-

turbing – in multi-agent applications. Our approach enjoys several advantageous features,

including the ability to represent:

• Preferences on incomplete descriptions of the world.

• Conditional behavior, even contingent on disbelieved (counterfactual) events.

1.4 Conclusions 31

• Higher-order preferences (and hence beliefs).

The current limitations include:

• The axioms are not optimized for the proposed interpretation. That is, we glue

together constraints drawn from Domotor (or Jeffrey) and from Fishburn. Together

these are sufficient to guarantee the representation we seek, but there is no reason to

believe that they are necessary.

• We do not account for ‘agent causality’, or actions.

• As a result, we are not yet in a position to apply our construction to the resolution of

most game-theoretic paradoxes.

We view the first limitation as more of a mathematical annoyance than anything else,

and are actively working on removing the second one. We leave the explicit application of

our theory to game theoretic paradoxes to future work.

Chapter 2Expected utility networks

2.1 Introduction

Modularity is the cornerstone of knowledge representation in artificial intelligence (AI); it

allows concise representations of otherwise quite complex concepts. Logic offers modu-

larity via the compositional nature of the logical connectives, and the property is exploited

by theorem provers. Probability allows this via the notion of probabilistic independence,

a notion fully exploited by Bayesian networks (Pearl 1988). In recent years there have

been several attempts to provide modular utility representations of preferences (Bacchus

and Grove 1995, Doyle and Wellman 1995, Shoham 1997).

It has proven difficult to devise a useful representation of utilities; this difficulty

can certainly be ascribed to the different properties of utility and probability functions, but

also, more fundamentally, to the fact that reasoning about probabilities and utilities together

requires more than simply gluing together a representation of utility and one of probability.

In fact, just as probabilistic inference involves the computation of conditional prob-

abilities, strategic inference– the reasoning process which underlies rational decision-

making – involves the computation of conditional expected utilities for alternative plans

of action, which may not have a modular representation even if probabilities and utilities,

taken separately, do.

32

2.1 Introduction 33

The purpose of this chapter is to introduce a new class of graphical representations,

expected utility networks (EU nets). EU nets are undirected graphs with two types of arc,

representing probability and utility dependencies respectively. In EU nets not only prob-

abilities, but also utilities enjoy a modular representation. The representation of utilities

is based on a novel notion of conditional utility independence, which departs significantly

from other existing proposals, and is defined in close analogy with its probabilistic coun-

terpart.

We also define a novel notion of conditional expected utility (EU) independence, and

show that in EU nets node separation with respect to the probability and utility subgraphs

implies conditional EU independence. In this respect, choosing the “right” notion of con-

ditional utility independence turns out to be crucial.

What is important about conditionally independent decisions is that they can be effec-

tively decentralized: a single, complicated agent can be replaced by simpler, conditionally

independent sub-agents, who can do just as well. This property is of interest not only to ar-

tificial intelligence, since it can be exploited to reduce the complexity of planning, but also

to economic theory, as it suggests a principled way for the identification of optimal task

allocations within economic organizations.

The rest of the chapter is organized as follows. In section 2 we introduce our notion

of conditional utility independence, and discuss it in the context of other recent propos-

als in the literature. In section 3 we formally introduce EU nets, and discuss some of their

structural properties. Next, we extend the utility function from elementary “states” or out-

comes to general events, with the interpretation that the utility of an event is the expected

2.2 Conditional independence of probabilities and utilities 34

utility of that event, conditional on it being true. We explain this, and other characteris-

tics of the underlying decision theoretic framework, in section 4. In section 5 we show that

conditional probability and utility independence jointly imply conditional expected utility

independence, and argue that conditionally independent decisions can be effectively de-

centralized. In section 6 we address the issue of probabilistic and strategic inference in

EU nets, and show how conditional probabilities and conditional expected utilities can be

recovered from the structural elements of EU nets.

2.2 Conditional independence of probabilities and utilities

Conditional probability independence is a powerful notion: it incorporates a natural, in-

tuitive notion of relevance, and may dramatically reduce the complexity of probabilistic

inference by allowing a convenient decomposition of the probability function.

In strategic inference, reducing the complexity of the decision problem calls for a de-

composition of utilities along with probabilities. Yet, this is generally not enough: even

if probabilities and utilities are separately decomposable, strategic inference typically in-

volves computation of theexpected utilitiesfor alternative plans of action, and hence what

is really important is the ability to decompose the expected utility function.

Several proposals which recently appeared in the literature (Bacchus and Grove 1995,

Shoham 1997) rely onadditivenotions of utility independence, while the familiar notion of

probabilistic independence ismultiplicative. This difference may account for the difficulty

encountered by these proposals to achieve a convenient decomposition of the expected

utility function, and hence an effective reduction in the complexity of strategic inference.


In this section, we propose a multiplicative notion of conditional utility indepen-

dence, which is a close analogue of its probabilistic counterpart. In the following sections,

we argue that these two notions turn out to play well together, by inducing a modular de-

composition of the expected utility function, and a consequent simplification of the decision

process.

Let {Xi}i∈N (N = {1, ..., n}) be a finite, ordered set of random variables5, and

let x0 = (x01, ..., x

0n) be some arbitrary given realization which will act as the reference

point (we use uppercase letters to denote random variables, and lowercase to denote their

realizations). A joint realizationx = (x1, ..., xn) represents a (global)state, or outcome.

For anyM ⊂ N , we denote byXM the set{Xi}i∈M . Let p be a strictly positive probability

measure defined on the Boolean algebraA generated byXN , and letu be a (utility) function

which assigns to each statex a positive real number. We assume that the decision maker’s

beliefs and preferences are completely characterized by(p, u). Specifically, we assume that

p represents the decision maker’s prior beliefs, and that for any two probability measures

p′ andp′′, p′ Â p′′ (p′ is preferred top′′) if and only if Ep′(u) > Ep′′(u), whereEp(u) =

∑x u(x)p(x). Finally, let

q(xM |xN−M) =p(xM , xN−M)

p(x0M , xN−M)

.

The interpretation ofq is in terms ofceteris paribuscomparisons: it tells us how

the probability changes when the values ofXM are shifted away from the reference point,

while the values ofXN−M are held fixed atxN−M .

5 To keep the notation simple, we assume that they may take only finitely many values. Yet, the constructionis easily extended to more general classes of random variables.


We also define a correspondingceteris paribuscomparison operator for utilities:

w(xM |xN−M) =u(xM , xN−M)

u(x0M , xN−M)

One way to interpretw is as a measure of the intensity of preference forxM (with

respect to the reference point) conditional onxN−M .

Suppose thatq(xM |xN−M) only depends onxK , whereK ⊂ N − M, but not on

xN−M−K . It is easily verified that this condition holds for allxN if and only if XM is prob-

abilistically independent ofXN−M−K givenXK . We express (and record) this by defining

new quantitiesq(xM |xK) = q(xM |xN−M), where the conditionsxN−M−K are dropped. We

call this notionp-independence: note that it is only defined in terms of states (as opposed

to general events), and corresponds to conditional probability independence whenever it is

defined. Specifically, ifA,B andC are three subsets of the setXN of all random variables,

the statement “A is p-independent ofB givenC” only makes sense ifA,B andC consti-

tute a partition ofXN , and in that case it is equivalent to the statement thatA andB are

probabilistically independent givenC.

A corresponding notion of conditional utility independence (u-independence) is de-

fined accordingly. Suppose thatw(xM |xN−M) depends onxK , but not onxN−M−K . Hence,

the intensity of preference for the variables inXM (relative to their reference values) de-

pends on the values ofXK , but not on those ofXN−M−K . In that case, we again define new

quantitiesw(xM |xK) = w(xM |xN−M), and say thatXM is u-independent ofXN−M−K

givenXK .


It is instructive to compare our notion of conditional utility independence with sev-

eral other proposals which have appeared in the literature, in the context of an example

adapted from (Bacchus and Grove 1995). Suppose that there are two basic events,H and

W (“health and wealth”), and that the following payoff tables, where payoffs are expressed

as multiples ofu(¬H ∩¬W ) (an arbitrary reference point), represent the decision maker’s

preferences overH andW in two different scenarios:

¬W W¬H 1 2H 3 6

¬W W¬H 1 2H 3 4

Bacchus and Grove’s utility independence is, fundamentally, a qualitative notion,

which in the example reduces to payoff dominance. SinceH dominates¬H andW domi-

nates¬W , utility independence holds in both cases.

Additive utility independence specializes utility independence by requiring that prob-

ability measures with the same marginals be indifferent. In our2×2 example, this amounts

to the restriction thatu(H ∩W ) + u(¬H ∩ ¬W ) = u(H ∩ ¬W ) + u(¬H ∩W ). Hence,

additive utility independence holds in the second case but not in the first.

Shoham further specializes the notion of additive independence, with the following

intended interpretation: the two Boolean variables in the example are independent if it is

possible to associate to each of them a linear “contribution”, such that the utility of a joint

realization is given by the sum of the contributions. In our2 × 2 example, this criterion

coincides with additive independence; however, in the general case it is more stringent.

2.3 Expected utility networks: a formal definition and some structural properties38

We too introduce quantitative information, but in a different way: the two variables in

the example are independent in our sense (conditional on the empty set) if the increment in

utility relative to the reference point is the product of the increments along each component.

The intended interpretation of utility independence in our case is that the “intensity of

preference” for one variable with respect to its reference value, represented by theceteris

paribusutility ratio, does not depend on the particular value taken by the other variable.

Hence, in our sense,H andW are independent in the first scenario but not in the second.

We claim thatu-independence is a particularly attractive notion for two reasons:

• it is information that is natural to elicit from people, as it purely involves relevance

considerations and order-of-magnitude comparisons between utilities

• it gives rise to a graphical representation and associated inference mechanism,

expected utility networks (defined in the next section), which is simultaneously

modular in probabilities, utilities and expected utilities.

2.3 Expected utility networks: a formal definition and somestructural properties

We define an expected utility network as an undirected graphG with two types of arc,

representing probability and utility dependencies respectively. Each node represents a

random variable (say,Xi), and is associated with two positive functions,q(xi|xP (i)) and


w(xi|xU(i)), whereP (i) denotes the set of nodes directly connected toXi via probability

arcs6, andU(i) the corresponding set of nodes directly connected toXi via utility arcs.

These quantities are interpreted as the probability and utility ratios (defined in the

previous section) produced by some expected utility representation(p, u), and may be as-

sessed by the decision maker throughceteris paribuscomparisons. Alternatively, the prob-

ability layer of a EU net may be initially specified as a Bayes network, and the probability

ratiosq derived from conditional probability tables.

Figure 1 depicts a simple EU net. Although it is possible to present much richer

examples, we select this one because it ties in with an example discussed in the chapter on

auctions.

1.A simple EU net. Probability arcs are represented by solid lines, and utility arcs bydashed lines.

6 P (i) corresponds to theMarkov mantleof Xi, i.e., the minimal set of variables such that, conditional onthose,Xi is (probability) independent of everything else.


If the q andw functions are specified directly, then any arbitrary assignment of posi-

tive functionsq(xi|xBP (i), x0AP (i)) for all i (whereAP (i) denotes the set of all variables in

P (i) whose index is greater thani, andBP (i) = P (i)− AP (i)) uniquely identifies a cor-

responding probability function. Similarly, any arbitrary assignment of positive functions

w(xi|xBU(i), x0AU(i)) (whereAU(i) is the set of all variables inU(i) whose index is greater

thani, andBU(i) = U(i) − AU(i)) identifies a utility function, unique up to normaliza-

tion.7

We remark that, ifq(xi|x−i) only depends onxP (i), then fixing xP (i) completely

specifies the behavior of the probability function along thei − th coordinate (up to the

probability of the reference point), and that such behavior does not depend on the particular

values taken by the other variables. The same is true about the utility ofxi with respect to

its reference value, givenxU(i).

It turns out that node separation with respect to the probability and utility subgraphs

characterizes all the impliedp- andu- independencies. More precisely, for any probability

– utility pair (p, u) there exists an undirected graphG such that, ifA, B andC are three

subsets of variables (each variable being associated with a node in the graph),A is p-

(resp.,u-) independent ofB givenC if and only if C separatesA from B with respect to

the probability (resp., utility) subgraph (in the sense that every path fromA to B in the

subgraph must pass throughC). In the language of (Pearl 1988),G is a perfect map of the

independence structure.

7 The particular normalization we adopt is discussed in section 4.


Theorem 6 The set of p– and u– independencies generated by any pair (p, u) has a

perfect map.

Proof We follow the methodology of Bacchus and Grove (Bacchus and Grove 1995),

that is, we appeal to a necessary and sufficient condition in Pearl and Paz (Pearl and Paz

1989) and check that suitable generalizations ofp-independence andu-independence both

possess the following five properties: symmetry, decomposition, intersection, strong union

and transitivity. We prove it in the case of utility; the proof for probability is analogous.

Let A,B,C,D,R,R′, R′′ be subsets of random variables, whereR, R′ and R′′ always

denote the subset of remaining variables in the appropriate context (so, for instance, in the

context of someA,B andC, R = XN − A − B − C). As elsewhere in this chapter,

we use uppercase/lowercase to denote (subsets of) random variables and their realizations

respectively.

For the purpose of this proof, let us say thatA is independent ofB givenC, and write

I(A,B|C) if and only if w(a|b, c, r) = w(a|b0, c, r) for all (a, b, c, r). Note that this notion

generalizesu–independence. Then the following properties hold.

Symmetry:I(A,B|C) ⇒ I(B, A|C).

This follows because

w(b|a, c, r) = u(a,b,c,r)u(a,b0,c,r)

= u(a,b,c,r)u(a0,b,c,r)

u(a0,b,c,r)u(a,b0,c,r)

= u(a,b0,c,r)u(a0,b0,c,r)

u(a0,b,c,r)u(a,b0,c,r)

= w(b|a0, c, r).

Decomposition:I(A,B ∪D|C) ⇒ I(A,B|C) ∧ I(A,D|C).

2.4 Conditional expected utility 42

This is equivalent to saying thatw(a|b, c, d, r) = w(a|b0, c, d0, r) impliesw(a|b, c, r′) =

w(a|b0, c, r′) andw(a|c, d, r′′) = w(a|c, d0, r

′′). This follows trivially, becauser′ = (d, r)

andr′′ = (b, r).

Intersection:I(A,B|C ∪D) ∧ I(A,D|B ∪ C)⇒ I(A,B ∪D|C).

Equivalently,w(a|b, c, d, r) = w(a|b0, c, d, r) and w(a|b, c, d, r) = w(a|b, c, d0,r)

imply w(a|b, c, d, r) = w(a|b0,c, d0, r).

This also follows quite easily by algebraic manipulation.

Strong union:I(A,B|C) ⇒ I(B, A|C ∪D).

Equivalently,w(a|c, b, r) = w(a|b0, c, r) implies w(b|a, c, d, r′) = w(b|a0, c, d, r′).

This follows by symmetry, and the fact thatr = (d, r′).

Transitivity: I(A,B|C) ⇒ I(A, V |C) ∨ I(B, V |C),whereV is any single vari-

able. This is equivalent to saying thatw(a|b, c, r) = w(a|b0, c, r) impliesw(a|v, c, r′) =

w(a|v0, c, r′) or w(b|v, c, r′′) = w(b|v0, c, r

′′), which follows by observing that eitherv is

in b or else is inr.

2.4 Conditional expected utility

While probability is a set function, defined for general events, utility is so far only defined

for elementary events (states). The notion ofp-independence introduced in section 2 pre-

cisely corresponds to conditional probability independence whenever it is defined, and it is

only defined in terms of states – in turn, this enabled us to define a new notion of conditional

utility independence, also defined in terms of states, which we namedu-independence.


As we have seen, allp- andu- independencies can be immediately recovered from

the graphical structure of EU nets, because they are fully characterized by node separation

with respect to the probability and utility subgraphs. For instance, in the simple EU net

represented in figure 1,V andC are conditionallyp- andu- independent of each other,

givenA,B andS.

Suppose now thatA andB are controllable variables, in the sense that their values

can be freely set by the decision maker. A rational decision maker will want to choose

values ofA andB which maximize expected utility; hence, for each assignment(a, b), the

decision maker should compute the corresponding expected utility, and identify an optimal

decision. Clearly, the decision process becomes quite cumbersome when there are many

decision variables; to reduce its complexity, we seek conditions under which the expected

utility calculations can be conveniently decomposed.

The first step will be to extend utility to a be a set function as well, with the following

interpretation: the utility of an event is the expected utility of the event, conditional on the

event being true. Formally,

u(E) =∑x∈E

u(x)p(x|E).

The following important property is an immediate consequence of the definition: for

any nonemptyE ∈ A and for any non-empty, finite partition{Ek} of E, where theEk may

or may not be elementary “states”,


u(E) =∑

k

u(Ek)p(Ek|E).

The (von Neumann - Morgenstern) utilities we start from8 are only defined up to

positive affine transformations. It is natural then to normalize the utility measure around

certain values, just as probabilities are normalized to lie between zero and one. Hence,

we require thatu(True) = 1, whereTrue denotes the tautological event, or the entire

universe.9

Although it shall not play a direct role in EU nets, in order to facilitate the exposition

we also define thevalue(or impact) of eventE:

v(E) = u(E)p(E)

Under the above normalizationv is a (strictly positive) probability measure, since it

is an additive set function, and

v(True) = 1 ≥ v(E) > 0

for all nonemptyE. Moreover, sincep is also strictly positive, we have that

u(E) =v(E)

p(E).

8 We start with von Neumann - Morgenstern utilities and a given prior probability, as it is customary ineconomics. Of course, the decision-theoretic representation developed in chapter 1 is also fully adequate(and, we believe, offers better foundations) for our treatment of EUNs, although we decided to start with vonNeumann - Morgenstern utilities to emphasize that none of the results in the foregoing treatment depend onour particular decision-theoretic perspective.9 This normalization uniquely identifies the expected utility function if the utility of a second eventE0 isalso fixed, or, equivalently, if utilities are expressed as multiples ofu(x0), the utility of an arbitrary referencepoint, as we do in EUNs.

2.5 Conditional expected utility independence 45

Note the remarkable structure of conditional expected utility: the utility “measure” is

simply the ratio of two probability measures, one representing value, and the other belief.

Beside being important for the practical construction of EU nets, this normalization

of u allows us to speak about “good” and “bad” events.True – the status quo – is neutral,

neither good or bad. An eventE is said to be good (i.e., better thanTrue) if u(E) > 1,

and bad ifu(E) < 1.

The conditional versions of the three set functions – probability, utility, and value

– are defined in the natural way:p(E|F ) = p(E ∩ F )/p(F ), and similarlyu(E|F ) =

u(E ∩ F )/u(F ), andv(E|F ) = v(E ∩ F )/v(F ).

The three notions of conditioning are related by

v(E|F ) = u(E|F )p(E|F ).

Being a probability measure,p obeys Bayes’ rule (and, clearly, so doesv):

p(F |E) =p(E|F )p(F )

p(E|F )p(F ) + p(E|-F )p(-F )

Bayes’ rule does not hold for utilities, but a modified version of it does:

u(F |E) =u(E|F )u(F )

u(E|F )u(F )p(F |E) + u(E|-F )u(-F )p(-F |E)

Note that this is a “hybrid” relationship: conditional utility depends, among other

things, on conditional probabilities. This is another fact which is important to keep in mind

in connection with EU nets.


2.5 Conditional expected utility independence

We have now extended the utility function from complete states to arbitrary events, but

this new concept will be useful only insofar as it can be associated with a corresponding

independence notion, and this extended notion is also captured in the structure of the graph.

In this section we show that both these conditions hold. First we shall define a notion of

conditional expected utility independence, and then show that this notion is indeed captured

in the graphical structure of EU nets.

We defineconditional expected utility independence(or, more concisely, conditional

EU independence) for general events in analogy with the familiar notion of conditional

probability independence. Two events,E andF , are said to be conditionally EU indepen-

dent given a third eventG if

u(E ∩ F |G) = u(E|G)u(F |G).

Conditional expected utility independence generalizesu-independence from states to

general events, much as conditional probability independence generalizesp-independence.

Yet, since expected utilities involve probabilities as well, the relationship between condi-

tional EU independence andu-independence is mediated by probabilities.

Let us look at the general case first. Consider a partition of the set of all random

variables into three subsetsA, B andC. The conditional expected utility ofb givena is

u(b|a) =u(a, b)

u(a)=

∑c u(a, b, c)p(c|a, b)∑

b,c u(a, b, c)p(b, c|a).


Suppose now thata separatesb from c with respect to both the probability and utility

subgraphs. Then the following simplification obtains:

u(b|a) =w(b|a)∑

b w(b|a)p(b|a)

p(b|a) =q(b|a)∑b q(b|a)

.

Hence, the formula foru(b|a) does not involve terms inC, and similarlyu(c|a) does

not involve terms inB.

This is not true ifB andC are notp-independent, as the following example shows.

Example 1 Consider the special case in which A is empty, and w(b) = 1 (in which

case, we say that B is payoff − irrelevant). Then B and C are u-independent,

although they may not be p-independent. Hence, u(b)u(b′) =

Pc w(c)q(c|b)Pc w(c)q(c|b′) , a quantity which

is generally different from one. Intuitively, in this case B is purely instrumental to C:

it is irrelevant in itself, but its expected utility reflects the influence that a particular

choice of B has on the probability of C. If B and C are also p-independent, then the

above expression reduces to u(b)u(b′) = 1 (in which case, we say that B is strategically

irrelevant).

These observations are central to the following result.

Theorem 7 p– and u– independence jointly imply conditional expected utility inde-

pendence.

2.6 Inference in expected utility networks 48

Proof Let b andc be p– andu– independent givena. We want to show thatu(bc|a) =

u(b|a)u(c|a). We have that

u(bc|a) =w(a, b, c)∑

b,c w(a, b, c)p(bc|a)

=w(c|a, b)w(b|ac0)w(a|b0c0)∑

b,c w(a, b, c)p(b|a)p(c|a)

=w(c|a)w(b|a)∑

b,c w(c|a)w(b|a)p(b|a)p(c|a)

=w(b|a)∑

b w(b|a)p(b|a)

w(c|a)∑c w(c|a)p(c|a)

= u(b|a)u(c|a),

which proves the result.

Hence, the graphical structure of EU nets can be exploited to identify conditional EU

independencies: node separation with respect to both the utility and probability subgraph

implies conditional EU independence. The upshot is that, conditional onA, decisions

regardingB and C can be effectively decomposed: if bothB and C contain variables

which are under the control of the decision maker, it is not necessary nor useful to possess

information aboutC in order to decide onB, and vice versa. One way to think about such

decomposability is in terms of strategic decentralization: a single, centralized decision

maker can be replaced by two conditionally independent, simpler agents, who only need to

worry about their own respective domains in order to make jointly optimal decisions.

2.6 Inference in expected utility networks

In EU nets, probabilities and utilities are implicitly described by theq andw functions,

together with the topological structure of the network. Probabilistic inference involves


the computation of conditional probabilities, and strategic inference the computation of

conditional expected utilities; in this section, we show how these quantities can be readily

recovered from the structural elements of EU nets.

The probability layer of a EU net is essentially a Markov network10, even though

probabilities are subject to a somewhat unusual normalization, and hence probabilistic in-

ference can be performed with standard techniques whenever the (Markov) potentialsq are

known. In turn, potentials can either be assigned directly by the decision maker in the form

of ceteris paribuscomparisons, or derived from conditional probabilities if one starts with

a Bayes network.

The advantage of using utility “potentials” in EU nets is that they are based purely on

utility comparisons between states, which do not involve probabilities: this enables one to

elicit all the relevant preferences from the decision maker without assuming that he or she

already knows the probabilities.

Although we do not tackle here the issue of computational efficiency for probabilistic

and strategic inference in EU nets (a topic which is next on our research agenda), we shall

show how the two fundamental operations of marginalization and conditionalization for

probabilities and expected utilities can be easily reduced to operations on the probability

and utility potentials.

Once the potentials are known, the computation ofp(x)/p(x0) is straightforward:

10 For a good introduction to Markov networks, we refer to (Pearl 1988, chapter 3).


p(x)

p(x0)=

p(x1, x0−1)

p(x0)

p(x)

p(x1, x0−1)

= q(x1|x0P (1))

p(x2, x1, x0−{1,2})

p(x02, x1, x0

−{1,2})p(x)

p(x2, x1, x0−{1,2})

= q(x1|x0P (1))q(x2|x1, x

0AP (2))

p(x)

p(x2, x1, x0−{1,2})

= ... = ×iq(xi|xBP (i), x0AP (i)).

One can obtainp(xM)/p(x0) (wherep(xM) is now the marginal probability function

for a subset of random variablesXM ) by summing over theXN−M :

p(xM)

p(x0)=

∑xN−M

p(xM , xN−M)

p(x0)

One can then usep(xM)/p(x0) to compute ratios of marginal probabilitiesp(xA)/p(xB),

and in particular conditional probabilitiesp(xA|xB) = p(xA, xB)/p(xB).

To computeu(x)/u(x0), we use the same decomposition:

u(x)

u(x0)= ×iw(xi|xBU(i), x

0AU(i))

The marginal (expected) utility ofxM , relative to the reference point, can hence be

computed as

u(xM)

u(x0)=

∑xN−M

u(xM , xN−M)

u(x0)p(xN−M |xM)

One can then use ratiosu(xM)/u(x0) to compute ratios of expected utilitiesu(xA)/u(xB),

and in particular conditional expected utilitiesu(xA|xB) = u(xA, xB)/u(xB).


Notice that marginal utilities generally depend on probabilities. For instance, the util-

ity of catching a particular cab rather than not is measured by the ratiou(Cab)/u(¬Cab),

and will generally depend on how likely it is that another cab will show up, on the proba-

bility of rain, and so on.

Again, we remark that using utility “potentials” as the initial data in EU nets (rather

than conditional expected utilities) enables the decision maker to specify all the relevant

preferences without any prior knowledge of the probabilities.

Chapter 3Game networks

3.1 Introduction

There are two standard representations for mathematical games: the normal (or strategic)

form, and the extensive form. The extensive form is more structured than the normal form:

not only it describes the identities of the players, the strategies available to each player, and

the payoff functions, as the normal form does; but also the information held by the agents

at any possible state of the system, and the causal structure of events in the game.

The extensive form is also more general than the normal form: any game in nor-

mal form can be interpreted as a game with simultaneous moves, and represented in the

extensive form. The reverse operation (from the extensive to the normal form) is also pos-

sible, but some information (namely, the structure of conditional independencies) is lost in

the process, and hence not all extensive-form solution concepts have a normal-form equiv-

alent. Furthermore, the normal-form representation is often so large that resorting to an

extensive-form representation is the only viable way to even write down the game.

Even though extensive-form representations are more compact than normal-form

ones, they are still quite “redundant”: in concrete examples, many branches of the tree

often have the same payoffs or conditional probabilities, but the recognition of these sym-

metries does not lead to a more parsimonious representation. Furthermore, changing a few

52

3.2 G nets: a formal definition and some results 53

details in the setup usually entails rewriting the whole game; in other words, the extensive-

form representation is not particularly modular.

Finally, for the vast majority of solution concepts, strategic inference is not explicitly

(i.e., algorithmically) supported in either form, but rather it relies on informal reasoning on

behalf of the analyst. This lack of explicit decision-theoretic foundations is reflected in the

proliferation of solution concepts in the literature, each of them backed by different, and to

some degree informal, strategic justifications.

We present a new graphical representation for mathematical games: Game Networks

(G nets). Specifically, G nets:

• are more general than extensive-form representations, but are more structured and

more compact

• admit rigorous foundations

• provide a computationally advantageous framework for strategic inference

• generalize probabilistic networks in AI, for which a large algorithmic “toolbox” is

already available.

3.2 G nets: a formal definition and some results

G nets are comprised of a finite, ordered set of nodesXN (whereN = {1, ..., n}), corre-

sponding to a set of relevant variables, a partitionI of N which determines the identity of


the agent responsible for the decision at each node (including Nature), and two types of

arc, representing causal and preferential (teleological) dependencies. Causal dependencies

are represented by directed (probability) arcs, and preferential dependencies by undirected

(utility) arcs.

Each node is associated with two quantities,w(xk|xU(k)) and p(xk|xP (k)), where

U(k) are the nodes directly connected toxk via utility arcs, andP (k) is the set of prob-

ability parents11 of xk (k ∈ N ), which we assume to be empty for one or more (initial)

nodes. Whilep(xk|xP (k)) is a conditional probability function, the same for all players,w

is a vector of functions, one for each player. For eachi ∈ I, pi(xk|xP (k)) is a (subjective)

conditional probability, whilewi(xk|xU(k)) is a ceteris paribus comparison operator:


u(x0M , xN−M)

,

wherexN = (xM , xN−M) is a state andx0 is an arbitrary reference point; moreover,

u is a strictly positive von Neumann-Morgenstern utility function.

Probability loops are allowed, but they must have zero probability (this will insure

thatp(−E|E) = 0). Furthermore, zero probability is identified with impossibility.

The incoming probability arcs represent those events which the agent who controls

Xk can observe at the moment of decision. A decision-maker (including Nature) may

choose any random rule, as long as it depends on the truth values of theP (k) only.

An element of the partition generated by theP (k) is called an information set atk.

11 The probability parents of a nodexk are those nodes which are immediate predecessors ofxk in theordering induced by the (directed) probability arcs.


We say that payoffs areregular if the utilities of all states are positive, and are ex-

pressed as multiples ofu(x0) (the arbitrary reference point). Clearly, any game can be

transformed in one with regular payoffs via a positive affine rescaling of the original pay-

offs. Hence, without loss of generality, from now on we’ll concentrate on games with

regular payoffs.

A G net in which only the probabilities of Nature’s actions are specified is aG frame.

A G frame can be regarded as the set of all G nets which respect the implied independence

structure, and agree in the utility assignments and in the probabilities of Nature’s moves.

Finally, asimultaneousG net is one in which there are no probability dependencies.

Theorem 8 Any finite game in extensive form has a G frame representation.

Proof It is well known that any finite game in extensive form has a normal form represen-

tation (Fudenberg and Tirole 1991, p. 85). But any normal-form game can be represented

as a simultaneous G frame, where each node is identified with the strategy set of some

player, which in turn implies the result.

Let Ai(H) be the set of actions available to playeri at an information setH. Ai(H)

can be regarded as a partition ofH into possible actionsEH .

Definition 5 A vector of probability measures {pi}i∈N is said to satisfy individual

rationality if the following condition holds for all i, for all information sets H, and

for all EH ∈ Ai(H): pi(EH |H) > 0 only if ui(EH |H) ≥ ui(FH |H) for all FH ∈ Ai(H).

3.3 Some examples 56

Anticipating the discussion on the existence of strategic equilibrium in Game net-

works in the next chapter, we state the following important corollary.

Corollary 9 For any finite, simultaneous G frame, there exists a corresponding G

net which satisfies individual rationality.

3.3 Some examples

As a first example of G net we present the beer/quiche game. In this game, Nature selects

the type of player1, who may be either strong (S) or weak (-S). Player1 (who knows his

type) is in a pub, and has to decide whether to get beer (B) or quiche (-B). His decisions

are observed by player2, who is a bully and, as such, enjoys fighting against weak types.

After observing what player1 orders, player2 decides whether to start a fight with player

1 (F) or not (-F). If player1 is strong he will fight back, and hence player2 in that case

prefers not to fight. Player1 always prefers not to get into a fight, but more strongly so if

he is a weak type and hence knows that he will be beaten up. Finally, strong types of player

1 prefer beer to quiche, while weak types have the opposite preference.

A G net (or, more precisely, a G frame) representation of this game is depicted in

Figure 2, along with the corresponding extensive-form representation. The probability

dependencies are represented by solid arrows, while the dashed and dotted lines represent

utility dependencies for players1 and2 respectively.

In the informal description above there are several implicit independence assump-

tions. For instance, it is implicitly assumed that Nature’s choice cannot depend on what


2.The beer-quiche game

the two players will do later, or that player2’s decision contingent on the observation of

player1’s behavior is independent of Nature’s choice. Moreover, it is implicitly assumed

that player1 prefers beer to quiche regardless of whether he will have to fight or not, or

that player2 only cares about1’s type if he chooses to fight, but not otherwise.

In the G net representation of this game, both probability and utility independencies

are captured in the structure of the network. By contrast, the extensive form only captures

probability independencies, while the existence of utility independencies (which induce

symmetries in the payoff structure) does not lead to a more compact representation. In our

example, compactness is also reflected in the number of parameters needed to specify the

game payoffs. In the extensive form, one needs 16 parameters to identify the payoffs. In

the G net representation, instead, one only needs 8 parameters.


Compactness is only one of several advantages of G nets over extensive forms; an-

other one is modularity. Suppose that, in the context of our beer-quiche example, one is

told that Nature can choose not two but three types, corresponding to different strength lev-

els. If we already have an extensive form representation for the case of two types we cannot

easily reuse it; in order to incorporate a third type we need a completely new drawing. In

the G net representation, by contrast, introducing a new type in an existing representation

is quite easy, as it amounts to adding some more probabilities and utility potentials. Sim-

ilarly, in a G net one can easily introduce new moves (for instance, the reaction of a third

player to player2’s decision to fight or not), or change the informational assumptions while

at the same time retaining most of the existing structure (for instance, payoffs do not need

to be completely reassessed if the state space is refined).

A third advantageous feature of G nets is that the relevant information about utilities

can generally be introduced more naturally than in extensive forms. In the context of our

example, for instance, going from the informal description to a numeric assessment of the

payoffs is relatively cumbersome; the decision maker needs to report absolute utility val-

ues for all possible outcomes, while the informal description only compares a few different

scenarios. To construct a G net, instead, one only needs information about payoff depen-

dencies, and order-of-magnitude comparisons between the relative utilities of alternative

scenarios, which is closer to the type of information contained in the informal description.

A second example is the “random order” game, in which Nature selects the order in

which agents play, but each agent only observes whether it is its turn to play or not. In

this game there are three players, plus Nature. Nature decides who plays first; each player


only observes whether it is its turn to play or not, but does not directly observe the order

of play. When it is its turn, each player chooses eitherout, in which case the game ends,

or across, in which case the move goes to the next player unless two players have already

moved, in which case the game ends. Figure 3 shows the extensive-form and the G net

representation of this game, where the payoffs have been omitted for simplicity. In the G

net representation,Ai represents playeri’s action in the event that it is its turn to move (Ei).

3.The random order game

Notice that, in the G net representation of this game, the information setsE1, E2, E3

are payoff-irrelevant: only actions and the order in which they are taken matters. In the

extensive-form representation, that is implicit.

This game illustrates why we do not model the probability layer as a Bayes network,

but rather we allow zero-probability directed loops. Even though the game always ends

after a finite number of moves, no information set logically precedes the others. By way


of contrast, Nature’s decision on who moves first logically precedes all other events, in

the sense that the truth value of no other event can be set, if not conditionally on the truth

value of Nature’s choice. If all information sets are singletons (i.e., the game has perfect

information), then it is always possible to ascribe a unilateral direction of causality to the

events in the game (which simply reduce to actions). Yet, if the information states are

not the same as physical events, then it becomes inconvenient to impose that a unilateral

direction of causality must hold among any arbitrary group of events.

Chapter 4Strategic equilibrium in Game Networks

4.1 Introduction

In this chapter, we establish existence and convergence results for strategic equilibrium in

game networks. While existence results (which we present in section 3 below) are easily

obtained, the issue of convergence deserves special attention. Many convergence proce-

dures have already been proposed in the game-theoretic literature, but none of them has

proved completely satisfactory. Fictitious play is one of the most popular; it is simple

and intuitively appealing, but it has some problems. The main one is that it may fail to

converge to a Nash equilibrium, due to the possible existence of cycles, which may per-

sist forever even though their frequency goes to zero. The tracing procedure (Harsanyi

and Selten 1988) is also problematic: it may not converge, and moreover is based on a

non-constructive argument which makes it difficult to use in practice.

In section 4 we introduce a new, simple iterative method, which always converges

to a unique Nash equilibrium in genericn-players normal form games. The equilibrium is

uniquely determined by the payoff structure of the game, can be computed with any level

of accuracy, and involves no weakly dominated strategies.

In many cases one is not interested in finding just one equilibrium, but rather would

like to obtain a complete list of all the Nash equilibria. It turns out that an adaptation of the

previous method can be put to such use. We discuss such method in section 5.

61

4.2 Setup 62

Finally we address a related issue, which becomes significant in the context of some

potential applications of game networks. In principle, conditionally independent sub-tasks

may be allocated to conditionally independent sub-agents, as conditional EU independence

implies that any optimal policy can be reproduced by the joint behavior of such sub-agents.

Yet, before one can decentralize its execution, one first needs to find an optimal policy, and

the task may turn out to be very cumbersome from a computational point of view.



hand, if we do so, we have no a priori guarantee that the system will indeed converge to

an optimal policy. In section 6 we study the convergence properties of such decentralized

systems in the special case of simultaneous G nets.

4.2 Setup

Let X be a finite set of states, and letA be the Boolean algebra of all the subsets ofX.

We assume that the agent’s preferences admit a von Neumann-Morgenstern expected

utility representation(p, u), wherep is a probability measure defined onA and w is a

function associating to each statex a positive real numberu(x).

We extend the utility function12 from states to general events by defining

u(E) =∑x∈X

u(x)p(x|E).

12 In the foregoing treatment we depart from the notational convention established in the previous chapters,and denote byu the utility function prior to normalization.

4.3 Existence of equilibrium in game networks 63

Let v(E) = u(E)u(X)

p(E); we say thatv is the value of eventE. Notice thatv is a

probability measure, since it is an additive set function, and0 ≤ v(E) ≤ 1. Moreover, for

any nonempty eventE whose probability is nonzero,

u(E)

u(X)=

v(E)

p(E).

4.3 Existence of equilibrium in game networks

In this section we present an existence result which only holds for simultaneous games.

Yet, the argument we use can be extended to general game networks, although we shall

concentrate on simultaneous games to keep the notation simple.

Let X = (X1, ..., Xn) be a matrix of pure strategies, andui : X → R+ be the payoff

matrix for playeri (i = 1, ..., n). Let ∆ be the set of all product measuresp on X, and let

v : ∆ → ∆ be the function defined byv(p) = ×ivi(p), where

vi(p)(xi) :=

∑x−i

p(xi, x−i)ui(xi, x−i)∑x p(x)ui(x)

= pi(xi)ui(p)(xi)

ui(p)(X).

Then v is a continuous self-function on a convex and compact subset ofRn, and

hence it has a fixed point by Brouwer’s theorem. A fixed point is characterized by the set

of equalities(pi = vi(p))i∈N . Let F be the set of such fixed points.

A Nash equilibrium is defined as a probability measurep ∈ ∆ such that

∑xi

pi(xi)ui(p)(xi) ≥∑xi

qi(xi)ui(p)(xi)


for all qi ∈ ∆i and for alli ∈ N. Clearly, the set of Nash equilibriaE is contained in

F, since equilibrium probabilities trivially satisfy the conditionpi = vi(p) for all i. This is

an immediate consequence of the following result.

Proposition 10 The following two statements are equivalent:

1. p is a Nash equilibrium

2. for all i in N, and for allxi ∈ Xi, eitherpi(xi) > 0 and ui(p)(xi)ui(p)(X)

= 1, or ui(p)(xi)ui(p)(X)

≤ 1.

Proof Clearly, if(2) holds, there exists noi ∈ N andqi ∈ ∆i such that∑

xiqi(xi)ui(p)(xi) >

∑xi

pi(xi)ui(p)(xi), so we just need to prove that(1) implies (2). If p is a Nash equilib-

rium, then the expected utility of any two strategies played with positive probability by

player i ∈ N must be the same, otherwise playeri can always obtain a higher expected

utility by relocating probability mass from the less profitable to the more profitable strat-

egy. Hence,ui(p)(X) =∑

xipi(xi)ui(p)(xi) = ui(p)(xi), which implies thatui(p)(xi)

ui(p)(X)= 1

for all xi such thatpi(xi) > 0. Moreover, it must be the case thatui(p)(xi) ≤ ui(p)(X)

for any strategyxi such thatpi(xi) = 0, otherwise playeri could obtain a higher expected

utility by giving probability 1 toxi.

How do we know that the set of Nash equilibria is nonempty? Of course we can

appeal to Nash’s theorem, but we can also prove it directly. Since the direct proof motivates

the convergence method we define later on, we present it here.


Let fε = ×ifi, wherefi = εzi +(1− ε)vi andzi is the uniform distribution on player

i’s strategies. Brouwer’s theorem guarantees that the set of fixed points offε is not empty.

Fixed points offε have an important property:

Proposition 11 If p is a fixed point of fε, then ui(p)(xi) > ui(p)(x′i) implies pi(xi) >

pi(x′i).

Proof To prove the claim, it suffices to observe that:pi(xi) = ε/k + (1 − ε)vi(p)(xi),

wherek is the number of pure strategies for playeri

vi(p)(xi) > 0, and hencepi(xi) > ε/k

ui(p)(xi)ui(p)(x′i)

=pi(x

′i)(pi(xi)−ε/k)

pi(xi)(pi(x′i)−ε/k)> 1 trivially implies pi(xi) > pi(x

′i).

We define arobust equilibriumas a limit point of a sequence of fixed points offε, as

ε goes to zero. By compactness of the strategy space any such sequence has a limit point,

and hence the set of robust equilibria is nonempty in∆.

Since a robust equilibrium always exists, the following result ensures that the set of

Nash equilibria is not empty.

Proposition 12 Any robust equilibrium is a Nash equilibrium.

Proof Supposep is a robust equilibrium. Thenvi(p)(xi) = pi(xi) for all i andxi, and

thereforeui(p)(xi)ui(p)(X)

= 1 for all xi which get positive probability. Suppose now thatx′i has

zero probability in equilibrium, butui(p)(x′i) > ui(p)(xi) for somexi which is played with

positive probability. Then, by continuity,ui(x′i) > ui(xi) also for the perturbed problem

4.4 Global convergence to equilibrium in simultaneous games 66

whenε is small, and hence in the limitpi(x′i) ≥ pi(xi), which contradicts our assump-

tion thatpi(x′i) is equal to zero. Therefore, a robust equilibrium satisfies condition(2) in

Proposition 10, which proves the claim.

Robust equilibria are similar to proper equilibria (Myerson 1978), in that they are

limits of sequences of strictly positive measures{pn} where strategies with higher utilities

always have higher probabilities. We conjecture that robustness is enough to guarantee that

weakly dominated strategies are not played in equilibrium.

The following result is also worth of note.

Proposition 13 Any fixed point of f which is the limit of a sequence of strictly positive

probability measures {pn} such that (vi(pn)− pn

i )(pn+1

i − pni

) ≥ 0 for all i and n is a

Nash equilibrium.

Proof To see why this holds suppose that, under the limit measurep, the probability ofxi

is zero butui(p)(xi)ui(p)(X)

> 1. Then, for largen, ui(pn)(xi)

ui(pn)(X)> 1 by continuity, i.e.vi(p

n)−pni > 0.

But thenpn+1i ≥ pn

i , and thereforepni cannot go to zero. Hence, the normalized utility

of any strategy which is played with zero probability is less or equal than one, and the

normalized utility of any strategy which is played with positive probability is equal to 1

becausep is a fixed point off. But thenp is a Nash equilibrium by Proposition 10.

4.4 Global convergence to equilibrium in simultaneous games 67

4.4 Global convergence to equilibrium in simultaneous games

Let (p, u) be the individual expected utilities in a finite, simultaneous game. We saw that

the equilibria of this game are probability vectorsp = (p1, ..., pn) which satisfyF (p) = 0,

whereF (p) is the vector

F (p) = ( ui(p)(X)[pi − vi(p)] )i∈n, and

vi(p)(xi) =

∑s−i

pi(xi)p−i(x−i)ui(xi, x−i)∑s p(x)ui(x)

= pi(xi)ui(p)(xi)

ui(p)(X).

Note that both the numerator and the denominator ofvi(p) are polynomial functions

of p, and hence( ui(p)(X)[pi−vi(p)] )i∈N is a vector of polynomial functions, whose zeros

include all the Nash equilibria.

We study convergence to a zero of the vectorF under the assumption that the prob-

ability of a strategy increases or decreases in proportion to its relative utility with respect

to the other available strategies. For now, we shall ignore the fact that some fixed points

may fail to be equilibria; in fact, we show below that the method we are presenting will

converge to a Nash equilibrium in generic games.

Consider the perturbed problemFε = u(p)(X)[p − fε(p)]. Observe thatFε can be

rewritten asεF 0(p) + (1 − ε)F (p), whereF (p) is the target system whose zeros we want

to find andF 0(p) is the trivial systemu(p)(X)[p− 1k], whose unique solution isp = 1

k.

ThenFε defines a convex-linear homotopyh(p, t) = F1−t (Morgan 1987, p.135) with

parametert ∈ [0, 1]. Note thath coincides with the trivial system fort = 0, and with the

target system fort = 1.

4.5 Computing all the Nash equilibria 68

In our setting,h is extremely well behaved: it satisfies conditions 1,2,3 and 4b in

(Morgan 1987, p.122) by construction, and moreover it satisfies condition 5 (inRn) because

F has no solutions at infinity and has at least one real root (since a Nash equilibrium exists).

Therefore, the homotopy continuation method discussed in (Morgan 1987) is guaranteed

to converge for generic games. The end point of the homotopy path is a robust equilibrium,

as it is the limit, asε goes to zero, of a sequence of solutions for the perturbed problem.

To handle degenerate cases, in which the uniform distribution is a bad choice of initial

condition, it suffices to introduce a slight random perturbation to the initial distribution or

to the game payoffs to guarantee convergence.

Now we are in the position of defining a new solution concept (the end point of the

homotopy path), which we namefirst equilibrium, and claim that:

• any generic simultaneous game has a unique first equilibrium

• the first equilibrium is uniquely determined by the payoff structure of the game

• the first equilibrium of a generic game can be approximated using a simple iterative

procedure

• the first equilibrium is a robust equilibrium of the game.

4.5 Computing all the Nash equilibria 69

4.5 Computing all the Nash equilibria

The method we discussed in the past section only tracks a single robust equilibrium. Yet,

in many cases one wants a complete list of all the Nash equilibria. It turns out that an

adaptation of the previous method can be put to such use.

A Nash equilibrium may not have any homotopy path converging to it inRn. Yet, the

following result ensures that we can get at them in the complex spaceCn. Let F (x) be a

polynomial system, whose zeros we want to find, and letG be the initial system defined by

Gi(x) = αdii xdi

i − βdii ,

wheredi is the degree offi and αi and βi are generic complex constants. Then

G(x) = 0 hasd = ×idi solutions. Leth(x, t) be the homotopy defined by

hi(x, t) = (1− t)Gi(x) + tFi(x).

Then the following result in (Morgan 1987, p. 124) applies.

Theorem 14 Given F, there are sets of measure zero, Aα and Aβ inCn such that, if

α /∈ Aα and β /∈ Aβ, then:

1. the solution set{(x, t) ∈ Cn× [0, 1) : h(x, t) = 0} is a collection ofd non-overlapping

(smooth) paths

2. the paths move fromt = 0 to t = 1 without backtracking int

3. each geometrically isolated solution ofF = 0 of multiplicity m has exactlym

continuation paths converging to it

4.6 Convergence to an optimal policy in single-agent game networks 70

4. a continuation path can diverge to infinity only ast → 1

5. if F = 0 has no solutions at infinity, all the paths remain bounded. IfF = 0

has a solution at infinity, at least one path will diverge to infinity ast → 1. Each

geometrically isolated solution at infinity ofF = 0 will generate exactlym diverging

continuation paths.

Observe that this method will identify all the zeros ofF, and in particular those which

are not Nash equilibria. Yet, discarding the zeros which are not equilibria is relatively

straightforward (one just needs to check if there are any profitable deviations).

4.6 Convergence to an optimal policy in single-agent gamenetworks

To conclude our discussion on equilibrium convergence in game networks we would like to

concentrate on a related issue, which becomes significant in the context of some potential

applications of game networks.

Suppose that a large single-agent decision problem is modeled as a G net, and that the

network topology identifies several conditionally EU independent sub-tasks. In principle,

these sub-tasks may be allocated to conditionally independent sub-agents, as conditional

EU independence implies that any optimal policy can be reproduced by the joint behavior

of such sub-agents. Yet, before one can decentralize its execution, one first needs to find

an optimal policy, and the task may turn out to be very cumbersome from a computational

point of view.




hand, if we do so, we have no a priori guarantee that the system will indeed converge to an

optimal policy. In this section, we study the convergence properties of such decentralized

systems in the special case of simultaneous G nets, leaving the general case to future work.

Let X = (Xi, X−i) be a matrix of pure strategies for playeri, and for a non-strategic

opponent−i (Nature), whose policyp−i is fixed Also, letui(X) be the payoff matrix for

playeri. We now define the following iterative method. We start with a uniform priorp0i

onXi, and compute a vector of values for playeri as follows:

v0i (xi) =

∑x−i

p0i (xi)p−i(x−i)ui(xi, x−i)∑

x p0i (xi)p−i(x−i)ui(x)

.

Next, we define a new probability measure

p1i (x) = v0

i (xi),

and use it to calculate new vectorsv1i .

The procedure is formally defined as follows, fork ≥ 1 :

p0i = uniform

vki (xi) =

∑x−i

pk−1(xi, x−i)ui(xi, x−i)∑x pk−1(x)ui(x)

pk(x) = vki (xi)× p−i(x−i).


Essentially, the procedure increases or decreases the probability of each strategy in

proportion to its relative expected utility. The following theorem says that in single-agent

decision problems the method always picks an optimal strategy.

Theorem 15 The iterative procedure always converges (in L1) to an optimal policy.

Proof Let fn = vni , and letu(xi) =

∑x−i

p−i(x−i)ui(x). Lp is a complete metric space,

and hence all Cauchy sequences converge. We first show that{fn} is indeed a Cauchy

sequence inL1. In our single-agent case,fn is defined by

f 0 = uniform

f 1 =f 0u∑f 0u

fk+1 =fku∑fku

=

fk−1uPfk−1u

u∑ fk−1uP

fk−1uu

= ... =f 0uk+1

∑f 0uk+1

=uk+1

∑uk+1

.

Hence, we need to show that for anyε there exists anN such that

∥∥∥∥un

∑un− um

∑um

∥∥∥∥p

< ε

for all n,m > N.

Notice that, ifu is constant, thenfn = f 0 for all n and we are done. Hence, we shall

assume thatu is not constant. First note that


∥∥∥∥un

∑un− um

∑um

∥∥∥∥1

=

∥∥∥∥uN

uN

(uk uN

∑un

− uhuN

∑um

)∥∥∥∥1

,

whereu is the maximum value ofu andk = n−N, h = m−N.

By the Schwarz inequality, the above is less or equal than

∥∥∥∥uN

uN

∥∥∥∥2

∥∥∥∥uk uN

∑un

− uhuN

∑um

∥∥∥∥2

,

which in turn is less or equal than

∥∥∥∥uN

uN

∥∥∥∥2

(∥∥∥∥uk uN

∑un

∥∥∥∥2

+

∥∥∥∥uhuN

∑um

∥∥∥∥2

)

by the Minkowsky inequality.

Sinceu ≤ u, we also have that

uk uN

∑un

≤ un

∑un

≤ 1

if un ≥ 1, and

uk uN

∑un

≤ un

∑un

≤ 1

otherwise, and the same holds foruhbuNPum . Therefore, the above expression is less or

equal than2∥∥∥uNbuN

∥∥∥2.

Furthermore, we have that


∥∥∥∥uN

uN

∥∥∥∥2

=

(1

s

s∑1

(u

u

)2N) 1

2

,

which goes to zero asN gets large (unlessu is a constant function, but we already

ruled out that case).

Finally, notice thatpn+1i > (<) pn

i if and only if vn+1i > (<) vn

i . But then the limit

policy p∗i is optimal by the argument used in the proof of Proposition 13.

It should be clear that, in the presence of EU independencies, the computation of

the current values only requires “local” information. Therefore, convergence to an optimal

policy can take place even if EU independent sub-tasks are allocated to independent sub-

agents.

Chapter 5Application: auctions

5.1 An independent-value, second-price auction

In this section, we propose a concrete application of the G net machinery in the context

of an economic example: a second-price (“Vickrey”) auction. Our goal here is to show

how to use G nets to formally represent an economic mechanism – in our case, an auction

– and reason about it. We shall not try to establish new results: rather, our aim is to

show how the informal representations and reasoning typical of auction theory can be made

completely precise thanks to the G net machinery. Formal specification is a rich subject area

in the AI literature, while it is not significantly represented in the economic literature. We

conjecture that electronic commerce applications will motivate the investigation of formal

specification methods also in the context of economic theory.

In a second-price auction the highest bidder gets the auctioned good, but only pays

the second-highest bid. We assume that there are only two bidders13, and that the values of

the good to each bidder correspond to the realizations of independent random variables.

Agent 1 privately observes her own value for the good (denoted byV ), and then

decides how much to bid (B). Independently, the value of the good for agent2 (S) is

realized, and contingent on that he decides how much to bid (C). The two bids jointly

13 This assumption is made for simplicity only: the same methodology and results apply if there are morethan two agents.

75

5.1 An independent-value, second-price auction 76

determine the final allocation (A), which is a paira = (g, m) denoting who gets the good

(g = 1, 2) and how much must be paid for it (m).

4.An independent-value, second-price auction, from the point of view of agent1.

To remove potential confusion, we emphasize that we only model the game from the

point of view of a single agent, and solve it as an individual decision problem. Yet, in a

second price auction, as well as in other dominance-solvable games, this also suffices to

identify the unique equilibrium.

Figure 4 represents the auction from the point of view of agent1. The probability

layer is represented as a Bayes network, to emphasize the causal structure of the events in

the game. Once again, we remark that the probability potentialsq (and the corresponding

Markov representation) can be readily derived from the conditional probability tables, in

which case the resulting G net looks like the one depicted in Figure 1. Here we omit this

extra step, as we shall not need it in the context of the example (since we will simplify the

expected utility function directly, rather than appealing to Theorem 7). Also, we omit all

5.1 An independent-value, second-price auction 77

the utility potentialsw which are identically equal to1 (corresponding to payoff-irrelevant

variables).

The probability of ending up with a particular allocation(g,m) depends onb andc,

and is given by14

p(a|b, c) =

1 if b ≥ c, c = m, g = 11 if b < c, b = m, g = 20 otherwise

For the purpose of exposition, we also postulate a specific (multiplicatively separable)

functional form for agent1’s preferences. We assume that the following condition holds:

w(a|v) =

{1+v1+m

if g = 11 otherwise

Note that agent1’s preferences on different allocations depend on her realized value

for the good. We also assume that the distribution of agent2’s bids, from the point of view

of agent1, has full support.

Agent1 chooses her bid in order to maximize utility, given her private value for the

good. The expected utility ofb conditional onv is given by:

u(b|v) =

∫u(a, b, c, s|v)p(da, dc, ds|b, v)

=u(x0)

u(v)

∫w(a, b, c, s, v)p(da|b, c)p(dc, ds)

=u(x0)w(v|a0)

u(v)

b∫

−∞

1 + v

1 + cp(dc) +

∞∫

b

p(dc)

.

14 For definiteness, we assume that in the case of identical bids agent1 gets the good.

5.2 The Generalized Vickrey auction 78

The first-order condition for optimality is given by

1 + v

1 + cp(c)

c=b∗= p(c)

c=b∗

and returnsb∗ = v.

Hence, regardless of what her opponent is going to bid (as long as the distribution

has full support), the optimal strategy for agent1 is to bid her true evaluation: i.e., to bid

exactly the amount of money which keeps her indifferent between getting the good (and

paying for it) or not.

5.2 The Generalized Vickrey auction

5.2.1 Introduction

In this section, we show how G nets can be used to reduce the computational complexity

of the generalized Vickrey auction (MacKie-Mason and Varian 1994). To do that we shall

not need the full power of G nets; in particular, we only need information about the agent’s

utilities, and not their subjective probabilities. Therefore, we shall only use the utility layer

of G nets (henceforth, U nets).

5.2.2 Setup

As is customary in auction theory, we assume that the agents’ preferences are represented

by quasi-linear utility functions

Ui(x, m) = Ui(x) + m.


We say thatUi(x) is the agent’sevaluation(in monetary units) of outcomex.

For our purposes, a generalized Vickrey auction is a simultaneous auction which im-

plements the Groves-Clarke mechanism. Running a generalized Vickrey auction involves

three steps:

• collect the agents’ preferences

• identify an efficient allocation

• determine the corresponding monetary transfers.

The first step is to collect all the individual preferences, in the presumption that the

auction mechanism causes the agents to truthfully reveal them (this presumption is essen-

tially justified in the case of the GVA). Even this step can be quite cumbersome, if the

structure of preferences is not known in advance and the space of possible allocations is

large.

Ideally, the agents should submit a complete payoff table representing their evalua-

tions of the different outcomes. Yet, the space of possible allocations is often far too large

to even write down such table.

Fortunately, if preferences are well-behaved (i.e., in the presence ofu-independencies),

their U net representation can be much more parsimonious than a payoff table. Hence, col-

lecting the agents’ preferences in the form of U nets achieves the double goal of obtaining

the data in a compact and convenient form, without having to impose predefined, arbitrary

restrictions on the structure of the individual utilities. The agents themselves, when they


submit the U net representing their preferences, also identify the relevant independencies,

and those may in turn be exploited by the auctioneer in order to decrease the complexity of

finding an efficient allocation.

In the next subsection, we illustrate the computational advantages of this method in

the context of a stylized example.

To make them suitable for a U net representation, utilities should first be rescaled to

be strictly positive. Letui(x,m) = ui(x)em, whereui(x) = eUi(x). Clearly,ui represents

the same preferences asUi, since it is obtained by a monotone increasing transformation

of the first (remember: these are just utility functions, and not expected utilities). Note that

theui are strictly positive, and (multiplicatively) separable inx andm. The latter property

implies thatm is u-independent of everything else (no income effects).

The second problem is to find an efficient allocation; this can be achieved by maxi-

mizing the sum of the evaluationsUi. Equivalently, the problem can be restated asmaxx∈X u,

whereu = ×iui(x), andX is some feasible set.

If we were to maximize a single utility function, we could use the U net representation

to identifyu-independencies, and – whenever possible – exploit them in order to reduce the

complexity of utility maximization.

What about the aggregate evaluationu? If we wish to represent it as a U net, we need

to construct the utility potentials


u(x0M , xN−M)

,

and identifyu-independencies. We show how to do this momentarily.


5.2.3 Some results

Suppose that, for all agents,XA and XB are u-independent given the set of remaining

variablesXC . This is true if and only ifXC separatesXA from XB. In that case, we have

that

w(xA|xB, xC) =×iui(xA, xB, xC)

×iui(x0A, xB, xC)

= ×iwi(xA|xC),

which shows that the sameu-independence condition is also satisfied by the aggre-

gate evaluation. This fact is recorded in the following proposition.

Proposition 16 Preference aggregation preserves unanimous u-independence.

A straightforward consequence of this result15 is that, if we take the individual U nets,

and superimpose them, so that in the resulting graph there is an arc between two nodes if

and only if the same arc appears in at least one of the individual U nets, the resulting graph

is a perfect map of theu-independence structure for the aggregate evaluation. To obtain the

aggregate U net, we just need to associate to each node in the resulting graph the product

of the individual potentialswi for the same node.

Now we can perform the second step, and maximize the aggregate evaluation; clearly,

if the independence structure of the individual U nets is identical (i.e., the agents care ex-

actly for the same things), then the same independence relations also holds for the aggregate

15 Incidentally, we remark that a similar result applies for unanimousp– independence, in the more generalcontext of EUNs. Combining the two result, one obtains that aggregation preserves unanimous conditionalEU independence.


evaluation, and therefore maximizing the latter is as hard as maximizing any of the indi-

vidual utilities. On the other hand, if everybody is concerned about different (but related)

aspects of the allocation, then maximizing the aggregate evaluation can be much harder,

since the resulting U net will approximate a completely connected graph. In general, the

problem of finding a maximal allocation is NP-hard16, but – as in the case of Bayes net-

works – for special configurations the problem is substantially simpler, as the following

proposition shows.

Proposition 17 In the case of polytrees, an efficient allocation can be found in linear

time.

Proof We first consider the case of a linear polytree of Boolean variables, defined as fol-

lows. LetX1, ..., Xn be Boolean variables, such thatX1 isu–independent of everything else

givenX2, X2 is u–independent of everything else givenX1 andX3, Xk (k = 3, ..., n− 1)

is u–independent of everything else givenXk−1 andXk+1, andXn givenXn−1. Suppose,

without loss of generality, thatw(X1|X2) ≥ 1. Then, wheneverX2 is chosen, indepen-

dently of the values of the remaining variables, it is convenient to chooseX1 rather than

−X1. Similarly, according to whetherw(X1|−X2) is greater than1 or not, it is convenient

to chooseX1 or−X1 whenever−X2 is chosen. Letx1(x2) be the optimal value ofx1 as a

function ofx2. Note that identifyingx1(x2) requires two binary comparisons, according to

the magnitudes ofw(x1|x2). To decide the optimal value ofx2 as a function ofx3 one needs

16 A problem is said to be NP-hard if there is no non-deterministic algorithm which is guaranteed to identifythe solution in polynomial time.


two more comparisons, according to the magnitudes ofw(x1(x2), x2|x3). Let x2(x3) be the

optimal decision; iterating this procedure, one then finds the optimal values ofxk as a func-

tion of xk+1 for k = 3, ..., n − 1, each step requiring two comparisons. Finally, one can

choose the optimal value ofxn according to whetherw(x1(x2), x2(x3), ..., xk(xk+1), ..., xn)

is greater than1 or not, which requires two more comparisons. In total, that makes2n com-

parisons, which is linear inn. By way of contrast, if there are nou–independencies one

would need up to2n comparisons, which is exponential inn. A similar argument applies

to general polytrees; if each variable has up toh distinct values andk neighbors, one finds

that no more thannhkhobservations are needed, a quantity which is again linear inn.

The third step is to figure out the payments, once an efficient allocation has been

picked. The complexity of this step, in the case of the generalized Vickrey auction, is the

same as the complexity of finding an efficient allocation.

In the next section, we illustrate the U net methodology in the context of a stylized

example.

5.2.4 Example: a Coase-tal economy

The coast of a lake is divided inton equally sized estates. Each estate can be either left

undeveloped, or used for one of three possible activities: a beach resort, a fishery, or a

chemical plant. The profitability of each of these activities depends on the type of activ-

ity practiced in the two neighboring estates, but not on the other ones. For instance, the

profitability of a beach resort is highest when the two neighboring estates are left undevel-

oped, is only mildly affected by the proximity of other beach resorts, is more significantly


affected by being contiguous to a fishery, and is dramatically upset by being close to a

chemical plant.

In a GVA the owner of each estate should express his evaluations on22n possible

configurations, but in practice – given the local nature of preferences – it suffices to elicit his

evaluations only on the26 triplets of activities in his neighborhood. The agents’ preferences

are submitted in the form of U nets, one for each land owner. The individual U nets are

then combined to form the aggregate evaluation, whose U net will have a “ring” structure.

Such structure is not a polytree, but it becomes a polytree (of triplets) if one of the triplets

is instantiated. Hence, the complexity of finding an efficient allocation (and the complexity

of calculating the corresponding payments, with the choice of constants described above)

is linear inn. Considering that in the general case finding an efficient allocation is NP-hard,

this constitutes a substantial improvement whenn is a large number.

Chapter 6A maximum entropy method for expected

utilities

6.1 Introduction

The notion of maximum entropy (Shannon 1949, Shannon and Weaver 1949, Jaynes 1983)

has recently attracted considerable interest in the context of Bayesian probability theory

(Good 1983, Jaynes 1998), as it provides a relatively simple, yet principled way to repre-

sent incomplete knowledge about uncertain domains. Not only maximum entropy methods

exhibit deep and interesting connections with seemingly unrelated notions such as mini-

mum description length and default reasoning, but - most importantly - they provide intu-

itively reasonable answers to many statistical problems, and have been successfully applied

to such diverse fields as physics, computer science, and even finance (Cover and Thomas

1991).

One way to think about maximum entropy is in terms of “typical” beliefs; in cases

where the available data are compatible with a large class of probabilistic models, it always

attempts to pick the most reasonable one, on the basis of symmetry considerations.

In economic decision theory, (subjective) probabilities are not regarded as a primitive

notion, but rather they are seen as aspects of the agent’s preferences, together with the sub-

jective utilities. While maximum entropy methods permit the characterization of “typical”

subjective beliefs, there is no corresponding notion of “typicality” for the case of utilities.

85

6.1 Introduction 86

The aim of the present chapter is to derive a symmetric notion of maximum en-

tropy for utilities, and investigate its connections with the notion of utility independence

(u-independence) which we introduced in the context of EU networks. As it turns out,

our maximum entropy method for utilities always returns indifference andu-independence

whenever possible. Hence, utility entropy provides an indirect justification for our notion

of u-independence, although this is not our main motivation in carrying on this program.

Our primary motivation is to provide a method for the revealed-preference estima-

tion of an unknown preference structure, based on observations on the agent’s behavior.

For instance, a current topic of interest in the theory of marketing is the estimation of the

potential demand for goods or services based on observations coming from large, heteroge-

neous databases of consumer data. Typically, the data are compatible with a broad spectrum

of possible consumer types, and hence the densities of the different types in the population

should first be estimated, and then incorporated in the model to produce an estimate of the

aggregate demand. To reduce the complexity of this task one may want to resort to a rep-

resentative agent formulation, in which the set of observed constraints is used to generate a

“typical” preference structure consistent with the observed data. Maximum entropy meth-

ods characterize “typical” belief structures, based on invariance considerations. Our aim is

to extend those methods in a principled way to the case of expected utilities, and use them

to characterize “typical” preference structures in the face of observed decision behavior

(the representative agent’s revealed preferences).

Our secondary motivation is to develop an algoritmically feasible procedure for co-

ordination in game networks. In economics, work on coordination has usually suffered

6.2 Setup 87

from the difficulty of capturing the fine psychological aspects underlying the phenomenon

of coordination in actual social environments. Language, mutual information, individual

characteristics all contribute substantially to the outcome of the coordination process, and

hence – in the impossibility to formally and practically account for those parameters – a

theory of human coordination would have nearly no empirical content.

Yet, in artificial environments, such as those created by the interaction of comput-

erized agents, endowed with individual motivations and information, the normative study

of coordination becomes a highly relevant issue. Not only we can choose the linguistic

and “psychologic” characteristics of our agents by design, but in principle we can also en-

dow them with an effective algorithmic toolbox for coordination which would give them

a strategic advantage in team games. For instance, in military applications, several divi-

sions of the same army may face a “coordinated attack” dilemma in a situation in which

the lines of communication between different divisions are disrupted. Having a common

protocol for coordination (say, a computerized unit for each division, running the same co-

ordination software) may reduce the risk of mis-coordination, and at the same time allow

the different divisions to improve their joint performance. More generally, since maximum

entropy methods provide a way to generate a unique “representative agent” (i.e., a unique

expected utility) in the face of convex observed constraints, in principle one could use them

to generate efficient reference points in game networks.

6.2 Setup

Let X be a finite set of states, and letA be the Boolean algebra of all the subsets ofX.

6.2 Setup 88

We assume that the agent’s preferences admit a (von Neumann-Morgenstern) ex-

pected utility representation(p, u), wherep is a strictly positive probability measure de-

fined onA andu is a function associating to each statex a positive real numberu(x).

We extend the utility function from states to general events by defining

u(E) =∑x∈X

u(x)p(x|E),

and normalize it by requiring thatu(X) = 1.

Let v(E) = u(E)p(E); we say thatv is the value of eventE. Hence, under the

above normalizationv is a probability measure, since it is an additive set function, and

0 ≤ v(E) ≤ 1. Moreover, sincep is strictly positive, for any nonempty eventE we can

write

u(E) =v(E)

p(E).

Once again, notice the remarkable structure of utilities: utility is simply the ratio of

two probability measures, one representing value and the other belief.

Theentropyof a probability measurep is defined as

H(p) = −∑x∈X

p(x) ln p(x).

Moreover, if p and q are two probability measures, therelative entropyof q with

respect top is defined as

D(q||p) =∑x∈X

q(x) ln

(q(x)

p(x)

).

6.3 Utility entropy 89

This quantity is always nonnegative, and is equal to zero if and only ifq ≡ p.

6.3 Utility entropy

The relative entropy of value with respect to probability is given by

D(v||p) =∑

v(x) ln(v(x)

p(x)) =

∑u(x) ln u(x)p(x) = Ep[u(x) ln u(x)]. (4)

Notice that minimizing the relative entropy ofv relative top is the same as minimiz-

ing the expectation ofu(x) ln u(x). This expectation is always non-positive, and is equal

to zero if and only ifv ≡ p, in which caseu(x) = 1 for all x, i.e. there is complete

indifference among all states.

Minimizing the relative entropy of value with respect to belief provides us with a

simple way of generating “typical” utilities, which we shall callmaximum entropy utilities.

If there is no information regarding the utilities of different states, then the method returns

complete indifference, as one would expect. What happens if we incorporate some infor-

mation about utilities? It turns out that the method returns the most symmetrical utility

function, i.e. it assumes indifference and independence whenever possible.

Let us see what happens in the context of two simple examples.


A first example

Let X = {x1, x2, x3} , and letpi = p(xi), ui = u(xi), etc. We assume that the

probabilities are given by(p1, p2, p3) = (1/5, 1/3, 7/15), and thatu1 = 2, i.e. v1 = 2p1 =

2/5.

The maximum entropy utility can then be obtained by minimizing

Ep(u ln u) = v1 ln(v1

p1

) + v2 ln(v2

p2

) + (1− v1 − v2) ln(1− v1 − v2

1− p1 − p2

)

with respect tov, subject to the constraintv1 = 2p1. The unique solution is given by

(v1, v2, v3) = (2/5, 1/4, 7/20).

Finally, one can solve for the corresponding utilities, which yields

(u1, u2, u3) = (2, 3/4, 3/4).

Hence, in the absence of any information regarding the utilities ofx2 andx3, their

maximum entropy utility turns out to be symmetric, as one would expect.

A second example

Suppose that there are two Boolean variables (Health and Wealth), and let

(x1, x2, x3, x4) = (HW,H −W,−HW,−H −W ) be the corresponding four states.

Suppose that the two variables are probabilistically independent, and that the proba-

bilities of H andW are 0.8 and 0.3 respectively.

Clearly, if there are no further constraints, the maximum entropy utility assigns com-

plete indifference to all states, which trivially implies thatH andW areu-independent.


If we further assume thatu(H) = 3u(−H), andu(W ) = 2u(−W ), are the two

variables stillu-independent under the maximum entropy utility? The answer turns out to

be yes.

The two constraints can be rewritten as

v1 + v2

p1 + p2

= 3v3 + v4

p3 + p4

v1 + v3

p1 + p3

= 2v2 + v4

p2 + p4

wherev4 = 1− v1 − v2 − v3, andp4 = 1− p1 − p2 − p3.

Solving forv2 andv3, we obtain

v2 = −v1+2v1p1+2v1p2−3p1−3p2

1+2p1+2p2, v3 = −v1+v1p1+v1p3−2p1−2p3

1+p1+p3.

The probabilities are given by(p1, p2, p3) = (24/100, 56/100, 6/100), and we wish

to find the valuev which minimizes

Ep[u ln u] =

v1 ln v1

p1+ v2 ln v2

p2+ v3 ln v3

p3+ (1− v1 − v2 − v3) ln 1−v1−v2−v3

1−p1−p2−p3

subject to the two constraints.

The unique solution is given by:

(v1, v2, v3, v4) = ( 72169

, 84169

, 6169

, 7169

),

6.4 Axiomatic justification of our maximum entropy method for expected utilities92

while the marginals are given byv1+v2 = 1213

, v3+v4 = 113

, v1+v3 = 613

, v2+v4 = 713

.

This information is summarized in the following table, where the last row and column

represent the marginals.

72169

84169

1213

6169

7169

113

613

713

One can then calculate the corresponding utilitiesui = vi/pi:

(u1, u2, u3, u4) = (300169

, 150169

, 100169

, 50169

).

The table below shows the utilities, in multiples ofu4 (an arbitrary reference point).

W −WH 6 3−H 2 1

Notice thatH andW areu-independent, as one would expect. This is always true if

the probabilities are independent, but does not hold in general.

6.4 Axiomatic justification of our maximum entropy methodfor expected utilities

So far, we defined a maximum entropy method which can be used to derive a unique ex-

pected utility function satisfying some (convex) observed constraints. Yet, a useful method

should not only be simple but also principled; a natural question is hence whether it is

possible to provide a rigorous justification to our maximum entropy method for expected

utilities. In this section, we show that the answer is yes: the method can be given a rig-

orous axiomatic foundation, in terms of a set of appealing invariance properties which we

present and discuss below.


Suppose that we are trying to estimate an agent’s expected utility function over a

set of possible statesX, whose subjective probabilities are known. Further, suppose that

observation of this agent’s behavior returns a set of constraints on its expected utilities,

of the formEU(E) ≤ α (in fact, our treatment will allow for a more general class of

constraints). For instance, we may have observations on the agent’s purchasing behavior,

or employment decisions, etc.

Our observations in general will be compatible with an infinite number of possible

utility functions; among those, we would like to identify a unique utility function repre-

senting – in some sense – the least informative preference structure compatible with the

observed data. We do so by imposing a set of axiomatic conditions on the selection rule,

based on invariance considerations. Our conditions mirror those introduced by Shore and

Johnson (Shore and Johnson 1980) for the selection of posterior probabilities, and can be

informally stated as follows.

1. Uniqueness: the result should be unique.

2. Invariance: the choice of coordinate system should not matter.

3. System Independence: it should not matter whether one accounts for independent

information about independent systems together or separately.

4. Subset Independence: it should not matter whether one incorporates information on

mutually disjoint subsets of system states in terms of separate conditional utilities or

in terms of the full system utility.


Let U be the set of normalized expected utility functions on a setX, and letC be the

set of all closed and convex (with respect to mixtures) subsets ofU. Then any elementI of

C can be expressed in terms of a (possibly infinite) set of inequalities of the type

∑u(x)f(x) ≥ 0,

and conversely any such set of constraints identifies an element ofC. We assume

that the information on the agent’s preferences always corresponds to a closed and convex

constraint.

Without loss of generality, we can state our inference problem as follows: find

H(u, p) = min(u′,p)∈bI H(u′, p),

whereH is a suitable functional,p is given, andI is a closed and convex constraint.

Equivalently, sinceu = v/p, the problem can be restated asminv∈I H(v, p) for a

suitableH, whereI is the (closed and convex) constraint onv corresponding to the closed

and convex constraintI. We shall work with this equivalent formulation, as it is more

convenient for our treatment. We remark that, if probabilities are known (as we assumed

throughout), then the observed constraints – which come in the form of inequalities on

expected utilities – can be equivalently restated in terms of value inequalities.

We now introduce an inference operator◦, which associates to any constraintI and

probability measurep a value functionv = p ◦ I. This operation can be regarded as the

outcome of the above minimization for some functionalH.

Our assumptions can hence be formally stated as follows.


Assumption 1 (Uniqueness).p◦I is unique for any priorp and new informationI ∈ C.

Assumption 1 is quite appealing: it says that the outcome of the inference process

should be unambiguously determined.

Assumption 2 (Invariance). LetΓ be a transformation fromx ∈ X to y ∈ X ′, with

(Γv)(y) := v(Γ−1(y)). Let ΓI be the set of value measures onX ′ corresponding to the

original constraintI. Then, for anyp andI,

(Γp) ◦ (ΓI) = Γ(p ◦ I).

Assumption 2 says that coordinate transformations should not matter: the inference

method should be invariant with respect to relabelings.

Now suppose that the state space is composed of two independent subspaces, i.e.

X = Xi × X2, and thatpi, vi are probability measures onXi (i = 1, 2). Moreover, letI1

andI2 two constraints onv1 andv2 respectively. We then require that the following holds.

Assumption 3 (System Independence).(p1 × p2) ◦ (I1 × I2) = (p1 ◦ I1)× (p2 ◦ I2).

The assumption says that it should not matter if the inference rule on the two inde-

pendent subsystems is applied jointly or separately.

Finally, let {Xk} be a partition ofX, and letpXk, vXk

be the corresponding condi-

tional measures. LetIk be a family of constraints involving only the conditional values


vXk, let I = ∩kIk, and letM be a constraint of the formv(Xk) = mk, where themk are

known values. We require that the following holds.

Assumption 4 (Subset Independence). Letv = p ◦ (I ∩M). ThenvXk= pXk

◦ Ik.

This last assumption states that it should not matter if new information on the condi-

tional values of mutually exclusive subsets is incorporated jointly or separately.

The following result then holds:

Theorem 18 There exists an inference rule which satisfies all the above assumptions.

Moreover, any such rule is equivalent to the minimization of (6.4).

Proof The result is an immediate corollary of Theorem III and Theorem IV in (Shore and

Johnson 1980) for the discrete case.

Even though our theorem is a straightforward consequence of existing results for the

case of posterior probabilities, we submit that it is not without merit. Surprisingly, the

required assumptions are still quite reasonable if we reinterpret the posterior probability

as value and use the axiomatic derivation to characterize “typical” expected utilities via

our notion of utility entropy. We believe that our maximum entropy method for expected

utilities constitutes a valid and principled way to identify a representative agent from a

large population of types in the face of revealed-preference constraints, and could find

application not only to economic theory, but also to problems in marketing and electronic

commerce.

References

Aumann, R. and Brandenburger, A., 1995, “Epistemic Conditions for Nash Equilibrium”,Econometrica63, n.5, 1161-1180.

Bacchus, F. and Grove, A., 1995, Graphical models for preference and utility. InProc. 11thConference on Uncertainty in Artificial Intelligence, 3-10.

Battigalli, P. and Siniscalchi, M., 1999, “Hierarchies of Conditional Beliefs and InteractiveEpistemology”,Journal of Economic Theory, forthcoming.

Bolker, E.D., 1967, “A Simultaneous Axiomatization of Utility and Subjective Probabil-ity”, Philosophy of Science34, 333-340.

Cover, T. and Thomas, J., 1991,Elements of Information Theory, Wiley.

De Finetti, B., 1937, “La prevision: ses lois logiques, ses sources subjectives”,Annales del’Institut Henry Poincare, n.7, 1-68.

Domotor, Z., 1978, “Axiomatization of Jeffrey Utilities”,Synthese39, 165-210.

Doyle, J. and Wellman, M. P., 1995, Defining preferences as Ceteris Paribus Comparatives.In Proc. AAAI Spring Symp. on Qualitative Decision Making, 69-75.

Epstein, L. and Wang, T., 1996, “‘Beliefs about Beliefs’ without Probabilities”,Economet-rica 64, n.4, 1343-74.

Fagin, R., Halpern, J., Moses, Y., and Vardi, M. Y., 1995,Reasoning about Knowledge.MIT Press, Cambridge.

Fishburn, P.C., 1982,The Foundations of Expected Utility, Reidel, Dordrecht, Holland.

Fudenberg, D. and Tirole, J., 1991,Game Theory, MIT Press, Cambridge.

Good, I. J., 1983,Good thinking : the foundations of probability and its applications.Uni-versity of Minnesota Press.

Halpern, J., 1997, “On ambiguities in the interpretation of game trees”,Games and Eco-nomic Behavior20, 66-96.

97

References 98

Harsanyi, J. and Selten, R., 1988,A General Theory of Equilibrium Selection in Games.MIT Press, Cambridge.

Hayes-Roth, B., 1995, Agents on Stage: Advancing the State of the Art in AI. InProc. 14thInternational Joint Conference on Artificial Intelligence, 967-971.

Heifetz, A. and Samet, D., 1996, “Topology-Free Typology of Beliefs”, ewp-game/9609002.

Jaynes, E. T., 1983,Papers on Probability, Statistics, and Statistical Physics.Kluwer,Boston.

Jaynes, E. T., 1998,Probability Theory: The Logic Of Science.ftp site : bayes.wustl.eduon subdirectory Jaynes.book.

Jeffrey, R.C., 1965,The Logic of Decision, University of Chicago Press, Chicago.

Koller, D. and Pfeiffer, A., 1997, “Representations and Solutions for Game-Theoretic Prob-lems”,Artificial Intelligence94, n.1, 167-215.

Kripke, S., 1980,Naming and Necessity. Harvard University Press, Cambridge.

La Mura, P. and Shoham, Y., 1998, Conditional, hierarchical, multi-agent preferences. InProc. of Theoretical Aspects of Rationality and Knowledge– VII, 215-224.

MacKie-Mason, J. K. and Varian, H. R., 1994, “Generalized Vickrey Auctions”. AnnArbor, MI, Dept. of Economics, University of Michigan. Available from http://www-personal.umich.edu/~jmm/papers/gva3.pdf.

McCarthy, J., 1959, Programs with Common Sense. InMechanization of Thought Pro-cesses, Proceedings of the Symposium of the National Physics Laboratory, 77-84.

Mertens, J.-F. and Zamir, Z., 1985, “Formulation of Bayesian Analysis for Games withIncomplete Information”,International Journal of Game Theory14, 1-29.

Morgan, A., 1987.Solving polynomial systems using continuation for engineering andscientific problems.Prentice-Hall.

Myerson, R., 1978, “Refinements of the Nash Equilibrium Concept”,International Journalof Game Theory7, 73-80.

Pareto, V., 1906,Manuel d’Economie Politique.

Pearl, J., 1988,Probabilistic reasoning in intelligent systems. Morgan Kaufmann.

References 99

Pearl, J., and Paz, A., 1989, Graphoids: A Graph-Based Logic for Reasoning About Rele-vance Relations. In B. Du Boulay (Ed.),Advances in Artificial Intelligence - II,North-Holland.

Samet, D., 1994, “Hypothetical Knowledge and Games of Perfect Information”, workingpaper, Tel Aviv University.

Savage, L.J., 1954,The Foundations of Statistics, Wiley, New York.

Shannon, C., 1949, “A Mathematical Theory of Communication”,Bell Systems TechnicalJournal47, 143-157.

Shannon, C. and Weaver, W., 1949,The Mathematical Theory of Communication, Univ. ofIllinois Press.

Shoham, Y., 1997, A Symmetric View of Probabilities and Utilities. InProc. 13th Confer-ence on Uncertainty in Artificial Intelligence, p. 429-436.

Shore, J. E. and Johnson, R. W., 1980, Axiomatic derivation of the principle of maximumentropy and the principle of minimum cross-entropy.IEEE Transactions on InformationTheory, vol. IT-26, no. 1, 26-37.

Shub and Smale, 1983. “Complexity of Bezout’s theorem II - volumes and probabilities”,Computational Algebraic Geometry109.

Tan, T. and Werlang, S., 1988, “The Bayesian Foundations of Solution Concepts of Games”,Journal of Economic Theory45, 370-391.

Vickrey, W., 1961, “Counterspeculation, auctions, and competitive sealed tenders”.Journalof Finance16, 8-37.

von Neumann, J. and Morgenstern, O., 1944,Theory of Games and Economic Behavior,Princeton University Press, Princeton, New Jersey.

Date post:	30-Nov-2023
Category:	Documents
Upload:	paulleaderinbrackets
View:	0 times
Download:	0 times

Foundations of Multi-Agent Systems

Documents