Advanced Fixed Point Theory for Economics · of ﬁxed points entitled Selected Topics in the...

Andrew McLennan

Advanced Fixed Point Theory for Economics

Advanced Fixed Point Theory for Economics

Andrew McLennan

Advanced Fixed PointTheory for Economics

123

Andrew McLennanUniversity of QueenslandSaint Lucia, QLDAustralia

ISBN 978-981-13-0709-6 ISBN 978-981-13-0710-2 (eBook)https://doi.org/10.1007/978-981-13-0710-2

Library of Congress Control Number: 2018943718

© Springer Nature Singapore Pte Ltd. 2018This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or partof the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmissionor information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilarmethodology now known or hereafter developed.The use of general descriptive names, registered names, trademarks, service marks, etc. in thispublication does not imply, even in the absence of a specific statement, that such names are exempt fromthe relevant protective laws and regulations and therefore free for general use.The publisher, the authors and the editors are safe to assume that the advice and information in thisbook are believed to be true and accurate at the date of publication. Neither the publisher nor theauthors or the editors give a warranty, express or implied, with respect to the material contained herein orfor any errors or omissions that may have been made. The publisher remains neutral with regard tojurisdictional claims in published maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd.The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721,Singapore

Preface

Over two decades ago now I wrote a rather long survey of the mathematical theoryof fixed points entitled Selected Topics in the Theory of Fixed Points. It had nocontent that could not be found elsewhere in the mathematical literature, butnonetheless some economists found it useful. Almost as long ago, I began work onthe project of turning it into a proper book, and finally that project is coming tofruition. Various events over the years have reinforced my belief that the mathe-matics presented here will continue to influence the development of theoreticaleconomics, and have intensified my regret about not having completed it sooner.

This is a book of mathematics for graduate students and researchers in theo-retical economics and related disciplines. It is ambitious, seeking to expand therange of tools and results that are commonly employed in economic research, and ittargets ambitious readers who wish to develop or expand the sort of powerfultechnical toolkit that leading scholars deploy in the most creative and originalresearch. It is suitable for self-study and as a reference.

It also has many exercises, so it can serve as the text of a university course. Forthe most part, the exercises are not just simple examples or illustrative calculations,but instead ask for proofs of important related results. In particular, many economicapplications are covered, with the consequence that this book supports a course thatcan legitimately be regarded as a course in mathematical economics, and not justsome mathematics with economic applications. The amount of material seemsabout right for two semesters, and there is some flexibility insofar as such a coursewill not be seriously incomplete if it does not reach Chaps. 16 and 17, and Chap. 3(after Sect. 3.1) can be deferred without loss of logical continuity.

The most likely audience for such a course would be advanced graduate stu-dents, but the book provides secure foundations by developing many topics fromthe very beginning, so its prerequisites (linear algebra, multivariable differentialcalculus, real analysis, and a bit of point-set topology) are mild enough that studentswith adequate mathematical background may approach it at an earlier stage.

A distinctive feature of this book is that we do not require or use any algebraictopology. One reason for this is practical: A large amount of quite abstract materialmust be absorbed at the beginning before the structure, nature, and goals of

v

algebraic topology begin to come into view. Many researchers in economics learnadvanced topics in mathematics as a by-product of their research, picking up a bit ofinfinite dimensional analysis, or the basics of continuous time stochastic processes,because they need it for some project or because it is used in some piece of researchthey wish to understand. Algebraic topology is unlikely to gradually achievepopularity among economic theorists through such slow diffusion. At present,economic theorists mostly do not know the subject and do not use it, so they do notneed to know it, and they do not learn it. Perhaps, this is a lamentable state ofaffairs, but it is not something that this book can realistically aspire to change.

The avoidance of algebraic topology can also be seen as a feature rather than abug. Roughly, homology associates an abelian group to each well-enough behavedtopological space, and to each continuous function between such spaces it asso-ciates a homomorphism between the groups of the domain and range. These objectsobey certain rules: The identity homomorphism is associated with the identityfunction, and the homomorphism associated with a composition of two continuousfunctions is the composition of the homomorphisms associated with the twofunctions. In addition, certain geometric settings give rise to derived algebraicstructures. This adds up to a powerful and complex computational machine thatallows information about certain spaces and maps to be inferred from other suchinformation.

In general, mathematical understanding is enhanced when brute calculations arereplaced by logical reasoning based on conceptually meaningful definitions, so notusing the machinery of homology forces us to express everything in more direct andintuitive terms. In addition, that the theory can be developed, without homology, tothe level of generality seen herein, is itself a fact of considerable mathematicalinterest. Admittedly, there is a slight loss of generality, because there are acyclic—that is, homologically trivial—spaces that are not contractible, but for us this isunimportant because such spaces are rarely found “in nature” and have never madean appearance in theoretical economics.

There is a vast literature on fixed points, which has influenced me in many waysand which cannot be described in any useful way here. Even so, I should saysomething about how the present work stands in relation to some other books onfixed points. Fixed Point Theorems with Applications to Economics and GameTheory by Border (1985) and General Equilibrium Analysis: Existence andOptimality Properties of Equilibria by Florenzano (2003) are complements, notsubstitutes, explaining various forms of the fixed point principle such as the KKMStheorem and some of the many theorems of Ky Fan, along with the concrete detailsof how they are actually applied in economic theory. Fixed Point Theory byDugundji and Granas (2003) is, much more than this book, a comprehensivetreatment of the topic. Its fundamental point of view (applications to nonlinearfunctional analysis) audience (professional mathematicians) and technical base(there is extensive use of algebraic topology) are quite different, but it is still a workwith much to offer to economics. Particularly notable is the extensive and metic-ulous information concerning the literature and history of the subject, which is fullof affection for the theory and its creators. Topological Fixed Point Theory of

vi Preface

Multivalued Mappings by Górniewicz (2006) surveys a wealth of mathematicalresearch, much of it quite recent, on the fixed point theory of correspondences. Thebook that was, by far, the most useful to me, is The Lefschetz Fixed Point Theoremby Brown (1971). Again, his approach and mine have differences rooted in thenature of our audiences and the overall objectives, but at their cores the two booksare quite similar, in large part because I borrowed a great deal.

I would like to thank many people who, over the years, have commentedfavorably on Selected Topics. It is a particular pleasure to acknowledge some verydetailed and generous comments by Klaus Ritzberger, Bill Sandholm, EranShmaya, Eilon Solan, and Neil Wallace. This work would not have been possiblewithout the support and affection of my families, both present and past, for whichI am forever grateful.

Brisbane, Australia Andrew McLennanMarch 2018

Preface vii

Contents

Part I Overview

1 Introduction and Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 The Key Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Historical Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.3 Chapter Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.3.1 Chapter 2: Planes, Polyhedra, and Polytopes . . . . . . . 131.3.2 Chapter 3: Computing Fixed Points . . . . . . . . . . . . . . 131.3.3 Chapter 4: Topologies on Sets . . . . . . . . . . . . . . . . . . 141.3.4 Chapter 5: Topologies on Functions

and Correspondences . . . . . . . . . . . . . . . . . . . . . . . . 151.3.5 Chapter 6: Metric Space Theory . . . . . . . . . . . . . . . . 151.3.6 Chapter 7: Essential Sets of Fixed Points . . . . . . . . . . 161.3.7 Chapter 8: Retracts . . . . . . . . . . . . . . . . . . . . . . . . . . 181.3.8 Chapter 9: Approximation . . . . . . . . . . . . . . . . . . . . . 191.3.9 Chapter 10: Manifolds . . . . . . . . . . . . . . . . . . . . . . . 201.3.10 Chapter 11: Sard’s Theorem . . . . . . . . . . . . . . . . . . . 211.3.11 Chapter 12: Degree Theory . . . . . . . . . . . . . . . . . . . . 221.3.12 Chapter 13: The Fixed Point Index . . . . . . . . . . . . . . 231.3.13 Chapter 14: Topological Consequences . . . . . . . . . . . 241.3.14 Chapter 15: Dynamical Systems . . . . . . . . . . . . . . . . 241.3.15 Chapter 16: Extensive Form Games . . . . . . . . . . . . . . 261.3.16 Chapter 17: Monotone Equilibria . . . . . . . . . . . . . . . . 28

Part II Combinatoric Geometry

2 Planes, Polyhedra, and Polytopes . . . . . . . . . . . . . . . . . . . . . . . . . . 332.1 Affine Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332.2 Convex Sets and Cones . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.3 Polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

ix

2.4 Polytopes and Polyhedral Cones . . . . . . . . . . . . . . . . . . . . . . 422.5 Polyhedral Complexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 442.6 Simplicial Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.7 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3 Computing Fixed Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.1 The Axiom of Choice, Subsequences, and Computation . . . . . 563.2 Sperner’s Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613.3 The Scarf Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643.4 Primitive Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683.5 The Lemke–Howson Algorithm . . . . . . . . . . . . . . . . . . . . . . . 733.6 Implementation and Degeneracy Resolution . . . . . . . . . . . . . . 793.7 Using Games to Find Fixed Points . . . . . . . . . . . . . . . . . . . . . 853.8 Homotopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 873.9 Remarks on Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

Part III Topological Methods

4 Topologies on Spaces of Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1054.1 Topological Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1054.2 Spaces of Closed and Compact Sets . . . . . . . . . . . . . . . . . . . . 1064.3 Vietoris’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1084.4 Hausdorff Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1094.5 Basic Operations on Subsets . . . . . . . . . . . . . . . . . . . . . . . . . 110

4.5.1 Continuity of Union . . . . . . . . . . . . . . . . . . . . . . . . . 1114.5.2 Continuity of Intersection . . . . . . . . . . . . . . . . . . . . . 1114.5.3 Singletons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1124.5.4 Continuity of the Cartesian Product . . . . . . . . . . . . . . 1124.5.5 The Action of a Function . . . . . . . . . . . . . . . . . . . . . 1134.5.6 The Union of the Elements . . . . . . . . . . . . . . . . . . . . 114

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

5 Topologies on Functions and Correspondences . . . . . . . . . . . . . . . . 1175.1 Upper and Lower Hemicontinuity . . . . . . . . . . . . . . . . . . . . . 1185.2 The Strong Upper Topology . . . . . . . . . . . . . . . . . . . . . . . . . 1195.3 The Weak Upper Topology . . . . . . . . . . . . . . . . . . . . . . . . . . 1215.4 The Homotopy Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1245.5 Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

6 Metric Space Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1296.1 Paracompactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1296.2 Partitions of Unity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

x Contents

6.3 Topological Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 1326.4 Banach and Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . 1346.5 Embedding Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1366.6 Dugundji’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

7 Essential Sets of Fixed Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1437.1 The Fan–Glicksberg Theorem . . . . . . . . . . . . . . . . . . . . . . . . 1447.2 Convex Valued Correspondences . . . . . . . . . . . . . . . . . . . . . . 1467.3 Convex Combinations of Correspondences . . . . . . . . . . . . . . . 1477.4 Kinoshita’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1487.5 Minimal Q-Robust Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

8 Retracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1558.1 Kinoshita’s Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1568.2 Retracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1598.3 Euclidean Neighborhood Retracts . . . . . . . . . . . . . . . . . . . . . . 1608.4 Absolute Neighborhood Retracts . . . . . . . . . . . . . . . . . . . . . . 1628.5 Absolute Retracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1678.6 Domination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

9 Approximation of Correspondences by Functions . . . . . . . . . . . . . . 1739.1 The Approximation Result . . . . . . . . . . . . . . . . . . . . . . . . . . . 1739.2 Technical Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1759.3 Proofs of the Propositions . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

Part IV Smooth Methods

10 Differentiable Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18310.1 Review of Multivariate Calculus . . . . . . . . . . . . . . . . . . . . . . 18410.2 Smooth Partitions of Unity . . . . . . . . . . . . . . . . . . . . . . . . . . 18710.3 Manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19010.4 Smooth Maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19110.5 Tangent Vectors and Derivatives . . . . . . . . . . . . . . . . . . . . . . 19210.6 Submanifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19510.7 Tubular Neighborhoods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19810.8 Manifolds with Boundary . . . . . . . . . . . . . . . . . . . . . . . . . . . 20310.9 Classification of Compact 1-Manifolds . . . . . . . . . . . . . . . . . . 206Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

11 Sard’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21311.1 Sets of Measure Zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21411.2 A Weak Fubini Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216

Contents xi

11.3 Sard’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21711.4 Measure Zero Subsets of Manifolds . . . . . . . . . . . . . . . . . . . . 22011.5 Genericity of Transversality . . . . . . . . . . . . . . . . . . . . . . . . . . 221Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225

12 Degree Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22712.1 Some Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22812.2 Orientation of a Vector Space . . . . . . . . . . . . . . . . . . . . . . . . 23112.3 Orientation of a Manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . 23312.4 Induced Orientation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23512.5 The Degree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23712.6 Composition and Cartesian Product . . . . . . . . . . . . . . . . . . . . 241Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

13 The Fixed Point Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24513.1 A Euclidean Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24613.2 Multiplication and Commutativity . . . . . . . . . . . . . . . . . . . . . 24713.3 Germs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25013.4 Extension to ANR’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25213.5 Extension to Correspondences . . . . . . . . . . . . . . . . . . . . . . . . 25813.6 Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264

Part V Applications

14 Topological Consequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26914.1 Euler, Lefschetz, and Eilenberg–Montgomery . . . . . . . . . . . . . 27014.2 The Hopf Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27214.3 More on Maps Between Spheres . . . . . . . . . . . . . . . . . . . . . . 27514.4 Invariance of Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28214.5 Essential Sets Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287

15 Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28915.1 Euclidean Dynamical Systems . . . . . . . . . . . . . . . . . . . . . . . . 29115.2 Dynamics on a Manifold . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29315.3 Diffeoconvex Bodies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29615.4 Flows on Diffeoconvex Bodies . . . . . . . . . . . . . . . . . . . . . . . 30015.5 The Vector Field Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30415.6 Dynamic Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31115.7 The Converse Lyapunov Problem . . . . . . . . . . . . . . . . . . . . . 31515.8 A Necessary Condition for Stability . . . . . . . . . . . . . . . . . . . . 31915.9 The Correspondence and Index þ 1 Principles . . . . . . . . . . . . 320Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328

xii Contents

16 Extensive Form Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33116.1 A Signalling Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33216.2 Extensive Form Games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33416.3 Sequential Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33616.4 Conditional Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33916.5 Strong Deformation Retracts . . . . . . . . . . . . . . . . . . . . . . . . . 34416.6 Conical Decompositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34716.7 Abstract Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35116.8 Sequential Equilibrium Reformulated . . . . . . . . . . . . . . . . . . . 35616.9 Refinements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369

17 Monotone Equilibria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37117.1 Monotone Comparative Statics . . . . . . . . . . . . . . . . . . . . . . . . 37117.2 Motivation: A Bit of Auction Theory . . . . . . . . . . . . . . . . . . . 37517.3 Semilattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37917.4 Measure and Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38117.5 Partially Ordered Probability Spaces . . . . . . . . . . . . . . . . . . . . 38717.6 Monotone Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38917.7 The Game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39517.8 Best Response Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39717.9 A Simplicial Characterization of ANR’s . . . . . . . . . . . . . . . . . 40017.10 More Simplicial Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . 40317.11 Additional Characterizations of ANR’s . . . . . . . . . . . . . . . . . . 407Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425

Contents xiii

Symbols

hv;wi Inner product of v and w, 2.1vk k Norm

ffiffiffiffiffiffiffiffiffiffihv; vipof v, 2.1

fk k1 Supremum of f f ðxÞk k : x 2 X g, 6.5Lk k Operator norm of L, 11.1

@C Topological boundary of C, 12.5@M The boundary of the @-manifold M, 10.8f ts P f is transversal to P along S, 10.8?ðV ;WÞ The orthogonal complement of V in W , 12.1dfd/ ðpÞ /-derivative of f at p, 15.6

Df ðxÞ Derivative of f at x, 1.1, 1.3.9, 10.1q� p Product of interior probability q and conditional system p,

16.4S�T Product of the r-algebras S and T, 17.4AþB Minkowski sum of A and B, 2.2, 7.3am Antipode map am : Sm ! Sm, 14.3Ai Set of actions for i, 17.7affðSÞ Affine hull of S, 2.1ANR Class of locally compact ANR’s, 13.4BRðrÞ Q

i BRiðrÞ (the best response correspondence), 15.9BRiðrÞ Set of i’s best responses to r, 15.9BeðSÞ Closed ball of radius e centered around a set S in a metric

space, 2.6B A base or subbase of a topology, 4.1bP Barycenter of the polytope P, 2.5coNP Class of negations of problems in NP, Ex. 3.8C� Dual of C, 2.2Cr r times continuously differentiable, for 1� r�1, 1.1CrðM;NÞ Set of Cr functions from M to N, 1.3.9CrSðM;NÞ CrðM;NÞ endowed with the strong topology, 1.3.9

CðX; YÞ Set of continuous functions from X to Y , 5.5

xv

CSðX; YÞ CðX; YÞ endowed with the strong topology, 5.5CW ðX; YÞ CðX; YÞ endowed with the weak topology, 5.5CPðxÞ Not outward pointing vectors of P at x, 15.3CRðpÞ Not outward pointing vectors of R at p, 15.3CLIQUE Clique (computational problem), 3.9convðSÞ Convex hull of S, 2.2CX Set of index admissible germs of correspondences for X, 13.5CK;V Set of f 2 CðX;YÞ such that f ðKÞ � V , 5.5ConðX; YÞ Set of upper hemicontinuous convex-valued correspondences

from X to Y , 7.2ConSðX; YÞ ConðX; YÞ endowed with the strong topology, 7.2C;D Conical decompositions of a vector space, 16.6vðXÞ Euler characteristic of X, 1.3.13, 14.1deg1q ð f Þ Degree of f over q, for ðf ; qÞ 2 D1ðM;NÞ, 12.5degqð f Þ Degree of f over q, for ðf ; qÞ 2 DðM;NÞ, 12.5deg12 ð f ; qÞ Degree mod 2 of f over q, for ðf ; qÞ 2 D1ðM;NÞ, Ex. 12.5diamðSÞ Diameter of the set S, 3.2dimð�Þ Dimension of a vector space, polytope, etc.,DðAÞ Domain of attraction of A, 15.6DðM;NÞ Set of pairs ðf ; qÞ such that f is degree admissible over q, 12.5D1ðM;NÞ Set of ðf ; qÞ 2 DðM;NÞ such that f is C1 and q is a regular

value, 12.5dKðK; LÞ Maximum distance from a point in K to the nearest point in L,

4.4dHðK; LÞ Hausdorff distance between K and L, 4.4DðXÞ Set of probability measures on a finite set X, 3.5D�ðXÞ Set of interior probability measures on a finite set X, 16.2D�ðAÞ Set of conditional systems on A, 16.4e0; . . .; ed Standard unit basis vectors of Rd , 3.3ðE; pÞ A vector bundle, 10.7EOTL End of the line (computational problem), 3.9EXP Class of exponential time problems, 3.9f jC Restriction of the function f to the subdomain C, 1.1FNP Class of function versions of NP problems, 3.9Fq Set of q-frames in a vector space X, 12.1Fðf Þ;FðFÞ Set of fixed points of a function f or correspondence FU Flow, 15.2Uq;r fðV ;WÞ 2 Gq � Gr : V � W g, 12.1gAðX; YÞ Set of germs at A of continuous functions from X to Y , 13.3G ¼ ðV ;EÞ A graph, 2.7Gq Set q-dimensional linear subspaces of a vector space X, 12.1Gd A countable intersection of open sets, 8.4GAðX; YÞ Set of germs at A of upper hemicontinuous correspondences

from X to Y , 13.5

xvi Symbols

GLðXÞ Set of nonsingular linear transformations from X to itself, 12.2GLþ ðXÞ Set of elements of GLðXÞ with positive determinant, 12.2Grðf Þ;GrðFÞ Graph of the function f or correspondence F, 1.3.8GX Set of index admissible germs of functions for X, 13.3cAðf Þ Germ of the function f at A, 13.3Cq Gram–Schmidt process, 12.1CAðFÞ Germ of the correspondence F at A, 13.5H Hilbert space of square summable sequences, 6.4HðXÞ UX with topology fUU : U � X is openg[ fVU : U �

X is openg (Vietoris topology), 4.2h�ðfÞ Vector field induced by f and h, 15.2H0ðXÞ U0

X with topology fU0U : U � X is openg[ fV0

U : U �X is openg (Vietoris topology), 4.2

indMðfÞ Vector field index of f 2 VM , 15.5I1 Hilbert cube, 6.5IdX Identity function of X, 1.1Iðf ;PÞ Oriented intersection number of f and P, 12.4imðf Þ; imðFÞ Image of a function f or correspondence FIX

C Set of index admissible functions with domain C, 13.1

IX Union, over compact C � X, of the sets IXC, 13.1

JXC Set of index admissible correspondences with domain C, 13.5

JX Union, over compact C � X, of the sets JXC, 13.5

kerkerð‘Þ Kernel of a linear transformation ‘,KU Canonical map from X to NUj j, 8.6~KðXÞ ~UðXÞ with topology with base f ~UU : U � X is openg , 4.2KðXÞ UðXÞ with topology with base fUU : U � X is openg , 4.2K0ðXÞ U0

X with topology with base fU0U : U � X is openg , 4.2

jLðtÞ Characteristic polynomial of L : V ! V , 13.2LC Lineality space of C, 2.2k A Sperner labelling, 3.2KðAÞ Set of logarithmic relative probabilities on A, 16.4K�ðAÞ Set of interior logarithmic relative probabilities on A, 16.4KXðf Þ Index of f for the space X, 13.1M (also N and P) A manifold, 1.3.9,10.3Mp Permutation matrix of p, 3.4Mðt; SÞ Optimal choices from S for type t, 17.1M Set of monotone functions from T to A, 17.6~M Set of equivalence classes in M, 17.6NP Class of nondeterministic polynomial time problems, 3.9NRðpÞ Vectors having nonnegative inner product with all elements of

CRðpÞ, 15.3NU ¼ ðU;RUÞ Nerve of the open cover U, 8.6mN The normal bundle of N in M, 10.7

Symbols xvii

OEOTL Other end of the line (computational problem), 3.9Oq Set of orthonormal q-frames in a vector space X, 12.1x/ðpÞ x-limit set of the point p (for /), 15.6x/ðAÞ x-limit set of the set A (for /), 15.6P Class of polynomial time problems, 3.9PLS Class of polynomial local search problems, 3.9PPA Class of polynomial parity argument problems, 3.9PPAD Class of polynomial parity argument problems (directed), 3.9PPP Class of polynomial pigeonhole principle problems, 3.9PSPACE Class of polynomial space problems, 3.9PD Dual or polar of the polytope P, Ex. 2.5P;Q Polyhedral (often simplicial) complexes, 2.5PðwÞ Subdivision of P relative to w, 2.5PðnÞ nth derived of P, 2.5PðV ;RÞ Canonical realization of ðV ;RÞ, 2.6p-system Collection of sets closed under finite intersection, 17.4pV Orthogonal projection from a vector space X to a subspace V ,

12.1Pd1ða1; . . .; adÞ Permutahedron for a1\ � � �\ad , 16.9, Ex. 2.4Q The set of rational numbersrR Nearest point function of the diffeoconvex body R, 15.3RðAÞ Set of relative probabilities on A, 16.4R�ðAÞ Set of interior relative probabilities on A, 16.4RC Recession cone of C, 2.2R The set of real numbersRþ Set of nonnegative real numbersRm m-dimensional Euclidean spaceRm

þ Closed positive orthant of Rm

RcðpÞ Coarse order induced by the conditional system p, 16.7R f ðpÞ Fine order induced by the conditional system p, 16.7SAT Satisfiability (computational problem), Ex. 3.10SORT Sorting (computational problem), Ex. 3.3S ¼ ðV ;RÞ A combinatoric simplicial complex, 2.6Seþ 1 Symmetric group of f0; . . .; eg, 3.4Sn ¼ ðV ;RnÞ n-skeleton of S, 2.6Sm m-dimensional sphere centered at the origin of Rmþ 1, 1.3.13ðS; T; u; vÞ Two player game with pure strategy sets S and T and payoff

functions u and v, 3.5ðS;SÞ A measurable space, 17.4ðS;S; lÞ A measure space, 17.4span Map passing from a frame to its span, 12.1stðx;PÞ Open star of x in P, 2.6stðx;PÞ Closed star of x in P, 2.6suppðlÞ Support of l 2 dðXÞ, 3.5

xviii Symbols

rðTÞ r-algebra generated by T, 17.4r-algebra 17.4R A diffeoconvex body, 1.3.14,15.3R Set of chains in a conical decomposition, 16.6@R fr 2 R : f0g 62 rg, 16.6TFNP Class of total problems in FNP, 3.9T1-space A topological space in which points are closed, 4.1Tf for a C1 f : M ! N, the tangent map from TM to TN, 1.3.9,

10.5TM Tangent bundle of M, 1.3.9, 10.5TpM Tangent space of M at p, 1.3.9, 10.5ðTi;Ti; liÞ Ordered probability space of types for i, 17.7hC Map of conditional systems induced by C, 16.4ui Utility function for i, 17.7UeðxÞ Open ball of radius e centered at a point x in a metric space,

2.6eUU Set of compact subsets of U, 4.2UU Set of nonempty compact subsets of U, 4.2UðX; YÞ Set of upper hemicontinuous correspondences from X to Y ,

5.2USðX; YÞ UðX; YÞ endowed with the strong upper topology, 5.2UWðX; YÞ UðX; YÞ endowed with the weak upper topology, 5.3U0

USet of nonempty closed subsets of U, 4.2

V a d-dimensional vector space (in Chap. 2), 2.1VR Domain of rR, 15.3VM Set of index admissible vector fields for M, 15.5VU Set of compact sets that intersect U, 4.2V0U Set of closed sets that intersect U, 4.2

ðX; dÞ A metric space, 1.3.10N Set of consistent conditional systems, 16.4N� Set of interior consistent conditional systems, 16.4Z The set of integers

Symbols for Extensive Form Game Theory

Precedence relation partially ordering T , 16.2A :¼ Q

h Ah Set of pure behavior strategy profiles, 16.3Ah Set of actions that may be chosen at h, 16.2Ai Set of actions that may be chosen by i, 16.2aðyÞ The last action chosen before y, 16.2Bhðl; pÞ Set of optimal actions at h, given l and p, 16.3

Symbols xix

cðx; aÞ Immediate consequence of choosing a at x, 16.2El;pðuijhÞ Expected utility of i at h, given l and p, 16.3Uðl; pÞ Set of consistent assessments that best respond to ðl; pÞ, 16.3CðnÞ Set of consistent conditional systems that best respond to n, 16.8H Partition of X into information sets, 16.2Hi Set of information sets at which i chooses the action, 16.2gðxÞ Information set containing x, 16.2I Set f1; . . .; ng of agents, 16.2iðhÞ Agent who chooses an action at h, 16.2M Set of systems of beliefs, 16.3pðyÞ Immediate predecessor of y, 16.2PðtÞ Set of predecessors of t, 16.2PðtÞ PðtÞ [ ftg, 16.2PðtÞ PðtÞ \ Y , 16.2P Set of behavior strategy profiles, 16.3P� Set of interior behavior profiles, 16.3Pi Set of behavior strategies for i, 16.3PpðyjxÞ Probability of going from x to y, when play is governed by p, 16.3PpðtÞ Probability that t occurs, when play is governed by p, 16.3W Set of consistent assessments, 16.3W� Set of interior consistent assessments, 16.3N Set of consistent conditional systems, 16.7, 16.8N� Set of interior consistent conditional systems, 16.7, 16.8q Initial assessment, 16.2Si Set of pure strategies of agent i, 16.3T Set of nodes in the game tree, 16.2uiðzÞ The utility of agent i at z, 16.2W Set of initial nodes, 16.2X Set of nonterminal nodes, 16.2Y Set of noninitial nodes, 16.2Z Set of terminal nodes, 16.2

xx Symbols

Part IOverview

Chapter 1Introduction and Summary

This chapter gives a gentle overview of the book’s contents, sketching the sequence oftopics and their interrelationships. It also places thematerial in historical content, bothas it developed in pure mathematics and how its applications shaped the evolution oftheoretical economics. The reader shouldn’t expect to understand it fully—otherwisewe wouldn’t need the rest of the book!—and shouldn’t worry if some parts are, atthis stage, downright confusing. Hopefully it will give a good sense of what you willlearn if you study the book carefully, and how you might benefit.

1.1 The Key Concept

The fixed point index is the central theme of this book. Figure 1.1 shows a differen-tiable function f : [0, 1] → [0, 1] with three fixed points. In this context we say thata fixed point x is regular if the derivative

D(Id[0,1] − f )(x) = IdR − Df (x) : R → R

is nonsingular. If x is a regular fixed point that is contained in (0, 1), its index isthe sign of the determinant |IdR − Df (x)|, so it is +1 if this determinant is positiveand −1 if this determinant is negative. If all of the fixed points of f are regular andcontained in (0, 1), then the index of f is the sum of the indices of its fixed points.In this figure we see that the number of fixed points with index +1 is one more thanthe number of fixed points with index −1, so the index of f is +1. This is strictlymore information than is provided by a theorem that merely asserts that a fixed pointexists. The key features of this example extend to a very high level of generality.At this point we point out some key properties of the index, which will eventuallybecome a system of axioms, and give an informal explanation of how this generalitywill be established.

© Springer Nature Singapore Pte Ltd. 2018A. McLennan, Advanced Fixed Point Theory for Economics,https://doi.org/10.1007/978-981-13-0710-2_1

3

http://crossmark.crossref.org/dialog/?doi=10.1007/978-981-13-0710-2_1&domain=pdf

4 1 Introduction and Summary

Fig. 1.1 A function with three fixed points

To start off with we work with a very special class of functions. Let C ⊂ Rm be

compact,1 and let ∂C := C ∩ Rm \ C be its topological boundary. Let f : C → Rm

be a C1 function that doesn’t have any fixed points in ∂C , and that has only regularfixed points, so IdRm − Df (x) is nonsingular whenever f (x) = x . The inverse func-tion implies that each fixed point has a neighborhood that contains no other fixedpoints, so there are finitely many fixed points. We define the index of f to be thenumber of fixed points x such that the determinant of IdRm − Df (x) is positiveminusthe number of fixed points x such that the determinant of IdRm − Df (x) is negative.

We now point out several properties of this index that will eventually turn into asystem of axioms. The first asserts that if c : C → R

m is a constant function whosevalue is in C \ ∂C , then the index of c is +1. This property of the index is calledNormalization because its role is to establish the system of units (economists mightuse the term “numeraire”) that we will use to count fixed points.

Suppose that C1, . . . ,Cr are compact subsets of C that are pairwise disjoint, andthat all of the fixed points of f are contained in the interiors of these sets. Then thesum over i , of the index of f |Ci , is the index of f . This principle is calledAdditivity.It embeds the principle that the index of f depends only on the restriction of f toarbitrarily small neighborhoods of its set of fixed points. In particular, we can definethe index of an isolated fixed point of f to be the index of the restriction of f tosmall neighborhoods of the fixed point.

1In this book a topological space is compact if every open cover has a finite subcover. (In theBourbaki tradition such a space is said to be quasicompact, and a compact space is one that is bothquasicompact and Hausdorff.)

1.1 The Key Concept 5

Fig. 1.2 Homotopic deformation of the set of fixed points

The next property of the index is invariance under well behaved homotopies.Suppose h : C × [0, 1] → R

m is a homotopy, i.e., a continuous function. Let ht :=h(·, t) : C → R

m denote the function “at time t .” We assume that for all t , ht doesnot have any fixed points in ∂C , that h is C1, and that all of the fixed points of h0and h1 are regular. We wish to show that the index of h0 agrees with the index of h1Visually, as time goes from 0 to 1 we imagine pairs of fixed points of opposite indexbeing born and then moving apart, or coming together and then vanishing (Fig. 1.2).

Let g0 : C → Rm be the function x �→ x − h0(x), let g1 : C → R

m be thefunction x �→ x − h1(x), and let g : C × [0, 1] → R

m be the function (x, t) �→x − ht (x). Then 0 is a regular value of g0 and g1, which is to say that if Dg0(x)is nonsingular whenever g0(x) = 0, and similarly for g1. We also assume that 0 is aregular value of g, which means that Dg(x, t) has rank m whenever g(x, t) = 0.

Suppose that g(x, t) = 0. If 0 < t < 1, then the implicit function theorem impliesthat a neighborhood of (x, t) in g−1(0) is a smooth curve. If t = 0 and g is a C1

extension of g to a neighborhood of C × [0, 1], then a neighborhood of (x, 0) ing−1(0) is a smooth curve, and the line tangent to this curve at (x, 0) is the setof (x, 0) + αv for a v ∈ R

m+1 that is in the kernel of Dg(x, t). Since Dg0(x) isnonsingular, the final component of v cannot vanish. Therefore a neighborhood of(x, 0) in g−1 is a smooth curve with endpoint, and the tangent line of this curve isnot contained in Rm × {0}. The situation is similar when t = 1.


Since g is continuous, g−1(0) is compact, and by assumption it is contained in (C \∂C) × [0, 1]. Straightforward arguments show that it has finitely many connectedcomponents, each of which is a smooth curve that is homeomorphic either to a circle,in which case it is contained in (C \ ∂C) × (0, 1), or to a closed interval, in whichcase its endpoints are contained in (C \ ∂C) × {0, 1}. A point x ∈ C is a fixed pointof h0 (h1) if and only if (x, 0) ((x, 1)) is an endpoint of one of these line segments.In order to show that the index of h0 is the index of h1 it suffices to show that the twoendpoints of each line segment make the same total contribution to the two indices.Concretely this means that if the two endpoints are (x0, 0) and (x1, 1), then thedeterminants of Dg0(x0) and Dg1(x1) have the same sign, and if the two endpointsare (x0, 0) and (x1, 0) ((x0, 1) and (x1, 1)) then the determinants of Dg0(x0) andDg0(x1) (Dg1(x0) and Dg1(x1)) have opposite signs.

Suppose that a component homeomorphic to a line segment is parameterized bya C1 function γ = (ξ, τ ) : [a, b] → C × [0, 1] whose derivative does not vanishanywhere. Let w1, . . . ,wm : [a, b] → R

m+1 be continuous functions such that foreach s, the vectors w1(s), . . . ,wm(s), γ ′(s) are linearly independent, hence a basisof Rm+1, and the final components of w1(a), . . . ,wm(a),w1(b), . . . ,wm(b) are all0, so that

w1(a), . . . ,wm(a) and w1(b), . . . ,wm(b)

may be regarded as bases of Rm .Two ordered bases b1, . . . , bk and c1, . . . , ck of a k-dimensional vector space have

the same (opposite) orientation if the linear transformation taking each bi to ci hasa positive (negative) determinant. For each s,

Dg(γ (s))w1(s), . . . , Dg(γ (s))wm(s), Dg(γ (s))γ ′(s)

spansRm and Dg(γ (s))γ ′(s) = 0, so Dg(γ (s))w1(s), . . . , Dg(γ (s))wm(s) is a basisof Rm . By continuity

Dg(γ (a))w1(a), . . . , Dg(γ (a))wm(a) and Dg(γ (b))w1(b), . . . , Dg(γ (b))wm(b)

have the same orientation. Continuity also implies that the bases

w1(a), . . . ,wm(a), γ ′(a) and w1(b), . . . ,wm(b), γ ′(b)

of Rm+1 have the same orientation. Elementary facts concerning determinants nowimply that the bases

w1(a), . . . ,wm(a) and w1(b), . . . ,wm(b)

ofRm have the same orientation if and only if the final components of γ ′(a) and γ ′(b)have the same sign, which is the case if and only if γm+1(a) �= γm+1(b). Combiningall of this, we conclude that if γm+1(a) �= γm+1(b) (γm+1(a) = γm+1(b)) then


w1(a), . . . ,wm(a) and Dg(γ (a))w1(a), . . . , Dg(γ (a))wm(a)

have the same orientation if and only if

w1(b), . . . ,wm(b) and Dg(γ (b))w1(b), . . . , Dg(γ (b))wm(b)

have the same (opposite) orientation, which is what we wished to prove. Thisconcludes the proof that the index of h0 agrees with the index of h1 when h :C × [0, 1] → R

m is C1, each ht has no fixed points in ∂C , and 0 is a regular valueof (x, t) �→ x − h(x, t).

Now suppose that C ′ ⊂ Rm ′

is compact, g : C → C ′ and g′ : C ′ → C are C1,and all the fixed points of g′ ◦ g and g ◦ g′ are regular and contained in C \ ∂C andC ′ \ ∂C ′ respectively. The property of the index known as Commutativity assertsthat the index of g′ ◦ g is the index of g ◦ g′. This principle is a consequence of ahighly nontrivial fact of linear algebra, so we won’t discuss the proof further at thispoint. (The statement and proof of Proposition 13.4 is self contained, so the curiousreader can look at it right away if she likes.)

Next, suppose that f : C → Rm and f ′ : C ′ → R

m ′areC1 functions whose fixed

points are all regular and contained in C \ ∂C and C ′ \ ∂C ′ respectively. Let f ×f ′ : C × C ′ → R

m+m ′be the function (x, x ′) �→ ( f (x), f ′(x ′)). The fixed points of

f × f ′ are the pairs (x, x ′) where x is a fixed point of f and x ′ is a fixed point off ′, so they are contained in C × C ′ \ ∂(C × C ′). The matrix of Id

Rm+m′ − D( f ×f ′)(x, x ′) is block diagonal with blocks corresponding to IdRm − Df (x) and Id

Rm′ −Df ′(x ′). From this it follows that the fixed points of f × f ′ are all regular. Sincethe determinant of a block diagonal matrix is the product of the determinants of theblocks, the index of a fixed point (x, x ′) is the product of the index of x and the indexof x ′. If we sum over all fixed points of f × f ′ and apply the distributive law, wefind that the index of f × f ′ is the product of the index of f and the index of f ′.This principle is called Multiplication.

We now wish to extend the index to functions that are merely continuous ratherthan smooth, to more general spaces, and finally to correspondences. We brieflydescribe the main ideas in each of these steps.

The norm of a continuous function f : C → Rm is

‖ f ‖∞ := maxx∈C ‖ f (x)‖.

The distance between two continuous functions f, f ′ : C → Rm is

d( f, f ′) := ‖ f − f ′‖∞ .

This is easily seen to be a metric. Suppose that f : C → Rm is continuous and has

no fixed points in ∂C . If f ′ is sufficiently close to f , then f ′ has no fixed points in∂C . In Chap.10 we will show that every neighborhood of f contains a C∞ functionf ′. Sard’s theorem (the topic of Chap.11) implies that there are vectors δ ∈ R

m


arbitrarily close to the origin such that all of the fixed points of x �→ f ′(x) + δ areregular. Therefore every neighborhood of f contains a C∞ function whose fixedpoints are all regular.

Wewould like to define the index of f to be the index of such approximating func-tions, which makes sense if two such approximations f0 and f1 that are sufficientlyclose to f have the same index. There is a smooth homotopy

h(x, t) := (1 − t) f0(t) + t f1(t) .

If f0 and f1 are sufficiently close to f , then each ht will have no fixed points in theboundary of C . Using the inverse function theorem, one can easily show that for δ insome neighborhood of the origin, the fixed points of x �→ f0(x) + δ are in an obviousone-to-one correspondence with the fixed points of f0, so that f0 + δ has the sameindex as f0.Another application of Sard’s theorem implies that there exist δ arbitrarilyclose to the origin such that 0 is a regular value of (x, t) �→ x − h(x, t) − δ. For sucha δ in a small enough neighborhood of the origin the index of f0 agrees with the indexof f0 + δ, the homotopy property of the smooth index implies that f0 + δ and f1 + δ

have the same index, and f1 + δ has the same index as f1.Thus our definition makes sense. In addition, this construction implies that the

index has a property called Continuity: the index of f agrees with the index offunctions in a sufficiently small neighborhood of f . Note that if h : C × [0, 1] → R

m

is continuous and for all t , ht has no fixed points in ∂C , then t → ht is continuous, soContinuity implies that the index of ht is a (locally constant, hence) constant functionof t , and thus the index of h0 agrees with the index of h1. The index for continuousfunctions satisfies Normalization automatically, and Additivity, Commutativity, andMultiplication are shown to hold by taking suitable smooth approximations of thegiven functions for which the relevant condition has already been established. Infact we will see that the index is the unique integer valued function on the space ofrelevant functions that satisfies Normalization, Additivity, and Commutativity.

The next level of generalization replacesRm with amore general space. In Chap.8we define and study a class of metric spaces called absolute neighborhood retracts(ANR’s) that is very general, but also well behaved with respect to fixed point theory.Suppose that X is a compact ANR, C ⊂ X is compact, and ε > 0. A key result(Theorem 8.4) states that there is an integer m, an open U ⊂ R

m with compactclosure U , and continuous functions ϕ : C → U and ψ : U → X such that IdC andψ ◦ ϕ are ε-homotopic. (That is, there is a continuous η : C × [0, 1] → X such thatη0 = IdC , η1 = ψ ◦ ϕ, and for all x ∈ C and t ∈ [0, 1] the distance from x to η(x, t)is less than ε.) Intuitively, a compact subset of an ANR can be approximated by acompact subset of a Euclidean space.

Suppose that f : C → X is a continuous function with no fixed points in theboundary of C . Let D be a compact neighborhood of the set of fixed points of Csuch that open ball of radius ε around D is contained in C . If an index at this level ofgenerality satisfied all of our properties, then Additivity would imply that the indexof f was the index of f |D , Continuity would imply that the index of f |D was theindex of f ◦ ψ ◦ ϕ|D , and Commutativity would imply that the index of f ◦ ψ ◦ ϕ|D


was the index of ϕ ◦ f ◦ ψ |ψ−1(D). The latter function maps a compact subset of Rm

toRm , so its index has already been defined. This observation suggests that we definethe index of f to be the index of this function. There are of course numerous details,but the bottom line is that this works, and gives a unique index for functions such as fthat satisfies Normalization, Additivity, Continuity, and Commutativity. In additionthis index satisfies Multiplication.

The final generalization is to correspondences. If X and Y are topological spaces,a correspondence F : X → Y assigns a nonempty set F(x) ⊂ Y to each x ∈ X .If Y = X , then x ∈ X is a fixed point of F if x ∈ F(x). The correspondence F isupper hemicontinuous2 if, for each x , F(x) is compact and for every neighborhoodV of F(x) there is a neighborhoodU of x such that F(x ′) ⊂ V for all x ′ ∈ U . (Mostauthors do not include compact valuedness in the definition of upper hemicontinuity,but we will never be interested in upper hemicontinuous (in the more general sense)correspondences that are not compact valued.) A topological space Z is contractibleif there is a continuous function c : Z × [0, 1] → Z such that c0 = IdZ and c1 is aconstant function. We say that F is contractible valued if each F(x) is contractible.In Chap.9 it is shown that if X and Y are absolute neighborhood retracts, F : X →Y is an upper hemicontinuous and contractible valued correspondence, and W ⊂X × Y is a neighborhood of the graph { (x, y) : y ∈ F(x) } of F , then there is acontinuous function f : X → Y whose graph is contained in W . This suggests thatwhen X is an absolute neighborhood retract, C ⊂ X is compact, and F : C → X isan upper hemicontinuous contractible valued correspondence with no fixed pointsin the boundary of C , then we might define the index of F to be the index of ffor continuous f whose graphs are in a sufficiently small W . Again, this works andgives a unique index for correspondences such as F that satisfies Normalization,Additivity, Continuity, and Commutativity (which is only defined for functions).Again, this index also satisfies Multiplication.

This completes the description of the book’s central core. Much of our work con-sists of preparations for a completely rigorous rendition of the argument sketchedabove. Many of the concepts and results (e.g., the separating hyperplane theorem)have considerable independent significance in economic theory. Insofar as this bookaims to be a compleat treatment of fixed point theory, as it relates to economics,there is in addition a discussion of computation in Chap. 3. In Chap.10 the perspec-tive is broadened to include smooth manifolds and differential topology, which arethe proper settings of the degree (Chap. 12) and the vector field index (Chap. 15)which are alternative formulations of the index concept. Chapter 14 uses the indexto derive some classical results of topology, and the relationship between the fixedpoint index and dynamic stability is developed in Chap. 15. Finally, Chaps. 16 and17 are expositions of journal articles that illustrate how the mathematics developedherein has been applied in actual economic research.

2Although it is not directly relevant, it makes sense to mention that F is lower hemicontinuousif, for each x and open V ⊂ Y such that F(x) ∩ V �= ∅ there is a neighborhood U of x such thatF(x ′) ∩ V �= ∅ for all x ′ ∈ U .


1.2 Historical Background

Leon Walras is generally credited with initiating the mathematical theory of generaleconomic equilibrium. He pointed out that in an exchange economy in which agentstrade endowments of � goods, there are effectively � − 1 equations (if supply equalsdemand in � − 1 markets, it must also be equal in the last market) and � − 1 prices(supply and demand are unaffected if all prices are multiplied by the same positivescalar). This observation suggests that the system of equations should have solutions,which will “typically” be isolated, but Walras’ attempts to prove rigorous theoremsalong these lines were unsuccessful, and after his work there was little mathematicalprogress on this topic for over half a century.

The Brouwer fixed point theorem states that if C is a nonempty compact convexsubset of a Euclidean space and f : C → C is continuous, then f has a fixed point.The proof of this by Brouwer (1912) was one of the major events in the history oftopology.3 Since then the study of such results, and the methods used to prove them,has flourished, undergone radical transformations, become increasingly general andsophisticated, and extended its influence to diverse areas of mathematics. Of partic-ular note are the Lefschetz (1926) fixed point theorem, which extended the resultto nonconvex domains, the Schauder (1930) fixed point theorem and its subsequentgeneralizations, which extended the result to infinite dimensional domains, and theKakutani (1941) fixed point theorem for correspondences, which was subsequentlygeneralized by Eilenberg and Montgomery (1946).

Algebraic topology emerged gradually during the quarter century followingBrouwer’s work. Although this process was lengthy, and influenced by manyundercurrents, one can still clearly see it as an elaboration of the ideas used toestablish the fixed point principle. Since that time the methods of algebraic topologyhave spread to many other fields, and it has become a basic course for graduatestudents of mathematics.

Around 1950, most notably through the work of Nash (1950, 1951) on nonco-operative games, and the work of Arrow and Debreu (1954) on general equilibriumtheory, it emerged that in economists’ most fundamental and general models, equilib-ria are fixed points. The results of Sonnenschein (1973), Debreu (1974), and Mantel(1974) show that the theorem asserting existence of a Walrasian equilibrium pricevector is not less general than Brouwer’s fixed point theorem, because any functionsatisfying certain obvious necessary conditions can be the sum of sufficiently manyindividual excess demand functions. In Chap. 3 we will see results suggesting thatNash’s existence theorem (even restricted to the case of two players) is not less gen-eral than Brouwer’s theorem. Conditions under which there is a unique Walrasianequilibrium were studied by Arrow and Hurwicz (1958) and Arrow et al. (1959).These authors and others also initiated the study of the stability of equilibrium undercertain seemingly natural price adjustment processes.

3Although the result is universally attributed to Brouwer, it seems that it had actually been provedearlier by Bohl (1904).

1.2 Historical Background 11

The other point made by Walras, that “most” economies should have equilibriathat are isolated and thus finite in number, was formalized rather later by Debreu(1970). His result showed that the method of comparative statics, which had beenemphasized by Samuelson (1947) in his Foundations of Economic Analysis, was,in a precise sense, “almost always” applicable. Shortly thereafter Harsanyi (1973)showed that for “almost all” strategic form games there are finitely many Nashequilibria. This result was extended to extensive form games by Kreps and Wilson(1982). Since Debreu’s seminal contribution a large number of other papers haveproved generic finiteness results for a variety of economic models, and his methodis now a standard part of the toolkit of theoretical economics.

These results apply Sard’s theorem, which is another major landmark of 20th cen-tury mathematics. (This result is sometimes called the Morse-Sard-Brown theorem,acknowledging the contributions of Brown 1935 and Morse 1939, as well as Sard1942.) Consider a C1 function f : R → R and let C := { t : f ′(t) = 0 } be the setof critical points of f . If t ∈ C , then the definition of the derivative implies thatfor small δ > 0, f ((t − δ, t + δ)) is contained in an interval ( f (t) − ε, f (t) + ε)

where ε is much smaller than δ, and in fact we can make the ratio ε/δ arbitrarilysmall by making δ small. This suggests that f (C) should be a “small” subset of R,but it is not at all difficult to concoct examples for which the closure of f (C) is allof R, so just what we might mean by this is not obvious. In fact it was only with thereformulation of the theories of integration and probability using measure theory,which began during the 1920’s, that adequate concepts became available. In additionto the concept of a set of measure zero, the proof in, for example, Milnor (1965a),uses only the tools of multivariable calculus. The final form of the result for the finitedimensional case is due to Federer, and is stated and proved in Sect. 3.4 of Federer(1969), together with a complete collection of examples showing that no better resultis possible. Smale (1965) provided an infinite dimensional version of the result.

Sard’s theorem had important implications for topology. Roughly, it implies that,in a variety of settings, an arbitrary continuous function can be well approximated bya smooth function that is in “general position,” in whatever sense is desired. (Here wehave in mind Theorem 11.4.) This provides a method for counting things like pointsof intersection that was developed into a systematic toolkit, which is now known asthe field of differential topology, andwhich is a fundamentalmethod of this book. Thebrief monograph by Milnor (1965a) was very influential in theoretical economics,and Guillemin and Pollack (1974) is another broadly accessible introduction, whileHirsch (1976b) is a graduate level text.

The fixed point index (actually its equivalent formulation in terms of degree) wasfirst defined by Leray and Schauder (1934). Notable events in its subsequent devel-opment are the extension to absolute neighborhood retracts by Browder (1948) andthe axiomatic formulation developed for simplicial complexes by O’Neill (1953),which was extended to absolute neighborhood retracts by Bourgin (1955a, b, 1956).The survey of Mawhin (1999) documents its influence in mathematics, especially innonlinear functional analysis and in relation to certain types of partial differentialequations. He remarks that: “The quick search in the Mathematical Reviews disclos-ing 591 references to papers that make use of it, mentioned by Peter Lax in deFinetti


(1949b), is surely underestimated, and the real figure should be much larger thanone thousand.” The books of Dugundji and Granas (2003) and Górniewicz (2006)provide extensive additional documentation of its influence.

At the same time the fixed point index has had very little influence in theoreticaleconomics. A few researchers know of it, and I have seen a handful of applications,including Eraslan and McLennan (2013), but they are not commensurate with theimportance and centrality of fixed points in economic theory. Surely one reason forthis neglect is that prior to this book there have been no presentations of the theory ata high level of generality that were well suited to economists. In addition, the com-munities of scholars studying nonlinear functional analysis and partial differentialequations seem to have little overlap or contact with economic theory. Needless tosay, my hope is that in the future we will see many more applications of index the-ory in economics. A concrete reason for expecting such developments is laid out inChap.15, which develops the relationship between the index and dynamic stability,and explains its relationship to Samuelson’s correspondence principle. More gener-ally, that a powerful theory related to the central mathematics of economic theorymight continue to have only slight application to the diverse and prolific economicresearch being produced these days would seemingly defy reason.

1.3 Chapter Contents

This section gives brief (or in some cases not so brief) synopses of the contents ofeach chapter. The contents of the various chapters are quite heterogeneous, and, to agreater extent than for many books of mathematics, it is possible at the beginning togive accessible descriptions of the main ideas. Presumably these synopses will makeit somewhat easier to use the book as a reference.

Some general remarks concerning the character of the material may be useful.This is a book of mathematics, and it reflects my beliefs concerning how economictheorists should approach mathematics. Compared to the vastness of present daymathematical knowledge, life is woefully brief, which is a strong argument for min-imality. Where I have gone a bit deeper than necessary, I feel that the subject matteris sufficiently important, either to fixed point theory or to mathematics in general,that minimal mathematical literacy for an economic theorist should include at leasta brief acquaintance with the basic structures and points of view of the topic. Thiswill be seen especially in the exercises, which in many cases sketch proofs of majorresults that are not closely related to the material in the chapter. Also, there shouldbe some room for appreciating things that are merely beautiful.

I am a stickler for proving everything, both in what I read and what I write.The habit of insisting on a full understanding of proofs of substantial theorems is aform of investment that, in my own experience, pays handsome dividends. (That thevast majority of economists did not see a proof of Brouwer’s fixed point theoremduring their education has never ceased to shock me.) Beyond the presumed basicknowledge, this book’s argument is (with one or two small exceptions) entirely selfcontained, and those who approach it in this spirit should be well rewarded.

1.3 Chapter Contents 13

1.3.1 Chapter 2: Planes, Polyhedra, and Polytopes

This material is foundational, introducing the simplest geometric objects that are“uncurved.” Much of this is a matter of terminology. For example, an affine com-bination of points x1, . . . , xr in a vector space is a sum α1x1 + · · · + αr xr whereα1, . . . , αr are scalars that sum to one, the affine hull of a set is the set of allaffine combinations of its elements, and x1, . . . , xr are affinely independent if eachxi is not an affine combination of x1, . . . , xi−1, xi+1, . . . , xr . An affine subspaceis a set that is its own affine hull. A convex combination of x1, . . . , xr is a sumα1x1 + · · · + αr xr where α1, . . . , αr are nonnegative scalars that sum to one, theconvex hull of a set is the set of all convex combinations of its elements, and a setis convex if it is its own convex hull. The convex hull of an affinely independent setof points is a simplex.

A cone is a set that contains any nonnegative scalar multiple of any of its elements,and a polyhedal cone is a finite intersection of closed half spaces that each have theorigin in its boundary. More generally, a polyhedron is a finite intersection of closedhalf spaces, and a polyhedron is a polytope if it is bounded and hence compact.A polyhedral complex is a collection of polyhedrons that contains all the facesof each of its elements, such that the intersection of any two of its elements is acommon face. A polytopal complex is a polyhedral complex whose elements areall polytopes, and a simplicial complex is a polyhedral complex whose elementsare all simplices. Simplicial complexes also arise combinatorially: a combinatoricsimplicial complex is a pair (V,Σ)whereV is a set of “vertices” andΣ is a collectionof finite subsets of V that contains all of the subsets of each of its elements. A (simpleundirected) graph is a pair (V, E) in which V is a set and E is a collection of twoelement subsets of V .

The main theorem in this chapter is the separating hyperplane theorem. Roughly,this asserts that for a convex set with nonempty interior and a point outside that setthere is a hyperplane (that is, a maximal proper affine subspace) that has the set in oneof its associated closed half spaces and the point in the other. Farkas’ lemma, whichis the technical linchpin of the theory of linear programming, is the application ofthis result to polyhedral cones.

1.3.2 Chapter 3: Computing Fixed Points

Chapter 3 provides a proof of Brouwer’s fixed point theorem, but if that was all therewas to it, it could be much shorter. Let Δ be a simplex, and let f : Δ → Δ be a con-tinuous function. Roughly speaking, a point x ∈ Δ is an ε-approximate fixed pointif f (x) is in the ε-ball around x . If {εr } is a sequence of positive numbers converg-ing to 0 and for each r , xr is εr -approximately fixed, then the limit of a convergent


subsequence of {xr } is a fixed point, by continuity. It turns out that the existence ofa convergent subsequence depends on the axiom of choice, which we explain (alongwith Zorn’s lemma and the well ordering principle, which are equivalent) in Sect. 3.1.

The hard part of proving the BFPT is showing that for any ε there is an ε-approximate fixed point. Themost important proofs do this by providing an algorithmthat computes an ε-approximate fixed point. We describe three such algorithms: (a)what is usually called the Scarf algorithm, which is a pivoting procedure that passesbetween adjacent subsimplices of a simplicial subdivision of Δ; (b) the independentset algorithm, which is what Scarf originally found; (c) a recent algorithm due toMcLennan and Tourky, which has the Lemke-Howson algorithm for finding a Nashequilibrium of a two person game as a subroutine. In addition we describe homotopymethods, which are in a technical sense computational procedures, rather than algo-rithms, because (as they are usually implemented) there is no guarantee of eventuallyhalting with a valid output, but which are important in practical computation. Finallywe describe recent work by computer scientists on the computational complexity offinding an approximate fixed point. Chapter 3 is off to the side of the main thrustof our work, and can be bypassed (after Sect. 3.1) without subsequent loss of under-standing, but these algorithms and the related computational theory are certainly animportant aspect of the theory of fixed points. In addition, it provides an opportunityto learn about the basic concepts of computer science, and to get some sense of whycomputer scientists are currently doing so much research related to economic theory.

1.3.3 Chapter 4: Topologies on Sets

Insofar as a correspondence is a set valued function, it makes sense to ask whethernotions such as upper hemicontinuity can be understood as continuity in the usualsense if we impose an appropriate topology on the relevant set of subsets of the rangespace. In addition, a correspondence has an associated set, namely its graph, and wecan study the topology on correspondences induced by a topology on the closed orcompact subsets of the cartesian product of the domain and range.

Chapter 4 studies topologies on spaces of sets. Section 4.1 introduces the relevantconcepts from point set topology, and Sect. 4.2 defines a multitude of topologies onthe spaces of closed and compact subsets of a given topological space, the mainidea being to specify that a neighborhood of a given set is the set of sets that arecontained in some neighborhood of the given set. This can be refined by specifyingthat a neighbor consists of those set that are contained in the neighborhood of thegiven set and, in addition, have nonempty intersections with each of finitely manyopen sets. This more refined topology is called the Vietoris topology. A very earlyresult of Vietoris (1923) is that if the given space is compact, then so is the spaceof compact subsets endowed with the Vietoris topology. For a metric space theHausdorff distance between two compact sets is the infimum of the set ε > 0 suchthat each set is contained in the open ε-ball around the other. This is a metric, and its


induced topology is the Vietoris topology. The continuity properties of elementaryoperations on sets (union, intersection, cartesian product, the function x �→ {x}, thefunctions K �→ f (K ) and D �→ f −1(D) where f is a continuous function) arestudied. It is shown that if S is a Vietoris compact set of compact sets, then the unionof the elements of S is compact.

1.3.4 Chapter 5: Topologies on Functions andCorrespondences

Section 5.1 characterizes upper hemicontinuity and continuity of a correspondenceF : X → Y in terms of the continuity of the map x �→ F(x)when the space of com-pact subsets of Y is endowed with the corresponding topology. (There is no suchcharacterization of lower hemicontinuity by itself.) In the strong upper topology ofsuch correspondences a neighborhood of a given correspondence is a set of corre-spondences containing those correspondences whose graphs are contained in someneighborhood of the graph of the given correspondence. The weak upper topologyis the quotient topology induced by the maps F �→ F |K where K ⊂ X is compactand the set of correspondences from K to Y has the strong upper topology. (Con-cretely, the weak upper topology is the finest topology such that all such maps arecontinuous.) We study the continuity properties of restriction to a subdomain, com-position, and cartesian products, with respect to these topologies. A conceptuallycrucial result is that a correspondence H : X × [0, 1] → Y is upper hemicontinuous(i.e., a homotopy of correspondences) if and only if each Ht is upper hemicontinousand the map t �→ Ht is continuous. We also study some properties of continuousfunctions that do not generalize to correspondences.

1.3.5 Chapter 6: Metric Space Theory

We are assuming that the reader is conversant with the basics concepts and factsconcerning metric spaces, but we will need some more advanced results. Fix a topo-logical space X . A family of subsets of a topological space X is locally finite ifevery point in X has a neighborhood that intersects only finitely many members ofthe collection. A refinement of an open cover of X is a second open cover, each ofwhose elements is contained in some element of the given open cover. The space Xis paracompact if each of its open covers is refined by an open cover that is locallyfinite. A brief and simple argument of Mary Ellen Rudin (1969) proves that everymetric space is paracompact. (This was originally shown by Stone 1948.)

A partition of unity subordinate to a locally finite open cover {Uα} of X is acollection of functions ψα : X → [0, 1] such that

∑α ψα(x) = 1 for all x and


ψα(x) = 0 for all α and x /∈ Uα . Partitions of unity are used in many constructions,so we study results guaranteeing that they exist.

Section 6.3 introduces the notion of a topological vector space, which is a vec-tor space endowed with a Hausdorff topology such that vector addition and scalarmultiplication are continuous. Functional analysis is the (vast) subdiscipline ofmath-ematics that studies such spaces and various types ofmaps between them. Fortunatelywe will need to know only the most basic definitions and facts, but the exercises s-ketch the proofs of several important results. A topological vector space is locallyconvex if every neighborhood of a point contains a convex neighborhood. A normedspace is a topological vector space endowed with a norm and the derived metric. Itis easy to see that a normed space is locally convex. A Banach space is a complete(every Cauchy sequence is convergent) normed space, and a Hilbert space is a Ba-nach space whose norm is derived from an inner product. We will need to know thata metric space can be isometrically embedded in a Banach space, and a separablemetric space can be isometrically embedded in a separable Hilbert space.

The Tietze extension theorem asserts that if X is a normal topological space, Ais a closed subset of X , and f : A → [0, 1] is continuous, then f has a continuousextension to all of X . There is an obvious extension for maps into finite dimensionalspaces, but since we will be dealing with spaces that are infinite dimensional, wewill need the following result of Dugundji (1951): if X is a metric space, A is aclosed subset of X , Y is a locally convex topological vector space, and f : A → Yis continuous, then f has a continuous extension to X such that f (X) is containedin the convex hull of f (A).

1.3.6 Chapter 7: Essential Sets of Fixed Points

Fixed points come in different flavors. Figure 1.3 shows a function f : [0, 1] → [0, 1]with two fixed points, s and t . If we perturb the function slightly by adding a smallpositive constant, s “disappears” in the sense that the perturbed function does nothave a fixed point anywhere near s, but a function close to f has a fixed point neart . More precisely, if X is a topological space and f : X → X is continuous, a fixedpoint x∗ of f is essential if, for any neighborhood U of x∗, there is a neighborhoodV of the graph of f such that any continuous f ′ : X → X whose graph is containedin V has a fixed point in U . If a fixed point is not essential, then we say that it isinessential. These concepts were introduced by Fort (1950).

There need not be an essential fixed point. The function shown in Fig. 1.4 has aninterval of fixed points. If we shift the function down, there will be a fixed point nearthe lower endpoint of this interval, and if we shift the function up there will be afixed point near the upper endpoint.

This example suggests that we might do better to work with sets of fixed points.A set K of fixed points of a function f : X → X is essential if it is closed, it has aneighborhood that contains no other fixed points, and for any neighborhood U of S,


Fig. 1.3 An inessential fixed point

Fig. 1.4 An essential set of fixed points whose elements are inessential


there is a neighborhood V of the graph of f such that any continuous f ′ : X → Xwhose graph is contained in V has a fixed point in U .

A problem with this concept is that “large” essential sets are not very useful. Forexample, if X is compact and has the fixed point property, then the set of all fixedpoints of f is essential. It seems that we should really be interested in sets of fixedpoints that are either essential and connected4 or essential and minimal in the senseof not having a proper subset that is also essential.

The Fan-Glicksberg theorem is the infinite dimensional version of the Kakutanifixed point theorem: if V is a locally convex space, X is a nonempty compact convexsubset of V , and F : X → X is an upper hemicontinuous convex valued correspon-dence, then F has a fixed point.

The central result of Chap.7, which is due to Kinoshita (1953), states that anyessential set of fixedpoints contains aminimal essential set, and thatminimal essentialsets are connected. Kinoshita’s argument is by contradiction. Suppose that the setK of fixed points of F is a the union of disjoint compact sets K1, . . . , Kr . If no Ki

was essential, then for each there would be a destabilizing perturbation of F , andcombining these into a single function or correspondence would give a perturbationof F that had no fixed points at all, contrary to the Fan-Glicksberg theorem. It wouldbe possible to work only with perturbations of F that are functions, but in Sect. 7.3we explain how to define convex combinations of convex valued correspondences,with weights that vary continuously as we move through X .

The theory of refinements of Nash equilibrium (e.g., Selten 1975; Myerson 1978;Kreps andWilson 1982;Kohlberg andMertens 1986;Mertens 1989, 1991;Govindanand Wilson 2008) has many concepts that amount to a weakening of the notion ofan essential set, insofar as the set is required to be robust with respect to only certaintypes of perturbations of the function or correspondence. In particular, Jiang (1963)pioneered the application of the concept to game theory, defining an essential Nashequilibrium and an essential set of Nash equilibria in terms of robustness withrespect to perturbations of the best response correspondence induced by perturbationsof the payoffs. Themathematical foundations of such concepts are treated in Sect. 7.4.

1.3.7 Chapter 8: Retracts

A topological space X is said to have the fixed point property if every continuousfunction from X to itself has a fixed point. Whether a compact contractible metricspace necessarily has the fixed point property was for many years an open problem,but eventually Kinoshita (1953) produced a lovely example of a compact contractibleX ⊂ R

3 and a continuous f : X → X with no fixed points. In order to make fixedpoint theory “work,” we have to impose some additional restriction.

4We recall that a subset K of a topological space X is connected if there do not exist two disjointopen sets U1 and U2 with S ∩U1 �= ∅ �= S ∩U2 and K ⊂ U1 ∪U2.


If X is ametric space, A is a subset of X , and r is a continuous function r : X → Awith r(a) = a for all a ∈ A, thenwe say that r is a retraction and A is a retract of X .We say that A is a neighborhood retract in X if it is a retract of some openU ⊂ Xthat contains A. A metric space A is an absolute neighborhood retract (ANR) ifh(A) is a neighborhood retract in X whenever X is a metric space, h : A → X is anembedding, and h(A) is a closed subset of X . This class of spaces was introduced byBorsuk in his Ph.D. thesis, and the book Borsuk (1967) is still a standard referencefor the topic.

Possibly being an ANR sounds like a quite stringent condition, but in fact it isquite permissive. A metric space is an ANR if it is (homeomorphic to) a retract ofa relatively open subset of a convex subset of a locally convex space. In particular,open and convex subsets of locally convex spaces are ANR’s. A metric space A is anANR if it has an open cover {Ui } such that eachUi is an ANR. Thus a manifold (thesubject of Chap.10) is an ANR. We will also see that finite simplicial complexes areANR’s.

A metric space A is an absolute retract (AR) if h(A) is a retract of X wheneverX is a metric space, h : A → X is an embedding, and h(A) is a closed subset of X .It turns out that an ANR is an AR if and only if it is contractible. Eventually we willsee that a nonempty compact AR has the fixed point property.

The domination theorem has already been mentioned. To repeat, it asserts that ifX is an ANR, C ⊂ X is compact, and ε > 0, then there is a finite dimensional vectorspace V , an open U ⊂ X , continuous functions ϕ : C → U and ψ : U → X , anda homotopy η : C × [0, 1] → X , such that η0 = IdC , η1 = ψ ◦ ϕ, and for all (x, t)the distance from x to η(x, t) is less than ε.

1.3.8 Chapter 9: Approximation

Suppose that X and Y are ANR’s, X is separable, andC and D are compact subsets ofX withC contained in the interior of D. Let F : D → Y be an upper hemicontinuouscontractible valued correspondence, and let W ⊂ C × Y be a neighborhood of thegraph of the restriction of F to C . Chapter 9 is devoted to the proof that there are:

(a) a continuous f : C → Y with Gr( f ) ⊂ W ;(b) a neighborhoodW ′ of Gr(F) such that, for any two continuous functions f0, f1 :

D → Y with Gr( f0),Gr( f1) ⊂ W ′, there is a homotopy h : C × [0, 1] → Ywith h0 = f0|C , h1 = f1|C , and Gr(ht ) ⊂ W for all 0 ≤ t ≤ 1.

The proof has three phases, the first two of which are due to Mas-Colell (1974),who established the result when X is a simplicial complex and Y is a locally convexspace. The first step is a particularly intricate and ingenious construction. The thirdphase, which is from McLennan (1991), uses the domination theorem to pass to asetting where X and Y are ANR’s.


1.3.9 Chapter 10: Manifolds

An m-dimensional manifold is a topological space M that is “locally homeomor-phic” to R

m . Concretely, this means that there is a collection {ϕi : Ui → M}i∈I ofembeddings of open sets Ui ⊂ R

m such that {ϕi (Ui )} is an open cover of M . Man-ifolds appear in many contexts, and have been a major theme of 20th mathematicsand physics.

For each i, j ∈ I there is a transition map

ϕ−1j ◦ ϕi : ϕ−1

i (ϕ j (Uj )) → Uj .

If, for some 1 ≤ r ≤ ∞, each of these transition maps is Cr , then they give rise to asense of what it means for a function f : M → R to be Cr . The manifold (endowedwith this sense of what a Cr function is) is a Cr manifold.

Embedding theorems of Whitney imply that no (relevant) generality is lost if weassume that M is a subset of some R

k , and that the Cr structure is obtained fromthis embedding, in the sense that each ϕi is Cr when regarded as a function withrange R

k , and at each point x ∈ Ui the derivative Dϕi (x) has full rank. Insofar asmany arguments require a rephrasing of the given data in terms of Euclidean spaces,having one close at hand is often convenient.

If N is a second n-dimensional Cr manifold, one may define (in a variety ofequivalent ways) what it means for a function f : M → N to be Cr . The identityfunction on M is Cr , and a composition of two Cr functions is Cr . That is (in thelanguage of contemporary higher mathematics) Cr manifolds and Cr maps betweenthem constitute a category.

Conceptually, differentiable manifolds are the natural setting for differential cal-culus. To implement this we define the tangent space TpM at a point p ∈ M to bethe image of Dϕi (x) for any ϕi and x ∈ Ui such that ϕi (x) = p. The tangent bundleof M is

T M :=⋃

p∈M{p} × TpM .

It is aCr−1 manifold contained inRk × Rk . The derivative Df (p) : TpM → T f (p)N

is defined naturally, and these derivatives combine to give a Cr−1 function

T f : T M → T N .

Wehave T (IdM) = IdT M , and if P is a p-dimensionalCr manifold and g : N → P isa secondCr map, then the chain rule implies that T (g ◦ f ) = Tg ◦ T f . Categoricallyspeaking, T is a covariant functor from the category of Cr manifolds and Cr

functions to the category of Cr−1 manifolds and Cr−1 functions.If P ⊂ R

k is a p-dimensional Cr manifold, and P ⊂ M , then P is a subman-ifold of M . If f : M → N is a Cr map, p ∈ M is an immersion (submersion,diffeomorphism) point of f if Df (p) is injective (surjective, bijective). If every


point of M is an immersion (submersion, local diffeomorphism) point, then f is animmersion (submersion, local diffeomorphism). A submersion point of f is alsosaid to be a regular point of f , and q ∈ N is a regular value if every element off −1(p) is a regular point. If this is the case, the regular value theorem (this is the“translation” of the implicit function theorem in this context) asserts that f −1(q) isa (m − n)-dimensional Cr submanifold of M . More generally, f is transversal to aq-dimensional Cr submanifold Q ⊂ N if, for every p ∈ f −1(Q),

imDf (p) + T f (p)Q = T f (p)N ,

and the transversality theorem asserts that if this is the case, then f −1(Q) is a(m − n + q)-dimensional Cr submanifold of M .

If f is a local diffeomorphism and a bijection, then it is a Cr diffeomorphism,and M and N are Cr diffeomorphic. Let P be a p-dimensional Cr submanifoldof M . The tubular neighborhood theorem asserts that if r ≥ 2, then there is aneighborhood U ⊂ M of P and a Cr−1 diffeomorphism ι : P × R

n−p → U suchthat for each p ∈ P , ι(p, 0) = p and the image of the derivative of ι(p, ·) at theorigin is the orthogonal complement of TpP in TpM when these are understood aslinear subspaces of Rk . Composing ι−1 with the projection P × R

m−p → P gives aCr−1 projection U → P that restricts to IdP .

If {Uα}α∈A is a collection of open subsets of a finite dimensional vector spaceX and U := ⋃

α Uα , then there is a partition of unity for U subordinate to {Uα}whose elements are all C∞ functions. If M and N are Cr manifolds and 0 ≤ s ≤ r ,let Cs

S(M, N ) be the space of Cs functions from M to N endowed with the strongupper topology. Suppose that M ⊂ R

k and N ⊂ R�. Using suitable C∞ partitions of

unity, one can show that CrS(M,R�) is dense in CS(M,R�). Let V ⊂ R

� be a tubularneighborhood of N . A continuous f : M → N can be well approximated by a Cr

function from M → V , which may be composed with the projection from V to N ,from which it follows that Cr−1

S (M, N ) is dense in CS(M, N ).An m-dimensional manifold with boundary, or ∂-manifold, is a topological

space M that “looks like” the half space { x ∈ Rm : x1 ≥ 0 } in some neighborhood

of each of its points. In detail, this means that there is a collection {ϕi : Ui → M}i∈Iof embeddings of open sets Ui ⊂ { x ∈ R

m : x1 ≥ 0 } such that {ϕi (Ui )} is an opencover of M . The transversality theorem generalizes naturally to Cr functions whosedomains are Cr ∂-manifolds. An obvious fact (which still must be proved) is that acompact 1-dimensional Cr ∂-manifold is Cr diffeomorphic to either the circle or theunit interval.

1.3.10 Chapter 11: Sard’s Theorem

For a metric space (X, d), S ⊂ X , and α > 0, we say that S has α-dimensionalHausdorff measure zero if, for any ε > 0, it is possible to find points x1, x2, . . . inX and radii r1, r2, . . . > 0 such that S is contained in the union of the balls centered


at xi of radius ri , and∑

i rαi < ε. The final (for finite dimensions) version of Sard’s

theorem, due to Federer, asserts that ifU ⊂ Rm be open, f : U → R

n is Cr , and Rp

is the set of points x ∈ U such that the rank of Df (x) is less than or equal to p, thenf (Rp) has α-dimensional Hausdorff measure zero for all α ≥ p + m−p

r .A set S ⊂ R

m has measure zero if it has m-dimensional Hausdorf measure zero.A countable union of sets of measure zero has measure zero. A set of measure zerohas empty interior, so the complement is dense. If U ⊂ R

m is open, f : U → Rm

is C1, and S ⊂ U has measure zero, then f (S) has measure zero. For a set S ⊂ Rm

and t ∈ R let S(t) := { (x2, . . . , xm) : (t, x2, . . . , xm) ∈ S } be the “slice” of S abovet , and let P(S) be the set of t ∈ R such that S(t) does not have (m − 1)-dimensionalmeasure zero.Then S hasmeasure zero if andonly if P(S)has 1-dimensionalmeasurezero. In the absence of the machinery of measure theory, this special case of Fubini’stheorem is a critical technical tool. The concept of a set of measure zero transferseasily to manifolds.

Suppose thatU ⊂ Rm is open and f : U → R

n isCr . A point y ∈ Rn is a critical

value of f if there is a point x ∈ f −1(y) that is not a regular point of f . Sard’s theoremasserts that if r > max{m − n, 0}, then the set of critical values of f has measurezero. (This is the case α = n and p = n − 1 in Federer’s theorem.) We present theproof given by Milnor (1965b) and Sternberg (1983).

Theorem 11.4 is a variant of the Thom transversality theorem. Let L , M , and N besmooth (that is,C∞)manifolds, let P be a smooth submanifold of N , and letπ : N →L be a submersion such thatπ |P is also a submersion. Theorem 11.4 asserts that if f :M → N is continuous, π ◦ f is smooth, and A ⊂ M × N is an open neighborhoodof Gr( f ), then there is a smooth function f ′ : M → N that is transversal to P withGr( f ′) ⊂ A and π ◦ f ′ = π ◦ f . The proof is a rather elaborate construction thatrepeatedly modifies the function on small sets, with a crucial application of Sard’stheorem.

1.3.11 Chapter 12: Degree Theory

If X is a finite dimensional vector space, a nonsingular linear transformation� : X → X is orientation preserving if its determinant is positive. Two orderedbases v1, . . . , vm and v′

1, . . . , v′m have the same orientation if the linear transfor-

mation taking each vi to v′i is orientation preserving. Since the determinant of the

inverse of a linear transformation is the multiplicative inverse of its determinant,and the determinant of a composition of two linear transformations is the product oftheir determinants, “have the same orientation” is an equivalence relation with twoequivalence classes, which are orientations of X . An oriented vector space is avector space that has been endowed with an orientation, and an ordered basis of sucha space is positively oriented (negatively oriented) if it is (is not) an element ofthe orientation. If X and Y are oriented vector spaces of the same dimension, a non-singular linear transformation � : X → Y is orientation preserving (orientation


reversing) if � takes positively oriented ordered bases of X to positively (negatively)oriented ordered bases of Y .

An orientation of a smooth manifold M is a “continuous” assignment of an ori-entation to each of the tangent spaces TpM . If such an assignment exists, then M isorientable, and each connected component of M has two orientations. An orientedmanifold is a smooth manifold that has been endowed with an orientation. In orderto say precisely what this means we need to be able to construct continuous assign-ments of ordered bases along a path in M . A proper treatment of the geometric issues(the Gram-Schmidt process, Grassman manifolds of linear subspaces, the projectionof a point onto a subspace as a joint function of the point and the subspace, theconstruction of paths) is given in Sects. 12.1 and 12.2.

If M and N are m-dimensional oriented manifolds,U ⊂ M is open, f : U → Nis smooth, and p is a regular point of f , we say that f is orientation preserving(orientation reversing) at p if Df (p) : TpM → T f (p)N is orientation preserving(reversing). If C ⊂ M is compact, f : C → N is continuous, and q ∈ N , then fis degree admissible over q if f −1(q) ∩ ∂C = ∅ where ∂C = C ∩ M \ C is thetopological boundary of C . If, in addition, f is smooth and q is a regular value of f ,then the degree of f over q is the number of p ∈ f −1(q) at which f is orientationpreservingminus the number of p ∈ f −1(q) atwhich f is orientation reversing. For acontinuous f : C → N that is degree admissible over q the degree of f over q is thedegree of f ′ over q for nearby f ′ that are smooth and have q as a regular value. Thisconcept is uniquely characterized by properties called Normalization, Additivity,and Continuity, which are analogues of the properties with these names that weredescribed in Sect. 1.1. The degree is well behaved with respect to compositions andcartesian products of functions.

1.3.12 Chapter 13: The Fixed Point Index

The contents of this chapter have, for themost part, already been surveyed in Sect. 1.1.While that survey is accurate in a conceptual sense, in the actual execution weuse the previously developed theory of the degree as the source of the index forcontinuous functions f : C → V whereV is a finite dimensional vector spaceC ⊂ Vis compact, and f has no fixed points in ∂C : in this setting the fixed point index off is just the degree over zero of IdV − f .If one is only interested in achieving a complete understanding of the proof of

the existence and uniqueness of the index, there is a somewhat simpler path that thereader may follow. Chapter 12 can be read from the perspective of the Euclideansetting, without reference to the material on manifolds from Chap.10. In the proofof Theorem 12.2, instead of invoking Thom transversality (Theorem 11.4) a simpleappeal to Sard’s theorem suffices, so the rather intricate argument of Sect. 11.5 canalso be bypassed.

In the more general versions of the theorem asserting existence and uniquenessof the index, what is unique is the index as a collection of functions defined on some


large class of spaces. Because Commutativity is used to transfer the index from onespace to another, results asserting that a given space has a unique index do not followautomatically. If one were to use uniqueness in a proof, by constructing a functionthat had the properties of the index and then invoking uniqueness in order to assertthat it is in fact the index, one would need to extend the function to all the spaces inthe class. This motivates interest in single index spaces, which are those for whichthere is only one function, on the set of index admissible functions for that space,that satisfy Normalization, Additivity, and Continuity. We show that manifolds andfinite simplicial complexes are single index spaces.

1.3.13 Chapter 14: Topological Consequences

With the point-set nitty-gritty out of the way, it is possible to reap a topologicalharvest. TheEuler characteristic χ(X) of a compact ANR is the index of its identityfunction. This is shown to coincide with the formula χ(X) = V − E + F when X isa 2-manifold with a triangulation that has V vertices, E edges, and F 2-dimensionalsimplices. If F : X → X is an upper hemicontinuous correspondence, then the indexof F is the Lefschetz number of F . The traditional formulation of the Lefschetzfixed point theorem combines the assertion that there is a fixed point if the Lefschetznumber is nonzero with a formula computing the Lefschetz number in terms ofhomological objects. When X is contractible the Lefschetz number is one, so afixed point exists: this is the Eilenberg-Montgomery fixed point theorem, with theirassumption that F is acyclic valued strengthened slightly.

For maps f : M → N where M and N are compact m-dimensional manifolds,if two such maps are homotopic, then they have the same degree. Hopf’s theoremasserts that if N is the m-dimensional sphere, and two maps from M to N have thesame degree, then they are homotopic. The Borsuk-Ulam theorem asserts that iff : Sm → R

m is continuous (where Sm := { x ∈ Rm+1 : ‖x‖ = 1 }) then there is a

point p ∈ Sm such that f (p) = f (−p). This result has several interesting equivalentformulations, and there are several other results concerning maps between spheres.One important consequence is invariance of domain, which asserts that ifU ⊂ R

m

is open and f : U → Rm is continuous and injective, then f (U ) is open and f −1 is

continuous.As we mentioned before, if a set of fixed points has nonzero index, then it is

essential. Section 14.5 provides two converse results, for functions from a manifoldto itself and for convex valued correspondences respectively. At a technical level,Hopf’s theorem is the key ingredient.

1.3.14 Chapter 15: Dynamical Systems

This chapter develops the connection between the index and dynamic stability. Alocally Lipschitz vector field on a manifold defines a dynamical system. Section 15.1


reviews the basic existence-uniqueness results for solutions of ordinary differentialequations, and Sect. 15.2 transfers these results and related concepts to the setting ofa sufficiently smooth manifold M .

Evolutionary game theory studies dynamical systems on spaces that are not man-ifolds, such as the cartesian product of the simplices of mixed strategies of the agentsplaying a strategic form game. In order to work at a satisfactory level of generality,Sect. 15.3 introduces the notion of a diffeoconvex body, which is a subset Σ of asmooth manifold M , each of whose points lies in the domain of a smooth coordinatechart that maps the portion of Σ in the domain to an open subset of a closed convexset with nonempty interior. A key technical tool for dealing with such a set is the maptaking a point p ∈ M to the nearest point rΣ(p) ∈ Σ . This map is unambiguouslydefined in a neighborhood of the diffeoconvex body, and is Lipschitz on smallerneighborhoods. Section 15.4 extends the existence-uniqueness results to the dynam-ical system defined by a vector field on Σ that is locally Lipschitz and not outwardpointing.

The points where a vector field vanishes are called equilibria. In addition to thefixed point index and the degree, the underlying topological principle has a thirdembodiment, the vector field index, which (roughly speaking) assigns an integerto each connected component of the set equilibria of a vector field. Section 15.5defines the vector field index, characterizes it axiomatically, and relates it to the fixedpoint index of the displacement map given by following the vector field’s dynamicalsystem for a small amount of time. This material is first developed for vector fieldson manifolds, then extended to not outward pointing vector fields on diffeoconvexbodies. The Poincaré-Hopf theorem asserts that if Σ is a compact diffeoconvexbody, then the vector field index of a not inward pointing vector field on Σ is theEuler characteristic of Σ .

Section 15.6 presents basic concepts related to dynamic stability. A set A is for-ward invariant for a dynamical system if trajectories originating in A are definedfor all positive times and never leave A. The domain of attraction of A is the set ofpoints p such that the trajectory originating at p is defined for all positive times andconverges to A. We say that A is asymptotically stable if it is compact and invariant,its domain of attraction is a neighborhood of A, and for every neighborhood U ofA there is a neighborhood U of A such that trajectories originating at points in Uare defined for all positive times and never leave U . A not outward pointing vectorfield on a diffeoconvex body Σ ⊂ M has a natural extension to a neighborhood ofΣ whose value at p ∈ M is the value of the vector field at rΣ(p) plus rΣ(p) − p(adjusted so as to be a tangent vector at p). We show that if a set A ⊂ Σ is asymp-totically stable for the dynamical system defined by the given vector field, then it isalso asymptotically stable for the dynamical system defined by the extended vectorfield.

A Lyapunov function for A is a continuous function from a neighborhood of Ato R

+ that takes the value 0 on A, takes positive values outside of A, and decreasesstrictly (in a differential sense) along trajectories outside of A. The existence ofa Lyapunov function is a well known and very intuitive sufficient condition for


asymptotic stability. Less well known, and highly nontrivial, is the fact that existenceof a Lyapunov function is also a necessary condition for asymptotic stability.

Section 15.8 shows that if A is asymptotically stable for the dynamical systemdefined by a not outward pointing vector field onΣ , then (−1)m (m is the dimensionof M) times the degree of this vector field at A is the Euler characteristic of A.Economic theory does not provide definite models of adjustment to equilibrium,almost necessarily, because agents who understood such a model would not behaveas the model predicts, but would instead take advantage of the predicted departuresfrom equilibrium. On the other hand, because the degree is a homotopy invariant, itwill be the same for all dynamics in rather large classes, for instance those in whichthe rate of adjustment of prices in an exchange economy and the excess demandform an acute angle, or the adjustment of each agent’s mixed strategy in a strategicform game is in a direction that increases expected payoff. If the common degree atA for all of these processes is different from (−1)m times the Euler characteristic ofA, then A is unstable in a strong sense that is independent of the details of any oneadjustment process. Section 15.9 argues that in this circumstance a very plausiblehypothesis is that the equilibria in Awill not be observed as persistent (in the sense ofrepeatedly being expected and then occurring) outcomes of the market or game, andthat this index +1 principle should be regarded as the multidimensional extensionof Samuelson’s correspondence principle.

1.3.15 Chapter 16: Extensive Form Games

In an extensive form game the possible states of the game are arranged in a tree. Playstarts at an initial node, and progress through the tree is governed by the agents’action choices. Hidden information is represented by partitioning the nonterminalstates of the game into information sets. Payoffs are received at terminal nodes.Elementary examples show that the Nash equilibrium concept is far too permissivebecause (among other reasons) it only requires rationality from the perspective of thesituation before the game begins, and permits irrational behavior at information setsthat are never reached when the game is played according to the Nash equilibriumstrategies.

A behavior strategy is an assignment, to each information set, of a probabilitydistribution over the actions that can be chosen there. A belief is an assignment,to each information set, of a probability distribution over the information set. Anassessment is a behavior strategy-belief pair. An interior consistent assessment isan assessment in which the behavior strategy is interior (each of its probability dis-tributions has full support) and the component probability distributions of the beliefare the derived conditional probability distributions. An assessment is consistent ifit is in the closure of the set of interior consistent assessments.

The information provided by an assessment allows one to compute a vector ofexpected payoffs at each information set, for each action that can be chosen there.The assessment is sequentially rational if, at each information set, all probability is


assigned to those actions that maximize the expected payoff of the agent who choosesthere.A sequentially equilibrium is an assessment that is consistent and sequentiallyrational. When this concept was introduced by Kreps and Wilson (1982) it seemedlike a refinement of Nash equilibrium, similar to the notions of perfect equilibrium(Selten 1975) and proper equilibrium (Myerson 1978), because Kreps and Wilsonused similar techniques to establish existence. Over time, however, it came to beregarded as the foundational concept for extensive form games, in the same way thatNash equilibrium is the fundamental solution concept for strategic form games. Themain point of this chapter (which is based on McLennan 1989a) is to show that thisis true in a mathematical sense, because the set of sequential equilibria is the set offixed points of an upper hemicontinuous contractible valued correspondence whosedomain is homomorphic to a closed ball in a Euclidean space.

Section 16.1 presents an example (from Cho and Kreps 1987) that provides con-crete illustrations of many of the concepts described above. Section 16.2 lays outthe formal apparatus of extensive form games, and Sect. 16.3 provides the precisedefinition of sequential equilibrium and related concepts.

The components (strategy and belief) of an assessment at an information set canboth be represented as conditional probabilities on subsets of the set of terminalnodes. At the technical level our framework differs from Kreps and Wilson’s insofaras we keep track of all such conditional distributions, and not just those that enter intothe computation of the expected payoffs. In general an interior conditional systemon a finite set specifies a probability distribution with full support on each of the set’snonempty subsets, such that the distribution on a set is derived from the distributionon any superset by taking conditional probability. The space of conditional systemsis the closure of the set of interior conditional systems. Section 16.4 introduces thisconcept, and develops alternative “coordinate systems” for the space of conditionalsystems. Of these, the most useful assigns the number

λ(a, b) = ln p(a|{a, b}) − ln p(b|{a, b})

to each pair (a, b) of elements of the set. (Here p(a|{a, b}) is the probability ofa conditional on {a, b}, and the logarithm function is extended to [0, 1] by settingln 0 := −∞, so λ(a, b) ∈ [−∞,∞].)

Since the given distribution on the set of initial nodes is interior, there is a functionpassing from a conditional system on the space of pure behavior strategies to therelevant conditional system on the set of terminal nodes. Therefore we work inthe space of conditional systems on the space of pure behavior strategies. In thecoordinate system mentioned above, the requirement that the different informationsets’ behaviors are statistically independent identifies a linear subspace of the set ofall interior conditional systems. The closure of this linear subspace is the space ofconsistent conditional systems.

Since a consistent conditional system gives rise to a consistent assessment, thereis no difficulty defining a best response correspondence from the space of consistentconditional systems to itself whose set of fixed points maps to the set of sequentialequilibria. This correspondence is easily shown to be upper hemicontinuous. Sections


16.6 and 16.7 construct an explicit homeomorphism between the space of consistentconditional systems and a set homeomorphic to the closed unit ball in the space ofinterior consistent conditional systems. We show that an image of the best responsecorrespondence is contractible because it is a strong deformation retract of an openstar shaped cone. (The relevant aspects of the theory of strong deformation retractionsare developed in Sect. 16.5. In particular, if A is a strong deformation retract of X ,then A is contractible if and only if X is contractible.)

Section 16.9 discusses refinements of sequential equilibrium. Solutions conceptsfor which existence is guaranteed can in principle be defined by requiring robustnesswith respect to perturbations of the best response correspondence, and index theorycan also be applied. However, it turns out that existence results for two previouslyknown refinements are most naturally proved by showing that relevant subcorre-spondences of the best response correspondence are upper hemicontinuous and con-tractible valued. Thismethodwas discovered during the final stages of the preparationof this book’s manuscript, and its potential has not yet been explored.

1.3.16 Chapter 17: Monotone Equilibria

In many games considered in economic applications, each agent has some piece ofprivate information, so that a pure strategy is a function from a space of possiblesignals or types to the set of possible actions. In an auction, for example, the actionis the agent’s bid, and the signal may be the agent’s valuation of the object beingsold, or it may be that each agent’s valuation depends on the entire vector of signals.In the latter case strategic analysis depends on what your signal implies about thelikely values of other agents’ signals, and how these signals are related to the otheragents’ bids.

Analysis of such models is generally restricted to pure equilibria, and to strategiesthat are monotone, so that you bid more when your signal improves. (It is hard toimagine tractable analysis without these restrictions.) Starting with Milgrom andRoberts (1982), various models of this sort have been developed, and this leadsto the question of what general features of such models imply the existence of anequilibrium in monotone pure strategies. (It is also natural to investigate assumptionsthat imply that all equilibria are monotone and pure, but this question is usually quitedifficult, and is not considered here.) Athey (2001) developed general conditionsimplying the existence of monotone equilibrium when the spaces of signals andactions are 1-dimensional, and McAdams (2003) generalized her results to certainmultidimensional settings. Their proofs apply the Kakutani fixed point theorem, andmuch of the effort in their arguments goes into showing that the set of monotonebest responses to a profile of monotone strategies is convex. Reny (2011) observedthat this work is unnecessary, because it is easy to show that the set of monotonebest responses to a profile of monotone pure strategies is contractible valued, sothat the Eilenberg-Montgomery theorem can be applied. Furthermore, this approach


allows the assumptions to be weakened in several directions. Chapter 17 exposits hisargument, along with relevant mathematical background.

The most important results of economic analysis tend to be qualitative, e.g., if theprice of one of a firm’s inputs increases, then the firm will purchase less of it. Thatis, we can determine the sign of a change of behavior resulting from a change in aparameter, but not much more than that. Section 17.1 studies this issue when the setsof parameters and actions are partially ordered sets, and the set of actions is a lattice,so that any pair of elements has a greatest lower bound and a least upper bound.The fundamental result giving necessary and sufficient conditions for monotonicityis due to Milgrom and Shannon (1994). This section also presents the Tarski (1955)fixed point theorem, which deserves amention due to its economic applications, evenif it is mathematically distant from the topological theory of fixed points. Section17.2 presents the rudiments of auction theory, in order to pose the issue of monotoneequilibrium concretely and in greater detail.

The next four sections develop mathematical background. Section 17.4 is a selfcontained treatment of the material from the theory of measure and integration thatwe apply. Section 17.5 considers the structure that will be imposed on the agents’spaces of types, in which there is both a partial order and a probability measure. Eachagent’s space of actions will be a metric space and a semilattice, which is to saythat it is partially ordered and any pair of elements has a least upper bound, but notnecessarily a greatest lower bound. The precise relationship between the partial orderand the metric is studied in Sect. 17.3. Section 17.6 studies the space of monotonefunctions from a partially ordered probability space to a compact metric semilattice.If we regard two such functions as equivalent if they agree outside a set of measurezero, the space of equivalence classes is compact and contractible. A key issue, towhich we return later in the chapter, is to show that this space is an ANR.

The equilibrium existence result is stated and proved in Sect. 17.7, for a versionof the model in which each agent’s space of pure strategies is the set of monotonefunctions from her type space to her action space. Here the main point is to showthat the set of best responses to a strategy profile is always contractible. Section17.8 presents one set of assumptions that imply that each agent has a monotone bestresponse to any profile of monotone strategies.

The final three sections complete the proof that the space of monotone functions isan ANR by laying out a collection of necessary conditions and sufficient conditionsfor a space to be anANR. Thesewere developed byDugundji over the course of aboutfifteen years, in three separate papers. Insofar as the material is quite technical, itcannot be described in any further detail at this point, beyond saying that the analysisis deeply insightful, with sophisticated, intricate constructions. Overall it is a lovelypiece of mathematics, and a fitting capstone to this project.

Part IICombinatoric Geometry

Chapter 2Planes, Polyhedra, and Polytopes

This chapter studies basic geometric objects defined by linear equations and in-equalities. This serves two purposes, the first of which is simply to introduce basicvocabulary. Beginning with affine subspaces and half spaces, we will proceed to(closed) cones, polyhedra, and polytopes, which are polyhedra that are bounded.A rich class of well behaved spaces is obtained by combining polyhedra to formpolyhedral complexes. Although this is foundational, there are nonetheless severalinteresting and very useful results and techniques, notably the separating hyperplanetheorem, Farkas’ lemma, barycentric subdivision, and approximation of continuousfunctions by piecewise linear functions. Required definitions from graph theory arepresented.

2.1 Affine Subspaces

In this chapter we work with a fixed d-dimensional real vector space V . (Of coursewe are really talking about Rd , but a more abstract setting emphasizes the geometricnature of the constructions and arguments.) We assume familiarity with the conceptsand results of basic linear algebra, as well as elementary facts concerning open,closed, and compact subsets of metric spaces, and continuous functions betweenmetric spaces.

An affine combination of y0, . . . , yr ∈ V is a point of the form

α0y0 + · · · + αr yr

where α = (α0, . . . , αr ) is a vector of real numbers whose components sum to 1. Wesay that y0, . . . , yr are affinely dependent if it is possible to represent a point as anaffine combination of these points in two different ways: that is, if


33


34 2 Planes, Polyhedra, and Polytopes

∑

j

α j = 1 =∑

j

α′j and

∑

j

α j y j =∑

j

α′j y j ,

then α = α′. If y0, . . . , yr are not affinely dependent, then they are affinely inde-pendent.

Lemma 2.1 For any y0, . . . , yr ∈ V the following are equivalent:

(a) y0, . . . , yr are affinely independent;(b) y1 − y0, . . . , yr − y0 are linearly independent;(c) there do not exist β0, . . . , βr ∈ R, not all of which are zero, with

∑j β j = 0 and∑

j β j y j = 0.

Proof Suppose that y0, . . . , yr are affinely dependent, and let α j and α′j be as above.

If we set β j := α j − α′j , then

∑j β j = 0 and

∑j β j y j = 0, so (c) implies (a). In

turn, if∑

j β j = 0 and∑

j β j y j = 0, then

β1(y1 − y0) + · · · + βr (yr − y0) = −(β1 + · · · + βr )y0 + β1y1 + · · · + βr yr = 0 ,

so y1 − y0, . . . , yr − y0 are linearly dependent. Thus (b) implies (c). Ifβ1(y1 − y0) +· · · + βr (yr − y0) = 0, then for any α0, . . . , αr with α0 + · · · + αr = 1 we can setβ0 := −(β1 + · · · + βr ) and α′

j := α j + β j for j = 0, . . . , r , thereby showing thaty0, . . . , yr are affinely dependent. Thus (a) implies (b). �

The affine hull aff(S) of a set S ⊂ V is the set of all affine combinations ofelements of S. The affine hull of S contains S as a subset, and we say that S isan affine subspace if the two sets are equal. That is, S is an affine subspace if itcontains all affine combinations of its elements. Note that the intersection of twoaffine subspaces is an affine subspace. If A ⊂ V is an affine subspace and a0 ∈ A,then { a − a0 : a ∈ A } is a linear subspace, and the dimension dim A of A is, bydefinition, the dimension of this linear subspace. The codimension of A isd − dim A.A hyperplane is an affine subspace of codimension one.

Throughout we assume that V is endowed with an inner product, which is afunction 〈·, ·〉 : V × V → R that is symmetric, bilinear, and positive definite:

(a) 〈v,w〉 = 〈w, v〉 for all v,w ∈ V ;(b) 〈αv + v′,w〉 = α〈v,w〉 + 〈v′,w〉 for all v, v′,w ∈ V and α ∈ R;(c) 〈v, v〉 ≥ 0 for all v ∈ V , with equality if and only if v = 0.

Such a function exists: if e1, . . . , ed is a basis of V , then there is an inner productgiven by

〈x1e1 + · · · + xded , y1e1 + · · · + yded〉 := x1y1 + · · · + xd yd .

The norm of v ∈ V is ‖v‖ := √〈v, v〉. Evidently ‖αv‖ = |α|‖v‖ for all v ∈ Vand α ∈ R.

2.1 Affine Subspaces 35

Proposition 2.1 (Cauchy-Schwartz Inequality) For all v,w ∈ V ,

〈v,w〉 ≤ ‖v‖ × ‖w‖ and ‖v + w‖ ≤ ‖v‖ + ‖w‖ .

These hold with equality if and only if one of the vectors is a scalar multiple of theother.

Proof The computation

0 ≤ ⟨〈v, v〉w − 〈v,w〉v, 〈v, v〉w − 〈v,w〉v⟩ = 〈v, v〉(〈v, v〉〈w,w〉 − 〈v,w〉2) ,

implies the first inequality, which is known as the Cauchy-Schwartz inequality.This holds with equality if v = 0 or 〈v, v〉w − 〈v,w〉v, which is the case if and onlyif w is a scalar multiple of v, and otherwise the inequality is strict. For the secondinequality we compute that

‖v + w‖2 = 〈v + w, v + w〉 = ‖v‖2 + 2〈v,w〉 + ‖w‖2 ≤ (‖v‖ + ‖w‖)2 .

This holds strictly exactly when the Cauchy-Schwartz inequality holds strictly. �

We now see that there is a metric d(v,w) := ‖v − w‖ on V . Throughout V willbe endowed with the associated topology. Vector addition and scalar multiplicationare continuous: if vn → v, wn → w, and αn → α, then ‖(vn + wn) − (v + w)‖ ≤‖vn − v‖ + ‖wn − w‖ → 0 and ‖αnvn − αv‖ ≤ ‖αn(vn − v)‖ + ‖(αn − α)v‖ → 0.

A (closed) half-space is a set of the form

H := { v ∈ V : 〈v, n〉 ≤ β }

where n is a nonzero element of V , called the normal vector of H , and β ∈ R. Ofcourse H determines n and β only up to multiplication by a positive scalar. We saythat

I := { v ∈ V : 〈v, n〉 = β }

is the bounding hyperplane of H . Any hyperplane is the intersection of the twohalf-spaces that it bounds.

2.2 Convex Sets and Cones

A convex combination of y0, . . . , yr ∈ V is a point of the form α0y0 + · · · + αr yrwhere α = (α0, . . . , αr ) is a vector of nonnegative numbers whose components sumto 1. A set C ⊂ V is convex if it contains all convex combinations of its elements,so that (1 − t)x0 + t x1 ∈ C for all x0, x1 ∈ C and 0 ≤ t ≤ 1. For any set S ⊂ V the


convex hull conv(S) of S is the smallest convex containing S. Equivalently, it is theset of all convex combinations of elements of S.

Theorem 2.1 (Carathéodory’s Theorem) If x is an element of the affine (convex)hull of S ⊂ V then S, is an affine (convex) combination of d + 1 elements of S.

Proof Let x = ∑ki=0 αi si be a minimal (with respect to k) representation of x as an

affine (convex) combination of elements of S. If k > d, then s1 − s0, . . . , sd − s0 arelinearly dependent, so β1(s1 − s0) + · · · ,+βk(sk − s0) for some β1, . . . , βk , not allof which are zero. Let β0 := −(β1 + · · · + βk). There is some t such that αi + tβi =0 for some i , and if all the αi are positive, then there is a smallest such positive t .Now x = ∑k

i=0(αi + tβi )si contradicts minimality. �

We now establish three versions of the separating hyperplane theorem.

Lemma 2.2 If C is a closed convex set that does not contain the origin, then thereis an n ∈ V \ {0} and a number c > 0 such that 〈n, v〉 > c for all v ∈ C.

Proof Let n be a point in C that is closer to the origin than any other. (Such an nexists because the intersection of C with a sufficiently large closed ball centered atthe origin is compact and nonempty, so it has a point that minimizes the distance tothe origin.) Let c := ‖n‖2/2. Fix v ∈ C . The derivative of the function

t �→ 〈n + t (v − n), n + t (v − n)〉

is 2〈n, v − n〉 + 2t‖v − n‖2. Since the line segment between n and v is in C , thismust be nonnegative when t = 0, so 〈v, n〉 ≥ ‖n‖2 > c > 0. �

Lemma 2.3 If S is compact, then conv(S) is compact.

Proof For any v,w ∈ S and t ∈ [0, 1] we have ‖(1 − t)v + tw‖ ≤ (1 − t)‖v‖ +t‖w‖ ≤ max{‖v‖, ‖w‖}, so conv(S) is bounded because S is bounded. Suppose that{vn}, {wn}, and {tn} are sequences such that (1 − tn)vn + twn → x . After passing tosubsequences, vn → v, wn → w, and tn → t . Since S is closed it contains v and w,and continuity gives x = (1 − t)v + tw. Thus conv(S) is closed. �

Lemma 2.4 If C is a convex set that does not contain the origin, then there is ann ∈ V \ {0} such that 〈n, v〉 ≥ 0 for all v ∈ C.

Proof For each compact convex K ⊂ C let

NK = { n ∈ V : ‖n‖ = 1 and 〈n, v〉 ≥ 0 for all v ∈ K } .

The lemma above implies that NK is nonempty, and it is evidently closed andbounded, hence compact. The sets NK have the finite intersection property becauseNK1 ∩ · · · ∩ NKr = Nconv(K1∪···∪Kr ). Therefore their intersection is nonempty, so thereis an n such that ‖n‖ = 1 and 〈n, v〉 ≥ 0 for all compact K ⊂ C and all v ∈ K . Foreach v ∈ C , {v} is compact, so 〈n, v〉 ≥ 0 for all v ∈ C . �

2.2 Convex Sets and Cones 37

We will frequently use the notion of Minkowski sum: for A, B ⊂ V , A + B :={ v + w : v ∈ A and w ∈ B }. We write A − B in place of A + (−B) where −B :={ −w : w ∈ B }.Theorem 2.2 (Separating Hyperplane Theorem) If A and B are nonempty disjointconvex subsets of V , then there is a nonzero n ∈ V and a number c such that 〈n, v〉 ≥c ≥ 〈n,w〉 for all v ∈ A and w ∈ B.

Proof Let C := A − B. The last result gives a nonzero n such that 〈n, v − w〉 ≥ 0for all v − w ∈ A \ B. Let c := supw∈B〈n,w〉. Since A and B are nonempty, c is welldefined and finite. �

Naturally one sometimes wants stronger forms of separation. Our first result inthis direction is an obvious consequence of the last result.

Theorem 2.3 (Separating Hyperplane Theorem) If A and B are nonempty disjointconvex subsets of V and A is open, then there is a nonzero n ∈ V and a number csuch that 〈n, v〉 > c ≥ 〈n,w〉 for all v ∈ A and w ∈ B.

An example showing that the compactness hypothesis of the next result cannotbe relaxed is given by setting A := { (x, y) ∈ R

2 : y ≤ 0 } and B := { (x, y) ∈ R2 :

y ≥ ex }.Theorem 2.4 (Separating Hyperplane Theorem) If A and B are nonempty closedconvex sets, one of which is compact, then there is a nonzero n ∈ V and a constantc such that 〈n, v〉 > c > 〈n,w〉 for all v ∈ A and w ∈ B.

Proof Without loss of generality assume that A is compact. Let C := A − B. Ofcourse C is convex, and we claim that it is closed. Let {vn} and {wn} be sequences inA and B such that vn − wn → x . After passing to a subsequence, {vn} is convergent,say with limit v, andwn → w := v − x . Since A and B are closed, v ∈ A andw ∈ B,so x = v − w ∈ A − B. Now Lemma 2.2 gives a nonzero n and a constant c′ suchthat 〈n, v − w〉 > c′ for all v ∈ A and w ∈ B. Set c := minv∈A〈n, v〉 − c′/2. �

Several sets can be derived from a given convex set C . The dual of C is

C∗ := { n ∈ V : 〈x, n〉 ≥ 0 for all x ∈ C } .

The recession cone of C is

RC := { y ∈ V : x + αy ∈ C for all x ∈ C and α ≥ 0 } .

The lineality space of C is

LC := RC ∩ −RC = { y ∈ V : x + αy ∈ C for all x ∈ C and α ∈ R } .

Obviously C∗, RC , and LC are convex. Since C∗ = ⋂x∈C { n ∈ V : 〈x, n〉 ≥ 0 } is

an intersection of closed sets, it is closed. Below we show that RL is closed. The


lineality space is closed under addition and scalar multiplication, so it is a linearsubspace of V . In fact it is the largest linear subspace of V contained in RC .

Lemma 2.5 Suppose C is nonempty, closed, and convex. Then RC is the set of y ∈ Vsuch that 〈y, n〉 ≤ 0 whenever H = { v ∈ V : 〈v, n〉 ≤ β } is a half space containingC, so RC is closed because it is an intersection of closed half spaces.

Proof SinceC �= ∅, if y ∈ RC , then 〈y, n〉 ≤ 0whenever H = { v ∈ V : 〈v, n〉 ≤ β }is a half space containing C . Suppose that y satisfies the latter condition and x ∈ C .Then for all α ≥ 0, x + αy is contained in every half space containing C , and theseparating hyperplane theorem implies that the intersection of all such half spaces isC itself. Thus y is in RC . �Lemma 2.6 If C is nonempty, closed, and convex, then C is bounded if and only ifRC = {0}.Proof If RC has a nonzero element, then of course C is unbounded. Suppose thatC is unbounded. Fix a point x ∈ C , and let y1, y2, . . . be a divergent sequence in C .Passing to a subsequence if need be, we can assume that y j−x

‖y j−x‖ converges to a unitvector w. To show that w ∈ RC it suffices to observe that if H = { v : 〈v, n〉 ≤ β } isa half space containing C , then 〈w, n〉 ≤ 0 because

⟨ y j − x

‖y j − x‖ , n⟩ ≤ β − 〈x, n〉

‖y j − x‖ → 0 .

�A convex cone is convex set C that is nonempty and closed under multiplication

by nonnegative scalars, so that αx ∈ C for all x ∈ C and α ≥ 0. Such a cone is closedunder addition: if x, y ∈ C , then x + y = 2( 12 x + 1

2 y) is a positive scalar multipleof a convex combination of x and y. Conversely, if a set is closed under addition andmultiplication by positive scalars, then it is a cone.

We have already seen several examples of closed convex cones. Clearly C∗ is aconvex cone, and it is closed, regardless of whether C is closed, because C∗ is theintersection of the closed half spaces { n ∈ V : 〈x, n〉 ≥ 0 }. Clearly RC is a convexcone, which was shown to be closed. More generally, any intersection of closed halfspaces that have the origin on the boundary is a closed convex cone.

From a technical point of view, the theory of systems of linear inequalities isdominated by the next result because a large fraction of the results about systems oflinear inequalities can easily be reduced to applications of it. Logically, it is merelya special case of the separating hyperplane theorem.

Theorem 2.5 (Farkas’ Lemma) If C is a closed convex cone, then for any b ∈ V \ Cthere is n ∈ C∗ such that 〈n, b〉 < 0.

Proof The separating hyperplane theorem gives n ∈ V and β ∈ R such that〈n, b〉 < β and 〈n, x〉 > β for all x ∈ C . Since 0 ∈ C , β < 0. There cannot be x ∈ Cwith 〈n, x〉 < 0 because we would have 〈n, αx〉 < β for sufficiently large α > 0, son ∈ C∗. �

2.2 Convex Sets and Cones 39

Corollary 2.1 If C is a convex cone, RC is the closure of C.

Proof Of courseC ⊂ RC , and above we showed that RC is closed. For any b outsidethe closure of C Farkas’ lemma gives an n such that 〈n, b〉 < 0 and 〈n, x〉 ≥ 0 forall x ∈ C , so b /∈ RC . �

For closed convex cones the most precise separation result is:

Lemma 2.7 If C is a closed convex cone, then there is n ∈ C∗ with 〈n, x〉 > 0 forall x ∈ C \ LC.

Proof For n ∈ C∗ let Zn := { x ∈ C : 〈x, n〉 = 0 }. Observe that LC ⊂ Zn , and thatZn+n′ = Zn ∩ Zn′ for all n, n′ ∈ C∗. Let n be a point in C∗, and suppose that 0 �=x ∈ Zn \ LC . Then −x /∈ C because x /∈ LC , and Farkas Lemma gives an n′ ∈ C∗with 〈x, n′〉 < 0. The span of Zn+n′ does not contain x , so it is a proper subspace ofthe span of Zn . In particular, if n is minimal for the dimension of the span of Zn ,then Zn = LC . �

A convex cone is said to be pointed if its lineality space is {0}.Proposition 2.2 A closed convex cone C is pointed if and only if it does not containa line.

Proof Since RC = C , LC = RC ∩ −RC = C ∩ −RC ⊂ C , and LC is a linear sub-space, so ifC is not pointed, then it contains a line. IfC is pointed, then the last resultgives an n such that 〈n, x〉 > 0 for all nonzero x ∈ C , so C cannot contain a line. �

Proposition 2.3 If C is a closed convex cone, W is a linear subspace of V that iscomplementary to LC (that is, LC + W = V and LC ∩ W = {0}) and C ′ := C ∩ W,then C ′ is a pointed closed convex cone C ′ and C = LC + C ′.

Proof As the intersection of two closed cones,C ′ is a closed cone.Wehave LC ′ = {0}because the lineality space of C ′ is contained in the lineality space of C , so C ′is pointed. Clearly LC + C ′ ⊂ C . If x ∈ C , then there is some w ∈ LC such thatx − w ∈ W , and x = w + (x − w) ∈ LC + C ′. �

2.3 Polyhedra

A polyhedron in V is a subset of a finite dimensional subspace of V that is theintersection of finitely many of its closed half spaces. Any hyperplane in a finitedimensional subspace is the intersection of the two half-spaces it bounds, and anyaffine subspace is an intersection of hyperplanes, so any finite dimensional affinesubspace of V is a polyhedron. The dimension of a polyhedron is the dimension ofits affine hull. Fix a polyhedron P .

A face of P is either the empty set, P itself, or the intersection of P with thebounding hyperplane of some half-space that contains P . Evidently any face of P


is itself a polyhedron. If F and F ′ are faces of P with F ′ ⊂ F , then F ′ is a faceof F , because if F ′ = P ∩ I ′, where I ′ is the bounding hyperplane of a half spacecontaining P , then that half space contains F and F ′ = F ∩ I ′. A face is proper ifit is not P itself. A facet of P is a proper face that is not a proper subset of any otherproper face. An edge of P is a one dimensional face, and a vertex of P is a zerodimensional face. Properly speaking, a vertex is a singleton, but we will often blurthe distinction between such a singleton and its unique element, so when we refer tothe vertices of P , usually we will mean the points themselves.

We say that x ∈ P is an initial point of P if there does not exist x ′ ∈ P and anonzero y ∈ RP such that x = x ′ + y. If the lineality subspace of P has positivedimension, so that RP is not pointed, then there are no initial points.

Proposition 2.4 The set of initial points of P is the union of the bounded facesof P.

Proof Let F be a face of P , so that F = P ∩ I where I is the bounding hyperplaneof a half plane H containing P . Let x be a point in F .

We first show that if x is noninitial, then F is unbounded. Let x := x ′ + y forsome x ′ ∈ P and nonzero y ∈ RP . Since x − y and x + y are both in H , they mustboth be in I , so F contains the ray { x + αy : α ≥ 0 }, and this ray is contained in Pbecause y ∈ RP , so F is unbounded.

We now know that the union of the bounded faces is contained in the set of initialpoints, and we must show that if x is not contained in a bounded face, it is noninitial.We may assume that F is the smallest face containing x . Since F is unbounded thereis a nonzero y ∈ RF . The ray { x − αy : α ≥ 0 } leaves P at some α ≥ 0. (Otherwisethe lineality of RP has positive dimension and there are no initial points.) If α > 0,then x is noninitial, and α = 0 is impossible because it would imply that x belongedto a proper face of F . �

Proposition 2.5 If RP is pointed, then every point in P is the sum of an initial pointand an element of RP .

Proof Lemma 2.7 gives an n ∈ V such that 〈y, n〉 > 0 for all nonzero y ∈ RP . Fixx ∈ P . Clearly K := (x − RP) ∩ P is convex, and it is bounded because its recessioncone is contained in −Rp ∩ RP = {0}. Lemma 2.5 implies that K is closed, hencecompact. Let x ′ be a point in K thatminimizes 〈x ′, n〉. Then x is a sumof x ′ and a pointin RP , and if x ′ was not initial, so that x ′ = x ′′ + y where x ′′ ∈ P and 0 �= y ∈ RP ,then 〈x ′′, n〉 < 〈x ′, n〉, which is impossible. �

Any polyhedron has a standard representation, which is a representation of theform

P = G ∩k⋂

i=1

Hi

where G is the affine hull of P and H1, . . . , Hk are half-spaces. Fix such a rep-resentation, with Hi = { v ∈ V : 〈v, ni 〉 ≤ αi } and Ii the bounding hyperplane ofHi .

2.3 Polyhedra 41

Proposition 2.6 For J ⊂ {1, . . . , k} let FJ := P ∩ ⋂j∈J I j . Then FJ is a face of

P, and every nonempty face of P has this form.

Proof If we choose numbers β j > 0 for all j ∈ J , then

⟨x,

∑

j∈J

β j n j⟩ ≤

∑

j∈J

β jα j

for all x ∈ P , with equality if and only if x ∈ FJ . We have displayed FJ as a face.Now let F := P ∩ H where H = { v ∈ V : 〈v, n〉 ≤ α } is a half-space containing

P , and let J := { j : F ⊂ I j }. Of course F ⊂ FJ . Aiming at a contradiction, supposethere is a point x ∈ FJ \ F . Then 〈x, ni 〉 ≤ αi for all i /∈ J and 〈x, n j 〉 = α j for allj ∈ J . For each i /∈ J there is a yi ∈ F with 〈yi , ni 〉 < αi ; let y be a strict convexcombination of these. Then 〈y, ni 〉 < αi for all i /∈ J and 〈y, n j 〉 ≤ α j for all j ∈ J .Since x /∈ H and y ∈ H , the ray emanating from x and passing through y leaves Hat y, and consequently it must leave P at y, but continuing along this ray from ydoes not immediately violate any of the inequalities defining P , so this is a contra-diction. �

This result has many worthwhile corollaries.

Corollary 2.2 P has finitely many faces, and the intersection of any two faces is aface.

Corollary 2.3 If F is a face of P and F ′ is a face of F, then F ′ is a face of P.

Proof IfG0 is the affine hull of F , then F = G0 ∩ ⋂i Hi is a standard representation

of F . The proposition implies that F = P ∩ ⋂i∈J Ii for some J , that F ′ = F ∩⋂

i∈J ′ Ii for some J ′, and that F ′ = P ∩ ⋂i∈J∪J ′ Ii is a face of P . �

Corollary 2.4 The facets of P are F{1}, . . . , F{k}. The dimension of each F{i} is oneless than the dimension of P, The facets are the only faces of P with this dimension.

Proof Minimality implies that each F{i} is a proper face, and the result above impliesthat F{i} cannot be a proper subset of another proper face. Thus each F{i} is a facet.

For each i minimality implies that for each j �= i there is some x j ∈ F{i} \ F{ j}.Let x be a convex combination of these with positive weights, then F{i} contains aneighborhood of x in Ii , so the dimension of F{i} is the dimension of G ∩ Ii , whichis one less than the dimension of P .

A face F that is not a facet is a proper face of some facet, so its dimension is notgreater than two less than the dimension of P . �

Now suppose that P is bounded. Any point in P that is not a vertex can be writtenas a convex combination of points in proper faces of P . Induction on the dimensionof P proves that:

Proposition 2.7 If P is bounded, then it is the convex hull of its set of vertices.


An extreme point of a convex set is a point that is not a convex combination ofother points in the set. This result immediately implies that only vertices of P canbe extreme. In fact any vertex v is extreme: if {v} = P ∩ I , where I is the boundinghyperplane of a half space H containing P , then v cannot be a convex combinationof elements of P \ I .

The representation P = G ∩ ⋂ki=1 Hi is minimal if it is irredundant, so that for

each j ,G ∩ ⋂i �= j Hi is a proper superset. Startingwith any standard representation of

P , we can reduce it to a minimal representation by repeatedly eliminating redundanthalf spaces.

Lemma 2.8 If P = G ∩ ⋂ki=1 Hi is a minimal standard representation, then P has

a nonempty interior in the relative topology of G.

Proof For each i we cannot have P ⊂ Ii because that would imply that G ⊂ Ii ,making Hi redundant. Therefore P must contain some xi in the interior of each Hi .If x0 is a convex combination of x1, . . . , xk with positiveweights, then x0 is containedin the interior of each Hi . �

2.4 Polytopes and Polyhedral Cones

A polytope in V is the convex hull of a finite set of points. Polytopes were alreadystudied in antiquity, but the subject continues to be an active area of research; Ziegler(1995) is a very accessible introduction.We have just seen that a bounded polyhedronis a polytope. The most important fact about polytopes is the converse:

Theorem 2.6 A polytope is a polyhedron.

Proof Fix P := conv{q1, . . . , q�}. The property of being a polyhedron is invariantunder translations: for any x ∈ V , P is a polyhedron if and only if x + P is also apolyhedron. It is also invariant under passage to subspaces: P is a polyhedron in Vif and only if it is a polyhedron in the span of P , and in any intermediate subspace.The two invariances imply that we may reduce to a situation where the dimension ofV is the same as the dimension of P , and from there we may translate to make theorigin of V an interior point of P . Assume this is the case.

LetP∗ := { v ∈ V : 〈v, p〉 ≤ 1 for all p ∈ P }

andP∗∗ := { u ∈ V : 〈u, v〉 ≤ 1 for all v ∈ P∗ } .

Since P is bounded and has the origin as an interior point, P∗ is bounded with theorigin in its interior. The formula P∗ = ⋂

j { v ∈ V : 〈v, q j 〉 ≤ 1 } displays P∗ as apolyhedron, hence a polytope. This argument with P∗ in place of P implies that

2.4 Polytopes and Polyhedral Cones 43

P∗∗ is a bounded polyhedron, so it suffices to show that P∗∗ = P . The definitionsimmediately imply that P ⊂ P∗∗.

Suppose that z /∈ P . The separating hyperplane theorem gives w ∈ V and β ∈ R

such that 〈w, z〉 < β and 〈w, p〉 > β for all p ∈ P . Since the origin is in P , β < 0.Therefore −w/β ∈ P∗, and consequently z /∈ P∗∗. �

There is now the following elegant decomposition result:

Proposition 2.8 Any polyhedron P is the sum of a finite dimensional linear sub-space, a finite dimensional pointed cone, and a polytope.

Proof Let L be its lineality, and let K be a linear subspace of V that is complementaryto L in the sense that K ∩ L = {0} and K + L = V . Let Q := P ∩ K . Then P =Q + L , and the lineality of Q is {0}, so RQ is pointed. Let S be the convex hull ofthe set of initial points of Q. Above we saw that this is the convex hull of the set ofvertices of Q, so S is a polytope. Now Proposition 2.5 gives

P = L + RQ + S .

�

A polyhedral cone is a cone that is a polyhedron.

Proposition 2.9 For a cone C ⊂ V the following are equivalent:

(a) C is a polyhedral cone.(b) C is a finite intersection of closed halfspaces that contain the origin in their

boundary.(c) C is the convex hull of finitely many rays emanating from the origin.

Proof First suppose that C is a polyhedral cone. Let C = G ∩ ⋂ki=1 Hi with Hi =

{ v ∈ V : 〈v, ni 〉 ≤ αi }be a standard representationofC . Since0 ∈ C wehaveαi ≥ 0for all i . For each i let H ′

i := { v ∈ V : 〈v, ni 〉 ≤ 0 } If there was any v ∈ C such that〈v, ni 〉 > 0, then large scalar multiples of v would not be in Hi . Therefore V ⊂G ∩ ⋂k

i=1 H′i ⊂ G ∩ ⋂k

i=1 Hi = V . Since G is a finite intersection of halfspacescontaining the origin in the boundary, (b) holds. Thus (a) implies (b), and of course(b) implies (a).

Let W be a subspace of V that is complementary to LC , and let C ′ := C ∩ W .Proposition 2.3 implies that C ′ be a pointed closed convex cone and LC + C ′ = C .Lemma 2.7 gives an n ∈ W such that 〈n, x〉 > 0 for all x ∈ C ′ \ {0}. Let A := { x ∈W : 〈n, x〉 = 1 }, and let P := C ∩ A. To see that P is bounded observe that if thiswas not the case therewould be a sequence {ur }of unit vectors inC ′ with 〈n, ur 〉 → 0,and any limit point of this sequence would contradict the choice of n. Since P isbounded, it is the convex hull of its vertices if and only if it is the intersection offinitely many closed half spaces.

In particular, if (b) holds, then P is the intersection of finite many closed halfspaces, so it is the convex hull of finitely many points, after which it is easy to see


that (c) holds. Suppose that (c) holds, so C is the convex hull of R1 ∪ · · · ∪ Rk whereeach Ri = { αvi : α ≥ 0 }. For each i let v′

i be the point where vi + LC intersects C ′,and let R′

i = { αv′i : α ≥ 0 }. Then C ′ is the convex hull of R′

1 ∪ · · · ∪ R′k , and P is

the convex hull of the set of points where these rays intersect { v ∈ V : 〈n, v〉 = 1 }.Therefore P is a finite intersection H1 ∩ · · · ∩ Hk of closed half spaces of W .It is easy to see that for each i there is an H ′

i with the origin in its boundarysuch that H ′

i ∩ A = Hi ∩ A. Then C = (H ′1 + LC) ∩ · · · ∩ (H ′

k + LC). Thus (c)implies (a). �

2.5 Polyhedral Complexes

A wide variety of spaces can be created by taking the union of a collection ofpolyhedra.

Definition 2.1 A polyhedral complex is a setP of nonempty polyhedra in V suchthat:

(a) F ∈ P whenever P ∈ P and F is a nonempty face of P;(b) for all P, P ′ ∈ P , P ∩ P ′ is a common (possibly empty) face of P and P ′.

The underlying space of the complex is

|P| :=⋃

P∈PP .

We say that P is a polyhedral subdivision of |P|. The dimension of P is themaximum dimension of any of its elements, or ∞ if P contains polytopes of alldimensions. For n = 0, 1, 2, . . . the n-skeleton ofP is the setPn ⊂ P consistingof all elements ofP of dimension n or less. The complex is finite ifP is a finite set,and it is locally finite if each element of P has a nonempty intersection with onlyfinitely many other elements ofP . A subsetQ ⊂ P is a subcomplex if it containsall the faces of its elements.

To illustrate these concepts we mention a structure that was first studied byDescartes, and that has accumulated a huge literature over the centuries. Letx1, . . . , xn be distinct points in V . TheVoronoi diagram determined by these pointsis

P := { PJ : ∅ �= J ⊂ {1, . . . , n} } ∪ {∅}

where

PJ := { y ∈ V : ‖y − x j‖ ≤ ‖y − xi‖ for all j ∈ J and i = 1, . . . , n }

is the set of points such that the x j for j ∈ J are as close to y as any of thepoints x1, . . . , xn . From Euclidean geometry we know that the condition ‖y − x j‖ ≤

2.5 Polyhedral Complexes 45

‖y − xi‖ determines a half space in V (a quick calculation shows that ‖y − x j‖2 ≤‖y − xi‖2 if and only if 〈y, x j − xi 〉 ≥ 1

2 (‖x j‖2 − ‖xi‖)) so each PJ is a polyhedron,and conditions (a) and (b) are easy consequences of Proposition 2.6.

The notion of a polyhedral complex canbe specialized by requiring certain types ofpolyhedra. We say thatP is a polytopal complex if each P ∈ P is a polytope. A k-dimensional simplex is the convex hull of an affinely independent collection of pointsx0, . . . , xk . We say that P is a simplicial complex, and that P is a triangulationof |P|, if each P ∈ P is a simplex.

If P and Q are polytopal complexes such that |Q| = |P| and each P ∈ P isthe union of finitely many elements of Q, then Q is a subdivision of P . We nowdescribe a general method of subdividing P . Let P ′ be a subset of P such thatP ∈ P ′ whenever Q ∈ P ′ and Q is a face of P . Let Σ := Σ0 ∪ Σ1 where:

(a) Σ0 is the set of {P0, P1, . . . , Pk} ⊂ P such that P0 /∈ P ′, P1, . . . , Pk ∈ P ′,and P0 ⊂ P1 ⊂ · · · ⊂ Pk .

(b) Σ1 is the set of {P1, . . . , Pk} ⊂ P ′ such that P1 ⊂ · · · ⊂ Pk .

Letw := (wP)P∈P ′ be a specification of a pointwP in the relative interior of eachP ∈ P ′. For σ = {P0, . . . , Pk} ∈ Σ0 let

Qσ (w) := conv(P0 ∪ {wP1 , . . . ,wPk }) ,

and for σ = {P1, . . . , Pk} ∈ Σ1 let

Qσ (w) := conv({wP1 , . . . ,wPk }) .

The subdivision of P relative to w isP(w) := { Qσ (w) : σ ∈ Σ }.Lemma 2.9 P(w) is a polytopal complex, and a subdivision ofP . If each elementof P \ P ′ is a simplex, then P(w) is a simplicial complex.

Proof Fix σ ∈ Σ . First suppose that σ = {P0, P1, . . . , Pk} ∈ Σ0. Since P0 is a poly-tope, it is the convex hull of finitelymany point, so Qσ (w) is the convex hull of finitelymany points, i.e., a polytope. If P0 is a simplex, i.e., the convex hull of a finite set ofaffinely independent points, then for each i , wPi is not in the affine hull of Pi−1, soby induction each conv(P0 ∪ {wP1 , . . . ,wPi }) is a simplex, and in particular Qσ (w)

is a simplex. The proof that Qσ (w) is a simplex when σ ∈ Σ1 is similar, but simpler.Consider a second σ ′ ∈ Σ . We will show that Qσ (w) ∩ Qσ ′(w) = Qσ∩σ ′ , so that

Qσ (w) ∩ Qσ ′(w) is a face of both Qσ (w) and Qσ ′(w). Clearly Qσ∩σ ′ ⊂ Qσ (w) ∩Qσ ′(w). It suffices to show the reverse inclusionwith σ and σ ′ replaced by any σ ⊂ σ

and σ ′ ⊂ σ ′ such that Qσ ∩ Qσ ′ = Qσ (w) ∩ Qσ ′(w). Therefore Qσ (w) ∩ Qσ ′(w)

has points in the interiors of the largest elements Pk and P ′k ′ of σ and σ ′, and any

convex combinationof suchpoints is a point x in both interiors. SinceP is a polytopalcomplex, it follows that Pk = P ′

k ′ . In addition, the ray emanating from wPk andpassing through x leaves Pk at a point y ∈ Qσ\{Pk } ∩ Qσ ′\{Pk′ }, and the claim followsby induction on max{k, k ′}. We have shown that P(w) is a polytopal complex.


Evidently |P(w)| ⊂ |P|. Choosing x ∈ |P| arbitrarily, let P be the smallest ele-ment ofP that contains x . If P /∈ P ′, then x ∈ P = Q{P}, so suppose that P ∈ P ′.If x = wP , then x ∈ Q{P}, and when P is 0-dimensional this is the only possibility.Otherwise the ray emanating from wP and passing through x intersects the boundaryof P at a point y, and if y ∈ Qσ (w), then x ∈ Qσ∪{P}. By induction on dimen-sion we see that x is contained in some element of P(w) of the form Qσ∪{P}, so|P(w)| = |P| and P is the union of finitely many elements of P(w). �

IfP ′ includes every simplexof dimension twoor higher, thenP(w) is a simplicialcomplex, so the underlying space of a polytopal complex is also the underlying spaceof a simplicial complex. Since each P ∈ P is a finite union of elements ofP(w), theCW topologies of |P| induced byP andP(w) coincide. Therefore the topologicalspaceswith polytopal decompositions are notmore general from thosewith simplicialdecompositions, and for this reason polytopal complexes that are not simplicial arerarely considered in the topological literature.

IfP ′ = P , then we say thatP(w) is a complete subdivision ofP . In analysiswith a geometric aspect we often use the following particular case. If P is a polytopewhose vertices are v0, . . . , vm , the barycenter of P is

βP := 1

m + 1(v0 + · · · + vm) .

If β = (βP)P∈P ,P(β) is the barycentric subdivision ofP , or the derived ofP(Fig. 2.1).

In connection with one of the proofs of Brouwer’s fixed point theorem, we willneed to establish that there are simplicial subdivisions of a given simplex with arbi-trarily small simplices. Thediameter of a polytope is themaximumdistance betweenany two of its points. It is easy to see (and not hard to show formally, using the triangleinequality for the norm) that the diameter of a polytope is the maximal distance be-tween any two of its vertices. Themesh of a finite polytopal complex is themaximumof the diameters of its polytopes. Because we can pass from a given complex to itsderived, and then the derived of the derived and so forth, a subdivision of arbitrarily

Fig. 2.1 A barycentric subdivision

2.5 Polyhedral Complexes 47

small mesh can be obtained if passing to the derived necessarily reduces the meshby some fixed factor.

Consider two vertices of a simplex of the barycentric subdivision, say βP as aboveand βQ , where P is a face of Q. We can index the vertices so that the vertices of Qare v0, . . . , vk and the vertices of P are v1, . . . , v� where � < k. Then

‖βQ − βP‖ = ∥∥ 1

k + 1(v0 + · · · + vk) − 1

� + 1(v0 + · · · + v�)

∥∥

=∥∥∥

1

(k + 1)(� + 1)((� + 1)

k∑

i=0

vi − (k + 1)�∑

j=0

v j )∥∥∥

≤ 1

(k + 1)(� + 1)

∑

0≤i≤k

∑

0≤ j≤�, j �=i

‖vi − v j‖ ≤ �

(� + 1)maxi, j

‖vi − v j‖ .

If P is m-dimensional, this inequality implies that the mesh of P(β) is at most(m − 1)/m times the mesh of P .

This inequality seems quite crude, but it is enough for our purposes. The barycen-tric subdivision of a polytopal complexP is calledderived ofP , or the first derivedofP , which may be denoted byP (1). We inductively define the nth derivedP (n) tobe the barycentric subdivision ofP (n−1). Since each successive subdivision reducesthe mesh be a factor of at most (m − 1)/m we have:

Proposition 2.10 The underlying space of a finite polytopal complex has triangu-lations of arbitrarily small mesh.

2.6 Simplicial Approximation

Barycentric subdivision shows that the underlying space of a polyhedral complex isalso the underlying space of a simplicial complex. Furthermore, simplicial complexescan be understood in purely combinatoric terms. For the remainder of the chapter wedrop the inner product space V and recycle its symbol. A combinatoric simplicialcomplex is a pair S = (V,Σ) where V is a set of vertices and Σ is a collection offinite subsets of V called simplices with the property that τ ∈ Σ whenever σ ∈ Σ

and τ ⊂ σ . (Note that now the empty set is regarded as a cell, and as a face of everysimplex. Of course this is purely a matter of formal convenience.) It might seemnatural to insist that

⋃σ∈Σ σ = V because a vertex that is not in any simplex has no

role, but this would cause some inconvenience when we considered subcomplexes.For the most part the definitions above for polyhedral complexes extend naturally tocombinatoric simplicial complexes. For example, a subcomplex of (V,Σ) is a pair(V,Σ ′) where Σ ′ is a subset of Σ that contains the subsets of each of its elements.


The dimension of a simplex is its cardinality minus one, and the n-skeleton of S isSn := (V,Σn) where Σn is the set of simplices of dimension n or less.

The geometric interpretation is as follows. We now work in RV and we let { ev :

v ∈ V } be the standard unit basis vectors. That is, RV is the vector space of finitesums α1ev1 + · · · + αkevk . For each nonempty σ ∈ Σ let |σ | be the convex hull of{ ev : v ∈ σ }, and let |∅| = ∅. The collection of geometric simplices

P(V,Σ) := { |σ | : ∅ �= σ ∈ Σ }

is called the canonical realization of S. UsuallywewillwriteP in place ofP(V,Σ) if(V,Σ) is either unimportant or clear from context. A subcomplex ofP is aQ ⊂ Pthat contains all the faces of its elements. For n = 0, 1, 2, . . . the n-skeleton of PisPn := P(V,Σn), i.e., the set of elements of P of dimension no greater than n.

The space of P is |P| := ⋃P∈P P . We endow |P| with the CW topology,

which is the finest topology that induces the usual topology on each P . Concretely,a setU ⊂ |P| is open if and only if its intersection with each P is open in the usualsense. Note that a function with domain |P| is continuous if and only if its restrictionto each P ∈ P is continuous.

Suppose that P is a collection of simplices in a finite dimensional vectorspace, and the intersection of any two elements of P is a common face. Let|P| := ⋃

P∈P P . Let V be the set of vertices of elements of P , let Σ be the set ofsubsets of V whose convex hulls are elements of P , and letP := P(V ,Σ). There is

an obvious bijection f : |P| → |P| that is continuous, but its inverse need not becontinuous. For example, if v0 is the origin of the vector space, the other elements ofV = {v0, v1, v2, . . . , }, are distinct unit vectors, and the cells ofP are∅, the vi , and theline segments from v0 to each vn for n ≥ 1, then

⋃n{ (1 − t)v0 + t vn : 0 ≤ t ≤ 1/n }

is open in the CW topology but not in the topology inherited from the vector space.

Lemma 2.10 IfQ ⊂ P is a subcomplex, then the relative topology of |Q| inducedby the CW topology of |P| is the CW topology of |Q|.Proof It is immediate from the definition that if W ⊂ |P| is open, then W ∩ |Q| isopen in |Q|. Supposing thatU ⊂ |Q| is open, an openW = ⋃

n Wn ⊂ |P| such thatW ∩ |Q| = U can be constructed by induction on skeletons. Let W0 := U ∩ |Q0|.If an open Wn−1 ⊂ |Pn−1| with Wn−1 ∩ |Qn−1| = U ∩ |Qn−1| has already beenconstructed, let Wn be any open subset of |Pn| such that Wn ∩ |Qn| = U ∩ |Qn|and Wn ∩ |Pn−1| = Wn−1. Such a Wn can be constructed by starting with Wn−1 ∪(U ∩ |Qn|) and for each n-dimensional P ∈ P \ Q appending an open subset of Pwhose intersection with the boundary of P is P ∩ Wn−1. �

Simplicial complexes are very important in topology. On the one hand a widevariety of important spaces have simplicial subdivisions, and certain limiting pro-cesses can be expressed using repeated barycentric subdivision. On the other hand,the purely combinatoric nature of an abstract simplicial complex allows combinatoric

2.6 Simplicial Approximation 49

and algebraic methods to be applied. In addition the requirement that a simplicialsubdivision exists rules out spaces exhibiting various sorts of pathologies and infinitecomplexities. A nice example of a space that does not have a simplicial subdivisionis the Hawaiian earring, which is the union over all n = 1, 2, 3, . . . of the circle ofradius 1/n centered at (1/n, 0) ∈ R

2. Section 8.1 presents another important exam-ple.

If P and Q are simplicial complexes, a function f : |P| → |Q| is simplicial(forP andQ) if its restriction to each P ∈ P maps P affinely onto some Q ∈ Q.That is, the vertices of P are mapped onto the vertices of Q, the set of vertices ofeach P ∈ P is mapped onto the set of vertices of some Q ∈ Q, and f |P is the linearinterpolation of themap on the vertices of P , whichmeans that it is the unique affineextension of this map. Simplicial maps have many nice properties, and are usefulin various constructions, so it is good to know that any map between locally finitesimplicial complexes can be approximated by a simplificial map, as we now show.

For x ∈ |P| the closed star of x , denoted by st(x,P), is the union of all theP ∈ P that contain x , and the open star, denoted by st(x,P), is the union ofthe interiors of all the polytopes that contain x . If S ⊂ |P| we let st(S,P) :=⋃

x∈S st(x,P) and st(S,P) := ⋃x∈S st(x,P). If the discussion involves only one

polytopal subdivision of |P|wewill often write st(S) and st(S) in place of st(S,P)

and st(S,P). Since st(S) is a union ofCWopen sets, it is open, and st(S) is a union ofsimplices, so it is closed. Evidently the closed star may be regarded as a subcomplexof P , and the distinction between the subset of P and the subspace of |P| is besttreated with a degree of informality, so that the reader will usually be expected toinfer from context which interpretation of the terms ‘star’ is intended. Note that if vis a vertex of P , then st(v) is precisely the set of x ∈ |P| such that v is one of thevertices of the smallest simplex containing x .

Lemma 2.11 Suppose that g : |P| → |Q| is a map and ϕ : |P0| → |Q0| is a mapof vertices such that for each vertex v ofP , g(st(v)) ⊂ st(ϕ(v)). Then ϕ extends bylinear interpolation to a simplicial map f : |P| → |Q|.Proof Weneed to show that the vertices v0, . . . , vk of any P ∈ P aremapped into theset of vertices of some Q ∈ Q. If x is in the interior of P , then g(x) ∈ g(

⋂i st(vi )) ⊂⋂

i g(st(vi )) ⊂ ⋂i st(ϕ(vi )), and thus ϕ(v0), . . . , ϕ(vk) are vertices of the smallest

Q ∈ Q containing g(x). �

In general Uε(x) := { y ∈ X : d(x, y) < ε } and Uε(S) := ⋃x∈S Uε(x) are the

open balls around a point x and a set S in a metric space (X, d).

Lemma 2.12 (Lebesgue’s Number Lemma) If U1, . . . ,Uk is an open cover of acompact metric space (X, d), then there is some ε > 0 such that for every x ∈ Xthere is some i such that Uε(x) ⊂ Ui .

Proof The set of balls Uδ(x) such that U2δ(x) is contained in some Ui has a finitesubcover Uδ1(x1), . . . ,Uδk (xk). Let ε := min δi . �


Theorem 2.7 (Simplicial Approximation Theorem) IfP andQ are finite simplicialcomplexes, g : |P| → |Q| is continuous, and W ⊂ |P| × |Q| is a neighborhoodof Gr(g), then there is a map f : |P| → |Q| that is simplicial for P (m) and Q(n),for some m and n, such that Gr( f ) ⊂ W.

Proof Choose numbers δ, ε > 0 small enough that for all x ∈ |P|, Uδ(x) ×Uε(g(x)) ⊂ W . (If there were no such numbers one could take a sequence of pointsin |P| × |Q| \ W that converges to a point in Gr(g).) Choose an n such that everysimplex ofQ(n) is contained in each ball of radius ε centered at one of its points. Thesets g−1(st(w)) with w a vertex of Q(n) are an open cover of |P|, so the Lebesquenumber lemma implies that we may replace δ with a smaller number such that anysubset of |P| of diameter < δ/2 is contained in one of these sets. Now choose anumber m such that the mesh of P (m) is less than δ. For each vertex v of P (m)

choose a vertex ϕ(v) ofQ(n) such that g(st(v)) ⊂ st(ϕ(v)). The result above impliesthat ϕ extends by linear interpolation to a map f : |P| → |Q| that is simplicial forP (m) and Q(n). Clearly Gr( f ) ⊂ W . �

2.7 Graphs

A graph is a one dimensional polytopal complex. That is, it consists of finitely manyzero and one dimensional polytopes, with the one dimensional polytopes intersectingat common endpoints, if they intersect at all. A one dimensional polytope is just a linesegment, which is a one dimensional simplex, so a graph is necessarily a simplicialcomplex.

Relative to general simplicial complexes, graphs sound pretty simple, and fromthe perspective of our work here this is indeed the case, but the reader should beaware that there is much more to graph theory than this. The formal study of graphsin mathematics began around the middle of the 20th century and quickly became anextremely active area of research, with numerous subfields, deep results, and variousapplications such as the theory of networks in economic theory. Among the numerousexcellent texts in this area, Bollobás (1979) can be recommended to the beginner.

This book will use no deep or advanced results about graphs. In fact, about theonly “result” wewill apply is that a graph ofmaximal degree two is a disjoint union ofisolated points, line segments, and cycles. The main purpose of this section is simplyto introduce the basic terminology of the subject, which will be used extensively.

Formally, a graph1 is a triple G = (V, E) consisting of a finite set V of verticesand a set E of two element subsets of V . An element of e = {v,w} of E is called

1In the context of graph theory the sorts of graphs we describe here are said to be “simple,” todistinguish them from a more complicated class of graphs in which there can be loops (that is,edges whose two endpoints are the same) and multiple edges connecting a single pair of vertices.They are also said to be “undirected” to distinguish them from so-called directed graphs in whicheach edge is oriented, with a “source” and “target.”

2.7 Graphs 51

an edge, and v and w are its endpoints. Sometimes one writes vw in place of {v,w}.Two vertices are neighbors if they are the endpoints of an edge. The degree of avertex is the cardinality of its set of neighbors.

Awalk inG is a sequence v0v1 · · · vr of vertices such that v j−1 and v j are neighborsfor each j = 1, . . . , r . It is a path if v0, . . . , vr are all distinct. A path ismaximal ifit not contained (in the obvious sense) in a longer path. Two vertices are connected ifthey are the endpoints of a path. This is an equivalence relation, and a component ofG is one of the graphs consisting of an equivalence class and the edges in G joiningits vertices. We say that G is connected if it has only one component, so that anytwo vertices are connected. A walk v0v1 · · · vr is a cycle if r ≥ 3, v0, . . . , vr−1 aredistinct, and vr = v0. If G has no cycles, then it is said to be acyclic. A connectedacyclic graph is a tree.

Exercises

2.1 We consider an exchange economy with � commodities and m consumers. Foreach i = 1, . . . ,m, agent i’s consumption set is a nonempty Xi ⊂ R

�, and herpreference relation is a binary relation �i on Xi that is complete (for all x, y ∈Xi , either x �i y or y �i x) and transitive. We write x �i y to indicate that x �i

y and not y �i x . The strict upper contour set of i at x ∈ Xi is Ui (x) := { y ∈Xi : y �i x }. There is an aggregate endowment ω ∈ R

�. An allocation is an m-tuple (x1, . . . , xm) ∈ X1 × · · · × Xn such that

∑i xi = ω. This allocation is weakly

Pareto efficient if there is no other allocation (x ′1, . . . , x

′m) such that x ′

i ∈ Ui (xi ) forall i . The allocation (x1, . . . , xm) is an equilibrium allocation for a price vectorp ∈ R

� \ {0} if 〈p, yi 〉 > 〈p, xi 〉 for all i and yi ∈ Ui (xi ).

(a) (First Fundamental Welfare Theorem of Economics) Prove that if (x1, . . . , xm)

is an equilibrium allocation for some p, then it is weakly Pareto efficient.

We say that �i is convex if Ui (x) is convex for all x ∈ Xi , and if it is locallynonsatiated if, for all x ∈ R

�+, x is contained in the closure ofUi (x). The allocation(x1, . . . , xm) is an quasiequilibrium allocation for a price vector p ∈ R

� \ {0} if〈p, yi 〉 > 〈p, xi 〉 for all i and yi ∈ Ui (xi ).

(b) Prove that if each �i is convex and locally nonsatiated, and (x1, . . . , xm) is aweakly Pareto efficient allocation, then it is a quasiequilibrium allocation.

(c) (Second Fundamental Welfare Theorem of Economics) Prove that if each �i

is convex and locally nonsatiated, each Xi is convex, (x1, . . . , xm) is a weaklyPareto efficient allocation, each xi is in the interior of Xi , and each Ui (xi ) isopen, then (x1, . . . , xm) is an equilibrium allocation.

2.2 This problem covers the key application of Farkas’ lemma to linear program-ming. Let A be anm × n matrix, let b be an element ofRm , and let c be an element ofR

n . The primal problem is to choose x ∈ Rn+ to maximize cT x subject to Ax ≤ b.

The dual problem is to choose y ∈ Rm+ to minimize yT b subject to yT A ≥ cT .


A point x ∈ Rn+ is feasible for the primal problem if Ax ≤ b, and the primal prob-

lem is feasible if such an x exists. The primal problem is unbounded if there arefeasible x forwhich cT x takes on arbitrarily large values, and otherwise it isbounded.Feasibility and boundedness for the dual problem are defined similarly.

(a) Prove that the dual problem for the data (A, b, c) is the primal problem for thedata (−AT ,−c,−b).

(b) (Weak Duality Theorem) Prove that if x is feasible for the primal problem andy is feasible for the dual problem, then cT x ≤ yT b.

(c) Use Farkas’ lemma to prove that if the primal problem is feasible and the dualproblem is infeasible, then the primal problem is unbounded.

The strong duality theorem asserts that if the primal and dual problems are bothfeasible, then they have the same optimal values. The proof requires some additionalideas, which allow one to reduce to the following situation.

(d) Ifm = n, A is invertible, x∗ is optimal for the primal problem, and all componentsof x∗ are positive, then y∗ := (A−1)T c is an optimal solution for the dual. (Inorder to prove that y∗ ≥ 0, understand y∗

i as the change in the value of the primalresulting from marginally relaxing the constraint given by bi .)

2.3 A d × d matrix is a permutation matrix if its entries are all 0 or 1, and eachcolumn and each row has exactly one zero. A d × d matrix is bistochastic if itsentries are all nonnegative, and the sum of the entries in each column, and in eachrow, are all 1. The set of bistochastic matrices is the Birkhoff polytope.

(a) (Birkhoff-von Neumann theorem) Prove that the Birkhoff polytope is the convexhull of the set of permutation matrices.

(b) Prove that the Birkhoff polytope has d2 facets.

2.4 Suppose that a1 < · · · < ad . The permutahedron Πd−1(a1, . . . , ad) is the con-vex hull of all points whose components are a1, . . . , ad in some order.

(a) Prove that Πd−1(a1, . . . , ad) has 2d − 2 facets by constructing their boundinginequalities.

(b) Give a linear mapRd2 → Rd mapping the Birkhoff polytope ontoΠd−1(a1, . . . ,

ad).

2.5 Let P ⊂ Rd be a polytope that contains the origin in its interior. The polar or

dual of P isPΔ := { y ∈ R

d : 〈x, y〉 ≤ 1 for all x ∈ P } .

(a) What is the polar of the cube [−1, 1]3 ⊂ R3?

(b) Prove that (PΔ)Δ = P .(c) For 0 ≤ i ≤ d describe a bijection between the i-dimensional faces of P and the

(d − i)-dimensional faces of PΔ.

Exercises 53

2.6 Let P ⊂ Rd be a d-dimensional polytope. We say that P is simple if every

vertex is contained in only d facets, and we say that P is simplicial if each of itsproper faces is a simplex. (Roughly, the intersection of a “generic” finite collectionof half spaces is a simple polytope if it is bounded, and the convex hull of a “generic”finite set of points with at least d + 1 elements is a simplicial polytope.)

(a) Prove that P contains the origin in its interior, then it is simple if and only if PΔ

is simplicial.(b) Prove that if d ≥ 3 and P is both simple and simplicial, then it is a simplex.

2.7 Let P be a finite simplicial complex. If x ∈ |P|, the closed star of x is theunion of all the simplices that contain x , the link of x is the union of all the simplicesthat do not contain x but are faces of simplices that do contain x , and the open staris the set of points that are in the star of x but not in the link of x .

(a) Prove that the link of x is closed in |P| and the open star of x is open in |P|.Now suppose that the elements of P are contained in R

d and P := |P| is a d-dimensional polytope. Let V be the set of vertices of P . For each v ∈ V let Fv isthe smallest face of P that contains v, and let Cv be the union of all rays emanatingfrom v an passing though points in P .

(b) For v ∈ V prove that Cv − v is a polyhedral cone, and the span of Fv − v is itslineality.

(c) For a given v ∈ V , let S be the closed star of v, and let L be its link. Let σ1, . . . , σk

be the maximal simplices in L . For each i = 1, . . . , k let Gi be the face of σi

opposite v (that is, Gi is the convex hull of the vertices of σi other than v),let Hi be affine hull of Gi , and let Oi be the connected component of Rd \ Hi

that contains v. Suppose that v′ ∈ Fv ∩ ⋂i Oi . For each i = 1, . . . , k let σ ′

i bethe convex hull of {v′} ∪ Gi . Prove that σ ′

1, . . . , σ′k and their faces constitute a

simplicial complex S ′, and |S ′| = S.(d) Prove that there is a collection {Uv}v∈V , where each Uv is a neighborhood of v

in V , such that if {wv ∈ Uv}v∈V is a selection of points, and for each τ ∈ P ,τ ′ is the convex hull of {wv : v ∈ τ }, then P ′ := { τ ′ : τ ∈ P } is a simplicialcomplex and |P ′| = P .

2.8 A bipartite graph is a triple B = (X,Y, E) where X and Y are disjoint finitesets and E is a set of unordered pairs {x, y} with x ∈ X and y ∈ Y . The associatedgraph is G = (V, E) where V = X ∪ Y . (We usually think of a bipartite graph as aspecial type of graph, but it is possible that G can be given more than one bipartitestructure.) Amatch in B is a set M ⊂ E such that each element of V is an endpointof at most one element of M . Elements of V that are (not) endpoints of elements ofM are said to be (unmatched)matched in M . An alternating path for M is a pathv0 · · · vr in G such that {vi , vi+1} is in M if i is odd and not if is even, or vice versa.

(a) We say that a match M is maximal if there is no match M ′ with |M ′| > |M |.Prove that if M is not maximal, then there is an alternating path v0 · · · vr suchthat v0 and vr are unmatched in M .


(b) Describe an algorithm that computes a maximal match in a number of steps thatis bounded by a polynomial function of |V |.

(c) (Hall’s marriage theorem) For W ⊂ X let N (W ) be the set of y ∈ Y for whichthere is some x ∈ W such that {x, y} ∈ E . Prove that there is a match in whichall elements of X are matched if and only if, for all W ⊂ X , |N (W )| ≥ |W |.

Chapter 3Computing Fixed Points

When it was originally proved, Brouwer’s fixed point theorem was a major break-through, providing a resolution of several outstanding problems in topology. Sincethat time the development of mathematical infrastructure has provided access to var-ious useful techniques, and a number of easier demonstrations have emerged, butthere are no proofs that are truly simple.

There are important reasons for this. The most common method of proving thatsome mathematical object exists is to provide an algorithm that constructs it, orsome proxy such as an arbitrarily accurate approximation, but for fixed points this isproblematic. As we will see in Sect. 3.1, a crucial step in the proof of existence is topass from a sequence of points that are “ε-approximately fixed,” for a sequence ofε > 0 that converge to zero, to a convergent subsequence, whose limit is necessarilyfixed. This step depends on the axiom of choice, which implies important limitationson what we can hope to compute.

Most of the chapter is concerned with algorithms that, for a given function andε > 0, yield a point that is ε-approximately fixed. García and Zangwill (1981) isa general reference for the literature on this subject, circa 1980. The best knownof these is an elaboration of Sperner’s lemma, which is the traditional method forproving Brouwer’s fixed point theorem. It is generally known as the Scarf algorithm,although what Scarf came up with first is actually the primitive set method, whichis studied in Sect. 3.4. Sections 3.5 and 3.6 develop the Lemke–Howson algorithmfor finding a Nash equilibrium of a two person game. Although this problem canbe regarded as the simplest nontrivial fixed point problem, Sect. 3.7 explains theMcLennan-Tourky algorithm, which turns it into the underlying engine of a generalfixed point solver.

Section 3.8 explains the homotopy method, in which a function whose uniquefixed point is known is gradually deformed into the function of interest, and the com-putational procedure follows the path of fixed points for the intermediate functions.Properly speaking, the homotopy method is not an algorithm, because there is no


55


56 3 Computing Fixed Points

absolute guarantee of convergence, but it is practical for many problems, and it isvery widely applied.

In various ways all of these algorithms follow paths that can be understood (per-haps vaguely in some instances) as the set of solutions of a system of equations withone fewer equation than the number of variables. In the final section we will seerecent work in computer science that uses the ability to find a solution using pathfollowing as the defining feature of a class of computational problems. The mainresults in this line of research show that even seemingly quite simple problems inthis class are (in a precise computational sense) already as hard as any problem in thisclass. No efficient method for solving problems in this class that does not use pathfollowing has ever been found, and if such a method was discovered, there wouldbe many ways to speed things up by combining it with path following, so it seemsquite likely that path following is the only efficient method for solving problems inthis class. This seems to set some lower bound on the complexity of proofs of theBrouwer fixed point. We will see proofs that are simple once some advanced math-ematical result is known, but in such cases the complexity is buried in the result’sproof.

3.1 The Axiom of Choice, Subsequences, and Computation

Probably the greatest watershed in the history of mathematics was the introductionof set theory by Cantor, and in particular the construction of the real numbers fromthe rationals by Dedekind. Leibniz had dreamed of reducing all of mathematics (aswell as science and metaphysics!) to a universal language in which all issues of logiccould be reduced to mechanical computation. Set theory held out the promise ofa language within which all mathematical concepts could be made precise, and ofcourse this promise has largely been fulfilled, with fantastic consequences for thesubsequent development of the mathematical sciences. Cantor’s work also showedthat set theory itself was far from trivial, and there arose the issue of developing asystem of axioms characterizing the theory of sets. Of the various attempts, the mostinfluential was due to Zermelo and Fraenkel.

As a practical matter, there is no need to remember the Zermelo–Fraenkel axioms,and like almost all mathematician, we will take the “naive” approach to set theory,accepting set theoretic constructions that obviously make sense. However, there isone axiom that stands out from the rest. The axiom of choice asserts that if Xand Y are sets, then any correspondence F : X → Y has a selection, i.e., there is afunction f : X → Y with f (x) ∈ F(x) for all x . Initially it seems strange to imaginethat this might not be true, but it does not follow from the other Zermelo–Fraenkelaxioms, roughly because no finite chain of reasoning can do more than prove thatthe restriction of F to some finite subset of X has a selection.

While we are on the topic we develop two alternative formulations of the axiomof choice that we will apply later, and which come up frequently in proofs. A partialorder on a set Z is a binary relation � that is reflexive (z � z) transitive (z � z′ and

3.1 The Axiom of Choice, Subsequences, and Computation 57

z′ � z′′ imply z � z′′) and antisymmetric (z � z′ and z′ � z imply z = z′). We writez � z′ to indicate that z � z′ and not z′ � z, and z � z′ and z ≺ z′ are often writtenin place of z′ � z and z′ � z respectively. Partial orders arise naturally in almost allbranches of mathematics, and there are a wealth of obvious examples such as thecoordinatewise partial ordering of Rn (z � z′ if and only if z1 ≥ z′

1, . . . , zn ≥ z′n)

and the ordering of the subsets of any set by containment. A complete ordering isa partial ordering � such that for all z, z′ ∈ Z , either z � z′ or z′ � z. A chain in apartially ordered set is a subset that is completely ordered.

Theorem 3.1 (Zorn’s Lemma) If Z is a nonempty partially ordered set and everychain in Z has an upper bound, then Z has a maximal element.

Proof Let Z be the set of chains in Z . This set of sets is partially ordered bycontainment. It is obvious, but crucial, that for any chain C ⊂ Z ,

⋃C∈C C ∈ Z . If

C is a maximal element ofZ and b is an upper bound ofC , then bmust be maximal,because otherwise we could create a larger chain. Therefore it suffices to show thatZ has a maximal element.

Aiming at a contradiction, assume that this is not the case. Applying the axiomof choice, let f : Z → Z be a function such that for each C ∈ Z , f (C) /∈ C andg(C) := C ∪ { f (C)} is a chain.

A set T ⊂ Z is a tower if:

(a) ∅ ∈ T ;(b) for all C ∈ T , g(C) ∈ T ;(c) if C ⊂ T is a chain in T , then

⋃C∈C C ∈ T .

Note that Z is itself a tower, so the set of towers is nonempty. Let T0 be the inter-section of all towers. Evidently T0 satisfies (a)–(c), so it is itself a tower.

Say that C ∈ T0 is comparable if every other element of T0 is either a subsetof C or a superset of C . Fixing a comparable C , let U be the set of C ′ ∈ T0 suchthat C ′ ⊂ C or g(C) ⊂ C ′. Consider C ′ ∈ U . Obviously g(C ′) ∈ U when C ′ = Cor g(C) ⊂ C ′. If C ′ is a proper subset of C then g(C ′) cannot be a proper supersetof C because it is obtained from C ′ by adding a single element, so (because C iscomparable and g(C ′) ∈ T0) g(C ′) must be a subset of C . Thus g(C ′) ∈ U .

Evidently ∅ ∈ U . We have just shown that U satisfies (b). The union of theelements of any chain inU is a chain that is either contained in C or contains g(C),according to whether some element of the chain contains g(C). Thus U is a tower,but T0 is the smallest tower, so U = T0, and in particular g(C) is comparable.

Let V be the set of comparable elements of T0. Evidently ∅ ∈ V . We have justshown that V satisfies (b). The union of the elements of any chain in V is an elementof T0, and any C ∈ T0 is either contained in this union or contains it, because eachelement of the chain is comparable, so the union is an element of V . Therefore Vsatisfies (c), so it is a tower, and minimality gives V = T0, so every element of T0

is comparable.Therefore T0 is both a tower and a chain. If C0 := ⋃

C∈T 0C , then C0 ∈ T0 by

(c), and (b) gives g(C0) ∈ T0, so g(C0) ⊂ C0, but g(C0) is a proper superset of C0,so this is impossible. This contradiction completes the proof. �


A well ordering of a set Z is a complete ordering � such that every nonemptysubset of Z has a minimal element. Note that if this is the case, then any subset of Zis well ordered by �.

Theorem 3.2 (Well Ordering Theorem) Every set Z has a well ordering.

Proof Let W be the collection of pairs (W,�) such that W ⊂ Z and � is a wellordering ofW .We say thatV ⊂ W is an initial segment of (W,�) if z ∈ V wheneverz, z′ ∈ W , z � z′ and z′ ∈ V . For (W,�), (W ′,�′) ∈ W , we specify that

(W,�) ≤ (W ′,�′)

ifW is an initial segment of (W ′,�′) and� is the restriction of�′ toW . This relationis evidently transitive, hence a partial ordering of W .

Let {(Wα,�α)}α∈A be a chain in W . We define (W,�) by setting W := ⋃α Wα

and specifying that z � z′ if there is some α such that z, z′ ∈ Wα and z �α z′. It iseasy to see � is a complete ordering of �, and that each �α is the restriction of � toWα .

If ∅ �= S ⊂ W , then there is some α such that S ∩ Wα is nonempty, and this sethas an element z that is minimal for�α . Aiming at a contradiction, suppose that thereis some z′ ∈ S such that z′ ≺ z. Then z′ is an element ofWα′ \ Wα for some α′. SinceWα′ is not an initial segment ofWα ,Wα must be an initial segment of (Wα′ ,�α′), butin which case z ≺ z′ because z′ ∈ Wα′ \ Wα . Thus z is minimal element in S for �,so � is a well ordering of W .

Suppose that z, z′ ∈ W , z � z′, z ∈ Wα , and z′ ∈ Wα′ . If Wα is an initial segmentof Wα′ , then z ∈ Wα′ , and if Wα′ is an initial segment of Wα , then again z ∈ Wα′ , asdesired. Therefore each Wα is an initial segment of (W,�), and (W,�) is an upperbound of the chain.

We have shown that an arbitrary chain has an upper bound, so we can apply Zorn’slemma to conclude that W has a maximal element (W ∗,�∗). If W ∗ did not containsome z ∈ Z we could construct a larger element of W by making z either largerthan or smaller than every element of W ∗. This would contradict the maximality of(W ∗,�∗), so we conclude that W ∗ = Z . The proof is complete. �

It is easy to see that the well ordering theorem implies the axiom of choice: ifF : X → Y is a correspondence and � is a well ordering of Y , we can define aselection f : X → Y from F by letting f (x) be the minimal element of F(x). Thusthe axiom of choice, Zorn’s lemma, and the well ordering theorem are equivalent.

We now consider a seemingly quite different topic. Most of the topological spacesthat appear in this book are compact metric spaces. For us a topological space iscompact if every open cover has a finite subcover, but for metric spaces there isan alternative characterization that is sometimes useful, and which may be morefamiliar to some readers. Let (X, d) be a metric space. We say that X is sequentiallycompact if every sequence in X has a convergent subsequence.

Proposition 3.1 X is compact if and only if it is sequentially compact.


Proof Suppose X is compact. If a sequence {xn} had no convergent subsequence,then each x would have a neighborhood that contained xn for only finitely many n,and finitely many of these neighborhoods would cover X , which is impossible.

Now suppose that X is sequentially compact. If, for each ε > 0, X has a finitecover by open balls of radius ε, then X is totally bounded. If this was not the case,then for some ε we could choose a sequence {xn} with d(xm, xn) ≥ ε for all distinctm and n. Such a sequence cannot have a convergent subsequence, so X must betotally bounded. For each k = 1, 2, . . . let Ck be a finite set of balls of radius 1/kthat cover X .

Aiming at a contradiction, suppose that U is a collection of finite sets without afinite subcover. For each k there are finitely many intersections B1k ∩ · · · ∩ Bkk suchthat B1k ∈ C1, . . . , Bkk ∈ Ck , and the union of all such intersections is X , so one ofthem is not covered by finitely many elements of U . Choosing such an intersectionfor each k, some B1 ∈ C1 is B1k for infinitely many k, so for all k = 2, 3, . . . thereare B2k ∈ C2, . . . , Bkk ∈ Ck such that B1 ∩ B2k ∩ · · · ∩ Bkk is not covered by finitelymany elements ofU . Again, some B2 ∈ C2 is B2k for infinitely many k, and continu-ing in this manner leads to the conclusion that we can choose B1 ∈ C1, B2 ∈ C2, . . .

such that for each k, B1 ∩ · · · ∩ Bk is not covered by finitely many elements of U .For each k choose xk ∈ B1 ∩ · · · ∩ Bk . If x is a limit of a convergent subsequence of{xk} then it is the limit of this sequence (which is Cauchy). Of course it is an elementof some U ∈ U , and Uε(x) ⊂ U for some ε > 0. For large k we have 2/k < ε/2and d(xk, x) < ε/2, so B1 ∩ · · · ∩ Bk ⊂ Bk ⊂ U2/k(xk) ⊂ Uε(x) ⊂ U . This contra-diction completes the proof. �

Did you notice the axiom of choice being applied? Possibly there are one or moreapplications in the second part of the proof, but for us a critical application occursin the proof that compactness implies sequential compactness. What we did there ispass from the assumption that each point has a neighborhood with only finitely manyterms in the sequence to an open cover consisting of a choice of such a neighborhoodfor each point.

At the time Zermelo advanced the axiom of choice early in the 20th century itwas rather controversial. Some critics were embarassed when it was found that theyhad already applied it in their own proofs, but on the other hand it does have somehighly counterintuitive consequences such as the Banach–Tarski paradox.

Brouwer himself founded the school of mathematical philosophy known as intu-itionismwhich, amongother things, does not accept the lawof the excludedmiddle—for any proposition P , either P is true or¬P is true—as an axiom. Nonacceptance ofthe law of the excluded middle is the defining feature of the school of mathematicalphilosophy known as constructivism. A simple argument shows that constructivismmust also reject the axiom of choice because it implies the law of the excludedmiddle: if

A := { x ∈ {0, 1} : P ∨ (x = 0) } and B := { y ∈ {0, 1} : P ∨ (y = 1) } ,


then A and B are nonempty because they contain 0 and 1 respectively, and the axiomof choice gives a function f : {A, B} → {0, 1} with f (A) ∈ A and f (B) ∈ B, butf (A) �= f (B) implies A �= B, so that P is false, and if f (A) = f (B), then either 0 =f (B) ∈ B = {0, 1} or 1 = f (A) ∈ A = {0, 1}, and thus P is true. (It is admittedlyimpossible to fully understand this argument without explicitly enumerating theprinciples of mathematical inference that constructivism accepts.)

Let f : Δ → Δ be a continuous function, where

Δ = { x ∈ Rd+1+ : x0 + · · · + xd = 1 }

is the standard d-dimensional unit simplex. For ε > 0 a finite set S ⊂ Δ is an ε-approximate fixed point for f if its diameter maxx,y∈S ‖x − y‖ is less than ε andeither:

(a) for each i = 0, . . . , d there is some x ∈ S such that fi (x) < xi + ε, or(b) for each i = 0, . . . , d there is some x ∈ S such that fi (x) > xi − ε.

Proposition 3.2 If, for each ε > 0, there is an ε-approximate fixed point for f , thenf has a fixed point.

Proof For each k = 1, 2, . . . let Sk be an εk-approximate fixed point, where {εk} be asequence of positive numbers that converges to zero. After passing to a subsequencewe may assume that {Sk} converges (in the obvious sense) to some x∗, and that either(a) or (b) above holds for all k. If (a) ((b) is similar) then for each i and k thereis x ∈ Sk such that fi (x) < xi + εk , and continuity implies that fi (x∗) ≤ x∗

i . Since∑i x

∗i = 1 = f ∗

i (x∗), it follows that f (x∗) = x∗. �

Without the existence of a convergent subsequence, this argument fails. In factBrouwer disavowed his own fixed point theorem, and toward the end of his life gavelectures titled “Why the Brouwer Fixed Point Theorem is False.”

As a moral precept concerning how mathematicians should spend their time,intuitionism and constructivism are far too severe, excluding a great deal of mathe-matics of undoubted scientific interest. In this sense contemporary mathematiciansuniversally accept the axiom of choice. Nevertheless, after an initial rather unpro-ductive period, constructivist mathematics has developed extensively, and continuesto attract interest; Bauer (2017) provides a gentle introduction and contemporaryperspective. Furthermore, the constructivist attitude does live on in our understand-ing of the limits of computation. In particular, although we will see algorithms thatcompute ε-approximate fixed points for any ε, there is an important sense in whichwe cannot hope for an algorithm that is guaranteed to give us a point that is withinε of an actual fixed point.

To say precisely what we mean by this we need to be a bit more precise. Recallthat, by definition, an algorithm is a computational procedure that is guaranteed tohalt eventually. Suppose that our algorithm gets all its information about f froman “oracle” that evaluates f at any point that is given to it as an input. Supposeour algorithm halts after sampling the oracle finitely many times, say at x1, . . . , xn ,


with some declaration that such-and-such is some sort of approximation of an actualfixed point. Provided that d > 1, the Devil could now change the function to onethat agrees with the original function at every point that was sampled, is continuous,and has no fixed points anywhere near the point designated by the algorithm. (Oneway to do this is to replace f with h−1 ◦ f ◦ h where h : X → X is a suitablehomeomorphism satisfying h(xi ) = xi and h( f (xi )) = f (xi ) for all i = 1, . . . , n.)The algorithm would necessarily process the new function in the same way, arrivingat the same conclusion, but for the new function that conclusion is wrong!

Sometimes one deals with functions with additional properties beyond continuity,such as computable bounds on second derivatives, that allow for algorithmic proofsthat an approximate fixed point is close to an actual fixed point. In practical experiencefunctions with approximate fixed points that are far from any actual fixed point arequite uncommon, and not really worth worrying about. An even more permissiveattitude pertains to the homotopy methods we will see toward the end of the chapter,which (as they are usually implemented) are not actually algorithms for computingapproximate fixed points because there is no absolute guarantee that they will haltwith an acceptable output.

3.2 Sperner’s Lemma

Brouwer’s original proof of his fixed point theorem seems to have been almostforgotten, except perhaps by historians of mathematics. It was an important step ina revolution in topology that led eventually to what is known as algebraic topology,which is now a fairly stable part ofmathematics, though still an area of active researchwhich continues to be a source of important methods for other active fields. No doubtthe terminology that Brouwer used is obsolete, and his argument would probablyseem quite roundabout.

Algebraic topology is a big machine with several sophisticated and powerfulgeneral principles. If one learns it systematically, eventually Brouwer’s fixed pointtheorem falls out “for free,” but that is a lot of work. In order to avoid this chore,but also to get a better view of the theorem’s essence, it would be nice to havea simple result that goes heart of the matter. After it emerged in the late 1920s,Sperner’s lemma became the standard device for proving Brouwer’s fixed pointtheorem without developing the machinery of algebraic topology.

This section gives a geometric proof of Sperner’s lemma based on volume, fromMcLennan and Tourky (2008). Although it is completely convincing, it will be tech-nically incomplete in two senses. Since the next section will give a second proof,this isn’t a big problem for us.

For w0, . . . ,wd ∈ Rd , let

Vd(w0, . . . ,wd) := 1d! det(w1 − w0, . . . ,wd − w0).


If σ is the convex hull of w0, . . . ,wd , let Vσ := |Vd(w0, . . . ,wd)|. Let P ⊂ Rd be a

nonempty d-dimensional polytope.

Proposition 3.3 There is a constant VP > 0 such that ifP is a simplicial complexwhose simplices are contained in R

d with |P| = P, then

∑

σ∈P , dim(σ )=d

Vσ = VP .

Of course this is not something anyone would doubt, and within any theory ofvolume it should be easy enough to prove, but we don’t want to be bothered withdeveloping such a theory, at least right now. (Section 17.4 develops the rudiments ofmeasure theory.) This is a point of some historical interest. Gauss regarded measuretheory (or what passed for it in his time) as “too heavy” a tool, expressing a wishfor a more elementary theory of the volume of polytopes. The third of Hilbert’sfamous problems asks whether it is possible, for any two polytopes of equal volume,to triangulate the first in such a way that the pieces can be reassembled to give thesecond. This was resolved negatively by Hilbert’s student Max Dehn within a yearof Hilbert’s lecture laying out the problems, and it remains the case today that thereis no truly elementary theory of the volumes of polytopes.

Let V be the set of vertices of P . Fix a polyhedral subdivisionP of P , and letWbe the set of vertices ofP . Fix a function λ : W → V . If σ ∈ P is d-dimensional,and the vertices of σ are w0, . . . ,wd , indexed so that Vd(w0, . . . ,wd) > 0, let pσ :R → R be the polynomial

pσ (t) := Vd((1 − t)w0 + tλ(w0), . . . , (1 − t)wd + tλ(wd)) .

We say that λ is a Sperner labelling if, for all v, λ(v) is contained in the smallestface of P that contains v.

Proposition 3.4 If λ is a Sperner labelling, then for all t ∈ R,

∑

σ∈P , dim(σ )=d

pσ (t) = VP .

Proof Ifσ ∈ P is the convex hull ofw0, . . . ,wk , for t ∈ R letσ(t) be the convex hullof (1 − t)w0 + tλ(w0), . . . , (1 − t)wk + tλ(wk). We claim that if |t | is sufficientlysmall, thenP(t) := { σ(t) : σ ∈ P } is a triangulation of P . This is visually obvious(see Fig. 3.1) and Exercise 2.7 outlines a formal proof. Since

∑σ pσ is a polynomial

that is constant on an open set, it is a constant function. �

Now suppose that P is a simplex, so it is the convex hull of the verticesv0, . . . , vd , indexed so that Vd(v0, . . . , vd) = VP . As above, suppose that σ ∈ P isd-dimensional, and thevertices ofσ arew0, . . . ,wd , indexed so thatVd(w0, . . . ,wd)>0.Then

3.2 Sperner’s Lemma 63

Fig. 3.1 Deformation of a triangulation

pσ (1) := Vd(λ(w0), . . . , λ(wd)) = 1d! det(λ(w1) − λ(w0), . . . , λ(wd) − λ(w0)) .

The right hand side is zero if λ(wi ) = λ(wj ) for some distinct i and j . We say thatσ is completely labelled if λ(w0), . . . , λ(wd) are distinct, in which case the righthand side is ±VP . We say that λ is orientation preserving on σ if pσ (1) = VP , andwe say that λ is orientation reversing on σ if pσ (1) = −VP . The last result impliesthat:

Theorem 3.3 (Sperner’s Lemma) If P is a simplex with vertex set V , P is a tri-angulation of P with vertex set W , and λ : W → V is a Sperner labelling, then thenumber of completely labelled σ ∈ P on which λ is orientation preserving is onegreater than the number of completely labelled σ ∈ P on which λ is orientationreversing. In particular, the number of completely labelled σ ∈ P is odd, hencenonzero.

Figure 3.2 illustrates this result. In this figure the labels are the indices of thevertices, rather than the vertices themselves, as is more customary because thatsystem of notation is less bulky.

Finally we explain how Sperner’s lemma implies the BFPT. As before letΔ be thestandard d-dimensional simplex, and let e0 = (1, 0, . . . , 0), . . . , ed = (0, . . . , 0, 1)be the vertices of Δ. Let f : Δ → Δ be a continuous function, and let P be atriangulation of Δ with vertex set W . For w ∈ W let λ(w) := vi where i is the leastindex such that fi (w) < wi . (If there is no such i , thenw is a fixed point.) The verticesof the smallest face ofΔ containingw are those ei such thatwi > 0, so this is a Spernerlabelling. If σ ∈ P is completely labelled, then for each i there is some vertex w ofσ such that fi (w) < wi , so the set of vertices of σ is a diam(σ )-approximate fixedpoint of f . Since Δ has triangulations of arbitrarily small mesh (Proposition 2.10)Sperner’s lemma implies that for each ε > 0, there is an ε-approximate fixed pointfor f , so (Proposition 3.2) f has a fixed point.


Fig. 3.2 A Sperner labelling of a triangulated simplex

3.3 The Scarf Algorithm

One proof of Sperner’s lemma, due to Cohen (1967), is an induction on dimension,using path following in a graph with maximal degree two to show that if the resultis true in dimension d, then it is also true in dimension d − 1. The Scarf algorithm(or at least what has come to be known as such) combines the paths in the variousdimensions into a single path going from a known starting point to the desiredconfiguration.

As above let e0 = (1, 0, . . . , 0), . . . , ed = (0, . . . , 0, 1) be the vertices of thestandard d-dimensional simplex Δ in R

d+1, which is their convex hull. Let P bea triangulation of Δ with vertex set W . In this section a labelling is a functionλ : W → {0, . . . , d}. (As before, the idea is to associate eλ(w) with w.) Such a func-tion is a Sperner labelling if, for all w, the λ(w)-coordinate of w is positive, andwe assume that this is the case. For σ ∈ P let λ(σ) := { λ(w) : w ∈ σ ∩ W }. Wesay that σ is a completely labelled simplex if λ(σ) = {0, . . . , d}. Finding such asimplex is the goal of the Scarf algorithm.

All the algorithms we will see in this chapter follow a path in a graph of “almost”satisfactory configurations. For i = 0, . . . , d let Δi be the convex hull of e0, . . . , ei .We say that σ ∈ P is almost completely labelled if, for some i , σ ⊂ Δi and{0, . . . , i − 1} ⊂ λ(σ). In this case σ is either (i − 1)-dimensional or i-dimensional.Also, note that a completely labelled simplex is almost completely labelled.

The vertices of the graph are the almost completely labelled simplices.We specifythe edges by describing the neighbors of an almost completely labelled simplex σ ,which we assume is contained in Δi but not in Δi−1. There are three cases:

(a) If λ(σ) = {0, . . . , i}, then the neighbors of σ are:

(i) the facet of σ whose vertices have the labels 0, . . . , i − 1 and(ii) the unique element of P that is contained in Δi+1 and has σ as a facet.

The first of these does not exist if i = 0, and the second does not exist if i = d.

3.3 The Scarf Algorithm 65

(b) If σ is i-dimensional and λ(σ) = {0, . . . , i − 1}, then two of its vertices havethe same label, and its two neighbors are the two facets that do not contain oneof these vertices. Put another way, its neighbors are its two facets whose verticeshave all the labels 0, . . . , i − 1.

(c) If σ is (i − 1)-dimensional, then it is not contained in Δi−1 by assumption, andit is not contained in any other facet of Δi because λ(σ) = {0, . . . , i − 1}, sothere are two elements of P that are contained in Δi that have σ as a facet,which are its neighbors.

In order for this definition of the neighbor relation to be coherent, each neighbormust be almost completely labelled. In all cases this is obvious. In addition, if τ is aneighbor of σ , then σ must be one of the neighbors of τ . If τ is given by (i) or (b),then (depending on whether τ ⊂ Δi−1) either (ii) or (c) states that σ is a neighbor ofτ . If τ is given by (ii) or (c), then (depending on the label of the vertex of τ that isnot in σ ) either (i) or (b) states that σ is a neighbor of τ .

Assuming that d > 0, the almost completely labelled simplices that have oneneighbor are {e0} and the completely labelled simplices. All other almost completelylabelled simplices have two neighbors. A finite graph, each of whose vertices hasdegree one or two, is, topologically speaking, a disjoint union of loops and paths withtwo endpoints. Altogether, the paths have an even number of endpoints, so there arean odd (hence nonzero) number of completely labelled simplices. Thus we haveproved Sperner’s lemma.

The Scarf algorithm follows the path that begins at {e0} to its other endpoint. Inconcrete detail, starting at {e0}, the algorithm iterates the following process: havingreached an almost completely labelled simplex, the algorithm looks at the labels ofall of its vertices, using this information to determine the two neighbors, and thengoes to the neighbor that is not the one from whence it came. Figure 3.3 illustratesthe path of the algorithm.

Fig. 3.3 The path of the Scarf algorithm


Several points are worth mentioning, some of which relate to orientation. Sup-pose that σ is an i-dimensional element of P that is contained in Δi , and λ(σ) ={0, . . . , i}. There is an affine map from the affine hull of e0, . . . , ei to itself that takeseach e j to thew ∈ λ(σ) such that λ(w) = j . We say that σ is positively (negatively)oriented if the linear part of this map has a positive (negative) determinant. Obvi-ously this generalizes the notion of orientation introduced in the last section. If weagree, as a matter of convention, that the determinant of the unique linear map fromR

0 to itself is positive, then {e0} is positively oriented.Although we have described the algorithm as exploiting the memory of its prior

journey, this is not necessary. That is, if we start at a given almost completely labelledsimplex, “local” information can be used to figure out a direction that necessarilyleads to a completely labelled simplex. The way to think about this is to conceive ofthe algorithm as a path in the set of σ ∈ P that, for some i , are i-dimensional andcontained inΔi , with {0, . . . , i − 1} ⊂ λ(σ). That is, we modify the graph describedabove by eliminating the almost completely labelled simplices towhich (c) is applied,replacing the two edges leading away from such a simplex with a single edge con-necting its two neighbors.

Suppose thatλ(σ) = {0, . . . , i − 1} and the two neighbors ofσ in this compressedgraph are both i-dimensional. Then exactly one of these would be positively orientedif its vertex that is not contained in σ had label i . (We will not prove this andsimilar claims, instead inviting the reader to look for examples of the phenomenon inFig. 3.3.) If σ has one i-dimensional neighbor and a facet in Δi−1, then that facet ispositively oriented if and only if the i-dimensional neighbor of σ would be negativelyoriented if its vertex that is not contained in σ had label i . The rule in this case is togo in the direction of the neighbor that is or would be positively oriented.

If λ(σ) = {0, . . . , i}, then the rule is to go to the neighbor given by (ii) if σ ispositively oriented, and to go to the neighbor given by (i) (possibly followed by (c))if σ is negatively oriented. To prove that this rule works we should show that if theproscribed neighbor of σ is τ , then the proscribed neighbor of τ is not σ . Especiallygiven our convention that {e0} is positively oriented, a formal proof of this wouldneed to consider many cases, and would be less to the point than the reader thinkingabout the various possibilities and looking for examples of them in Fig. 3.3.

Since {e0} is positively oriented, the proscribed path leads away from it, and itnecessarily ends at a positively oriented completely labelled simplex, because thepath always goes in a direction that would result in such a simplex if the next simplexwas completely labelled.Nowobserve that there is a different version of the algorithmfor each of the (d + 1)! orderings of the indices 0, . . . , d. Having found a positivelyoriented completely labelled simplex for one ordering, we can go away from thissimplex along the path given by a different ordering. This may lead back to a vertexcorresponding to e0 in the new ordering, but it is also possible that the end of this pathis a negatively oriented completely labelled simplex. A completely labelled simplexis accessible if can be reached by a path that begins at a vertex and combines pathsgiven by various orderings of the indices. In general there is no guarantee that allcompletely labelled simplices are accessible.

3.3 The Scarf Algorithm 67

When the mesh of the triangulation is small, the path followed by the Scarfalgorithm can be quite lengthy. The computational burden could, potentially, bereduced if one could run the algorithm with a coarse triangulation to develop a roughapproximation of the fixed point, then pass to a fine approximation and somehow“restart” the algorithm at the approximation.Wewill describe the sandwichmethoddiscovered independently by Merrill (1972) and MacKinnon, which is presented inKuhn andMacKinnon (1975). (A “homotopy” method achieving a similar effect wasdeveloped by Eaves 1972, Eaves and Saigal 1972, and another method was proposedby Tuy et al. 1978, Tuy 1979.)

Let Δ+ := { y ∈ Rd+2+ : y0 + · · · + yd+1 = 1 } be the (d + 2)-dimensional sim-

plex. We will need a simplicial subdivision of Δ+ whose vertices are the points( h0k , . . . ,

hd+1

k ) where h0, . . . , hd+1 are nonnegative integers that sum to k. In addi-tion, it must be the case that for each h = 0, . . . , d + 1, the subdivision restricts to asimplicial subdivision of { y ∈ Δ+ : yd+1 = h

k }. Exercise 3.5 develops (in a slightlydifferent coordinate system) the regular subdivision of the simplex due to Kuhn(1960, 1968), which has these properties.

Let a continuous f : Δ → Δ be given. We now describe a Sperner labelling ofthe vertices of the subdivision. The verticesw such thatwd+1 = 0 are given the labels0, . . . , d in such a way that there is a unique d-dimensional simplex σ ⊂ { y ∈ Δ+ :yd+1 = 0 } that has all the labels 0, . . . , d. There is an obvious identification of Δ

with { y ∈ Δ+ : yd+1 = 1k }, and the labels of the vertices w such that wd+1 = 1

k arethe ones induced by the given function f : Δ → Δ under this identification. All thevertices w with wd+1 ≥ 2

k receive the label d + 1. (See Fig. 3.4.)Now consider the path of the Scarf algorithm if we start it at one of the vertices

of Δ+ whose last coordinate is 0. It will pivot in { y ∈ Δ+ : yd+1 = 0 }, as if the lastdimension didn’t exist, until it arrives at σ . (Of course in practice we can simply startthe computation at σ .) From there it will continue pivoting until it finds a simplexwith all the labels 0, . . . , d + 1. Such a simplex is necessarily the convex hull ofa single vertex in { y ∈ Δ+ : yd+1 = 2

k } whose label is d + 1 and a d-dimensionalsimplex in { y ∈ Δ+ : yd+1 = 1

k } that has the labels 0, . . . , d. If we have reason tobelieve that σ is close to a fixed point of f , then we can expect, or at least hope, thatthis computation will not take very long.

Fig. 3.4 The path of the sandwich method for restart


3.4 Primitive Sets

The early history of the so-called Scarf algorithm is a bit confused. The algorithmthat was actually proposed by Scarf (1967) is the one described in this section,and in that paper Scarf asserts that “Sperner’s lemma suggests no procedure for thedetermination of an approximate fixed point other than an exhaustive search of allsubsimplices until one is found with all vertices labelled differently.” Kuhn (1968)pointed out that the algorithm Scarf had proposed could be understood as a matterof moving, in a derived simplicial complex, along a path of d-dimensional simpliceswith adjacent simplices sharing common facets. Kuhn describes a note by Cohen(1967) as an algorithm, even though Cohen describes his work as an inductive proofthat (for a Sperner labelling) for each i there are an odd number of i-dimensionalsimplices in Δi with the labels 0, . . . , i . Probably Kuhn had become aware that thepaths in Cohen’s argument could be joined together to give the path from a vertexto a completely labelled simplex that we saw in the last section. In addition, Kuhnpresented a different algorithm, and announced plans to publish others. Somehow,out of this ferment, the algorithm described in the last section emerged as a standardpresentation, with Scarf’s name attached. Nevertheless Scarf’s original method stillhas considerable independent interest. It is quite flexible, with various ways thatrestart can be achieved.

As before, Δ is the d-dimensional simplex, which is the convex hull of e0 =(1, 0, . . . , 0), . . . , ed = (0, . . . , 0, 1). Let D := {0, . . . , d}, andfixafinite setV ⊂ Δ

that contains e0, . . . , ed . A primitive set is a set W ⊂ V ∪ D such that |W | = |D|and there is no v ∈ V such that vi > minw∈W∩V wi for all i ∈ D \ W . This minimumis undefined if W ∩ V = ∅, and we convene that D is not a primitive set. Note thatfor each i ∈ D, {ei } ∪ D \ {i} is a primitive set.

Fix a primitive set W . The primitive simplex of W is

σW = { x ∈ Δ : xg ≥ minw∈W∩V wg for all g ∈ D \ W } .

The definition of a primitive set requires that there is no point of V in the interior(relative to Δ) of σW . To get a better picture of σW let xW ∈ R

d+1 be the pointsuch that xWi := 0 if i ∈ W and xWi := minw∈W∩V wi if i ∈ D \ W . Let α := 1 −∑

i∈D xWi . It is not hard to see that

σW := conv({ xW + αei : i ∈ D }) ,

so σW is a rescaled copy of Δ. The diameter of σW is the distance between anytwo of its vertices, which is

√2α. An easy calculation shows that the distance from

the barycenter of σW to the barycenter of one of its facets is α/√d(d + 1), so σW

contains the ball of this radius centered at its barycenter. We conclude that if everypoint in Δ is within ε/

√2d(d + 1) of some point in V , then diam(σW ) ≤ ε.

Now suppose that f : Δ → Δ is a continuous function. Define λ : V ∪ D → Dby letting λ(i) := i and letting λ(v) be the smallest i ∈ D such that fi (v) ≥ vi . Then

3.4 Primitive Sets 69

W is completely labelled if λ(W ) = D, which is to say that

λ(W \ D) ∪ (W ∩ D) = D .

Supposing this is the case, for each i ∈ D ∩ W choose some yi ∈ σW such thatyii = 0. Of course fi (yi ) ≥ yii , so (W ∩ V ) ∪ { yi : i ∈ D ∩ W } is a diam(σW )-approximate fixed point of f .

We will produce an algorithm that finds a completely labelled primitive set. Aprimitive set W is k-almost completely labelled if D \ {k} ⊂ λ(W ). The algorithmwill follow a path in the set of k-almost completely labelled primitive sets that beginsat {ek} ∪ D \ {k} and terminates at a completely labelled primitive set. According tothe logic of arguments we saw earlier, this algorithm provides a proof of the BFPT.

Working with an abstract combinatoric formulation will bring us closer to theessential logic of the subject, and it will also make the procedure muchmore flexible.For some positive integer e let E := {0, . . . , e}. Let Q be a finite set. For each i ∈ Elet ≺i be a strict complete order of Q ∪ E such that i ≺i Q ≺i E \ {i}. (Nothingwill depend on how ≺i orders E \ {i}.) As usual, x �i y means that either x ≺i y orx = y, and we sometimes write y �i x (y �i x) instead of x ≺i y (x �i y).

For any nonempty S ⊂ Q ∪ E letmi (S) andMi (S) be the≺i -least and≺i -greatestelements of S. Forw ∈ Q ∪ E letUi (w) := { x ∈ Q ∪ E : w ≺i x }. A (generalized)primitive set for ≺0, . . . ,≺e is a set W ⊂ Q ∪ E such that |W | = |E | and

Q ∩⋂

i∈EUi (mi (W )) = ∅ .

Lemma 3.1 If W is a primitive set, then mi (W ) ∈ {i} ∪ Q for all i ∈ E, and foreach w ∈ W there is a unique i ∈ E such that mi (W ) = w.

Proof The first assertion follows from the observation that |W | = |E | implies thatWcontains some element of {i} ∪ Q. If w ∈ W ∩ E , then of course mw(W ) = w, andif w ∈ W ∩ Q, then there is some i such that w /∈ Ui (mi (W )), so that w = mi (W ).Thus the map i �→ mi (W ) is a surjection, and since |W | = |E | it is a bijection. �

If W is a primitive set and w ∈ W , a replacement for w in W is a w′ ∈ (Q ∪ E)

\ W such that (W \ {w}) ∪ {w′} is primitive. The key result is:

Theorem 3.4 (Tuy 1979) Suppose that W is a primitive set and w ∈ W. If W \{w} ⊂ E, then there is no replacement for w in W. Let j be the element of E such thatm j (W ) = w, let w∗ := m j (W \ {w}), let j ′ be the element of E such that m j ′(W ) =w∗, let

R := Uj (w∗) ∩

⋂

h �= j, j ′Uh(mh(W )) ,

and if R �= ∅ let w′ := Mj ′(R). If (W \ {w}) ∩ Q �= ∅, then R �= ∅ and w′ is theunique replacement for w in W, and if W ′ := (W \ {w}) ∪ {w′}, then m j (W ′) = w∗,m j ′(W ′) = w′, and mh(W ′) = mh(W ) for all h �= j, j ′.


Proof If R �= ∅, then (W \ {w}) ∪ {w′} clearly satisfies the definition of a primi-tive set. Suppose that (W \ {w}) ∩ Q �= ∅. Then w∗ ∈ { j} ∪ Q, and w∗ �= j becausew∗ � j w, so w∗ ∈ Q. Therefore j ′ ∈ Uj (w∗). The last result implies that j ′ ∈Uh(mh(W )) for all h �= j ′, so R �= ∅ because j ′ is an element.

Now suppose that w′′ is a replacement for w in W . Let W ′′ = (W \ {w}) ∪ {w′′},and let i be the element of E such that mi (W ′′) = w′′. We will show that (W \{w}) ∩ Q �= ∅, i = j ′, and w′′ = w′. Note that if h �= i, j , then mh(W ) �= w andmh(W ′′) �= w′′, so

mh(W ) = mh(W \ {w}) = mh(W′′ \ {w′}) = mh(W

′′) .

We claim that i �= j . Aiming at a contradiction, suppose that i = j and (withoutloss of generality, by symmetry) w′′ � j w. If w′′ ∈ Q, then w′′ � j w and w′′ �h

mh(W ) for allh �= j ,which contradicts the assumption thatW is primitive. Ifw′′ ∈ E ,then w′′ = i = j , but j � j w is impossible. Note in particular that m j (W ′′) �= w′.

We now have m j (W ) = w, mi (W ′′) = w′′, and mh(W ) = mh(W ′′) for all h �=i, j . The last result implies that the remaining element of W \ {w} = W ′′ \ {w′′} is

mi (W ) = m j (W′′) = m j (W

′′ \ {w′′}) = m j (W \ {w}) = w∗ .

Therefore i = j ′. Since m j ′(W ) = m j (W ′′) � j ′ m j ′(W ′′) = w′′, m j ′(W ) �= j ′ andthus m j ′(W ) ∈ Q. In particular, (W \ {w}) ∩ Q �= ∅, so R �= ∅.

Since w∗ = m j (W ′′), R = ⋂h �= j ′ Uh(mh(W ′′)). Therefore the last result implies

that w′′ ∈ R, and the specification of the orderings implies that R ∩ E = { j ′}, andthat w′′ � j ′ j ′. Therefore any x ∈ R such that x � j ′ w′′ would be an element of Qand contradict the assumption that W ′′ is primitive, so w′′ = w′. The final assertionshave all already been established over the course of the argument. �

We can now describe the algorithm. Let a function λ : Q → E be given. Weextend λ to the domain Q ∪ E by setting λ(i) := i . Let qk := Mk(Q), and letW0 :={qk} ∪ E \ {k}. If λ(qk) = k, then W0 is a completely labelled primitive set, and thecomputation halts. Otherwise let w0 := λ(qk). This completes the initialization.

We iterate the following step. Let (Wt ,wt ) be given, where Wt is a primitive setwith λ(Wt ) = E \ {k} and wt is one of two elements of Wt that have the same label.If Wt \ {wt } ⊂ E the computation halts. (We will see that this does not happen.)Otherwise let w′

t be the unique replacement for wt in Wt , and let Wt+1 := (Wt \{wt }) ∪ {w′

t }. If λ(w′t ) = k, then Wt+1 is completely labelled and the computation

halts. Otherwise let wt+1 be the element of Wt+1 \ {w′t } such that λ(wt+1) = λ(w′

t ).We claim that this computation halts at a completely labelled primitive set. If

Wt \ {wt } ⊂ E , then in fact Wt \ {wt } = E \ {k}, and the definition of a primitiveset then implies that v ∈ Wt , so that Wt = W0. In the graph of 0-almost completelylabelled primitive sets, W0 has one neighbor, and every other 0-almost completelylabelled primitive set that is not completely labelled has two neighbors. Thereforethe path of the algorithm cannot double back on itself, so it cannot come back toW0, and it must halt eventually because there are only finitely many primitive sets,


so it must halt at a completely labelled primitive set. Recently Petri and Voorneveld(2017) rediscovered Tuy’s purely combinatoric framework, and this existence result,which deserves to be better known.

In the other path following algorithms studied in this chapter, orientation is ageometric phenomenon. In order to develop this concept for the algorithm consideredhere we need a combinatoric analogue, which comes from group theory.

A bijective function from E to itself is a permutation of this set. Permutations arefunctions that can be composed with each other, and the composition π ′ ◦ π of π andπ ′ is a third permutation. Composition is associative, but not in general commutative.It has a two sided identity element, namely the identity function. Each permutationhas a unique two sided inverse. (Probably most readers will already know that a setwith a binary operation that has these properties is called a group.) The symmetricgroup Se+1 is the set of all permutations π : E → E with composition as the groupoperation.

A (e + 1) × (e + 1) permutation matrix is a (e + 1) × (e + 1) matrix whoseentries are all 0 and 1, with exactly one 1 in each row and each column. For anypermutationπ there is an associated permutationmatrixMπ whose (i, j)-entry is 1 ifi = π( j) and 0 otherwise. It is easy to check that Mπ ′Mπ = Mπ ′◦π . A permutation π

is even if |Mπ | = 1 and it isodd if |Mπ | = −1.A transposition is a permutation t thatswaps two of the integers 0, . . . , e while leaving the others fixed. Evidently |Mt | =−1. From practical experience we know that any permutation can be written as acomposition of transpositions; a formal demonstration would be tedious, and is leftto the interested reader. Since the determinant of a product of matrices is the productof their determinants, any representation of π as a composition of transpositions hasan even (odd) number of terms if π is even (odd). Two permutations have the same(opposite) parity if one can go from one to the other with an even (odd) number oftranspositions.

For any primitive set W the labelling λ induces a function αW : E → E given by

αW ( j) := λ(m j (W )) .

If W is completely labelled, then αW is a permutation, and we say that W is posi-tively(negatively) oriented if αW is an even (odd) permutation.

For g, h ∈ E letαW,g→h : E → E be given byαW,g→h(g) := h andαW,g→h() :=αW () if �= g. If W is k-almost completely labelled, but not completely labelled,then there are j1, j2 ∈ E such that αW ( j1) = αW ( j2), and αW, j1→k and αW, j2→k arepermutations which are related by composition with a transposition, so αW, j1→k iseven if and only if αW, j2→k is odd.

Let w := m j2(W ), and suppose that (W \ {w}) ∩ Q �= ∅. Let w∗ := m j2(W \{w}), and let j ′1 be the element of E such that m j ′1(W ) = w∗. Then

αW, j2→k( j2) = k and αW, j2→k( j′1) = λ(w∗) .


Let R := Uj2(w∗) ∩ ⋂

�= j2, j ′1U(m(W )), let w′ := Mj2(R), and let W ′ :=

(W \ {w}) ∪ {w′}. Theorem 3.4 gives m j2(W′) = w∗, m j ′1(W

′) = w′, and m(W ) =m(W ′) for all �= j2, j ′1. Therefore

αW, j ′1→k( j2) = λ(w∗), αW, j ′1→k( j′1) = k ,

and αW ′, j ′1→k() = αW, j2→k() for all �= j2, j ′1, so αW, j2→k and αW ′, j ′1→k differ bycomposition with a transposition, and consequently exactly one of them is even.

Now consider the progress of the algorithm. We begin at the primitive setW0 = {qk} ∪ E \ {k}. If λ(qk) = k, then αW0 = αW0,k→k = IdE , soW0 is completelylabelled and positively oriented. Otherwise we let w0 := λ(qk) and j2,0 := λ(qk),observing that αW0, j2,0→k is an odd permutation.

In general suppose (Wt ,wt ) is given, where Wt is a primitive set with λ(Wt ) =E \ {k} and wt is one of two elements of Wt that have the same label. Suppose thatwt = m j2,t (Wt ), and that αWt , j2,t→k is an odd permutation. As per Theorem 3.4, letw′t+1 be the element of (Q ∪ E) \ Wt such that Wt+1 := (Wt \ {wt }) ∪ {w′

t+1} is aprimitive set. If m j1,t+1(Wt+1) = w′

t+1, then αWt+1, j1,t+1→k is an even permutation, andif λ(w′

t+1) = k, then Wt+1 is a positively oriented completely labelled primitive set.Otherwise let wt+1 be the element of Wt+1 \ {w′

t+1} such that λ(wt+1) = λ(w′t+1).

If m j2,t+1(Wt+1) = wt+1, then αWt+1, j2,t+1→k is an odd permutation. By induction thealgorithm will always find a positively oriented completely labelled primitive set.Note in particular that the path of the graph followed by the algorithm is directed,in the sense that if you start at a given 0-almost completely labelled primitive set,purely local information determines which direction does not lead back to W0.

As with the Scarf algorithm, having found such a primitive set, one may followthe path leading away from there for a different missing label, which may lead tonegatively oriented primitive set, and a completely labelled primitive set is accessibleif it can be reached by some sequence of maneuvers of this sort.

One may implement the primitive set method as an algorithm for computingapproximate fixed points by setting e = d and Q = V . In computational practiceone will usually (perhaps almost inevitably) let V be some subset of the set of pointsin Δ whose components are integer multiples of 1/N for some large N . There is acomplete strict ordering ≺lex of Rd+1 given by requiring that x ≺lex y if and only ifthere is some k such that x = y for all < k and xk < yk . It is natural to specifythat for v, v′ ∈ V , v ≺i v′ if and only if

(vi , . . . , vd , v0, . . . , vi−1) ≺lex (v′i , . . . , v

′d , v

′0, . . . , v

′i−1) .

Critically, this structure makes it not too hard to compute Mj ′(R) in the setting ofTheorem 3.4.

One way to achieve the effect of restart is simply to run the algorithm manytimes, each time adding more points to V near the point that was found on the lastrun, but not elsewhere. In a method proposed by Tuy (1979) and coworkers (seeTuy et al. 1978) using a somewhat more general combinatoric framework, Q is, in


effect, a finite subset of the boundary of the (d + 1)-dimensional simplex. Since theframework is very flexible there are certainly many other possibilities.

3.5 The Lemke–Howson Algorithm

A finite two person game is a quadruple (S, T, u, v) where S and T are nonemptyfinite sets of pure strategies for the two agents, and u, v : S × T → R are payofffunctions. Elements of S × T are called pure strategy profiles. This structure isusually interpreted as modelling a situation in which each of the two players isrequired to choose an element from their set of pure strategies before learning theother player’s choice.

A pure Nash equilibrium is a pure strategy profile (s∗, t∗) such that u(s, t∗) ≤u(s∗, t∗) for all s ∈ S and v(s∗, t) ≤ v(s∗, t∗) for all t ∈ T . The simplest examplesshow that pure Nash equilibria may not exist.

The “mixed extension” is the derived two person game with the same two playersin which each player’s set of strategies is the set of probability measures on thatplayer’s set of pure strategies in the original game. Let

Δ(S) :={

σ : S → [0, 1] :∑

s∈Sσ(s) = 1

}

and Δ(T ) :={

τ : T → [0, 1] :∑

t∈Tτ(t) = 1

}

be the sets of mixed strategies for the two players. An element of Δ(S) × Δ(T ) iscalled a mixed strategy profile. We regard S and T as subsets of Δ(S) and Δ(T )

by identifying s ∈ S with the σ such that σ(s) = 1 and σ(s ′) = 0 for all s ′ �= s, andsimilarly for T . The supports of σ ∈ Δ(S) and τ ∈ Δ(T ) are

supp(σ ) := { s ∈ S : σ(s) > 0 } and supp(τ ) := { t ∈ T : σ(t) > 0 }

respectively. For nonempty C ⊂ S and D ⊂ T let

Δ(C) := { σ ∈ Δ(S) : supp(σ ) ⊂ C } and Δ(D) := { τ ∈ Δ(T ) : supp(τ ) ⊂ D } .

Payoffs in the mixed extension are computed by taking expectations. We let u andv also denote the extensions of the given payoff functions to Δ(S) × Δ(T ), so theexpected payoffs resulting from a mixed strategy profile (σ, τ ) ∈ Δ(S) × Δ(T ) are

u(σ, τ ) :=∑

s∈S

∑

t∈Tu(s, t)σ (s)τ (t) and v(σ, τ ) :=

∑

s∈S

∑

t∈Tv(s, t)σ (s)τ (t) .

Note that u and v are the restrictions to Δ(S) × Δ(T ) of real valued functions onR

S × RT that arebilinear: for eachσ ∈ R

S ,u(σ, ·) : RT → R and v(σ, ·) : RT → R

are linear, and for each τ ∈ RT , u(·, τ ) : RS → R and v(·, τ ) : RS → R are linear.


A (mixed)Nash equilibrium is amixed strategy profile (σ ∗, τ ∗) ∈ Δ(S) × Δ(T )

such that u(σ, τ ∗) ≤ u(σ ∗, τ ∗) for all σ ∈ Δ(S) and v(σ ∗, τ ) ≤ v(σ ∗, τ ∗) for allτ ∈ Δ(T ). That is, each agent is maximizing her expected payoff, taking the otheragent’s mixed strategy as given. For τ ∈ Δ(T ) and σ ∈ Δ(S) let

BS(τ ) := argmaxs∈S

u(s, τ ) and BT (σ ) := argmaxt∈T

v(σ, t)

be the sets of pure best responses to τ and σ respectively. The following is animmediate consequence of the bilinear character of the payoff functions and the factthat each agent’s “budget constraint” is that the probabilities sum to unity.

Lemma 3.2 A mixed strategy profile (σ ∗, τ ∗) is a Nash equilibrium if and only if

supp(σ ∗) ⊂ BS(τ∗) and supp(τ ∗) ⊂ BT (σ ∗) .

For nonempty C ⊂ S and D ⊂ T let

ΔD(C) := { σ ∈ Δ(C) : D ⊂ BT (σ ) } and ΔC(D) := { τ ∈ Δ(D) : C ⊂ BS(τ ) } .

Evidently the set of Nash equilibria is

⋃

∅�=∈C⊂S, ∅�=D⊂T

ΔD(C) × ΔC(D) .

If a pure strategy is a best response to two different mixed strategies, then it is a bestresponse to any convex combination of them, so ΔD(C) and ΔC(D) are convex.

The game is nondegenerate if, for all nonempty C ⊂ S and D ⊂ T , ΔD(C) iseither empty or (|C | − |D|)-dimensional andΔC(D) is either empty or (|D| − |C |)-dimensional. (A set of negative dimension is necessarily empty.) In order for a gameto be degenerate, some system of linear equations with coefficients given by thenumbersu(s, t) and v(s, t)must have a spaceof solutionswith a higher-than-expecteddimension. Such an occurrence corresponds to the vanishing of the determinant of amatrix, some of whose entries are numbers u(s, t) and v(s, t), so the set of degenerategames is contained in the union of finitely many sets given by polynomial equations.It follows that any game can be approximated by a nondegenerate game.

Note that if (σ ∗, τ ∗) is a Nash equilibrium andC = supp(σ ∗) and D = supp(τ ∗),thenσ ∗ ∈ ΔD(C) and τ ∗ ∈ ΔC(D). If the game is nondegenerate, both |C | − |D| and|D| − |C | are nonnegative, so |C | = |D|, andΔD(C) andΔC (D) are both singletons,because they are nonempty 0-dimensional convex sets. Consequently a nondegener-ate game has finitely many Nash equilibria.

We now fix a pure strategy s0 ∈ S. A mixed strategy profile (σ, τ ) is an s0-almostequilibrium if supp(σ ) \ {s0} ⊂ BS(τ ) and supp(τ ) ⊂ BT (σ ). Let Γs0 be the set ofs0-almost perfect equilibria. Evidently

3.5 The Lemke–Howson Algorithm 75

Γs0 =⋃

s0∈C⊂S, ∅�=D⊂T

ΔD(C) × ΔC\{s0}(D) .

For the remainder of the section we assume that the game is nondegenerate. Thereis at least one pure best response to s0, soΔBT (s0)({s0}) is nonempty, and consequently|BT (s0)| = 1, i.e., BT (s0) = {t0} for some t0 ∈ T . By analyzing the sets of the formΔD(C) × ΔC\{s0}(D) we will show that (s0, t0) is either a Nash equilibrium or oneendpoint of a path in Γs0 whose other endpoint is a Nash equilibrium. The Lemke–Howson algorithm follows this path.

Suppose that (σ ∗, τ ∗) is aNash equilibrium that is the unique element ofΔD(C) ×ΔC(D), and that (σ ∗, τ ∗) ∈ ΔD′(C ′) × ΔC ′\{s0}(D′)where s0 ∈ C ′ ⊂ S and D′ ⊂ T .Then C ⊂ C ′ and D ⊂ D′ because C ′ and D′ contain the supports of σ ∗ and τ ∗, andC ′ \ {s0} ⊂ C and D′ ⊂ D because the elements of these sets are pure best responsesto τ ∗ and σ ∗ respectively. Consequently C ′ = C ∪ {s0} if s0 /∈ C , C ′ = C if s0 ∈ C ,and D′ = D. SinceΔD′(C ′) × ΔC ′\{s0}(D′) is nonempty, nondegeneracy implies thatit is 1-dimensional except when C ′ \ {s0} = ∅, in which case C ′ = {s0}, D′ = {t0}because ΔD′(C ′) �= ∅, (σ ∗, τ ∗) = (s0, t0), and ΔD′(C ′) × ΔC ′\{s0}(D′) = {(s0, t0)}is 0-dimensional. We have shown that if (s0, t0) is a Nash equilibrium, then it is anisolated point in Γs0 , and every other Nash equilibrium is an endpoint of preciselyone 1-dimensional set ΔD(C) × ΔC\{s0}(D).

We analyze the intersections of the sets ΔD(C) × ΔC\{s0}(D) that are 1-dimensional. Fix C and D such that C is a proper superset of {s0} and ΔD(C) ×ΔC\{s0}(D) �= ∅. Then |D| ≤ |C | and |C \ {s0}| ≤ |D|. There are two main cases.

Case I: |C | = |D| + 1.

Nondegeneracy implies that ΔD(C) is 1-dimensional and ΔC\{s0}(D) is a singleton,say with unique element τ . It also implies that τ assigns positive probability to everyelement of D, and that there are no best responses to τ outside ofC \ {s0}. Fix a pointσ ∈ ΔD(C), and suppose that (σ, τ ) ∈ ΔD′(C ′) × ΔC ′\{s0}(D′) where s0 ∈ C ′ ⊂ Sand D′ ⊂ T .

First suppose that σ in the interior of ΔD(C). Since ΔD(C) is 1-dimensional,nondegeneracy implies that it is not contained in any proper face of Δ(C), sosupp(σ ) = C . We have C ⊂ C ′ and D ⊂ D′ because C ′ and D′ contain the sup-ports of σ and τ , C ′ ⊂ C because there are no best responses to τ outside ofC \ {s0}, and D′ ⊂ D because no t outside of D is a best response to τ . Therefore(C ′, D′) = (C, D).

Now suppose that σ be an endpoint of ΔD(C). Either there is some s ∈ C suchthat σ(s) = 0 or there is some t ∈ T \ D such that t ∈ BT (σ ). If C is the set ofs ∈ C such that σ(s) = 0 and D is the set of t ∈ D \ T such that t ∈ BT (σ ), thenσ ∈ ΔT∪D(S \ C), so |C \ C | ≥ |D ∪ D|, and consequently |C | + |D| = 1. Thereare two subcases.

Subcase A: There is a single s ∈ C such that σ(s) = 0 and no t ∈ T \ D such thatt ∈ BT (σ ).


We have C \ {s} ⊂ C ′ and D ⊂ D′ because C ′ and D′ contain the supports of σ andτ , C ′ ⊂ C because s0 ∈ C and there are no best response to τ outside of C \ {s0},and D′ ⊂ D because no t outside of D is a best response to τ . That is, D′ = D,and if C ′ �= C , then C ′ = C \ {s}, and s �= s0. In this case (σ, τ ) is not a Nashequilibrium because supp(σ ) = C \ {s} and BT (τ ) ⊂ C \ {s0}, and there are tworemaining possibilities:

(a) If C = {s0, s}, then C ′ = {s0} and |D| = 1, so D = {t} for some t . SinceΔC ′(D) �= ∅, t = t0. Thus ΔD(C ′) × ΔC ′\{s0}(D) = {(s0, t0)} is 0-dimensional.

(b) If s is not the only element of C \ {s0}, then ΔD(C ′) × ΔC ′\{s0}(D) is nonempty,so ΔD(C ′) is 0-dimensional while ΔC ′\{s0}(D) is 1-dimensional.

That is, if (σ, τ ) is either (s0, t0) or a Nash equilibrium other than (s0, t0), then it iscontained in exactly one 1-dimensional set ΔD(C) × ΔC\{s0}(D), and otherwise it iscontained in precisely two such sets.

Subcase B: There is a single t ∈ T \ D such that t ∈ BT (σ ) and no s ∈ C such thatσ(s) = 0.

We have C ⊂ C ′ and D ⊂ D′ because C ′ and D′ contain the supports of σ and τ ,C ′ ⊂ C because s0 ∈ C and there are no best responses to τ outside of C \ {s0},and D′ ⊂ D ∪ {t} because no t ′ outside of D ∪ {t} is a best response to σ . That is,C ′ = C and either D′ = D or D = D ∪ {t}. Since it contains (σ, τ ), ΔD∪{t}(C) ×ΔC\{s0}(D ∪ {t}) is nonempty, and nondegeneracy implies that ΔD∪{t}(C) is 0-dimensional while ΔC\{s0}(D ∪ {t}) is 1-dimensional. Thus (σ, τ ) is contained inprecisely two 1-dimensional sets of the form ΔD(C) × ΔC\{s0}(D).

Case II: |C | = |D|.This case is similar, and a bit simpler. Now ΔD(C) is a singleton, say with uniqueelement σ , and ΔC\{s0}(D) is 1-dimensional. Nondegeneracy implies that σ assignspositive probability to every element of C , and that there are no best responses toσ outside of D. Fix a point τ ∈ ΔC\{s0}(D), and suppose that (σ, τ ) ∈ ΔD′(C ′) ×ΔC ′\{s0}(D′) where s0 ∈ C ′ ⊂ S and D′ ⊂ T .

First suppose that τ is in the interior of ΔC\{s0}(D). Since ΔC\{s0}(D) is 1-dimensional, nondegeneracy implies that it is not contained in any proper face ofΔ(D), so supp(τ ) = D. We have C ⊂ C ′ and D ⊂ D′ because C ′ and D′ containthe supports of σ and τ , C ′ ⊂ C because there are no best responses to τ outside ofC \ {s0}, and D′ ⊂ D because no t outside of D is a best response to σ . Therefore(C ′, D′) = (C, D).

Now let τ be an endpoint of ΔC\{s0}(D). Let C is the set of s ∈ S \ C that arebest responses to τ , and let D be the set of t ∈ D such that τ(t) = 0. Then τ ∈ΔC∪C\{s0}(D \ D), so |C ∪ C \ {s0}| ≤ |D \ D|, and thus |C | + |D| = 1.Again thereare two subcases.

Subcase A: There is a single s ∈ S \ C such that s ∈ BS(τ ) and no t ∈ T \ D suchthat τ(t) = 0.


We have C ⊂ C ′ and D ⊂ D′ because C ′ and D′ contain the supports of σ and τ ,C ′ ⊂ C ∪ {s} because s0 ∈ C and there are no best response to τ outside of C ∪{s} \ {s0}, and D′ ⊂ D because no t outside of D is a best response to σ . Thatis, D′ = D and either C ′ = C or C ′ = C ∪ {s}. Since it contains (σ, τ ), ΔD(C ∪{s}) × ΔC∪{s}\{s0}(D) is nonempty, and nondegeneracy implies that ΔD(C ∪ {s}) is1-dimensional while ΔC∪{s}\{s0}(D) is 0-dimensional. Thus (σ, τ ) is contained inprecisely two 1-dimensional sets of the form ΔD(C) × ΔC\{s0}(D).

Subcase B: There is a single t ∈ D such that τ(t) = 0 and no s ∈ S \ C such thats ∈ BS(τ ).

We have C ⊂ C ′ and D \ {t} ⊂ D′ because C ′ and D′ contain the supports of σ andτ ,C ′ ⊂ C because s0 ∈ C and there are no best response to τ outside ofC \ {s0}, andD′ ⊂ D because no t outside of D is a best response to σ . That is,C ′ = C and eitherD′ = D or D′ = D \ {t}. Since it contains (σ, τ ), ΔD\{t}(C) × ΔC\{s0}(D \ {t})is nonempty, and nondegeneracy implies that ΔD\{t}(C) is 1-dimensional whileΔC\{s0}(D \ {t}) is 0-dimensional. Thus (σ, τ ) is contained in precisely two 1-dimensional sets of the form ΔD(C) × ΔC\{s0}(D).

Despite the proliferation of cases, the results of this analysis can be summarizedsuccinctly. Distinct 1-dimensional sets of the form ΔD(C) × ΔC\{s0}(D) intersect attheir endpoints, if at all. An endpoint of such a set is contained in no other such setif it is a Nash equilibrium or it is (s0, t0) (in which case s0 is not a best response tot0) and otherwise it is an endpoint of precisely one other such set.

These results give us a clear picture of the structure of Γs0 . If (s0, t0) is a Nashequilibrium, then it is an isolated point in Γs0 , and the rest of Γs0 consists of loopsand paths whose endpoints are the other Nash equilibria. If (s0, t0) is not a Nashequilibrium then Γs0 consists of loops, a path from (s0, t0) to a Nash equilibrium, andpaths whose endpoints are the other Nash equilibria.

The Lemke–Howson algorithm has several features in common with the algo-rithms we saw earlier. First, a nondegenerate game has an odd number of Nash equi-libria. Second, the process can be given an orientation. That is, there is a notion of apositively or negatively oriented equilibrium, the equilibrium found by the Lemke–Howson algorithm is positively oriented, and the two endpoints of any path inΓ haveopposite orientation, so the number of positively oriented equilibria is one more thanthe number of negatively oriented equilibria. In addition, at any point along the pathone can use local information to determine which direction along the path leads to apositively oriented equilibrium. These properties of the Lemke–Howson algorithmwere established by Shapley (1974). Unfortunately his definitions and analysis aretoo cumbersome to be included here, so we refer the interested reader to that article.

Third, after following the path in Γs0 from (s0, t0) to its other endpoint (σ ∗, τ ∗),for some different s ′

0 or t′0 one may follow the path in Γs ′

0or Γt ′0 leading away from

(σ ∗, τ ∗). It is possible that this path leads to a negatively oriented Nash equilibrium.As with the other algorithms, equilibria that can be reached by repeated applicationsof this maneuver are said to be accessible. A famous example due to Robert Wilson(reported in Shapley 1974) shows that there can be inaccessible equilibria even ingames with a surprisingly small number of pure strategies.


A concrete example may give a more vivid impression. For concrete calculationsit is convenient to let

S = {s1, . . . , sm} and T = {t1, . . . , tn} ,

and let A and B be them × nmatriceswith entriesai j := u(si , t j ) andbi j := v(si , t j ).In our example m = n = 3, and

A =⎛

⎝0 0 11 0 00 1 0

⎞

⎠ and B =⎛

⎝2 3 02 0 33 1 0

⎞

⎠ .

These payoffs determine divisions of Δ(S) and Δ(T ), according to best responses,as shown in Fig. 3.5.

If we let s1 have the role of s0 in our description above (Exercise 3.2 asks you towork out the paths for each of the other five possibilities) then the Lemke–Howsonalgorithm follows the sequence of points

(s1, t2) −→ (A, t2) −→ (A, B) −→ (C, B) −→ (C, t1) −→ (D, t1) −→ (D, E) .

This path alternates between the moves in Δ(S) and the moves in Δ(T ) shown inFig. 3.5.

Let’s look at this path in detail. The best response to s1 is t2, so the algorithmbegins at (s1, t2). The best response to t2 is s3, so we replace probability assigned tos1 with probability assigned to s3 until we arrive at (A, t2). Here t1 becomes a bestresponse, so it becomes possible to replace probability assigned to t2 with probabilityassigned to t1 until we arrive at (A, B). Now s2 is a best response, so we move inΔ(S) along the line of indifference between t1 and t2 to (C, B). At C the probabilityassigned to s3 is zero, so it is no longer necessary that s3 be a best response, and wecan replace probability assigned to t2 with probability assigned to t1, to get to (C, t1).Now t2 need not be a best response, so we can move to (D, t1). Here t3 is a best

Fig. 3.5 The path of the Lemke-Howson algorithm


response, so we can move probability from t1 to t3 until we get to (D, E). At thispoint s1 becomes a best response, so (D, E) should be a Nash equilibrium, which isindeed the case.

3.6 Implementation and Degeneracy Resolution

We have described the Lemke–Howson algorithm geometrically, in terms that ahuman can picture, but that is not quite the same thing as providing a description interms of concrete, fully elaborated, algebraic operations. This section provides sucha description. In addition, our discussion to this point has assumed a nondegenerategame. This assumption simplifies the theoretical analysis, but in computational prac-tice one does not want to assume this, for several reasons. We will develop a versionof the algorithm that works for any inputs.

Since our current perspective is numerical, we adopt notation suitable for linearalgebra operations. Let S = {s1, . . . , sm} and T = {t1, . . . , tn}, and let A and B bethe m × n matrices with entries ai j := u(si , t j ) and bi j := v(si , t j ). Treating mixedstrategies as column vectors, we have

u(σ, τ ) = σ T Aτ and v(σ, τ ) = σ T Bτ ,

so that (σ ∗, τ ∗) is a Nash equilibrium if σ T Aτ ∗ ≤ σ ∗T Aτ ∗ for all σ ∈ Δ(S) andσ ∗T Bτ ≤ σ ∗T Bτ ∗ for all τ ∈ Δ(T ).

The formulation of the Nash equilibrium problem we have been working with sofar may be regarded as a matter of finding equilibrium expected payoffs u∗, v∗ ∈ R,equilibrium mixed strategies σ ∗ ∈ R

m+, and τ ∗ ∈ Rn+, and vectors of slack variables

s∗ ∈ Rm+, and t∗ ∈ R

n+, such that:

Aτ ∗ + s∗ = u∗em , BT σ ∗ + t∗ = v∗en, 〈s∗, σ ∗〉 = 0 = 〈t∗, τ ∗〉, 〈σ ∗, em〉 = 1 = 〈τ ∗, en〉

where em := (1, . . . , 1) ∈ Rm and en := (1, . . . , 1) ∈ R

n . The set of Nash equilibriais unaffected if we add a constant to every entry in a column of A, or to every entryof a row of B. Therefore we may assume that all the entries of A and B are positive,and will do so henceforth. Now the equilibrium utilities u∗ and v∗ are necessarilypositive, so we can divide in the system above, obtaining the system

Aτ + s = em, BTσ + t = en, 〈s, σ 〉 = 0 = 〈t, τ 〉, s, σ ∈ Rm+, t, τ ∈ R

n+

together with the formulas 〈σ, em〉 = 1/v∗ and 〈τ, en〉 = 1/u∗ for computing equi-librium expected payoffs.

This new system is not quite equivalent to the one we started with because thatsystem in effect requires that σ and τ each have some positive components. We arenow allowing σ = 0 and τ = 0, and in fact the new system has a solution that does


not come from a Nash equilibrium, namely σ = 0, τ = 0, s = em , and t = en . It iscalled the extraneous solution. (To see that this is the only new solution consider thatif σ = 0, then t = en , so that 〈t, τ 〉 = 0 implies τ = 0, and similarly τ = 0 impliesthat σ = 0.)

We now wish to see the geometry of the Lemke–Howson algorithm in the newcoordinate system. Let

Δ(S)∗ := { σ ∈ Rm+ : BTσ ≤ en } and Δ(T )∗ := { τ ∈ R

n+ : Aτ ≤ em } .

There is a bijection σ �→ σ/∑

i σi between the points on the upper surface ofΔ(S)∗,namely those for which some component of en − BTσ is zero, and the points ofΔ(S), and similarly for Δ(T )∗ and Δ(T ). For the game studied in the last sectionthe polytopes Δ(S)∗ and Δ(T )∗ are shown in Fig. 3.6. Note that the best responseregions in Fig. 3.5 have become facets.

In this framework nondegeneracy has a geometric consequence. In general, ad-dimensional polytope P is simple (cf. Exercise 2.6) if each vertex is containedin exactly d facets. If this is the case, then every r -dimensional face is containedin exactly d − r facets. (If an r -dimensional face is contained in more than d − rfacets, then each of its vertices is contained inmore than d facets.)We claim that if thegame is nondegenerate, thenΔ(S)∗ andΔ(T )∗ are simple. (It can happen thatΔ(S)∗is simple when the game is degenerate: for example, if two of the strategies in Tgive the same payoffs to the second player, then their corresponding facets of Δ(S)∗are the same.) Consider a vertex v of Δ(S)∗. If v is the origin, then the m facetsthat contain it are the portions of Δ(S)∗ lying in (m − 1)-dimensional coordinatesubspaces. Otherwise v is in the upper surface of Δ(S)∗, and the facets that containit are those corresponding to the vanishing of the pure strategies that are not in the

Fig. 3.6 A second geometric presentation of the two person game

3.6 Implementation and Degeneracy Resolution 81

support of the corresponding element of Δ(S) and the pure strategies in T that arebest responses. Nondegeneracy requires that the number of pure strategies in thesupport is equal to the number of best responses, so the sum of the number of purestrategies not in the support and the number of best responses is m.

We now transport the Lemke–Howson algorithm to this framework. Let M∗ bethe set of (σ, τ ) ∈ Δ(S)∗ × Δ(T )∗ such that, when we set s := em − Aτ and t :=en − BTσ , we have

(a) for each i = 2, . . . ,m, either σi = 0 or si = 0;(b) for each j = 1, . . . , n, either τ j = 0 or t j = 0.

For our running example we can follow a path in M∗ from (0, 0) to the image of theNash equilibrium, as shown in Fig. 3.7. This path has a couple more edges than theone in Fig. 3.5, but there is the advantage of starting at (0, 0), which is a bit morecanonical.

Let := m + n, and let e := (1, . . . , 1) ∈ R. If we set

C :=[0 ABT 0

]

, q := e, y := (σ, τ ), and x := (s, t) ,

the system above is a special case of

Cy + x = q 〈x, y〉 = 0 x, y ≥ 0 ∈ R. (3.1)

This is called the linear complementarity problem. It arises in a variety of othersettings, and is very extensively studied. The framework of the linear complementar-ity problem is simpler conceptually and notationally, and it allows somewhat greatergenerality, so we will work with it for the remainder of this section.

Fig. 3.7 The path of the Lemke-Howson algorithm in the new geometric setting


Let P := { (x, y) ∈ R+ × R

+ : Cy + x = q }. We will assume that all the com-ponents of q are positive, that all the entries of C are nonnegative, and that eachrow of C has at least one positive entry, so that P is bounded and thus a poly-tope. The condition that generalizes the nondegeneracy assumption on A and B isthat P is simple. To see this let the projection of P onto the second copy of R beQ := { y ∈ R

+ : Cy ≤ q }. If the linear complementarity problem is derived from agame, then Q = Δ(S)∗ × Δ(T )∗. In general the faces of the cartesian product of twopolytopes are the cartesian products of the two polytopes’ faces. From this obser-vation it follows immediately that the cartesian product of two simple polytopesis simple, so if the linear complementarity problem comes from a nondegenerategame, then Q is simple, and Q is simple if and only if P is simple because the maps(x, y) �→ y and y �→ (q − Cy, y) are inverse linear bijections between the two sets.

Our problem is to find a (x, y) ∈ P such that x �= 0 and the the “complementaryslackness condition” 〈x, y〉 = 0 is satisfied. The algorithm follows the path startingat (x, y) = (q, 0) in

M := { (x, y) ∈ P : x2y2 + · · · + xy = 0 } .

Theequation x2y2 + · · · + xy = 0 encodes the condition that for each j = 2, . . . , ,either x j = 0 or y j = 0. Suppose we are at a vertex (x, y) of P satisfying this condi-tion, but not x1y1 = 0. Since P is simple, exactlyof the variables x2, . . . , x, y2, . . . ,y vanish, so there is some i such that xi = 0 = yi . The portion of P where xi ≥ 0and the other − 1 variables vanish is an edge of P whose other endpoint is the firstpoint where one of the variables that are positive at (x, y) vanishes. Again, sinceP is simple, precisely one of those variables vanishes there.

How should we describe moving from one vertex to the next algebraically? Con-sider specifically the move away from (0, q). Observe that P is the graph of thefunction y �→ q − Cy from Q toR. We explicitly write out the system of equationsdescribing this function:

x1 = q1 − c11y1 − · · · − c1y,...

......

...

xi = qi − ci1y1 − · · · − ciy,...

......

...

x = q − c1y1 − · · · − cy.

As we increase y1, holding 0 = y2 = · · · = y, the constraint we bump into first isthe one requiring xi ≥ 0 for the i for which qi/ci1 is minimal. If i = 1, then the pointwe arrived at is a solution and the algorithm halts, so we may suppose that i ≥ 2.

We now want to describe P as the graph of a function with domain in thexi , y2, . . . , y coordinate subspace, and x1, . . . , xi−1, y1, xi+1, . . . , x as the vari-ables parameterizing the range. To this end we rewrite the i th equation as


y1 = 1

ci1qi − 1

ci1xi − ci2

ci1y2 − · · · − ci

ci1y .

Replacing the first equation above with this, and substituting it into the other equa-tions, gives

x1 =(q1 − c11

ci1qi

)−

(− c11

ci1

)xi −

(c12 − c11ci2

ci1

)y2 − · · · −

(c1 − c11ci

ci1

)y,

......

......

...

y1 = 1ci1qi − 1

ci1xi − ci2

ci1y2 − · · · − ci

ci1y,

......

......

...

x =(q − c1

ci1qi

)−

(− c1

ci1

)xi −

(c2 − c1ci2

ci1

)y2 − · · · −

(c − c1ci

ci1

)y.

This is not exactly a thing of beauty, but it evidently has the same form as what westarted with. The data of the algorithm consists of a tableau [q ′,C ′], a list describ-ing how the rows and the last columns of the tableau correspond to the originalvariables of the problem, and the variable that vanished when we arrived at the corre-sponding vertex. If this variable is either x1 or y1 we are done. Otherwise the data isupdated by letting the variable that is complementary to this one increase, finding thenext variable that will vanish when we do so, then updating the list and the tableauappropriately. This process is called pivoting.

We can now describe how the algorithm works in the degenerate case when Pis not necessarily simple. From a conceptual point of view, our method of handlingdegenerate problems is to deform them slightly, so that they become nondegenerate,but in the end we will have only a combinatoric rule for choosing the next pivotvariable. Let L := { (x, y) ∈ R

× R : Cy + x = q }, let α1, . . . , α, β1, . . . , β be

distinct positive integers, and for ε > 0 let

Pε := { (x, y) ∈ L : xi ≥ −εαi and yi ≥ −εβi for all i = 1, . . . , } .

If (x, y) is a vertex of Pε, then there are variables, which we will describe as“free variables,” whose corresponding equations xi = εαi and yi = εβi determine(x, y) as the unique member of L satisfying them. At the point in L where theseequations are satisfied, the other variables can be written as linear combinations ofthe free variables, and thus as polynomial functions of ε. Because the αi and βi are alldifferent, there are only finitely many values of ε such that any of the other variablesvanish at this vertex. Because there are finitely many -element subsets of the 2variables, it follows that Pε is simple for all but finitely many values of ε.

In particular, for all ε in some interval (0, ε) the combinatoric structure of Pε willbe independent of ε. In addition, we do not actually need to work in Pε because thepivoting procedure, applied to the polytope Pε for such ε, will follow a well definedpath that can be described in terms of a combinatoric procedure for choosing thenext pivot variable.


To see what we mean by this consider the problem of finding which xi first goesbelow −εαi as we go out the line y1 ≥ −εβ1 , y2 = −εβ2 , . . . , y = −εβ . This isbasically a process of elimination. If ci1 ≤ 0, then increasing y1 never leads to aviolation of the i th constraint, so we can begin by eliminating all those i for whichci1 is not positive. Among the remaining i , the problem is to find the i for which

1

ci1qi + 1

ci1εαi + ci2

ci1εβ2 + · · · + ci

ci1εβ

is smallest for small ε > 0. The next step is to eliminate all i for which qi/c1i is notminimal. For each i that remains the expression

1

ci1εαi + ci2

ci1εβ2 + · · · + ci

ci1εβ

has a dominant term, namely the term, among those with nonzero coefficients, whoseexponent is smallest. The dominant terms are ordered according to their values forsmall ε > 0:

(a) terms with positive coefficients are greater than terms with negative coefficients;(b) among terms with positive coefficients, those with smaller exponents are greater

than terms with larger exponents, and if two terms have equal exponents theyare ordered according to the coefficients;

(c) among terms with negative coefficients, those with larger exponents are greaterthan terms with smaller exponents, and if two terms have equal exponents theyare ordered according to the coefficients.

We now eliminate all i for which the dominant term is not minimal. All remainingi have the same dominant term, and we continue by subtracting off this term andcomparing the resulting expressions in a similar manner, repeating until only one iremains. This process does necessarily continue until only one i remains, becauseif other terms of the expressions above fail to distinguish between two possibilities,eventually there will be a comparison involving the terms εαi /ci1, and the exponentsα1, . . . , α, β1, . . . , β are distinct.

Let’s review the situation. We have given an algorithm that finds a solution of thelinear complementarity problem (3.1) that is different from (q, 0). The assumptionsthat ensure that the algorithm works are that q ≥ 0 and that P is a polytope. Inparticular, these assumptions are satisfied when the linear complementarity problemis derived from a two person game with positive payoffs, in which case any solutionother than (q, 0) corresponds to a Nash equilibrium.

There are additional issues that arise in connection with implementing the algo-rithm, since computers cannot do exact arithmetic on arbitrary real numbers. Onepossibility is to require that the entries of q and C lie in a set of numbers for whichexact arithmetic is possible—usually the rationals, but there are other possibilities, atleast theoretically. Alternatively, one may work with floating point numbers, whichis more practical, but also more demanding because there are issues associated with


round-off error, and in particular its accumulation as the number of pivots increases.The sort of pivoting we have studied here also underlies the simplex algorithm forlinear programming, and the same sorts of ideas are applied to resolve degeneracy.Numerical analysis for linear programming has a huge amount of theory, much ofwhich is applicable to the Lemke–Howson algorithm, but it is far beyond our scope.

3.7 Using Games to Find Fixed Points

This section explains the proof of Kakutani’s fixed point theorem in McLennan andTourky (2005), which passes directly from the existence of equilibrium in two persongames to full generality, and the resulting algorithm for computing an approximatefixed point of a continuous function. The key idea has a simple description. Fix anonempty compact convex X ⊂ R

d , and let F : X → X be a (not necessarily convexvaluedor upper hemicontinuous) correspondencewith compact values.Wecandefinea two person game with strategy sets S = T = X by setting

u(s, t) := − minx∈F(t)

‖s − x‖2 and v(s, t) :={0, s �= t,

1, s = t.

If (s, t) is a Nash equilibrium, then s ∈ F(t) and t = s, so s = t is a fixed point.Conversely, if x is a fixed point, then (x, x) is a Nash equilibrium.

Of course this observation does not prove anything, but it does point in a usefuldirection. Let x1, . . . , xn, y1, . . . , yn ∈ X be given. We can define a finite two persongame with pure strategy sets S := {x1, . . . , xn} and T := {y1, . . . , yn} and n × npayoff matrices A = (ai j ) and B = (bi j ) by setting

ai j := −‖xi − y j‖2 and bi j :={0, i �= j,

1, i = j.

Let (σ, τ ) ∈ Δ(S) × Δ(T ) be a mixed strategy profile. Clearly τ is a best responseto σ if and only if it assigns all probability to the yi such that xi is assigned maximumprobability by σ , which is to say that τ(y j ) > 0 implies that σ(x j ) ≥ σ(xi ) for all i .

Understanding when σ is a best response to τ requires a brief calculation. Letz := ∑n

j=1 τ(y j )y j . The expected payoff of the first player when she chooses xi is

∑

j

ai jτ(y j ) = −∑

j

τ(y j )‖xi − y j‖2 = −∑

j

τ(y j )⟨xi − y j , xi − y j

⟩

= −∑

j

τ(y j )⟨xi , xi

⟩ + 2∑

j

τ(y j )⟨xi , y j

⟩ −∑

j

τ(y j )⟨y j , y j

⟩

= −⟨xi , xi

⟩ + 2⟨xi , z

⟩ − 〈z, z〉 + C = −‖xi − z‖2 + C


where C = ‖z‖2 − ∑nj=1 τ(y j )‖y j‖2 is a quantity that does not depend on i . There-

fore σ is a best response to τ if and only if it assigns all probability to those i withxi as close to z as possible. If y1 ∈ F(x1), . . . , yn ∈ F(xn), then there is a sense inwhich a Nash equilibrium may be regarded as an approximate fixed point.

We are going tomake this precise, thereby provingKakutani’s fixed point theorem.Assume now that F is upper hemicontinuous with convex values. Define sequencesx1, x2, . . . and y1, y2, . . . inductively as follows. Choose x1 arbitrarily, and let y1 bean element of F(x1). Supposing that x1, . . . , xn and y1, . . . , yn , have already beendetermined, let (σ n, τ n) be a Nash equilibrium of the two person game with payoffmatrices An = (ani j ) and Bn = (bni j ) where a

ni j := −‖xi − y j‖2 and bni j is 1 if i = j

and 0 otherwise. Let xn+1 := ∑j τ(y j )y j , and choose yn+1 ∈ F(yn+1).

Let x∗ be an accumulation point of the sequence {xn}. To show that x∗ is a fixedpoint of F it suffices to show that it is an element of the closure of any convexneighborhood V of F(x∗). Choose δ > 0 such that F(x) ⊂ V for all x ∈ Uδ(x∗).Consider an n such that xn+1 = ∑

j τnj y j ∈ Uδ/3(x∗) and at least one of x1, . . . , xn

is also in this ball. Then the points in x1, . . . , xn that are closest to xn+1 are inU2δ/3(xn+1) ⊂ Uδ(x∗), so xn+1 is a convex combination of points in V , and is there-fore in V . Therefore x∗ is in the closure of the set of xn that lie in V , and thus in theclosure of V .

In addition to proving the Kakutani fixed point theorem, we have accumulated allthe components of an algorithm for computing approximate fixed points of a con-tinuous function f : X → X . Specifically, for any error tolerance ε > 0 we com-pute the sequences x1, x2, . . . and y1, y2, . . . with f in place of F , halting when‖xn+1 − f (xn+1)‖ < ε. The argument above shows that this is, in fact, an algorithm,in the sense that it is guaranteed to halt eventually. This algorithm is quite new. Codeimplementing it exists, and the initial impression is that it performs quite well. Butit has not been extensively tested.

There is one more idea that may have some algorithmic interest. As before, weconsider points x1, . . . , xn, y1, . . . , yn ∈ R

d . For z ∈ Rd let

J (z) := argmini

‖z − xi‖ .

(Recall that the Voronoi diagram determined by x1, . . . , xn is the polyhedral decom-position of Rd whose nonempty polyhedra are the sets PJ := { z ∈ V : J ⊂ J (z) }.)Define a correspondence Φ : Rd → R

d by letting Φ(z) be the convex hull of{ y j : j ∈ J (z) }. Clearly Φ is upper hemicontinuous and convex valued.

Suppose that z is a fixed point of Φ. Then z = ∑j τ(y j )y j for some τ ∈ Δ(T )

with τ(y j ) = 0 for all j /∈ J (z). If σ(xi ) = 1/|J (z)| when i ∈ J (z) and σ(xi ) = 0when i /∈ J (z), then (σ, τ ) is a Nash equilibrium of the game. Conversely, if (σ, τ )

is a Nash equilibrium of this game, then∑

j τ(y j )y j is a fixed point of Φ. In asense, the algorithm described above approximates the given correspondence F witha correspondence of a particularly simple type.

Wemay project the path of the Lemke–Howson algorithm, in its application to thegame derived from x1, . . . , xn, y1, . . . , yn , into this setting. DefineΦ1 : Rd → R

d by

3.7 Using Games to Find Fixed Points 87

letting Φ1(z) be the convex hull of { y j : j ∈ {1} ∪ J (z) }. Suppose that Γx1 is the setof pairs (σ, τ ) satisfying all the conditions of Nash equilibrium except that it may bethe case that σ(x1) > 0 even if the x1 is not optimal. (This is the set that contains thepath of the Lemke–Howson algorithm when x1 is the distinguished pure strategy.)Suppose that (σ, τ ) ∈ Γx1 . Let J := { j : τ(y j ) > 0 }, and let z := ∑

j τ(y j )y j . ThenJ ⊂ { i : σ(xi ) > 0 } ⊂ {1} ∪ J (z), so z ∈ Φ1(z). Conversely, suppose z is a fixedpoint of Φ1. Then z = ∑

j τ(y j )y j for some τ ∈ Δn−1 with τ(y j ) = 0 for all j /∈{1} ∪ J (z). If we let σ be the element of Δn−1 such that σ(xi ) = 1/|{1} ∪ J | ifi ∈ J and σ(xi ) = 0 if i /∈ {1} ∪ J , then (σ, τ ) ∈ Γx1 . This setups gives a picture ofwhat the Lemke–Howson algorithm is doing that has interesting implications. Forexample, if there is no point in R

d that is equidistant from more than d + 1 points,as will be the case when the n-tuple x1, . . . , xn is “generic,” then there is no point(σ, τ ) ∈ Γx1 with σ(xi ) > 0 for more than d + 2 indices.

3.8 Homotopy

Let X ⊂ Rd be nonempty, compact, and convex, let f : X → X be a continuous

function, and let x0 be an element of X . We let h : X × [0, 1] → X be the homotopy

h(x, t) := (1 − t)x0 + t f (x) .

Here we think of the variable t as time, and let ht = h(·, t) : X → X be the function“at time t .” In this way we imagine deforming the constant function with value x0at time zero into the function f at time one. (There are of course many additionalpossibilities, including different choices of h0.)

Let g : X × [0, 1] → X be the function g(x, t) := h(x, t) − x . The idea of thehomotopymethod is to follow a path in Z := g−1(0) starting at (x0, 0) until we reacha point of the form (x∗, 1). There is a mathematical guarantee that such a path iswell defined if f is C1, so that h and g are C1, the derivative of g has full rank atevery point of Z , and the derivative of the map x �→ f (x) − x has full rank at eachof the fixed points of f . As we will see later in the book, there is a sense in whichthis is “typically” the case when f isC1, so that these assumptions are in some sensemild. With these assumptions Z will be a union of finitely many curves. Some ofthese curves will be loops, while others will have two endpoints in X × {0, 1}. Inparticular, the other endpoint of the curve beginning at (x0, 0) cannot be in X × {0},because there is only one point in Z ∩ (X × {0}), so it must be (x∗, 1) for some fixedpoint x∗ of f .

We now have to tell the computer how to follow this path. The standard computa-tional implementation of curve following is called the predictor-corrector method.Suppose we are at a point z0 = (x, t) ∈ Z . We first need to compute a vector v that istangent to Z at z0. Algebraically this amounts to finding a nonzero linear combinationof the columns of the matrix of Dg(z0) that vanishes. For this it suffices to express


one of the columns as a linear combination of the others, and, roughly speaking, theGram–Schmidt (Sect. 12.1) process can be used to do this. We can divide any vectorwe obtain this way by its norm, so that v becomes a unit vector. There is a parameterof the procedure called the step size that is a numberΔ > 0, and the “predictor” partof the process is completed by passing to the point z′

1 = z0 + Δv.The “corrector” part of the process uses the Newton method to pass from z′

1 toa new point z1 in Z , or at least very close to it. In general the Newton method forfinding a zero of a C1 function j : U → R

n , where U ⊂ Rn is open, beginning at

an initial point y0, is the iteration of the computation yt+1 := yt − Dj (yt )−1 j (yt ).In general there is no guarantee that this converges to anything, but if j (y∗) = 0and Dj (y∗) is nonsingular, then there is a neighborhood of y∗ such that the processconverges very rapidly if y0 is in this neighborhood, roughly doubling the number ofsignificant digits with each iteration. (E.g., Galántai 2000.) For all of the methods forfinding approximate fixed points studied here, a final step inwhich the approximationis improved using the Newton method is a useful piece of software engineering incomputational practice.

For the corrector step the Newton method searches for a zero z1 of g in thehyperplane that contains z′

1 and is orthogonal to v. The net effect of the predictorfollowed by the corrector is to move us from one point on Z to another a bit furtherdown. By repeating this one can go from one end of the curve to the other.

Probably the reader has sensed that the description above is a high level overviewthat glides past many issues. In fact it is difficult to regard the homotopy methodas an actual algorithm, in the sense of having precisely defined inputs and beingguaranteed to eventually halt at an output of the promised sort. One issue is that theprocedure might accidentally hop from one component of Z to another, particularlyif Δ is large. There are various things that might be done about this, for instancetrying to detect a likely failure and starting over with a smaller Δ, but these issues,and the details of round off error that are common to all numerical software, arereally in the realm of engineering rather than computational theory. Chapter 16 ofGarcía and Zangwill (1981) discusses many of these issues.

Relaxing the promise of certain success that comes from the definition of analgorithm has various advantages. Instead of computing the relevant derivatives fromclosed form expressions for them, one may instead approximate them by computinghow the function varies as one takes steps in a spanning set of directions. In this waythe software may be applied to inputs that are not guaranteed to be C1. There is nomathematical guarantee that the computation will succeed, but for certain types ofproblems it can have a negligible, or at least tolerable, failure rate. As a practicalmatter, the homotopy method is highly successful, and is used to solve systems ofequations from a wide variety of application domains.

3.9 Remarks on Computation 89

3.9 Remarks on Computation

We have now seen several algorithms for computing approximate fixed points. Howgood are these, practically and theoretically?TheScarf algorithmdid not live up to thehopes it raised when it was first developed, and is not used in practical computation.Kuhn and MacKinnon (1975) suggest that the running times of their restart versionof the Scarf algorithm should be expected to increase with the cube or the fourthpower of the dimension. The general similarities of restart versions of the primitiveset method suggest a similar rate of increase should be expected. Code exists for theMcLennan-Tourky algorithm, and some preliminary computations suggests that itworks quite well for high dimensional problems, but it has not been systematicallytested. The rate at which the computational burden of homotopy methods increaseswith dimension does not seem to have been studied theoretically, as one wouldexpect given the rather vague description of these computational procedures. Sincethe corrector step inverts a matrix of the same dimension as the problem, and thepredictor step has a similar computation, a burden that increases with the cube orfourth power of the dimension would not be surprising.

More generally, what canwe reasonably hope for from an algorithm that computespoints that are approximately fixed, and what sort of theoretical concepts can webring to bear on these issues? These questions have been the focus of importantrecent advances in theoretical computer science, and in this section we give a briefdescription of these developments. The discussion presumes little in the way of priorbackground in computer science, and is quite superficial—a full exposition of thismaterial is far beyond our scope. Interested readers can learn much more from thecited references, and from textbooks such as Papadimitriou (1994a), Arora and Boaz(2007).

Theoretical analyses of algorithms must begin with a formal model of compu-tation. The standard model is the Turing machine, which consists of a processorwith finitely many states connected by an input-output device to a unbounded onedimensional storage medium that records data in cells, on each of which one canwrite an element of a finite alphabet that includes a distinguished character ‘blank.’At the beginning of the computation the processor is in a particular state, the storagemedium has a finitely many cells that are not blank, and the input-output device ispositioned at a particular cell in storage. In each step of the computation the char-acter at the input-output device’s location is read. The Turing machine is essentiallydefined by functions that take state-datum pairs as their arguments and compute:

1. the next state of the processor,2. a bit that will be written at the current location of the input-output device (over-

writing the bit that was just read) and3. a motion (forward, back, stay put) of the input-output device.

The computation ends when it reaches a particular state of the machine called “Halt.”Once that happens, the data in the storage device is regarded as the output of thecomputation.


As you might imagine, an analysis based on a concrete and detailed descrip-tion of the operation of a Turing machine can be quite tedious. Fortunately, it israrely necessary. Historically, other models of computation were proposed, but weresubsequently found to be equivalent to the Turing model, and the Church–Turingthesis is the hypothesis that all “reasonable” models of computation are equivalent,in the sense that they all yield the same notion of what it means for something to be“computable.” This is a metamathematical assertion: it can never be proved, and arefutation would not be logical, but would instead be primarily a social phenomenon,consisting of researchers shifting their focus to some inequivalent model.

Once we have the notion of a Turing machine, we can define an algorithm to be aTuringmachine that eventually halts, for any input state of the storage device.A subtledistinction is possible here: a Turing machine that always halts is not necessarily thesame thing as a Turing machine that can be proved to halt, regardless of the input.In fact one of the most important early theorems of computer science is that there isno algorithm that has, as input, a description of a Turing machine and a particularinput, and decides whether the Turing machine with that input will eventually halt.As a practical matter, one almost always works with algorithms that can easily beproved to be such, in the sense that it is obvious that they eventually halt.

A computational problem is a rule that associates a nonempty set of outputs witheach input, where the set of possible inputs and outputs is the set of pairs consisting ofa position of the input-output device and a state of the storage medium in which thereare finitely many nonblank cells. (Almost always the inputs of interest are formattedin some way, and this definition implicitly makes checking the validity of the inputpart of the problem.)

There are many kinds of computational problems, e.g., sorting, function evalua-tion, optimization, etc. For us themost important types are decision problems, whichrequire a yes or no answer to a well posed question, and search problems, whichrequire an instance of some sort of object or a verification that no such object exists.An important example of a decision problem is Clique: given a simple undirectedgraph G and an integer k, determine whether G has a clique with k nodes, where aclique is a collection of vertices such that G has an edge between any two of them.An example of a search problem is to actually find such a clique or to certify that nosuch clique exists.

A computational problem is computable if there is an algorithm that passes fromeach input to one of the acceptable outputs. The distinction between computationalproblems that are computable and those that are not is fundamental, with manyinteresting and important aspects, but aside from the halting problem mentionedabove, in our discussion here we will focus exclusively on problems that are knownto be computable.

For us the most important distinction is between those computable computationalproblems that are “easy” and those that are “hard,” where the definitions of theseterms remain to be specified. In order to be theoretically useful, the easiness/hardnessdistinction should not depend on the architecture of a particular machine or thetechnology of a particular era. In addition, it should be robust, at least in the sensethat a composition of two easy computational problems, where the output of the first


is the input of the second, should also be easy, and possibly in other senses as well.For these reasons, looking at the running time of an algorithm on a particular input isnot very useful. Instead, it is more informative to think about how the resources (timeand memory) consumed by a computation increase as the size of the input grows.In theoretical computer science, the most useful distinction is between algorithmswhose worst case running time is bounded by a polynomial function of the size of theoutput, and algorithms that do not have this property. The class of decision problemsthat have polynomial time algorithms is denoted by P. If the set of possible inputs ofa computational problem is finite, then the problem is trivially in P, and in fact wewill only consider computational problems with infinite sets of inputs.

There is a particularly important class of decision problems called NP, whichstands for “nondeterministic polynomial time.” Originally NP was thought of as theclass of decision problems for which a Turing machine that chooses its next staterandomly has a positive probability of showing that the answer is “Yes” in polynomialtime when this is the case. For example, if a graph has a k-clique, an algorithm thatsimply guesses which elements constitute the clique has a positive probability ofstumbling onto some k-clique. The more modern way of thinking about NP is that itis the class of decision problems forwhich a “Yes” answer has a certificate orwitnessthat can be verified in polynomial time. In the case of Clique an actual k-clique issuch a witness. Factorization of integers is another algorithmic issue which easilygenerates decision problems—for example, does a given number have a prime factorwhose first digit is 3?—that are in NP because a prime factorization is a witnessfor them. (One of the historic recent advances in mathematics is the discovery of apolynomial time algorithm for testing whether a number is prime. Thus it is possibleto verify the primality of the elements of a factorization in polynomial time.)

An even larger computational class is EXP, which is the class of computationalproblems that have algorithms with running times that are bounded above by a func-tion of the form exp(p(s)), where s is the size of the problem and p is a polynomialfunction. Instead of using time to define a computational class, we can also use space,i.e., memory; PSPACE is the class of computational problems that have algorithmsthat use an amount of memory that is bounded by a polynomial function of the size ofthe input. The sizes of the certificates for a problem inNP are necessarily bounded bysome polynomial function of the size of the input, and the problem can be solved bytrying all possible certificates not exceeding this bound, so any problem inNP is alsoin PSPACE. In turn, the number of processor state-memory state pairs during therun of a program using polynomially bounded memory is bounded by an exponentialfunction of the polynomial, so any problem in PSPACE is also in EXP. Thus

P ⊂ NP ⊂ PSPACE ⊂ EXP .

Computational classes can also be defined in relation to an oracle which isassumed to perform some computation. The example of interest to us is an ora-cle that evaluates a continuous function f : X → X . How hard is it to find a pointthat is approximately fixed using such an oracle? Hirsch et al. (1989) showed thatany algorithm that does this has an exponential worst case running time, because


some functions require exponentially many calls to the oracle. Once you commit toan algorithm, the Devil can devise a function for which your algorithm will makeexponentially many calls to the oracle before finding an approximate fixed point.

An important aspect of this result is that the oracle is assumed to be the onlysource of information about the function. In practice the function is specified bycode, and in principle an algorithm could inspect the code and use what it learned tospeed things up. For linear functions, and certain other special classes of functions,this is a useful approach, but it seems quite farfetched to imagine that a fully generalalgorithm could do this fruitfully. At the same time it is hard to imagine how wemight prove that this is impossible, so we arrive at the conclusion that even thoughwe do not quite have a theorem, finding fixed points almost certainly has exponentialworst case complexity.

Even if finding fixed points is, in full generality, quite hard, it might still be thecase that certain types of fixed point problems are easier. Consider, in particular,finding a Nash equilibrium of a two person game. Savani and von Stengel (2006)(see also McLennan and Tourky 2010) showed that the Lemke–Howson algorithmhas exponential worst case running time, but the algorithm is in many ways similar tothe simplex algorithm for linear programming, not least because both algorithms tendto work rather well in practice. The simplex algorithmwas shown by Klee andMinty(1972) to have exponential case running time, but later polynomial time algorithmswere developed by Khachian (1979), Karmarkar (1984). Whether or not finding aNash equilibrium of a two person game is in P was one of the outstanding openproblems of computer science for over a decade. Additional concepts are requiredin order to explain how this issue was resolved.

A technique called reduction can be used to show that some computational prob-lems are at least as hard as others, in a precise sense. Suppose that A and B are twocomputational problems, and we have two algorithms, guaranteed to run in polyno-mial time, the first of which converts the input encoding an instance of problem Ainto the input encoding an instance of problem B, and the second of which convertsthe desired output for the derived instance of problem B into the desired output forthe given instance of problem A. Then problem B is at least as hard as problem Abecause one can easily turn an algorithm for problem B into an algorithm for problemA that is “as good,” in any sense that is invariant under these sorts of polynomialtime transformations.

A problem is complete for a class of computational problems if it is at leastas hard, in this sense, as any other member of the class. One of the reasons that NPis so important is there are numerous NP-complete problems, many of which arisenaturally;Clique is one of them. One of the most famous problems in contemporarymathematics is to determine whether NP is contained in P. This question boils downto deciding whether Clique (or any other NP-complete problem) has a polynomialtime algorithm. This is thought to be highly unlikely, both because a lot of efforthas gone into designing algorithms for these problems, and because the existence ofsuch an algorithmwould have remarkable consequences. It should be mentioned thatthis problem is, to some extent at least, an emblematic representative of numerousopen questions in computer science that have a similar character. In fact, one of the


implicit conventions of the discipline is to regard a computational problem as hardif, after some considerable effort, people haven’t been able to figure out whether itis hard or easy.

For any decision problem inNP and any system of witnesses there is an associatedsearch problem, namely to find a witness for an affirmative answer or verify that theanswer is negative. For Clique this could mean not only showing that a clique ofsize k exists, but actually producing one. The class of search problems associatedwith decision problems in NP is called FNP. (The ‘F’ stands for “function.”) ForClique the search problem is not much harder than the decision problem, in thefollowing sense: if we had a polynomial time algorithm for the decision problem,we could apply it to the graph with various vertices removed, repeatedly narrowingthe focus until we found the desired clique, thereby solving the search problem inpolynomial time.

However, there is a particular class of problems for which the search problem ispotentially quite hard, even though the decision problem is trivial because the answeris known to be yes. This class of search problems is called TFNP. (The ‘T’ standsfor “total.”) There are some “trivial” decision problems that give rise to quite famousproblems in this class:

1. “Does a integer have a prime factorization?” Testing primality can now be donein polynomial time, but there is still no polynomial time algorithm for factoring.

2. “Given a set of positive integers {a1, . . . , an} with ∑i ai < 2n , do there exist two

different subsets with the same sum?” There are 2n different subsets, and the sumof any one of them is less than 2n − n + 1, so the pigeonhole principle impliesthat the answer is certainly yes.

3. “Does a two person game have sets of pure strategies for the agents that are thesupports of a Nash equilibrium?” Verifying that a pair of sets are the support of aNash equilibrium is a computation involving linear algebra and a small numberof inequality verifications that can be performed in polynomial time.

Problems involving a function defined on some large space must be specified witha bit more care, because if the function is given by listing its values, then the problemis easy, relative to the size of the input, because the input is huge. Instead, one takesthe input to be a Turing machine that computes (in polynomial time) the value of thefunction at any point in the space.

• “Given a Turing machine that computes a real valued function at every vertex of agraph, is there a vertexwhere the function’s value is at least as large as the function’svalue at any of the vertex’ neighbors in the graph?” Since the graph is finite, thefunction has a global maximum and therefore at least one local maximum.

• “Given a Turing machine that computes the value of a Sperner labelling at anyvertex in a triangulation of the simplex, does there exist a completely labelledsubsimplex?”

Mainly because the class of problems in NP that always have a positive answer isdefined in terms of a property of the outputs, rather than a property of the inputs (butalso in part because factoring seems so different from the other problems) experts


expect that TFNP does not contain any problems that are complete for the class.In view of this, trying to study the class as a whole is unlikely to be very fruitful.Instead, it makes sense to define and study coherent subclasses, and Papadimitriou(1994b) advocates defining subclasses in terms of the proof that a solution exists.ThusPPP (“polynomial pigeonhole principle”) is (roughly) the class of problems forwhich existence is guaranteed by the pigeonhole principle, and PLS (“polynomiallocal search”) is (again roughly) the set of problems requesting a local maximum ofa real valued function defined on a graph by a Turing machine.

For us the most important subclass of TFNP is PPAD (“polynomial parity argu-ment directed”) which is defined by abstracting certain features of the algorithmswe have seen in this chapter. To describe this precisely we need to introduce anotherconcept. A Boolean circuit is a directed graph C = (V, E) satisfying the followingdescription. The vertices are strictly partially ordered by a relation< such that v < wfor all (v,w) ∈ E . The indegree of all elements of V is not greater than two. If v hasindegree zero, it is an input gate, if it has indegree one it is a not gate, and if it hasindegree two it is either an or gate or an and gate. If v has outdegree zero it is anoutput gate. Given an input vector x of truth values for the input gates, we can (inthe obvious way) impute truth values to gates whose input gates have already beenassigned truth values, arriving eventually at an output vector C(x) of truth values forthe output gates.

The computational problem EOTL (“end of the line”) takes as input a pair ofBoolean circuits P and S, each with n input gates and n output gates, such thatP(0n) = 0n �= S(0n), P(S(0n)) = 0n , and there is no x ∈ {0, 1}n such that P(x) =S(x) = x . (Here we are identifying true and false with 1 and 0 respectively, and0n = (0, . . . , 0) ∈ {0, 1}n .) There is a directedgraph1 GS,P whosevertices are the x ∈{0, 1}n such that either P(S(x)) = x or S(P(x)) = x , with an edge going from x to ypreciselywhen both S(x) = y and P(y) = x . Clearly the indegrees and outdegrees ofall vertices are either zero or one, and there are no isolated points. (That is, no vertexhas indegree zero and outdegree zero.) Evidently 0n is a source, and an acceptableoutput is a sink or a second source.

More informally, an instance of EOTL specifies devices that allow one to traversea directed graph ofmaximal indegree one andmaximal outdegree one, and it specifiesa source. There is an obvious algorithm for finding a sink, namely follow the pathstarting at the given source to its other endpoint, but we are also allowed to performcomputations on the Boolean circuits S and P if we like, and we are allowed to returnany sink or a second source.

The Scarf algorithm, the no lonely toy algorithm, and the Lemke–Howson algo-rithm each have this character. (It would be difficult to describe homotopy in exactlythese terms, but there is an obvious sense in which it is similar.) Although lots oflittle details would need to be specified in order to put one of them precisely in the

1A directed graph is a pair G = (V, E) where V is a finite set of vertices and E is a finite set ofordered pairs of distinct elements of V . That is, in a directed graph each edge is an “arrow” goingfrom a “tail” vertex to a “head” vertex. The indegree of v ∈ V is the number of elements of E thathave v as head, and its outdegree is the number of elements of E that have v as tail. The vertex vis a source if its indegree is zero, and it is a sink if its outdegree is zero.


framework of EOTL, for the most part it is easy to see that this is possible. Notein particular that throughout we have been careful to establish (by citation in thecase of Lemke–Howson) that each algorithm had an orientation, so that it could beunderstood as going forward in a directed graph rather than traversing an undirectedgraph by remembering from whence it came. The one important subtlety is that thefunction whose approximate fixed points we are trying to find might be specifiedby a Turing machine, but for a particular n such a specification can be converted, inpolynomial time, into a definition in terms of a Boolean circuit whose size is boundedby a polynomial function of the size of the Turing machine. (See Theorem 8.1 andProposition 11.1 of Papadimitriou (1994a), and their proofs. Conversions of this sortwere used to establish the first NP-complete problems.)

By definition, a computational problem is in PPAD if there is a polynomial timereduction of it to end of the line. Thus each of the approximate fixed pointproblems our algorithms solve is in PPAD. The class of problems that can be reducedto the computational problem that has the same features as EOTL, except that thegraph is undirected, is PPA. Despite the close resemblance to PPAD, the theoreticalproperties of the two classes differ in important ways.

In a series of rapid developments in 2005 and 2006 (Daskalakis et al. 2006; Chenand Deng 2006a, b) it was shown that 2-NASH is PPAD-complete, and also that thetwo dimensional version of SPERNER is PPAD-complete. Since we expect that thegeneral problem of computing an approximate fixed point is hard, this is regardedas compelling evidence that there is no polynomial time algorithm for 2-NASH.Since this breakthrough many other computational problems have been shown to bePPAD-complete, including finding Walrasian equilibria in seemingly quite simpleexchange economies. This line of research has also found that, in various senses,these computational problems do not become more tractable if we relax them, forexample asking for a point that is ε-approximately fixed for an ε > 0 that is not muchgreater than zero.

There is an even stronger negative result. The computational problem OEOTL(other end of the line) has the same given data as EOTL, but now the goal is to findthe other end of the path beginning at (0, . . . , 0), and not just any second leaf of thegraph. Goldberg et al. (2011) show that OEOTL is PSPACE-complete.

The current state of theory presents a stark contrast between theoretical conceptsthat classify even quite simple fixed point problems as intractable, and algorithms thatoften produce useful results in a reasonable amount of time. The argument showingthat 2-NASH is PPAD-complete can be used to show that the Scarf algorithm, theno lonely toy algorithm, the Lemke–Howson algorithm, and many specific instancesof homotopy procedures can be recrafted as algorithms for OEOTL. Yet each ofthese algorithms has some practical utility, and some of them frequently succeedwith problems that are, in various senses, “large.”

All the algorithms we have discussed follow paths. In view of this fact, and thetheoretical results described above, it seems quite likely that all practical algorithmsfor problems in PPAD will have this character. If there was a fundamentally differ-ent approach, it would be possible to develop hybrid algorithms that combined thestrengths of the two approacheswhile using each tominimize the other’s weaknesses.


No such approach is currently known, and the rich theoretical possibilities that wouldresult from such a discovery are in sharp contrast with the negative theoretical resultsdescribed above.

At the same time the path following algorithms described here are, in variouspractical senses, quite diverse. Homotopy has been applied successfully to numer-ous application domains, but its practical applications are restricted to problems thatdo not have fatal failures of smoothness. The McLennan-Tourky method of leverag-ing the Lemke–Howson algorithmmay have the potential to overcomeweaknesses ofthe restart versions of the Scarf algorithm and the primitive set method. The conceptsfrom computer science we have described here are too coarse to distinguish betweenthese approaches. There is at present very little practical experience contrasting thesealgorithms, and there is evidently ample scope for imaginative software engineering.

Exercises

3.1 Prove that for a metric space X the following are equivalent:

(a) X is compact.(b) Every collection of closed sets with the finite intersection property has a

nonempty intersection.(c) If F1 ⊃ F2 ⊃ · · · is a decreasing sequence of nonempty closed sets, then

⋂Fn �=

∅.(d) X is totally bounded and complete.

3.2 For the example of Fig. 3.5, work out the path of the Lemke–Howson algorithmwhen the role of s0 is played by s2, and when it is played by each of s3, t1, t2, and t3.

3.3 In the computational problem Sort the input is a list n1, . . . , nk of distinctintegers, and the output is a list with the same integers ordered from smallestto largest. The bubble sort algorithm compares the first two elements of thelist, swapping them if they are out of order, then compares the second and thirdelement, again swapping them if they are out of order, and so forth to the end of thelist. This process is repeated until there is a pass through the list that does not swapany elements. The merge algorithm has a recursive definition. If the list has morethan one element, then it is divided in the middle into two lists whose sizes differ byat most one, and the merge algorithm is applied to each list. The resulting orderedlists are then merged by repeatedly moving the smaller of the two numbers at theheads of these lists to a new list.

(a) What is themaximumnumber of pairwise comparisons of two integers for bubblesort?

(b) What is the maximum number of pairwise comparisons of two integers formerge?

Exercises 97

3.4 Linear programming is, arguably, by far the most important computationalproblem. Among other things, a large fraction of the computational problems thathave polynomial time algorithms can be expressed as linear programs. This exerciseexplores methods for solving linear programs.

We consider the linear program max cT x subject to x ≥ 0 and Ax ≤ b where Ais an m × n matrix, b ∈ R

m , and c ∈ Rn . Let P := { x ∈ R

n+ : Ax ≤ b }. This is apolyhedron, which may be empty or unbounded. The geometric idea of the simplexalgorithm is to repeatedly increase the value of the objective function by movingalong an edge of P from one endpoint to the other. Eventually the algorithm finds avertex of P at which no further improvement of the objective function is possible,or it finds a ray along which the objective function can be increased without bound.(The problem of determining whether P is empty, and finding an initial vertex if itis not, can be handled by applying this algorithm to a relaxation of the constraints.)

The numerical implementation of the simplex algorithm involves a “tableau” thatis a matrix that, in effect, specifies a coordinate system for which the current vertexis the origin, and the values of the currently binding constraints are the coordinates.Passage fromone vertex to the next is reduced to a rule for updating this tableau. Thereare numerous numerical issues related to accumulation of round off error (if floatingpoint arithmetic is used) or growth in the sizes of the numerators and denominators(in the case of rational arithmetic) and an important issue is how to proceed if morethan n constraints bind at a vertex. These difficulties are not insuperable, and inpractice the simplex algorithm is very successful.

Klee and Minty (1972) presented an example showing that the simplex algorithmcould have exponential running time in the worst case. (Spielman and Teng 2004 iscurrently regarded as the best theoretical explanationofwhy running times are usuallymuch faster.) So, is there a polynomial time algorithm for linear programming? Thisquestion was answered in the affirmative by Khachian (1979). We explain the keyidea of his approach.

(a) Taking the results of Exercise 2.2, including the strong duality theorem of linearprogramming, as known, argue that we can find an optimal x if we have aprocedure that finds a feasible x or shows that P is empty.

(b) Suppose we have a procedure that determines whether P is nonempty. Showthat if P is nonempty, then by repeatedly applying this procedure to systemsobtained by setting some component of x to 0, or requiring that some inequality∑

j ai j x j ≤ bi hold with equality, we can obtain a system of linear equationswhose unique solution is a feasible point.

We now let P = { x ∈ Rn : Ax ≤ b }. (Since the nonnegativity constraints can be

added to A, the problem of determining whether P is nonempty is not less generalthanwhatwehad before.)We alsomake our objective easier to attain in two additionalways. First, it is enough to have a procedure that determines whether P is nonemptywhen it is already known that P is bounded. The justification for this is that if thereis any solution at all, then there is one with −2D ≤ x j ≤ 2D for all j , where D is thetotal number of digits in the binary representations of the m(n + 1) integers ai j andbi . (You may take proving this as a challenge.) Therefore we can add these additional


constraints to A. Second, we will actually determine whether

Pε :=⎧⎨

⎩x ∈ R

n :∑

j

ai j x j < bi + ε for j = 1, . . . , n

⎫⎬

⎭

is nonempty, where ε > 0 is small enough that if it is, then P is nonempty. (A secondchallenge is convincing yourself that such an ε exists, and can be expressed in termsof D.) For a suitable ε it is possible to develop a lower bound on the volume of Pε

when it is nonempty.We now know that Pε is contained in a large ball B centered at the origin, and it

has a certain minimum volume if it is nonempty. We can check whether the originis an element of Pε, and if it is we are done. Otherwise there is some i such thatbi + ε < 0, and we know that Pε ⊂ { x ∈ B : ∑

j ai j x j < bi + ε }.(c) Prove that { x ∈ R

n : ‖x‖ ≤ 1 and x1 ≥ 0 } ⊂ E where

E :={x ∈ R

n :(n+1n

)2(x1 − 1

n+1

)+ n2−1

n2

n∑

j=2

x2j ≤ 1}

.

(d) Prove that the volume of E is nn+1

(n2

n2−1

)(n−1)/2times the volume of the unit ball.

(e) Prove that nn+1

(n2

n2−1

)(n−1)/2< e−1/(2n+1).

Therefore Pε is contained in an ellipsoid B ′ (that is, a set like E) whose volume isless than e−1/(2n+1) times the volume of B. There is an affine transformation of Rn

that takes B ′ to B. After applying this affine transformation to Pε, we can repeat thiscalculation. The lower bound on the volume Pε gives an upper bound on the numberof times we have to do this before we are able to conclude that Pε = ∅. Of coursethe affine transformation in question needs to be specified algebraically, and there isthen a lot of work to do in showing that the numerical calculations are polynomiallybounded, but we have seen the main idea.

3.5 Barycentric subdivision is theoretically useful, but for computational purposesit is extremely inefficient. There is a huge literature studying other methods of tri-angulating simplices, cubes, and polytopal complexes. Here we develop the regularsubdivision of the simplex due to Kuhn (1960, 1968), which is simple and compu-tationally practical.

Fix an integerd, and letO be the set of complete orderings�of {B} ∪ {0, . . . , d} ∪{T } such that B � i � T for all i = 0, . . . , d and B ≺ T . Let C := [0, 1]d+1. For� ∈ O let Δ� be the set of x ∈ C such that xi = 0 if i � B, xi ≤ x j for all i and jsuch that i � j , and xi = 1 for all i such that T � i . Let�∗ be the ordering such thatB ≺∗ 0 ≺∗ · · · ≺∗ d ≺∗ T .Wewill see thatΔ�∗ is a simplex, andwewill triangulateit. Fix �,�′ ∈ O .

(a) Prove that the set of vertices of the polytope Δ� is Δ� ∩ {0, 1}d+1.(b) Prove that the vertices of Δ� are affinely independent, so that Δ� is a simplex.(c) What is Δ� ∩ Δ�′?

Exercises 99

Fix an integer k, and let L := {0, . . . , k − 1}d+1. For h ∈ L and � ∈ O let Δ�(h) bethe set of x ∈ ∏d

i=0[ hik , hi+1k ] such that xi = hi/k if i � B, xi − hi/k ≤ x j − h j/k

for all i and j such that i � j , and xi = (hi + 1)/k for all i such that T � i .

(d) Prove that T := {∅} ∪ { Δ�(h) : h ∈ L and � ∈ O } is a triangulation of C .(e) Prove that { σ ∈ T : σ ⊂ Δ�∗ } is a triangulation of Δ�∗ .

3.6 Let N = {1, . . . , n} be a set of individuals. There is a set H containing n houses.Each individual i has a strict preference ordering�i of H . (That is,�i is a complete,transitive, antisymmetric binary relation.) We write h �i h′ to indicate that eitherh �i h′ or h = h′. An allocation is a bijection α : N → H . Initially each individuali owns a house hi , and each house is owned by one and only one individual. Theallocationα is in the core if there does not exist a nonempty setC ⊂ N and a bijectionβ : C → { hi : i ∈ C } such thatβ(i) �i α(i) for all i ∈ C andβ(i) �i α(i) for somei ∈ C . A top trading cycle is a list i1, . . . , ik of distinct individuals such that foreach j = 1, . . . , k − 1, hi j+1 is j’s favorite house, and hi1 is k’s favorite house.

(a) Prove that if α is a core allocation and i1, . . . , ik is a top trading cycle, thenα(i j ) = hi j+1 for all j = 1, . . . , k − 1 and α(ik) = hi1 .

(b) Prove that there is a unique core allocation.(c) Describe an algorithm that has initial ownership and preferences as its inputs and

has the unique core allocation as its output. Is this a polynomial time procedure?

(The top trading cycle algorithm appeared in Scarf and Shapley 1974, where it isattributed to David Gale.)

3.7 Let M andW be finite sets of men and women. Each m ∈ M has a strict prefer-ence ordering �m of W ∪ {∅}, and each w ∈ W has a strict preference ordering �w

of M ∪ {∅}, where ∅ represents being unmatched. A match is a function

μ : M ∪ W → M ∪ W ∪ {∅}

such that μ(M) ⊂ W ∪ {∅}, μ(W ) ⊂ M ∪ {∅}, μ(μ(m)) = m for all m ∈ M suchthatμ(m) ∈ W , andμ(μ(w)) = w for allw ∈ W such thatμ(w) ∈ M . Thematchμ isstable if there do not existm ∈ M andw ∈ W such thatw �m μ(m) andm �w μ(w).The deferred acceptance algorithm (Gale and Shapley 1962) begins with each mansending a proposal to his favorite woman (if there is anyone he prefers to ∅). Womenreject all proposals that are worse than ∅, and women who receive multiple proposalsreject all but their favorite suitor. In each subsequent round each man who received arejection in the last round sends a proposal to his favorite woman among those whoare better for him than ∅ and have not yet rejected him, and each woman rejects allproposals worse than ∅ and all proposals other than the best she has received so far.Eventually men run out of woman to propose to, at which point each woman holdinga proposal is matched to the proposer, and everyone else is matched to ∅. Let μ bethe resulting match.


(a) Prove that μ is stable.(b) Proveμ isM-optimal, in the sense that there is no other stablematch that provides

any man with a partner he prefers.(c) By comparing the sets of men and women who are unmatched in the M-optimal

andW -optimal stable matches, show that the sets of unmatched men and womenare the same in all stable matches. (The generalization of this to many-to-onematching is known as the rural hospital theorem.)

3.8 The computational class coNP is the class of all decision problems for which a‘No’ answer has a certificate that can be verified in polynomial time. It is a perfectmirror of NP, and thus of interest primarily with respect to questions concerninghow it stands in relation to NP. For example, it is thought to be quite unlikely thatNP = coNP, but we cannot prove this if we cannot prove that NP �= P.

A parity game consists of a directed graph G = (V, E) with a partition of Vinto sets VO and VE , and a priority function λ : V → N. For v ∈ V , the set C(v)of children of v is the set of w ∈ V that are targets of elements of E whose sourceis v. We assume that C(v) �= ∅ for all v ∈ V . A play is a sequence v0v1v2 . . . suchthat vi+1 ∈ C(vi ) for all i ≥ 0. The Odd (Even) player wins v0v1v2 . . . if the largestpriority that occurs infinitely often in the sequence λ(v0), λ(v1), . . . is odd (even).A stationary strategy for the Odd (Even) player is a function σ : VO → V (τ :VE → V ) such that for all v ∈ V0 (v ∈ VE ) σ(v) ∈ C(v) (τ(v) ∈ C(v)). A stationarystrategy σ (τ) for the Odd (Even) player is winning at v0 ∈ V if the Odd (Even)player wins every play v0v1v2 . . . such that vi+1 = σ(vi ) for all vi ∈ VO (vi+1 = τ(vi )for all vi ∈ VE ).

(a) Argue that there is a polynomial time algorithm that can verify that σ is winningat v0 when this is the case. (It is perhaps simplest to think of this as determiningwhether Even can win at v0 when VO = ∅.)

(b) For v ∈ V let p(v) := (−1)λ(v)(|V | + 1)λ(v). We consider the discounted gamein which, for a given δ ∈ (0, 1), the payoff to the Even player at play v0v1v2 . . .

is∑∞

t=0 δt p(vt ) and the payoff to the Odd player is the negation of this. Usethe contraction mapping theorem to prove that there is a unique Wδ : V → R

such that Wδ(v) = p(v) + δmaxw∈C(v) Wδ(w) if v ∈ VE and Wδ(v) = p(v) +δminw∈C(v) Wδ(w) if v ∈ VO .

(c) For each v ∈ VE let σδ(v) be an element w ∈ C(v) that maximizes Wδ(w), andfor each v ∈ VO let τδ(v) be an element w ∈ C(v) that minimizes Wδ(w). Provethat if δ is sufficiently close to 1, then there is no v ∈ V such that Wδ = 0, σδ isa winning strategy for the Even player at every v such that Wδ(v) > 0, and τδ isa winning strategy for the Odd player at every v such that Wδ(v) < 0.

(d) Thus the problem of determining whether the Even player has a winning strategyat v0 is in NP ∩ coNP. In spite of considerable effort there is no known poly-nomial time algorithm for this problem. Can you see why computing Wδ for δ

sufficiently close to 1 doesn’t work?

Condon (1992) initiated the study of computational complexity for stochastic games.

Exercises 101

3.9 This problem and the next give examples of reductions. A two person gamespecified by m × n payoff matrices A and B is a zero sum game if B = −A. LetΔm = { x ∈ R

m+ : x1 + · · · + xm = 1 } and Δn = { y ∈ Rn+ : y1 + · · · + yn = 1 }.

(a) (von Neumann’s Minimax Theorem) Use the fact that the a zero sum game hasa Nash equilibrium to prove that

maxx∈Δm

miny∈Δn

xTAy = miny∈Δn

maxx∈Δm

xTAy.

(b) Reduce the problem of finding a Nash equilibrium strategy of a zero sum gameto a linear program in which the variables are x and the first player’s equilibriumpayoff.

3.10 An input for the decision problem SAT (“satisfiability”) consists of a listP, Q, . . . , Z of Boolean variables, called literals, and a Boolean formula inconjunctive-disjunctive form, which is an expression of the form

(x11 ∨ · · · ∨ x1n1) ∧ · · · ∧ (xk1 ∨ · · · ∨ xknk )

where each xi j is an element of {P,¬P, Q,¬Q, . . . , Z ,¬Z}, so it is either a literal orthe negation of a literal. The problem is to determine whether there is a vector of truthvalues for the literals such that the truth value of the formula is ‘true.’ HistoricallySATwas the first problem to be shown to beNP-complete, independently by StephenCook and Leonid Levin.

An integer linear program is a linear program with the additional constraint thatthe some or all of the solution variables are constrained to be integers. The feasibilityversion is a decision problem in which the inputs are anm × n matrix A and a vectorb ∈ R

m , and the problem is to determine whether there is a vector x = (x1, . . . , xn)of nonnegative integers such that Ax ≤ b.

(a) Show that the feasibility version of integer linear programming is in NP.(b) Give a polynomial time reduction passing from an input to SAT to an equivalent

integer linear programming feasibility problem. Conclude that the feasibilityversion of integer linear programming is NP-complete.

The importance of NP-completeness became evident when Karp (1972) presenteda list of 21 NP-complete problems, including the feaibility version of integer linearprogramming with the additional requirement that all of the integers are either zeroor one.

Part IIITopological Methods

Chapter 4Topologies on Spaces of Sets

The theories of the degree and the index involve a certain kind of continuity with re-spect to the function or correspondence in question, so we need to develop topologieson spaces of functions and correspondences. The main idea is that one correspon-dence is close to another if its graph is close to the graphof the second correspondence,so we need to have topologies on spaces of subsets of a given space. In this chapterwe study such spaces of sets, and in the next chapter we apply these results to spacesof functions and correspondences. There are three basic set theoretic operations thatare used to construct new functions or correspondences from given ones, namelyrestriction to a subdomain, cartesian products, and composition, and our agenda hereis to develop continuity results for elementary operations on sets that will eventuallysupport continuity results for those operations.

To begin with Sect. 4.1 reviews some basic properties of topological spaces thathold automatically in the case of metric spaces. In Sect. 4.2 we define topologieson spaces of compact and closed subsets of a general topological space. Section 4.3presents a nice result due to Vietoris which asserts that for one of these tolopogiesthe space of nonempty compact subsets of a compact space is compact. Economistscommonly encounter this in the context of a metric space, in which case the topologyis induced by the Hausdorff distance; Section 4.4 clarifies the connection. In Sect. 4.5we study the continuity properties of basic operations for these spaces. Our treatmentis largely drawn from Michael (1951) which contains a great deal of additionalinformation about these topologies.

4.1 Topological Terminology

Up to this point the only topological spaces we have encountered have been subsetsof Euclidean spaces. Nowwe allow the possibility that X lacks some of the properties


105


106 4 Topologies on Spaces of Sets

of metric spaces, in part because we may ultimately be interested in some spaces thatare not metrizable, but also in order to clarify the logic underlying our results.

Throughout this chapter we work with a fixed topological space X . We say thatX is:

(a) a T1-space if, for each x ∈ X , {x} is closed;(b) Hausdorff if any two distinct points have disjoint neighborhoods;(c) regular if every neighborhood of a point contains a closed neighborhood of that

point;(d) normal if, for any two disjoint closed sets C and D, there are disjoint open sets

U and V with C ⊂ U and D ⊂ V .

These conditions are called separation axioms. An equivalent (and more common)definition of a T1-space requires that for any two distinct points, each has a neigh-borhood that does not contain the other, so a Hausdorff space is T1. An equivalent(and more common) definition of a regular space requires that any closed set and anypoint not in that set have disjoint open neighborhoods. Therefore a normal T1 spaceis both Hausdorff and regular. It is an easy exercise to show that a metric space isnormal and T1.

A collection B of subsets of X is a base of the topology if the open sets areprecisely the sets that can be expressed as unions of elements of B. That is, B is abase of a topology if and only if all the elements ofB are open and the open sets arethose U ⊂ X such that for every x ∈ U there there is a V ∈ B with x ∈ V ⊂ U .

We say thatB is a subbase of the topology if the collection of finite intersectionsof elements of B (including ∅, as a matter of convention) is a base. Equivalently,each element ofB is open and for each openU and x ∈ U there are V1, . . . , Vk ∈ Bsuch that x ∈ V1 ∩ · · · ∩ Vk ⊂ U . It is often easy to define or describe a topology byspecifying a subbase—in which case we say that the topology of X is generated byB—so we should understand what properties a collection B of subsets of X has tohave in order for this to work.

Lemma 4.1 If B is a collection of subsets of X such that every point of X is anelement of some element of B, then B is a subbase of a unique topology of X.

Proof The open sets of a topology that has B as a subbase must be precisely thearbitrary unions of finite intersections of elements of B. Evidently the collectionof such sets is closed under finite intersection and arbitrary union. It includes ∅ byconvention and X by assumption.

4.2 Spaces of Closed and Compact Sets

There will be a number of topologies, and in order to define them we need thecorresponding subbases. For each open U ⊂ X let:

4.2 Spaces of Closed and Compact Sets 107

• UU := { K ⊂ U : K is compact };• UU := UU \ {∅};• U 0

U := {C ⊂ U : C is nonempty and closed }.For any open U1, . . . ,Uk we have:

UU1 ∩ . . . ∩ UUk = UU1∩...∩Uk ; UU1 ∩ . . . ∩ UUk = UU1∩...∩Uk ;

U 0U1

∩ . . . ∩ U 0Uk

= U 0U1∩...∩Uk

.

We now have the following spaces:

• ˜K (X) is the space of compact subsets of X endowed with the topology with base{ UU : U ⊂ X is open }.

• K (X) is the space of nonempty compact subsets of X endowed with the topologywith base {UU : U ⊂ X is open }.

• K 0(X) is the space of nonempty closed subsets of X endowed with the topologywith the base {U 0

U : U ⊂ X is open }.Of course K (X) has the subspace topology inherited from ˜K (X).

For each open U ⊂ X let:

• VU := { K ⊂ X : K is compact and K ∩U �= ∅ };• V 0

U := {C ⊂ X : C is closed and C ∩U �= ∅ }.These give two additional topologies:

• H (X) is the space of nonempty compact subsets of X endowed with the topologygenerated by the subbase

{UU : U ⊂ X is open } ∪ {VU : U ⊂ X is open } .

• H 0(X) is the space of nonempty closed subsets of X endowed with the topologygenerated by the subbase

{U 0U : U ⊂ X is open } ∪ {V 0

U : U ⊂ X is open } .

The topologies of H (X) and H 0(X) are both called the Vietoris topology.Roughly, a neighborhood of K in ˜K (X) or K (X) consists of those K ′ that

are close to K in the sense that every point in K ′ is close to some point of K . Aneighborhood of K ∈ H (X) consists of those K ′ that are close in this sense, andalso in the sense that every point in K is close to some point of K ′. Similar remarkspertain toK 0(X) andH 0(X). Section 4.4 develops these intuitions precisely whenX is a metric space.


Compact subsets of Hausdorff spaces are closed1, so “for practical purposes” (i.e.,when X is Hausdorff) every compact set is closed. In this case K (X) and H (X)

have the subspace topologies induced by the topologies of K 0(X) and H 0(X).

4.3 Vietoris’ Theorem

An interesting fact, whichwas proved very early in the history of topology byVietoris(1923), and which is applied from time to time in mathematical economics, is thatH (X) is compact whenever X is compact.

Lemma 4.2 (Alexander) If X has a subbase such that any cover of X by elementsof the subbase has a finite subcover, then X is compact.

Proof Say that a set isbasic if it is a finite intersectionof elements of the subbasis.Anyopen cover is refined by the collection of basic sets that are subsets of its elements.If a refinement of an open cover has a finite subcover, then so does the cover, so itsuffices to show that any open cover of X by basic sets has a finite subcover.

A collection of open covers is a chain if it is completely ordered by inclusion: forany two covers in the chain, the first is a subset of the second or vice versa. If eachopen cover in a chain consists of basic sets, and has no finite subcover, then the unionof the elements of the chain also has these properties (any finite subset of the unionis contained in some member of the chain) so Zorn’s lemma implies that if there isone open cover with these properties, then there is a maximal such cover, say C . Wewill show that each U ∈ C is contained in a subbasic set that is also in the cover, sothe subbasic sets in C cover X , and by hypothesis there must be a finite subcoverafter all.

Fixing a particular U ∈ C , suppose that U = V1 ∩ . . . ∩ Vn where V1, . . . , Vn

are in the subbasis. There must be an i such that C ∪ {Vi } has no finite subcover.(Otherwise each C ∪ {Vi } has a finite subcover Ci , so that Ci \ {Vi } covers X \ Vi ,and {U } ∪ ⋃

i Ci \ {Vi } is a finite subcover from C of U ∪ ⋃i (X \ Vi ) = U ∪ X \⋂

i Vi = X .)Maximality implies thatVi is already in the cover, and of courseU ⊂ Vi .�

Theorem 4.1 If X is compact, then H (X) is compact.

Proof Suppose that {UUα: α ∈ A} ∪ {VVβ

: β ∈ B} is an open cover of H (X) bysubbasic sets. Let D := X \ ⋃

β Vβ ; since D is closed and X is compact, D is com-pact. We may assume that D is nonempty because otherwise X = Vβ1 ∪ . . . ∪ Vβn

for some β1, . . . , βn , in which caseH (X) = VVβ1∪ . . . ∪ VVβn

. In addition, D mustbe contained in someUα because otherwise D would not be an element of anyUUα

or

1Proof: fixing a point y in the complement of the compact set K , for each x ∈ K there are disjointneighborhoods of Ux of x and Vx of y, {Ux } is an open cover of K , and if Ux1 , . . . ,Uxn is a finitesubcover, then Vx1 ∩ . . . ∩ Vxn is a neighborhood of y that does not intersect K .

4.3 Vietoris’ Theorem 109

any VVβ. But then {Uα} ∪ {Vβ : β ∈ B} has a finite subcover, say Uα, Vβ1 , . . . , Vβn .

Any compact set that does not intersect some Vβi is contained in Uα , so

H (X) = UUα∪ VVβ1

∪ . . . ∪ VVβn.

We have shown that any cover by subbasic sets has a finite subcover, so the claimfollows from the last result. �

4.4 Hausdorff Distance

Economists sometimes encounter spaces of compact subsets of ametric space, whichare frequently topologized with the Hausdorff metric. In this section we clarify therelationship between that approach and the spaces introduced above. Suppose thatX is a metric space with metric d. For nonempty compact sets K , L ⊂ X let

δK (K , L) := maxx∈K min

y∈L d(x, y) .

Then for any K and ε > 0 we have

{ L : δK (L , K ) < ε } = { L : L ⊂ Uε(K ) } = UUε(K ). (4.1)

On the other hand, whenever K ⊂ U with K compact and U open there is someε > 0 such that Uε(K ) ⊂ U (otherwise we could take sequences x1, x2, . . . in Land y1, y2, . . . in X \U with d(xi , yi ) → 0, then take convergent subsequences) so{ L : δK (L , K ) < ε } ⊂ UU . Thus:

Lemma 4.3 When X is a metric space, the sets of the form { L : δK (L , K ) < ε }constitute a base of the topology of K (X).

The Hausdorff distance between nonempty compact sets K , L ⊂ X is

δH (K , L) := max{δK (K , L), δK (L , K )} .

This is a metric. Specifically, it is evident that δH (K , L) = δH (L , K ), and thatδH (K , L) = 0 if and only if K = L . If M is a third compact set, then

δK (K , M) ≤ δK (K , L) + δK (L , M) ,

fromwhich it follows easily that the Hausdorff distance satisfies the triangle inequal-ity.

There is now an ambiguity in our notation, insofar as Uε(L) might refer either tothe union of the ε-balls around the various points of L or to the set of compact sets


whose Hausdorff distance from L is less than ε. Unless stated otherwise, we willalways interpret it in the first way, as a set of points and not as a set of sets.

Proposition 4.1 The Hausdorff distance induces the Vietoris topology on H (X).

Proof Fix a nonempty compact K . We will show that any neighborhood of K in onetopology contains a neighborhood in the other topology.

First consider some ε > 0. Choose x1, . . . , xn ∈ K such that K ⊂ ⋃i Uε/2(xi ).

If L ∩ Uε/2(xi ) �= ∅ for all i , then δK (L , K ) < ε, so, in view of (4.1),

K ∈ UUε(K ) ∩ VUε/2(x1) ∩ . . . ∩ VUε/2(xn) ⊂ { L : δH (K , L) < ε } .

We now show that any element of our subbasis for the Vietoris topology contains{ L : δH (K , L) < ε } for some ε > 0. IfU is an open set containing K , then (as weargued above) Uε(K ) ⊂ U for some ε > 0, so that

K ∈ { L : δH (L , K ) < ε } ⊂ { L : δK (L , K ) < ε } ⊂ UU .

If V is open with K ∩ V �= ∅, then we can choose x ∈ K ∩ V and ε > 0 smallenough that Uε(x) ⊂ V . Then

K ∈ { L : δH (K , L) < ε } ⊂ { L : δK (K , L) < ε } ⊂ VV . �

Combining this with Theorem 4.1 gives:

Corollary 4.1 If X is a compact metric space, then H (X) is a compact metricspace.

4.5 Basic Operations on Subsets

In this section we develop certain basic properties of the topologies defined inSect. 4.2. To achieve a more unified presentation, it will be useful to let T denote ageneric element of { ˜K ,K ,K 0,H ,H 0}. This is, T (X) will denote one of thespaces ˜K (X),K (X),K 0(X),H (X), andH 0(X), with the range of allowed in-terpretations indicated in each context. Similarly, W will denote a generic elementof {U ,U ,U 0,V ,V 0}.

We will frequently apply the following simple fact.

Lemma 4.4 If Y is a second topological space, f : Y → X is a function, andB isa subbase for X such that f −1(V ) is open for every V ∈ B, then f is continuous.

Proof For any sets S1, . . . , Sk ⊂ X we have f −1(⋂

i Si ) = ⋂i f

−1(Si ), and for anycollection {Ti }i∈I of subsets of X we have f −1(

⋃i Ti ) = ⋃

i f−1(Ti ). Thus the

4.5 Basic Operations on Subsets 111

preimage of a union of finite intersections of elements of B is open, because itis a union of finite intersections of open subsets of Y . �

4.5.1 Continuity of Union

The function taking a pair of sets to their union is as well behaved as one might hope.

Lemma 4.5 For any T ∈ { ˜K ,K ,K 0,H ,H 0} the function

υ : (K1, K2) �→ K1 ∪ K2

is a continuous function from T (X) × T (X) to T (X).

Proof Applying Lemma 4.4, it suffices to show that preimages of subbasic open setsare open. For T ∈ { ˜K ,K ,K 0} it suffices to note that

υ−1(WU ) = WU × WU

for all three W ∈ {U ,U ,U 0}. For T ∈ {H ,H 0} we also need to observe that

υ−1(WU ) = (WU × H (X)) ∪ (H (X) × WU )

for both W ∈ {V ,V 0}. �

4.5.2 Continuity of Intersection

Simple examples show that intersection is not a continuous operation for the topolo-giesH andH 0, so the only issues here concern ˜K ,K , andK 0. For a nonemptyclosed set A ⊂ X letKA(X) andK 0

A (X) be the sets of compact and closed subsetsof X that have nonempty intersection with A.

Lemma 4.6 If A ⊂ X is closed, the function K �→ K ∩ A from KA(X) to K (A)

and the function C �→ C ∩ A from K 0A (X) toK 0(A) are continuous.

Proof If V ⊂ A is open, then the set of compact (closed) K such that ∅ �= K ∩ A ⊂V is UV∪(X\A) (U 0

V∪(X\A)). �

Joint continuity of the map (C, D) �→ C ∩ D requires an additional hypothesis.

Lemma 4.7 If X is a normal space, then ι : (C, D) �→ C ∩ D is a continuous func-tion from { (C, D) ∈ K 0(X) × K 0(X) : C ∩ D �= ∅ } toK 0(X).


Proof By Lemma 4.4 it suffices to show that, for any openU ⊂ X , ι−1(U 0U ) is open.

For any (C, D) in this set normality implies that there are disjoint open sets V andW containing C \U and D \U respectively. Then (U ∪ V ) ∩ (U ∪ W ) = U , so

(C, D) ∈ { (C ′, D′) ∈ U 0U∪V × U 0

U∪W : C ′ ∩ D′ �= ∅ } ⊂ ι−1(U 0U ) .

�Lemma 4.8 If X is a T1 normal space, then ι : ˜K (X) × ˜K (X) → ˜K (X) is con-tinuous.

Proof By Lemma 4.4 it suffices to show that, for any openU ⊂ X , ι−1(UU ) is open.Since X is T1 and normal, it is a Hausdorff space, so compact sets are closed. Forany (K , L) in ι−1(UU ) normality implies that there are disjoint open sets V and Wcontaining K \U and L \U respectively. Then (U ∪ V ) ∩ (U ∪ W ) = U , so

(K , L) ∈ UU∪V × UU∪W ⊂ ι−1(UU ) .

�The restriction of a continuous function to a subdomain is continuous when the

subdomain has the relative topology, and it remains continuous if the range is replacedby any subset that contains the image, again with its relative topology, so:

Lemma 4.9 If X is a T1 normal space, then

ι : { (K , L) ∈ K (X) × K (X) : K ∩ L �= ∅ } → K (X)

is continuous.

4.5.3 Singletons

Lemma 4.10 The function η : x �→ {x} is a continuous function from X to T (X)

when T ∈ {K ,H }. If, in addition, X is a T1-space, then it is continuous whenT ∈ {K 0,H 0}.Proof Singletons are always compact, so for any open U we have η−1(UU ) =η−1(VU ) = U . If X is T1, then singletons are closed, so η−1(U 0

U ) = η−1(V 0U ) = U .

�

4.5.4 Continuity of the Cartesian Product

In addition to X , we now let Y be another given topological space. A simple exampleshows that the cartesian product π0 : (C, D) �→ C × D is not a continuous function


fromH 0(X) × H 0(Y ) toH 0(X × Y ). Suppose X = Y = R, (C, D) = (X, {0}),and

W = { (x, y) : |y| < (1 + x2)−1 } .

It is easy to see that there is noneighborhoodV ⊂ H 0(Y )of D such thatπ0(C, D′) ∈UW (that is, R × D′ ⊂ W ) for all D′ ∈ V .

For compact sets there are positive results. In preparation for them we recall abasic fact about the product topology.

Lemma 4.11 If K ⊂ X and L ⊂ Y are compact, and W ⊂ X × Y is a neigh-borhood of K × L, then there are neighborhoods U of K and V of L such thatU × V ⊂ W.

Proof By the definition of the product topology, for each (x, y) ∈ K × L there areneighborhoods U(x,y) and V(x,y) of x and y such that U(x,y) × V(x,y) ⊂ W . For eachx ∈ K we can find y1, . . . , yn such that L ⊂ Vx := ⋃

j V(x,y j ), and we can thenlet Ux := ⋂

j U(x,y j ). Now choose x1, . . . , xm such that K ⊂ U := ⋃i Uxi , and let

V := ⋂i Vxi . �

Proposition 4.2 For T ∈ { ˜K ,K ,H } the function π : (K , L) �→ K × L is acontinuous function from T (X) × T (Y ) to T (X × Y ).

Proof Let K ⊂ X and L ⊂ Y be compact. If W is a neighborhood of K × L and Uand V are open neighborhoods of K and L with U × V ⊂ W , then

(K , L) ∈ UU × UV ⊂ π−1(UW ) .

By Lemma 4.4, this establishes the asserted continuity when T ∈ { ˜K ,K }.To demonstrate continuity when T = H we must also show that π−1(VW ) is

open in H (X) × H (Y ) whenever W ⊂ X × Y is open. Suppose that (K × L) ∩W �= ∅. Choose (x, y) ∈ (K × L) ∩ W , and choose open neighborhoods U and Vof x and y with U × V ⊂ W . Then

K × L ∈ VU × VV ⊂ π−1(VW ) .

�

4.5.5 The Action of a Function

Now fix a continuous function f : X → Y . Then f maps compact sets to compactsets while f −1(D) is closed whenever D ⊂ Y is closed. The first of these operationsis as well behaved as one might hope.

Lemma 4.12 IfT ∈ { ˜K ,K ,H }, then φ f : K �→ f (K ) is a continuous functionfrom T (X) to T (Y ).


Proof Preimages of subbasic open sets are open: for any open V ⊂ Y we haveφ−1

f (WV ) = W f −1(V ) for all W ∈ {U ,U ,V }. �

There is the following consequence for closed sets.

Lemma 4.13 If X is compact, Y is Hausdorff, and T ∈ {K ,H }, T 0 = K 0 ifT = K , andT 0 = H 0 ifT = H , thenφ f : K �→ f (K ) is a continuous functionfrom T 0(X) to T 0(Y ).

Proof Recall that a closed subset of a compact space X is compact2 so thatT 0(X) ⊂T (X). As we mentioned earlier, T 0(X) has the relative topologies induced by thetopology of T (X), so the last result implies that φ f is a continuous function fromT 0(X) to T (Y ). The proof is completed by recalling that a compact subset of aHausdorff space Y is closed, so that T (Y ) ⊂ T 0(Y ). �

If f is surjective there is a well defined functionψ f : D �→ f −1(D) fromK 0(Y )

toK 0(X).We need an additional hypothesis to guarantee that it is continuous. Recallthat a function is closed if it is continuous and maps closed sets to closed sets.

Lemma 4.14 If f is a surjective closed map, then ψ f : D �→ f −1(D) is a contin-uous function fromK 0(Y ) toK 0(X).

Proof For an open U ⊂ X , we claim that ψ−1f (U 0

U ) = U 0Y\ f (X\U ). First of all, Y \

f (X \U ) is open because f is a closed map. If D ⊂ Y \ f (X \U ) is closed, thenf −1(D) is a closed subset of U . Thus U 0

Y\ f (X\U ) ⊂ ψ−1f (U 0

U ). On the other hand,

if D ⊂ Y is closed and f −1(D) ⊂ U , then D ∩ f (X \U ) = ∅. Thus ψ−1f (U 0

U ) ⊂U 0

Y\ f (X\U ). �

When X is compact and Y is Hausdorff, any continuous f : X → Y is closed,because any closed subset of X is compact, so its image is compact and consequentlyclosed. Here is an example illustrating how the assumption that f is closed isindispensable.

Example 4.1 Suppose0< ε < π , let X := (−ε, 2π + ε) andY := { z ∈ C : |z| = 1 },and let f : X → Y be the function f (t) := eit . The function ψ f : D �→ f −1(D)

is discontinuous at D0 = { eit : ε ≤ t ≤ 2π − ε } because for any open V con-taining D0 there are closed D ⊂ V such that f −1(D) includes points far fromf −1(D0) = [ε, 2π − ε].

4.5.6 The Union of the Elements

Whenever we have a set of subsets of some space, we can take the union of itselements. For any open U ⊂ X we have

⋃K∈UU

K = U because for each x ∈ U ,

2Proof: an open cover of the subset, together with its complement, is an open cover of the space,any finite subcover of which yields a finite subcover of the subset.


{x} is compact. Since the setsUU are a base for the topology ofK (X), it follows thatthe union of all elements of an open subset of K (X) is open. If U and V1, . . . , Vk

are open, then UU ∩ VV1 ∩ · · · ∩ VVk = ∅ if there is some j with U ∩ Vj = ∅, andotherwise

{x, y1, . . . , yk} ∈ UU ∩ VV1 ∩ · · · ∩ VVk

whenever x ∈ U and y1 ∈ V1 ∩U, . . . , yk ∈ Vk ∩U , so the union of all K ∈ UU ∩VV1 ∩ · · · ∩ VVk is againU . Therefore the union of all the elements of an open subsetof H (X) is open. If X is either T1 or regular, then a similar logic shows that foreitherT ∈ {K 0,H 0} the union of the elements of an open subset ofT (X) is open.

If a subset C of H (X) or H 0(X) is compact, then it is automatically compactin the coarser topology of K (X) or K 0(X). Therefore the following two resultsimply the analogous claims for H (X) and H 0(X), which are already interesting.

Lemma 4.15 If S ⊂ K (X) is compact, then L := ⋃K∈S K is compact.

Proof Let {Uα : α ∈ A} be an open cover of L . For each K ∈ S let VK be the unionof the elements of some finite subcover of K . Then K ∈ UVK , so {UVK : K ∈ S }is an open cover of S; let UVK1

, . . . ,UVKrbe a finite subcover. Then L ⊂ ⋃r

i=1 VKi ,and the various sets from {Uα} that were united to form the VKi are the desired finitesubcover of L . �

Lemma 4.16 If X is regular and S ⊂ K 0(X) is compact, then D := ⋃C∈S C is

closed.

Proof We will show that X \ D is open; let x be a point in this set. Each elementof S is a closed set that does not contain x , so (since X is regular) it is an elementof U 0

X\N for some closed neighborhood N of x . Since S is compact we have S ⊂U 0

X\N1∪ . . . ∪ U 0

X\Nkfor some N1, . . . , Nk . Then N1 ∩ . . . ∩ Nk is a neighborhood

of x that does not intersect any element of S, so x is in the interior of X \ D asdesired. �

Exercises

4.1 (Tychonoff’s Theorem) Let {Xi }i∈I be a collection of compact topologicalspaces, and let X := ∏

i Xi . For each i let πi : X → Xi be the projection x �→ xi .The product topology on X is the topology generated by the subbase { π−1

i (Ui ) :i ∈ I and Ui ⊂ Xi is open }. Prove that X with the product topology is compact. (ByLemma 4.2 it suffices to show that if a collection of subbasic sets C does not have afinite subset that covers X , then it does not cover X . Note that C = ⋃

i { π−1i (Ui ) :

Ui ∈ Ci } where Ci is the set of Ui such that π−1i (Ui ) ∈ C .)

4.2 Give an example illustrating why the map (K , L) �→ K ∩ L from { (K , L) ∈H (X) × H (X) : K ∩ L �= ∅ } toH (X) is not continuous.


4.3 For openU1, . . . ,Un ⊂ X let 〈U1, . . . ,Un〉be the set of nonempty closedC ⊂ Xsuch that C ⊂ ⋃

i Ui and C ∩Ui �= ∅ for all i . Prove that the topology on the setof closed nonempty subsets of X generated by the subbasis of sets 〈U1, . . . ,Un〉coincides with the Vietoris topology.

4.4 We now establish some separation properties:

(a) Prove that X is regular if and only ifH 0(X) is Hausdorff.(b) Prove that X is Hausdorff if and onlyH (X) is Hausdorff.(c) Prove that X is regular if and only ifH (X) is regular. (This ismore challenging.)

4.5 Following Michael (1951), say that a topology on the set of nonempty closedsubsets of X is acceptable if, for every closed A ⊂ X , the set of nonempty closedsubsets of A is closed, and for every openU ⊂ X , the set of nonempty closed subsetsof U is open.

(a) Prove that the topology of H 0(X) is the coarsest acceptable topology.(b) Under what conditions on X is the topology ofH (X) the coarsest such that, for

every closed A ⊂ X , the set of nonempty compact subsets of A is closed, andfor every open U ⊂ X , the set of nonempty compact subsets of U is open?

4.6 A topological space X is locally compact at x ∈ X if every neighborhood of xcontains a compact neighborhood, and X is locally compact if it is locally compact ateach of its points. Prove that if X is locally compact, thenH (X) is open inH 0(X).

4.7 A topological space X is totally disconnected if, for any distinct x, y ∈ X , thereare A, B ⊂ X that are both open and closed such that x ∈ A, y ∈ B, and A ∩ B = ∅.Prove that H (X) is totally disconnected if and only if X is totally disconnected.

Chapter 5Topologies on Functionsand Correspondences

We now study correspondences systematically. A compact valued correspondencefrom X to Y may be viewed as a function from X to H (Y ), and we naturally wishto compare continuity of this function with the usual continuity properties of corre-spondences. Section 5.1 considers these questions, and also the relationship betweencontinuity concepts and the topological properties of the graph of the correspondence.

In order to study the robustness of fixed points, or sets of fixed points, with respectto perturbations of the function or correspondence, one must specify topologies onthe relevant spaces of functions and correspondences. We do this by identifying afunction or correspondence with its graph, so that the topologies from the last chaptercan be invoked. The definitions of upper and lower hemicontinuity, and their basicproperties, are given in Sect. 5.1. There are two topologies on the space of upperhemicontinuous correspondences from X to Y . The strong upper topology, which isdefined and discussed in Sect. 5.2, turns out to be rather poorly behaved, and theweakupper topology, which is usually at least as coarse, is presented in Sect. 5.3. WhenX is compact the strong upper topology coincides with the weak upper topology.

We will frequently appeal to a perspective in which a homotopy h : X × [0, 1] →Y is understood as a continuous function t �→ ht from [0, 1] to the space of continuousfunctions from X to Y . Section 5.4 presents the underlying principle in full generalityfor correspondences. The specializations to functions of the strong and weak uppertopologies are known as the strong topology and the weak topology respectively. IfX is regular, then the weak topology coincides with the compact-open topology, andwhen X is compact the strong and weak topologies coincide. Section 5.5 discussesthese matters, and presents some results for functions that are not consequences ofmore general results pertaining to correspondences.

The strong upper topology plays an important role in the development of the topic,and its definition provides an important characterization of the weak upper topologywhen the domain is compact, but it does not have any independent significance.Throughout the rest of the book, barring an explicit counterindication, the space ofupper hemicontinuous correspondences from X to Y will be endowed with the weak


117


118 5 Topologies on Functions and Correspondences

upper topology, and the space of continuous functions from X to Y will be endowedwith the weak topology.

5.1 Upper and Lower Hemicontinuity

Let X and Y be topological spaces. Recall that a correspondence F : X → Y mapseach x ∈ X to a nonempty F(x) ⊂ Y . The graph of F is

Gr(F) := { (x, y) ∈ X × Y : y ∈ F(x) } .

If each F(x) is compact (closed, convex, etc.) then F is compact valued (closedvalued, convex valued, etc.).

We say that F is upper hemicontinuous if it is compact valued and, for anyx ∈ X and open set V ⊂ Y containing F(x), there is a neighborhood U of x suchthat F(x ′) ⊂ V for all x ′ ∈ U . We say that F is lower hemicontinuous if, for eachx ∈ X , y ∈ F(x), and neighborhood V of y, there is a neighborhood U of x suchthat F(x ′) ∩ V �= ∅ for all x ′ ∈ U . If F is both upper and lower hemicontinuous,then it is said to be continuous.

When F is compact valued, it is upper hemicontinuous if and only if F−1(UV ) isa open whenever V ⊂ Y is open. Thus:

Lemma 5.1 A compact valued correspondence F : X → Y is upper hemicontinu-ous if and only if it is continuous when regarded as a function from X to K (Y ).

When F is compact valued, it is lower hemicontinuous if and only if F−1(VV ) isopen whenever V ⊂ Y is open. Combining this with Lemma 5.1 gives:

Proposition 5.1 A compact valued correspondence F : X → Y is continuous if andonly if it is continuous when regarded as a function from X to H (Y ).

In the economics literature the graph being closed in X × Y is sometimespresented as the definition of upper hemicontinuity. Useful intuitions and simplearguments flow from this point of view, so we should understand precisely when itis justified.

Proposition 5.2 If F is upper hemicontinuous and Y is a Hausdorff space, thenGr(F) is closed.

Proof We show that the complement of the graph is open. Suppose (x, y) /∈ Gr(F).Since Y is Hausdorff, y and each point z ∈ F(x) have disjoint neighborhoods Vz

and Wz . Since F(x) is compact, F(x) ⊂ Wz1 ∪ · · · ∪ Wzk for some z1, . . . , zk . ThenV := Vz1 ∩ · · · ∩ Vzk andW := Wz1 ∪ · · · ∪ Wzk are disjoint neighborhoods of y andF(x) respectively. If U is a neighborhood of x with F(x ′) ⊂ W for all x ′ ∈ U , thenU × V is a neighborhood (x, y) that does not intersect Gr(F). �

5.1 Upper and Lower Hemicontinuity 119

If Y is not compact, then a compact valued correspondence F : X → Y with aclosed graph need not be upper hemicontinuous. For example, suppose X = Y = R,F(0) = {0}, and F(t) = {1/t} when t �= 0.

Proposition 5.3 If Y is compact and Gr(F) is closed, then F is upper hemicontin-uous.

Proof Fix x ∈ X . Since (X × Y ) \ Gr(F) is open, for each y ∈ Y \ V we can chooseneighborhoodsUy of x and Vy of y such that (Uy × Vy) ∩ Gr(F) = ∅. In particular,Y \ F(x) = ⋃

y∈Y\F(x) Vy is open, so F(x) is closed and therefore compact. Thus Fis compact valued.

Now fix an open neighborhood V of F(x). Since Y \ V is a closed subset of acompact space, hence compact, there are y1, . . . , yk such that Y \ V ⊂ Vy1 ∪ . . . ∪Vyk . Then F(x ′) ⊂ V for all x ′ ∈ Uy1 ∩ . . . ∩Uyk . �

Proposition 5.4 If F is upper hemicontinuous and X is compact, then Gr(F) iscompact.

Proof We have the following implications of earlier results:

• Lemma 4.10 implies that the function x �→ {x} ∈ K (X) is continuous;• Lemma 5.1 implies that F is continuous, as a function from X toK (Y );• Proposition 4.2 states that (K , L) �→ K × L is a continuous function fromK (X)

× K (Y ) toK (X × Y ).

Together these imply that F : x �→ {x} × F(x) is continuous, as a function fromX to K (X × Y ). Since X is compact, it follows that F(X) is a compact subset ofK (X × Y ), so Lemma 4.15 implies that Gr(F) = ⋃

x∈X F(x) is compact. �

5.2 The Strong Upper Topology

Let X and Y be topological spaces with Y Hausdorff, and let U (X,Y ) be the set ofupper hemicontinuous correspondences from X to Y . Proposition 5.2 ensures thatthe graph of each F ∈ U (X,Y ) is closed, so there is an embedding F �→ Gr(F) ofU (X,Y ) in K 0(X × Y ). The strong upper topology of U (X,Y ) is the topologyinduced by this embedding. LetUS(X,Y ) beU (X,Y ) endowed with this topology.Since {U 0

V : V ⊂ X × Y is open } is a subbase for K 0(X × Y ), there is a subbaseof US(X,Y ) consisting of the sets of the form { F : Gr(F) ⊂ V }.

Naturally the following result is quite important.

Theorem 5.1 If Y is a Hausdorff space and X is a compact subset of Y , then

F : US(X,Y ) → ˜K (X)

is continuous.


Proof Since Y is Hausdorff, X and Δ := { (x, x) : x ∈ X } are closed subsets of Yand X × Y respectively. For each F ∈ US(X,Y ),F (F) is the projection of Gr(F) ∩Δ onto the first coordinate. Since Gr(F) is compact (Proposition 5.4) so is Gr(F) ∩Δ, and the projection is continuous, so F (F) is compact. The definition of thestrong topology implies that Gr(F) is a continuous function of F . Since Δ is closedin X × Y , Lemma 4.6 implies that Gr(F) ∩ Δ is a continuous function of F , afterwhich Lemma 4.12 implies that F (F) is a continuous function of F . �

The basic operations for combining given correspondences to create new corre-spondences are restriction to a subset of the domain, cartesian products, and compo-sition. We now study the continuity of these constructions.

Lemma 5.2 If A is a closed subset of X, then the map F �→ F |A is continuous asa function from US(X,Y ) to US(A,Y ).

Proof Since A × Y is a closed subset of X × Y , continuity as a function fromUS(X,Y ) toUS(A, Y )—that is, continuity ofGr(F) �→ Gr(F) ∩ (A × Y )—followsimmediately from Lemma 4.6. �

Anadditional hypothesis is required to obtain continuity of restriction to a compactsubset of the domain, but in this case we obtain a kind of joint continuity.

Lemma 5.3 If X is regular, then the map (F, K ) �→ Gr(F |K ) is a continuous func-tion from US(X,Y ) × K (X) toK (X × Y ). In particular, for any fixed K the mapF �→ F |K is a continuous function from US(X,Y ) to US(K ,Y ).

Proof Fix F ∈ US(X,Y ), K ∈ K (X), and an open neighborhood W of Gr(F |K ).For each x ∈ K Lemma 4.11 gives neighborhoods Ux of x and Vx of F(x) withUx × Vx ⊂ W . Choose x1, . . . , xk such thatU := Ux1 ∪ . . . ∪Uxk contains K . SinceX is regular, each point in K has a closed neighborhood contained in U , and theinteriors of finitely many of these cover K , so K has a closed neighborhood Ccontained in U . Let

W ′ := (Ux1 × Vx1) ∪ . . . ∪ (Uxk × Vxk ) ∪ ((X \ C) × Y ) .

Then (K ,Gr(F)) ∈ UintC × UW ′ , and whenever (K ′,Gr(F ′)) ∈ UintC × UW ′ wehave

Gr(F ′|K ′) ⊂ W ′ ∩ (C × Y ) ⊂ (Ux1 × Vx1) ∪ . . . ∪ (Uxk × Vxk ) ⊂ W .

�

Let X ′ and Y ′ be two other topological spaces with Y ′ Hausdorff. Since the map(C, D) �→ C × D is not a continuous operation on closed sets, we should not expectthe function (F, F ′) �→ F × F ′ from US(X,Y ) × US(X ′,Y ′) to US(X × X ′,Y ×Y ′) to be continuous, and indeed, after giving the matter a bit of thought, the readershould be able to construct a neighborhood of the graph of the function (x, x ′) �→

5.2 The Strong Upper Topology 121

(0, 0) that shows that the map (F, F ′) �→ F × F ′ from US(R,R) × US(R,R) toUS(R

2,R2) is not continuous.We now turn our attention to composition. Suppose that, in addition to X and

Y , we have a third topological space Z that is Hausdorff. (We continue to assumethat Y is Hausdorff.) We can define a composition operation (F,G) �→ G ◦ F fromU (X,Y ) × U (Y, Z) to U (X, Z) by letting

G(F(x)) :=⋃

y∈F(x)

G(y) .

That is,G(F(x)) is the projection onto Z of Gr(G|F(x)), which is compact by Propo-sition 5.4, so G(F(x)) is compact. Thus G ◦ F is compact valued. To show thatG ◦ F is upper hemicontinuous, consider an x ∈ X , and letW be a neighborhood ofG(F(x)). For each y ∈ F(x) there is open neighborhood Vy such that G(y′) ⊂ Wfor all y′ ∈ Vy . Setting V := ⋃

y∈F(x) Vy , we have G(y) ⊂ W for all y ∈ V . If U isa neighborhood of x such that F(x ′) ⊂ V for all x ′ ∈ U , then G(F(x ′)) ⊂ W for allx ′ ∈ U .

We can also define G ◦ F to be the correspondence whose graph is

πX×Z ((Gr(F) × Z) ∩ (X × Gr(G)))

where πX×Z : X × Y × Z → X × Z is the projection. This definition involves setoperations that are not continuous, so we should suspect that (F,G) �→ G ◦ F isnot a continuous function from US(X,Y ) × US(Y, Z) to US(X, Z). For a concreteexample let X = Y = Z := R, and let f and g be the constant function with valuezero. IfU and V are neighborhoods of the graph of f and g, there are δ, ε > 0 suchthat (−δ, δ) × (−ε, ε) ⊂ V , and consequently the set of g′ ◦ f ′ with Gr( f ′) ⊂ Uand Gr(g′) ⊂ V contains the set of all constant functions with values in (−ε, ε), butof course there are neighborhoods of the graph of g ◦ f that do not contain this setof functions for any ε.

5.3 The Weak Upper Topology

As in the last section, X and Y are topological spaces with Y Hausdorff. There isanother topology onU (X,Y ) that is in certain ways more natural and better behavedthan the strong upper topology. Recall that if {Bi }i∈I is a collection of topologicalspaces and { fi : A → Bi }i∈I is a collection of functions, the quotient topology onA induced by this data is the coarsest topology such that each fi is continuous. Theweak upper topology onU (X,Y ) is the quotient topology induced by the functionsF �→ F |K ∈ US(K ,Y ) for compact K ⊂ X . Since a function is continuous if andonly if the preimage of every subbasic subset of the range is open, a subbase for the


weak upper topology is given by the sets of the form { F : Gr(F |K ) ⊂ V } whereK ⊂ X is compact and V is a (relatively) open subset of K × Y .

LetUW (X,Y ) beU (X,Y ) endowed with the weak upper topology. As in the lastsection, we study the continuity of basic operations.

Lemma 5.4 For any S ⊂ X the map F �→ F |S is a continuous function fromUW (X,Y ) to UW (S,Y ).

Proof For any compact K ⊂ S and relatively open V ⊂ K × Y ,

{ F ∈ U (X,Y ) : F |S ∈ {G ∈ U (S,Y ) : Gr(G|K ) ⊂ V }}

= { F ∈ U (X,Y ) : Gr(F |K ) ⊂ V }

is open. �

Lemma 5.5 If every compact set in X is closed (e.g., because X is Hausdorff) thenthe topology of UW (X,Y ) is at least as coarse as the topology of US(X,Y ). If, inaddition, X is itself compact, then the two topologies coincide.

Proof We need to show that the identity map from US(X,Y ) to UW (X,Y ) is con-tinuous, which is to say that for any given compact K ⊂ X , the map Gr(F) →Gr(F |K ) = Gr(F) ∩ (K × Y ) is continuous. This follows from Lemma 5.4 becauseK × Y is closed in X × Y whenever K is compact.

If X is compact, the continuity of the identity map from UW (X,Y ) to US(X,Y )

follows directly from the definition of the weak upper topology. �

There is a useful variant of Lemma 5.3.

Lemma 5.6 If X is normal, Hausdorff, and locally compact, then the function(K , F) �→ Gr(F |K ) is a continuous function from K (X) × UW (X,Y ) to K (X ×Y ).

Proof We will demonstrate continuity at a given point (K , F) in the domain. Localcompactness implies that there is a compact neighborhood C of K . The map F ′ �→F ′|C fromU (X,Y ) toUS(C,Y ) is a continuous function by virtue of the definitionof the topology of U (X,Y ). Therefore Lemma 5.3 implies that the composition(K ′, F ′) → (K ′, F ′|C) → Gr(F |K ′) is continuous, and of course it agrees with thefunction in question on a neighborhood of (K , F). �

In contrast with the strong upper topology, for the weak upper topology cartesianproducts and composition are well behaved. Let X ′ and Y ′ be two other spaces withY ′ Hausdorff.

Lemma 5.7 If X and X ′ are Hausdorff, then the function (F, F ′) �→ F × F ′ fromUW (X,Y ) × UW (X ′,Y ′) to UW (X × X ′,Y × Y ′) is continuous.

5.3 The Weak Upper Topology 123

Proof First suppose that X and X ′ are compact. Then, by Proposition 5.4, the graphsof upper hemicontinuous functionswith these domains are compact, and continuity ofthe function (F, F ′) �→ F × F ′ from US(X,Y ) × US(X ′,Y ′) to US(X × X ′,Y ×Y ′) follows from Proposition 4.2.

BecauseUW (X × X ′,Y × Y ′) has the quotient topology, to establish the generalcase we need to show that (F, F ′) �→ F × F ′|C is a continuous function fromUW (X,Y ) × UW (X ′,Y ′) toUS(C,Y × Y ′) whenever C ⊂ X × X ′ is compact. LetK and K ′ be the projections of C onto X and X ′ respectively; of course these setsare compact. The map in question is the composition

(F, F ′) → (F |K , F ′|K ′) → F |K × F ′|K ′ → (F |K × F ′|K ′)|C .

The continuity of the second map has already been established, and the continuity ofthe first and third follows from Lemma 5.4, because compact subsets of Hausdorffspaces are closed and products of Hausdorff spaces are Hausdorff. �

Suppose that, in addition to X and Y , we have a third topological space Z that isHausdorff.

Lemma 5.8 If K ⊂ X is compact, Y is normal and locally compact, and X × Y × Zis normal, then

(F,G) �→ Gr(G ◦ F |K )

is a continuous function from UW (X,Y ) × UW (Y, Z) toK (X × Z).

Proof The map F �→ Gr(F |K ) is a continuous function fromUW (X,Y ) toK (X ×Y ) by virtue of the definition of the weak upper topology, and the natural projectionof X × Y onto Y is continuous, so Lemma 4.12 implies that im(F |K ) is a continuousfunction of (K , F). Since Y is normal and locally compact, Lemma 5.6 implies that(F,G) �→ Gr(G|im(F |K )) is a continuous function from UW (X,Y ) × UW (Y, Z) toK (X × Z), and again (F,G) �→ im(G|im(F |K )) is also continuous. The continuityof cartesian products of compact sets (Proposition 4.2) now implies that

Gr(F |K ) × im(G|im(F |K )) and K × Gr(G|im(F |K ))

are continuous functions of (K , F,G). Since X is T1 while Y and Z are Hausdorff,X × Y × Z is T1, so Lemma 4.9 implies that the intersection

{ (x, y, z) : x ∈ K , y ∈ F(x), and z ∈ G(y) }

of these two sets is a continuous function of (K , F,G), and Gr(G ◦ F |K ) is theprojection of this set onto X × Z , so the claim follows from another application ofLemma 4.12. �

As we explained in the proof of Lemma 5.4, the continuity of (F,G) �→ G ◦ F |Kfor each compact K ⊂ X implies that (F,G) �→ G ◦ F is continuouswhen the rangehas the weak upper topology, so:


Proposition 5.5 If X is T1, Y is normal and locally compact, and X × Y × Z isnormal, then (F,G) �→ G ◦ F is a continuous function fromUW (X,Y ) × UW (Y, Z)

to UW (X, Z).

5.4 The Homotopy Principle

Let X , Y , and Z be topological spaces with Z Hausdorff, and fix a compact valuedcorrespondence F : X × Y → Z . For each x ∈ X let Fx : Y → Z be the derivedcorrespondence y �→ F(x, y). Motivated by homotopies, we study the relationshipbetween the following two conditions:

(a) x �→ Fx is a continuous function from X to US(Y, Z);(b) F is upper hemicontinuous.

If F : X × Y → Z is upper hemicontinuous, then x �→ Fx will not necessarily becontinuous without some additional hypothesis. For example, let X = Y = Z := R,and suppose that F(0, y) = {0} for all y ∈ Y . Without F being in any sense poorlybehaved, it can easily happen that for x arbitrarily close to 0 the graph of Fx is notcontained in { (y, z) : |z| < (1 + y2)−1 }.Lemma 5.9 If Y is compact and F is upper hemicontinuous, then x �→ Fx is acontinuous function from X to US(Y, Z).

Proof For x ∈ X let Fx : Y → Y × Z be the correspondence Fx (y) := {y} × Fx (y).Clearly Fx is compact valued and continuous as a function from Y to K (Y × Z).Since Y is compact, the image of Fx is compact, so Lemma 4.15 implies thatGr(Fx ) = ⋃

y∈Y Fx (y) is compact, and Lemma 4.16 implies that it is closed.Since Z is a Hausdorff space, Proposition 5.2 implies that Gr(F) is closed. Now

Proposition 5.3 implies that x �→ Gr(Fx ) is upper hemicontinuous, which is the same(by Lemma 5.1) as it being a continuous function from X to K (Y × Z). But sinceGr(Fx ) is closed for all x , this is the same as it being a continuous function from XtoK 0(Y × Z), and in view of the definition of the topology ofUS(Y, Z), this is thesame as x �→ Fx being continuous. �

Lemma 5.10 If Y is regular and x �→ Fx is a continuous function from X toUS(Y, Z), then F is upper hemicontinuous.

Proof Fix (x, y) ∈ X × Y and a neighborhoodW ⊂ Z of F(x, y). Since Fx is upperhemicontinuous, there is neighborhood V of y such that F(x, y′) ⊂ W for all y′ ∈ V .Applying the regularity of Y , let V be a closed neighborhood of y contained in V .Since x �→ Fx is continuous, there is a neighborhood U ⊂ X of x such that

Gr(Fx ′) ⊂ (V × W ) ∪ ((Y \ V ) × Z)

for all x ′ ∈ U . Then F(x ′, y′) ⊂ W for all (x ′, y′) ∈ U × V . �

5.4 The Homotopy Principle 125

For the sake of easier reference we combine the last two results.

Theorem 5.2 If Y is regular and compact, then F is upper hemicontinuous if andonly if x �→ Fx is a continuous function from X to US(Y, Z).

5.5 Continuous Functions

If X and Y are topological spaceswith Y Hausdorff,CS(X,Y ) andCW (X,Y )will de-note the space of continuous functions with the topologies induced by the inclusionsof C(X,Y ) in US(X,Y ) and UW (X,Y ). In connection with continuous functions,these topologies are know as the strong topology and weak topology respectively.Most of the properties of interest are automatic corollaries of our earlier work; thissection contains a few odds and ends that are specific to functions.

Proposition 5.1 asserts that an upper hemicontinuous correspondence from X toY is the same thing as a continuous function from X to K (Y ). We need to checkthat this does not introduce new topologies on the space of such objects.

Lemma 5.11 The identity function is a homeomorphism between US(X,Y ) andCS(X,K (Y )).

Proof LetU be a neighborhood of the graph of F ∈ U (X,Y ). For any (x, K ) ∈ X ×K (Y ) such that {x} × K ⊂ U a familiar compactness argument constructs neigh-borhoods Vx of x and Wx of K such that Vx × Wx ⊂ U . Therefore { (x, K ) ∈ X ×K (Y ) : {x} × K ⊂ U } is open in X × K (Y ), so the identity functionC(X,K (Y ))

→ US(X,Y ) is continuous because (Lemma 4.4) the preimage of any subbasic setis open.

Let U be a neighborhood of the graph of F ∈ C(X,K (Y )). Let F denote Fregarded as an element of U (X,Y ). For each x ∈ X choose a neighborhood Vx ofx and a neighborhood Wx of F(x) such that (x ′, K ′) ∈ U for all x ′ ∈ Vx and K ′ ∈K (Y ) such that K ′ ⊂ Wx . Let U := ⋃

x Vx × Wx . Then the set of F ′ ∈ U (X,Y )

such that Gr(F ′) ⊂ U is a neighborhood of F that is mapped to U by the identityfunction US(X,Y ) → C(X,K (Y )), which shows that this map is continuous. �

Lemma 5.12 The identity function is a homeomorphism between UW (X,Y ) andCW (X,K (Y )).

Proof The topology of UW (X,Y ) is the coarsest such that for each compactK ⊂ X the projection F �→ F |K ∈ US(K ,Y ) is continuous, and the topology ofCW (X,K (Y )) is the coarsest such that for each compact K ⊂ X the projectionF �→ F |K ∈ CS(K ,K (Y )) is continuous, so this follows from the last result.

If K ⊂ X is compact and V ⊂ Y is open, let CK ,V be the set of continuous func-tions f such that f (K ) ⊂ V . The compact-open topology is the topology generatedby the subbasis


{CK ,V : K ⊂ X is compact, V ⊂ Y is open } ,

and CCO(X,Y ) will denote the space of continuous functions from X to Y endowedwith this topology. The set of correspondences F : X → Y with Gr(F |K ) ⊂ K × Vis open in UW (X,Y ), so the compact-open topology is always at least as coarse asthe topology inherited from UW (X,Y ).

Proposition 5.6 Suppose X is regular. Then the compact-open topology coincideswith the weak topology.

Proof What this means concretely is that whenever we are given a compact K ⊂ X ,an open set W ⊂ K × Y , and a continuous f : X → Y with Gr( f |K ) ⊂ W , we canfind a compact-open neighborhood of f whose elements f ′ satisfy Gr( f ′|K ) ⊂ W .For each x ∈ K the definition of the product topology gives open sets Ux ⊂ K andVx ⊂ Y such that (x, f (x)) ∈ Ux × Vx ⊂ W . Since f is continuous, by replacingUx with a smaller open neighborhood if necessary, wemay assume that f (Ux ) ⊂ Vx .Since X is regular, x has a closed neighborhoodCx ⊂ Ux , andCx is compact becauseit is a closed subset of a compact set. Then f ∈ CCx ,Vx for each x . We can findx1, . . . , xn such that K = Cx1 ∪ . . . ∪ Cxn , and clearly Gr( f ′|K ) ⊂ W whenever

f ′ ∈ CCx1 ,Vx1∩ . . . ∩ CCxn ,Vxn

. �

We now study the continuity of elementary operations constructing new functionsfrom given functions. There is a special result concerning continuity of composition.

Lemma 5.13 If X is compact and f : X → Y is continuous, then g �→ g ◦ f is acontinuous function from CCO(Y, Z) → CCO(X, Z).

Proof In view of the subbasis for the strong topology, it suffices to show, for a givencontinuous g : Y → Z and an open V ⊂ X × Z containing the graph of g ◦ f , that

N := { (y, z) ∈ Y × Z : f −1(y) × {z} ⊂ V }

is a neighborhood of the graph of g. If not, then some point (y, g(y)) is an accumu-lation point of the set of points of the form ( f (x ′), z) where (x ′, z) /∈ V . Since X iscompact, it cannot be the case that for each x ∈ X there are neighborhoods A of xand B of (y, g(y)) such that

{ (x ′, z) ∈ (A × Z) \ V : ( f (x ′), z) ∈ B } = ∅ .

Therefore there is some x ∈ X such that for any neighborhoods A of x and B of(y, g(y)) there is some x ′ ∈ A and z such that (x ′, z) /∈ V and ( f (x ′), z) ∈ B. Evi-dently f (x) = y. To obtain a contradiction choose neighborhoods A of x and W ofg(y) such that A × W ⊂ V , and set B := Y × W . �

5.5 Continuous Functions 127

The remaining results do not depend on any additional assumptions on the spaces.

Lemma 5.14 If g : Y → Z is continuous, then f �→ g ◦ f is a continuous functionfrom CS(X,Y ) to CS(X, Z).

Proof If U ⊂ X × Z is open, then so is (IdX × g)−1(U ). �

Lemma 5.15 The function that takes ( f, g) to the function x �→ ( f (x), g(x)) is acontinuous function from CS(X,Y ) × CS(X, Z) to CS(X,Y × Z).

Proof If U is an open subset of X × Y × Z containing the graph of x �→ ( f (x),g(x)), for each x ∈ X there are neighborhoods Vx of x , Sx of f (x), and Tx ofg(x), such that Vx × Sx × Tx ⊂ U . Let A := ⋃

x Vx × Sx and B := ⋃x Vx × Tx .

Then A and B are neighborhoods of Gr( f ) and Gr(g) respectively, and the graph ofx �→ ( f ′(x), g′(x)) is in U whenever Gr( f ′) ⊂ A and Gr(g′) ⊂ B. �

The sets { f ∈ C(X,Y ) : fK ∈ V }, where K ⊂ X is compact and V ⊂ CS(K ,Y )

is open, constitute a subbase ofCW (X,Y ), and similarly forCW (X, Z) andCW (X,Y× Z), so the last two results implies parallel results for the weak topologies.

Lemma 5.16 If g : Y → Z is continuous, then f �→ g ◦ f is a continuous functionfrom CW (X,Y ) to CW (X, Z).

Lemma 5.17 The function that takes ( f, g) to the function x �→ ( f (x), g(x)) is acontinuous function from CW (X,Y ) × CW (X, Z) to CW (X,Y × Z).

Exercises

5.1 Let X be a topological space, and let F be a family of functions from X toR. The topology of uniform convergence is the topology generated by the subbaseof sets { f ′ ∈ F : | f ′(x) − f (x)| < ε for all x ∈ X } where f ∈ F and ε > 0. Thetopology of uniform convergence on compacta is the topology generated by thesubbase of sets { f ′ ∈ F : | f ′(x) − f (x)| < ε for all x ∈ K } where f ∈ F , K ⊂X is compact, and ε > 0. Prove that ifF = C(X,Y ), then the topology of uniformconvergence on compacta is the compact-open topology. (The notion of uniformconvergence can be greatly generalized; cf. ch. 6 and 7 of Kelley (1955).)

5.2 Let X and Y be topological spaces, and let F be a family of functions fromX to Y . The topology of pointwise convergence is the topology generated by thesubbase of sets { f ∈ F : f (x) ∈ V } where x ∈ X and V ⊂ Y is open.

(a) Prove that the topology of pointwise convergence is at least as coarse as thecompact-open topology.

(b) Let f : [0, 1] → R be a continuous function such that f (0) = f (1) = 0, andf (t) �= 0 for some t . For n = 1, 2, . . . let gn be the function gn(t) = f (tn). Provethat {gn} converges to the constant zero function pointwise, but not uniformly.


5.3 (Dini’s theorem for correspondences) For nonempty S, S′ ∈ R we say that S′dominates S in the strong set order, and write S′ ≥ S, if, for all s ∈ S and s ′ ∈ S′,min{s, s ′} ∈ S and max{s, s ′} ∈ S′. Let X be a topological space, and let {Fn} bea sequence of continuous correspondences Fn : X → R that is increasing, in thesense that Fn′(x) ≥ Fn(x) for all x ∈ X and n′ ≥ n, and that converges pointwise toa continuous correspondence F : X → R. (Here pointwise convergence is definedby regarding Fn and F as continuous functions from X to H (R).) Prove that Fn

converges to F in the weak upper topology.

5.4 Let X andY be topological spaces, and letF be a family of continuous functionsfrom X to Y . A sequence { fn} in F converges continuously to f ∈ F if, for allsequences {xn} converging to a point x ∈ X , fn(xn) → f (x). Prove that if { fn}converges to f in the compact-open topology, then it converges continuously to f .

5.5 Let X and Y be topological spaces. Let e : C(X,Y ) × X → Y be the functione( f, x) = f (x). A topology on C(X,Y ) is jointly continuous if e is continuouswhen C(X,Y ) × X has the associated product topology.

(a) Prove that if X is locally compact, then the compact-open topology is jointlycontinuous.

(b) Prove that if A and B are topological spaces, K ⊂ A and L ⊂ B are compact,and U ⊂ A × B is a neighborhood of K × L , then there are neighborhoods Vof K and W of L such that V × W ⊂ U .

(c) Prove that a jointly continuous topology τ is at least as fine as the compact-opentopology. (Concretely, given f ∈ CK1,V1 ∩ · · · ∩ CKn ,Vn we need to construct aτ -open U such that f ∈ U ⊂ CK1,V1 ∩ · · · ∩ CKn ,Vn .)

Chapter 6Metric Space Theory

In this chapter we develop some advanced results concerning metric spaces.An important tool, partitions of unity, exist for locally finite open covers of a

normal space: this is shown in Sect. 6.2. But sometimes wewill be given a local coverthat is not necessarily locally finite, so we need to know that any open cover has alocally finite refinement. A space is paracompact if this is the case. Paracompactessis studied in Sect. 6.1; the fact that metric spaces are paracompact will be quiteimportant.

Section 6.3 describes most of the rather small amount we will need to know abouttopological vector spaces. Of these, the most important for us are the locally convexspaces, which have many desirable properties. One of the larger themes of this studyis that the concepts and results of fixed point theory extend naturally to this level ofgenerality, but not further.

Two important types of topological vector spaces, Banach spaces and Hilbertspaces, are introduced in Sect. 6.4. Results showing that metric spaces can be em-bedded in such linear spaces are given in Sect. 6.5. Section 6.6 presents an infinitedimensional generalization of the Tietze extension theorem due to Dugundji.

6.1 Paracompactness

Fix a topological space X . A family {Sα}α∈A of subsets of X is locally finite ifevery x ∈ X has a neighborhood W such that there are only finitely many α withW ∩ Sα �= ∅. If {Uα}α∈A is a cover of X , a second cover {Vβ}β∈B is a refinement of{Uα}α∈A if each Vβ is a subset of some Uα . The space X is paracompact if everyopen cover is refined by an open cover that is locally finite.


129


130 6 Metric Space Theory

Theorem 6.1 A metric space is paracompact.

This result is due to Stone (1948). At first the proofs were rather complex, buteventually Mary Ellen Rudin (1969) found the following brief and simple argument.

Proof Let {Uα}α∈A be an open cover of X where A is a well ordered set. We definesets Vαn for α ∈ A and n = 1, 2, . . ., inductively (over n) as follows: let Vαn be theunion of the balls U2−n (x) for those x such that:

(a) α is the least element of A such that x ∈ Uα;(b) x /∈ ⋃

j<n,β∈A Vβ j ;(c) U3×2−n (x) ⊂ Uα .

For each x there is a least α such that x ∈ Uα and an n large enough that (c) holds,so x ∈ Vαn unless x ∈ Vβ j for some β and j < n. Thus {Vαn} is a cover of X , and ofcourse each Vαn is open and contained in Uα , so it is a refinement of {Uα}.

To prove that the cover is locally finite we fix x , let α be the least element of Asuch that x ∈ Vαn for some n, and choose j such that U2− j (x) ⊂ Vαn . We claim thatU2−n− j (x) intersects only finitely many Vβi .

If i > j and y satisfies (a)–(c) with β and i in place of α and n, then U2−n− j (x) ∩U2−i (y) = ∅ because U2− j (x) ⊂ Vαn , y /∈ Vαn , and n + j, i ≥ j + 1. ThereforeU2−n− j (x) ∩ Vβi = ∅.

For i ≤ j we will show that there is at most one β such that U2−n− j (x) intersectsVβi . Suppose that y and z are points satisfying (a)–(c) for β and γ , with i in placeof j . Without loss of generality β precedes γ . Then U3×2−i (y) ⊂ Uβ , z /∈ Uβ , andn + j > i , so U2−n− j (x) cannot intersect both U2−i (y) and U2−i (z). Since this is thecase for all y and z, U2−n− j (x) cannot intersect both Vβi and Vγ i . �

6.2 Partitions of Unity

We continue to work with a fixed topological space X . This section’s central conceptis:

Definition 6.1 A partition of unity for X is a collection of continuous functions{ψα : X → [0, 1]} such that

∑α∈A ψα(x) = 1 for each x . (This is understood as

entailing that there are at most countably many α such that ψα(x) > 0.) If {Uα}α∈A

an open cover of X , a partition of unity {ψα} is subordinate to {Uα} if ψα(x) = 0for all α and x /∈ Uα .

The most common use of a partition of unity is to construct a global functionor correspondence with particular properties. Typically locally defined functionsor correspondences are given or can be shown to exist, and the global object isconstructed by taking a “convex combination” of the local objects, with weights thatvary continuously. Of course to apply thismethod onemust have results guaranteeingthat suitable partitions of unity exist. Our goal in this section is:

6.2 Partitions of Unity 131

Theorem 6.2 For any locally finite open cover {Uα}α∈A of a normal space X thereis a partition of unity subordinate to {Uα}.

A basic tool used in the constructive proof of this result, and many others, is:

Lemma 6.1 (Urysohn’s Lemma) If X is a normal space and C ⊂ U ⊂ X with Cclosed andU open, then there is a continuous function ϕ : X → [0, 1]with ϕ(x) = 0for all x ∈ C and ϕ(x) = 1 for all x ∈ X \U.

Proof Since X is normal, whenever C ′ ⊂ U ′, with C ′ closed and U ′ open, thereexist a closed C ′′ and an open U ′′ such that C ′ ⊂ U ′′, X \U ′ ⊂ X \ C ′′, and U ′′ ∩(X \ C ′′) = ∅, which is to say thatC ′ ⊂ U ′′ ⊂ C ′′ ⊂ U ′. LetC0 := C andU1 := U .Choose anopenU1/2 and a closedC1/2 withC0 ⊂ U1/2 ⊂ C1/2 ⊂ U1.Choose anopenU1/4 and a closed C1/4 with C0 ⊂ U1/4 ⊂ C1/4 ⊂ U1/2, and choose an openU3/4 anda closed C3/4 with C1/2 ⊂ U3/4 ⊂ C3/4 ⊂ U1. Continuing in this fashion, we obtaina system of open sets Ur and a system of closed sets Cr for rationals r ∈ [0, 1]of the form k/2m (except that C1 and U0 are undefined) with Ur ⊂ Cr ⊂ Us ⊂ Cs

whenever r < s.For x ∈ X let

ϕ(x) :={inf{ r : x ∈ Cr }, x ∈ ⋃

r Cr

1, otherwise.

Clearly ϕ(x) = 0 for all x ∈ C and ϕ(x) = 1 for all x ∈ X \U . Any open subset of[0, 1] is a union of finite intersections of sets of the form [0, a) and (b, 1], where0 < a, b < 1, and

ϕ−1([0, a)

) =⋃

r<a

Ur and ϕ−1((b, 1]) =

⋃

r>b

(X \ Cr )

are open, so ϕ is continuous. �

Below we will apply Urysohn’s lemma to a closed subset of each element of alocally finite open cover. We will need X to be covered by these closed sets, as perthe next result.

Proposition 6.1 If X is a normal space and {Uα}α∈A is a locally finite cover ofX, then there is an open cover {Vα}α∈A such that for each α, the closure of Vα iscontained in Uα .

Proof A partial thinning of {Uα}α∈A is a function F from a subset B of A to theopen sets of X such that:

(a) for each β ∈ B, the closure of F(β) is contained in Uβ ;(b)

⋃β∈B F(β) ∪ ⋃

α∈A\B Uα = X .

Our goal is to find such an F with B = A. The partial thinnings can be partiallyordered as follows: F ≺ G if the domain of F is a proper subset of the domain of G


and F andG agree on this set.Wewill show that this ordering has maximal elements,and that the domain of a maximal element is all of A.

Let {Fι}ι∈I be a chain of partial thinnings. That is, for all distinct ι, ι′ ∈ I , eitherFι ≺ Fι′ or Fι′ ≺ Fι. Let the domain of each Fι be Bι, let B := ⋃

ι Bι, and for β ∈ Blet F(β) be the common value of Fι(β) for those ι with β ∈ Bι. For each x ∈ Xthere is some ι with Fι(β) = F(β) for all β ∈ B such that x ∈ Uβ because there areonly finitely many α with x ∈ Uα . Therefore F satisfies (b). We have shown that anychain of partial thinnings has an upper bound, so Zorn’s lemma implies that the setof all partial thinnings has a maximal element.

If F is a partial thinning with domain B and α′ ∈ A \ B, then

X \( ⋃

β∈BF(β) ∪

⋃

α∈A\B,α �=α′Uα

)

is a closed subset ofUα , so it has an open superset Vα′ whose closure is contained inUα . We can define a partial thinningG with domain B ∪ {α′} by settingG(α′) := Vα′

and G(β) := F(β) for β ∈ B. Therefore F cannot be maximal unless its domain isall of A. �Proof of Theorem6.2 The result above gives a closed cover {Cα}α∈A of X withCα ⊂ Uα for each α. For each α let ϕα : X → [0, 1] be continuous with ϕα(x) = 0for all x ∈ X \Uα and ϕα(x) = 1 for all x ∈ Cα . Then

∑α ϕα is well defined and

continuous everywhere since {Uα} is locally finite, and it is positive everywhere since{Cα} covers X . For each α ∈ A set

ψα := ϕα∑

α′ ϕα′.

�

6.3 Topological Vector Spaces

Since we wish to develop fixed point theory in as much generality as is reasonablypossible, infinite dimensional vector spaces will inevitably appear at some point. Inaddition, these spaces will frequently be employed as tools of analysis. The result inthe next section refers to such spaces, so this is a good point at which to cover thebasic definitions and elementary results.

A topological vector space (TVS) V is a vector space over the real numbers1 thatis endowedwith a topology that makes addition and scalar multiplication continuous,and makes {0} a closed set. TVS’s, and maps between them, are the objects studiedin functional analysis. Over the last few decades functional analysis has grown into

1Other fields of scalars, in particular the complex numbers, play an important role in functionalanalysis, but have no applications in this book.

6.3 Topological Vector Spaces 133

a huge body of mathematics; it is fortunate that our work here does not require muchmore than the most basic definitions and facts.

We now lay out elementary properties of V . For any given w ∈ V the mapsv → v + w and v → v − w are continuous, hence inverse homeomorphisms. Thatis, the topology of V is translation invariant. In particular, the topology of V iscompletely determined by a neighborhood base of the origin, which simplifies manyproofs.

The following facts are basic.A set A ⊂ V is balanced (or circled) if t x ∈ Awhenever x ∈ A and−1 ≤ t ≤ 1.

Lemma 6.2 If A is a neighborhood of the origin, then there is closed balancedneighborhood of the origin C such that C + C ⊂ A.

Proof Continuity of addition implies that there are open neighborhoods of the originB1, B2, B3 with B1 + B2 + B3 ⊂ A, and replacing these with their intersection givesa neighborhood B such that B + B + B ⊂ A. If w ∈ B, then w − B intersects anyneighborhood of the origin, and in particular (w − B) ∩ B �= ∅. Thus B ⊂ B + B, soB + B ⊂ A. Continuity of scalarmultiplication gives an open neighborhoodU of theorigin and ε > 0 such that αU ⊂ B for all α ∈ (−ε, ε). Therefore V := ⋃

|α|<ε αUis a balanced open neighborhood of the origin contained in B. Noting that the closureof a balanced set is balanced (this is an easy exercise) let C be the closure of V . �

We can now establish the separation properties of V .

Lemma 6.3 V is a regular T1 space, and consequently a Hausdorff space.

Proof Since {0} is closed, translation invariance implies that V is T1. Translationinvariance also implies that to prove regularity, it suffices to show that any neighbor-hood of the origin, say A, contains a closed neighborhood, and this is part of whatthe last result asserts. As has been pointed out earlier, a simple and obvious argumentshows that a regular T1 space is Hausdorff. �

We can say slightly more in this direction:

Lemma 6.4 If K ⊂ V is compact and U is a neighborhood of K , then there is aclosed neighborhood W of the origin such that K + W ⊂ U.

Proof For each v ∈ K Lemma 6.2 gives a closed neighborhoodWv of the origin suchthat v + Wv + Wv ⊂ U . Then there are v1, . . . , vn such that v1 + Wv1 , . . . , vn + Wvnis a cover of K . Let W := Wv1 ∩ . . . ∩ Wvn . For any v ∈ K there is an i such thatv ∈ vi + Wi , so that

v + W ⊂ vi + Wvi + Wvi ⊂ U .

�

Lemma 6.5 If C ⊂ V is convex, then so is its closure C.


Proof Aiming at a contradiction, suppose that v = (1 − t)v0 + tv1 is not in C eventhough v0, v1 ∈ C and 0 < t < 1. Let U be a neighborhood of v that does not in-tersect C . The continuity of addition and scalar multiplication implies that there areneighborhoodsU0 andU1 of v0 and v1 such that (1 − t)v′

0 + tv′1 ∈ U for all v′

0 ∈ U0

and v′1 ∈ U1. Since U0 and U1 contain points in C , this contradicts the convexity of

C . �

Lemma 6.6 If U is an open convex subset of V , v ∈ U, v ∈ U, and 0 < t < 1, then(1 − t)v + tv ∈ U.

Proof Let W be a neighborhood of the origin such that v + W ⊂ U . Since v is inthe closure of U there is a v′ ∈ (v − 1−t

t W ) ∩U . We have

(1 − t)v + tv ∈ (1 − t)v + t (v′ + 1−tt W ) = (1 − t)(v + W ) + tv′ ⊂ U.

�

A TVS is locally convex if every neighborhood of the origin contains a convexneighborhood. In several ways the theory of fixed points developed in this bookdepends on local convexity, so for the most part locally convex TVS’s represent theouter limits of generality considered here.

Lemma 6.7 If V is locally convex and A is a neighborhood of the origin, then thereis closed convex neighborhood of the origin C such that C + C ⊂ A.

Proof Lemma 6.2 gives a closed neighborhood B of the origin such that B + B ⊂ A.Continuity of scalar multiplication gives an open neighborhood U of the origin andε > 0 such that αU ⊂ B for all α ∈ (−ε, ε). Since U contains a convex neighbor-hood of the origin we may assume that it is convex. Therefore V := ⋃

|α|<ε αU isa balanced convex open neighborhood of the origin contained in B. Let C be theclosure of V . Then C is balanced, as we noted before, and Lemma 6.5 implies thatit is convex. �

6.4 Banach and Hilbert Spaces

We now describe two important types of locally convex spaces. A norm on V is afunction ‖ · ‖ : V → R≥ such that:

(a) ‖v‖ = 0 if and only if v = 0;(b) ‖αv‖ = |α|‖v‖ for all α ∈ R and v ∈ V ;(c) ‖v + w‖ ≤ ‖v‖ + ‖w‖ for all v,w ∈ V .

Condition (c) implies that the function (v,w) → ‖v − w‖ is a metric on V , andwe endow V with the associated topology. Condition (a) implies that {0} is closedbecause every other point has a neighborhood that does contain the origin. Conditions(b) and (c) give the calculations

6.4 Banach and Hilbert Spaces 135

‖α′v′ − αv‖ ≤ ‖α′v′ − α′v‖ + ‖α′v − αv‖ = |α′|‖v′ − v‖ + |α′ − α|‖v‖

and‖(v′ + w′) − (v + w)‖ ≤ ‖v′ − v‖ + ‖w′ − w‖ ,

which are easily seen to imply that scalar multiplication and addition are continuous.A vector space endowed with a norm and the associated metric and topology is calleda normed space.

For a normed space the calculation

‖(1 − α)v + αw‖ ≤ ‖(1 − α)v‖ + ‖αw‖ = (1 − α)‖v‖ + α‖w‖ ≤ max{‖v‖, ‖w‖}

shows that for any ε > 0, the open ball of radius ε centered at the origin is convex.The open ball of radius ε centered at any other point is the translation of this ball, soa normed space is locally convex.

A sequence {vm} in a TVS V is a Cauchy sequence if, for each neighborhoodA of the origin, there is an integer N such that vm − vn ∈ A for all m, n ≥ N . Thespace V is complete if its Cauchy sequences are all convergent. A Banach space isa complete normed space.

For the most part there is little reason to consider TVS’s that are not completeexcept insofar as they occur as subspaces of complete spaces. The reason for thisis that any TVS V can be embedded in a complete space V whose elements areequivalence classes of Cauchy sequences, where twoCauchy sequence {vm} and {wn}are equivalent if, for each neighborhood A of the origin, there is an integer N such thatvm − wn ∈ A for all m, n ≥ N . (This relation is clearly reflexive and symmetric. Tosee that it is transitive, suppose {u} is equivalent to {vm} which is in turn equivalentto {wn}. For any neighborhood A of the origin the continuity of addition impliesthat there are neighborhoods B,C of the origin such that B + C ⊂ A. There is Nsuch that u − vm ∈ B and vm − wn ∈ C for all ,m, n ≥ N , whence u − wn ∈ A.)Denote the equivalence class of {vm} by [vm]. The vector operations have the obviousdefinitions: [vm] + [wn] := [vm + wm] and α[vm] := [αvm]. The open sets of V arethe sets of the form

{ [vm] : vm ∈ A for all large m }

where A ⊂ V is open. (It is easy to see that the condition “vm ∈ A for all large m”does not depend on the choice of representative {vm} of [vm].) A complete justificationof this definition would require verifications of the vector space axioms, the axiomsfor a topological space, the continuity of addition and scalar multiplication, and that{0} is a closed set. Instead of elaborating, we simply assert that the reader who treatsthis as an exercise will find it entirely straightforward. A similar construction can beused to embed any metric space in a “completion” in which all Cauchy sequences(in the metric sense) are convergent.

As in the finite dimensional case, the best behaved normed spaces have normsthat are derived from inner products. An inner product on a vector space V is afunction 〈·, ·〉 : V × V → R that is symmetric, bilinear, and positive definite:


(a) 〈v,w〉 = 〈w, v〉 for all v,w ∈ V ;(b) 〈αv + v′,w〉 = α〈v,w〉 + 〈v′,w〉 for all v, v′,w ∈ V and α ∈ R;(c) 〈v, v〉 ≥ 0 for all v ∈ V , with equality if and only if v = 0.

Wewould like to define a norm by setting ‖v‖ := 〈v, v〉1/2. This evidently satisfies (a)and (b) of the definition of a norm. The verification of (c) beginswith the computation

0 ≤ ⟨〈v, v〉w − 〈v,w〉v, 〈v, v〉w − 〈v,w〉v⟩ = 〈v, v〉(〈v, v〉〈w,w〉 − 〈v,w〉2) ,

which implies theCauchy-Schwartz inequality: 〈v,w〉 ≤ ‖v‖ × ‖w‖ for all v,w ∈V . This holds with equality if v = 0 or 〈v, v〉w − 〈v,w〉v, which is the case ifand only if w is a scalar multiple of v, and otherwise the inequality is strict. TheCauchy-Schwartz inequality implies the inequality in the calculation

‖v + w‖2 = 〈v + w, v + w〉 = ‖v‖2 + 2〈v,w〉 + ‖w‖2 ≤ (‖v‖ + ‖w‖)2 ,

which implies (c) and completes the verification and ‖ · ‖ is a norm. A vector spaceendowed with an inner product and the associated norm and topology is called aninner product space. A Hilbert space is a complete inner product space.

Up to linear isometry there is only one separable2 Hilbert space. Let

H := { s = (s1, s2, . . .) ∈ R∞ : s21 + s22 + · · · < ∞}

be the Hilbert space of square summable sequences. Let 〈s, t〉 := ∑i si ti be the usual

inner product; the Cauchy-Schwartz inequality implies that this sum is convergent.For any Cauchy sequence in H and for each i , the sequence of i th components isCauchy, and the element of R∞ whose i th component is the limit of this sequenceis easily shown to be the limit in H of the given sequence. Thus H is complete. Theset of points with only finitely many nonzero components, all of which are rational,is a countable dense subset, so H is separable.

We wish to show that any separable Hilbert space is linearly isometric to H , so letV be a separable Hilbert space, and let {v1, v2, . . . } be a countable dense subset. Thespan of this set is also dense, of course. Using the Gram-Schmidt process (Sect. 12.1)wemay pass from this set to a countable sequencew1,w2, . . . of orthonormal vectorsthat has the same span. It is now easy to show that s → s1w1 + s2w2 + · · · is a linearisometry between H and V .

6.5 Embedding Theorems

An important technique is to endow metric spaces with geometric structures by em-bedding them in normed spaces. Let (X, d) be a metric space, let V be a normed

2Recall that a metric space is separable if it contains a countable set of points whose closure is theentire space.

6.5 Embedding Theorems 137

space, and let C(X, V ) be the space of bounded continuous real valued functions onX . This is, of course, a vector space under pointwise addition and scalar multiplica-tion. We endow C(X, V ) with the norm

‖ f ‖∞ := supx∈X

‖ f (x)‖ .

Lemma 6.8 C(X, V ) is a normed space, and if V is a Banach space, then so isC(X, V ).

Proof The verification that ‖ · ‖∞ is actually a norm is elementary and left to thereader. If V is a Banach space and { fn} is a Cauchy sequence, this sequence hasa pointwise limit f because each { fn(x)} is Cauchy, and the pointwise limit of auniformly convergent sequence of continuous functions between two metric spacesis continuous.3 �Theorem 6.3 (Kuratowski 1935; Wojdyslawski 1939) There is an embedding ι :X → C(X) such that ι(X) is a relatively closed subset of a convex set, and is aclosed subset of C(X) if X is complete.

Proof For each x ∈ X let fx ∈ C(X) be the function fx (y) := min{1, d(x, y)}; themap ι : x → fx is evidently an injection from X to C(X). For any x, y ∈ X we have

‖ fx − fy‖∞ = supz

|min{1, d(x, z)} − min{1, d(y, z)}| ≤ supz

|d(x, z) − d(y, z)| ≤ d(x, y) ,

so ι is continuous. On the other hand, if {xn} is a sequence such that fxn → fx , thenmin{1, d(xn, x)} = | fxn (x) − fx (x)| ≤ ‖ fxn − fx‖∞ → 0, so xn → x . Thus the in-verse of ι is continuous, so ι is a embedding.

Now suppose that fxn converges to an element f = ∑ki=1 λi f yi of the convex hull

of ι(X). We have ‖ fxn − f ‖∞ → 0 and

‖ fxn − f ‖∞ ≥ | fxn (xn) − f (xn)| = | f (xn)| ,

so f (xn) → 0. For each i we have 0 ≤ fyi (xn) ≤ f (xn)/λi → 0, which implies thatxn → yi , whence f = fy1 = · · · = fyk ∈ ι(X). Thus ι(X) is closed in the relativetopology of its convex hull.

Now suppose that X is complete, and that {xn} is a sequence such that fxn → f .Then as above, min{1, d(xm, xn)} ≤ ‖ fxm − fxn‖∞, and { fxn } is a Cauchy sequence,so {xn} is also Cauchy and has a limit x . Above we saw that fxn → fx , so fx = f .Thus ι(X) is closed in C(X). �

3We recall the proof of this basic fact of analysis. Suppose (X, d) and (Y, e) are metric spaces andf is the pointwise limit of a uniformly convergent sequence { fn} of continuous functions from X toY . Fix x ∈ X and ε > 0. There is anm such that e( fm(x ′), fn(x ′)) < ε/3 for all n ≥ m and x ′ ∈ X ,and there is a δ > 0 such that e( fm(x ′), fm(x)) < ε/3 for all x ′ ∈ Uδ(x). For such x ′ we have

e( f (x ′), f (x)) ≤ e( f (x ′), fm(x ′)) + e( fm(x ′), fm(x)) + e( fm(x), f (x)) < ε .


The so-called Hilbert cube is

I∞ := { s ∈ H : |si | ≤ 1/ i for all i = 1, 2, . . . } .

For separable metric spaces we have the following refinement of Theorem 6.3.

Theorem 6.4 (Urysohn) If (X, d) is a separablemetric space, there is an embeddingι : X → I∞.

Proof Let { x1, x2, . . . } be a countable dense subset of X . Define ι : X → I∞ bysetting

ιi (x) := min{d(x, xi ), 1/ i} .

Clearly ι is a continuous injection. To show that the inverse is continuous, supposethat {x j } is a sequence with ι(x j ) → ι(x). If it is not the case that x j → x , thenthere is a neighborhood U that (perhaps after passing to a subsequence) does nothave any elements of the sequence. Choose xi in that neighborhood. The sequenceof numbers min{d(x j , xi ), 1/ i} is bounded below by a positive number, contrary tothe assumption that ι(x j ) → ι(x). �

6.6 Dugundji’s Theorem

The well known Tietze extension theorem asserts that if a topological space X isnormal and f : A → [0, 1] is continuous, where A ⊂ X is closed, then f has acontinuous extension to all of X . A map into a finite dimensional Euclidean spaceis continuous if its component functions are each continuous, so Tietze’s theoremis adequate for finite dimensional applications. Mostly, however, we will work withspaces that are potentially infinite dimensional, for which we will need the followingvariant due to Dugundji (1951).

Theorem 6.5 If A is a closed subset of a metric space (X, d), Y is a locally convexTVS, and f : A → Y is continuous, then there is a continuous extension f : X → Ywhose image is contained in the convex hull of f (A).

Proof The sets Ud(x,A)/2(x) are open and cover X \ A. Theorem 6.1 implies theexistence of an open locally finite refinement {Wα}α∈I . Theorem 6.2 implies theexistence of a partition of unity {ϕα}α∈I subordinate to {Wα}α∈I . For each α chooseaα ∈ A with d(aα,Wα) < 2d(A,Wα), and define the extension by setting

f (x) :=∑

α∈Iϕα(x) f (aα) (x ∈ X \ A) .

Clearly f is continuous at every point of X \ A and at every interior point of A.Let a be a point in the boundary of A, let U be a neighborhood of f (a), which we

6.6 Dugundji’s Theorem 139

may assume to be convex, and choose δ > 0 small enough that f (a′) ∈ U whenevera′ ∈ Uδ(a) ∩ A. Consider x ∈ Uδ/7(a) ∩ (X \ A). For any α such that x ∈ Wα andx ′ such that Wα ⊂ Ud(x ′,A)/2(x ′) we have

d(aα,Wα) ≥ d(aα, x ′) − d(x ′, A)/2 ≥ d(aα, x ′) − d(x ′, aα)/2 = d(aα, x ′)/2

andd(x ′, x) ≤ d(x ′, A)/2 ≤ d(Wα, A) ≤ d(Wα, aα) ,

so

d(aα, x) ≤ d(aα, x ′) + d(x ′, x) ≤ 3d(aα,Wα) ≤ 6d(A,Wα) ≤ 6d(a, x) .

Thus d(aα, a)≤ d(aα, x) + d(x, a) ≤ 7d(x, a)< δ whenever x ∈Wα , so f (x)∈U .�

Exercises

6.1 Let V be a vector space, and let p : V → R+ be a function such that p(tv) =tp(v) and p(v + w) ≤ p(v) + p(w) for all t ≥ 0 and v,w ∈ V . Let W be a linearsubspace of V , and let λ : W → R be a linear functional such that λ(w) ≤ p(w)

for all w ∈ W . Observe that λ(w + w′) ≤ p(w − v) + p(w′ + v) for all v ∈ V andw,w′ ∈ W .

(a) For a fixed v ∈ V and c ∈ R such that

supw∈W

λ(w) − p(w − v) ≤ c ≤ infw′∈W p(w′ + v) − λ(w′) ,

let λ′ : W + Rv → R be the linear functional λ′(w + tv) = λ(w) + ct . Provethat λ′(w + tv) ≤ p(w + tv) for all w ∈ W and t ∈ R.

(b) Using Zorn’s lemma, prove that λ has a linear extension λ′ : V → R such thatλ′(v) ≤ p(v) for all v ∈ V .

Now let V be a locally convex topological vector space, letU ⊂ V be an open convexset containing the origin, and let z be a point in V that is not contained in U .

(c) Show that the function pU (v) := inf{ t > 0 : v ∈ tU } satisfies pU (tv) = tpU (v)and pU (v + w) ≤ pU (v) + pU (w) for all t ≥ 0 and v,w ∈ V .

(d) Show that there is a continuous linear functional λ : V → R such that λ(z) = 1and λ(u) < 1 for all u ∈ U .

(e) (Hahn-Banach Theorem) Prove that if X ⊂ V is nonempty, open, and convex,Y ⊂ V is nonempty and convex, and X ∩ Y = ∅, then there is a continuous linearfunctional λ : V → R and a constant c such that λ(x) < c ≤ λ(y) for all x ∈ Xand y ∈ Y . (This result is applied several times in the remaining exercises.)


6.2 For a vector space V , V+ is the set of linear functionals v+ : V → R. A subspaceΓ ⊂ V+ is total if, for each v ∈ V other than the origin, there is some γ ∈ Γ suchthat γ (v) �= 0. Note that V is automatically a total subspace of (V+)+.

(a) Let V be a topological vector space. The dual space of V is the set V ∗ ⊂ V+of linear functionals that are continuous. Sum and scalar products of continuouslinear functionals are easily shown to be continuous, so V ∗ is a linear subspaceof V+. Prove that if V is locally convex, then V ∗ is total.

IfΓ is a linear subspace of V+, theΓ topology of V is the coarsest topology such thateach γ ∈ Γ is continuous. Equivalently, it is the topology generated by the subbaseof all sets of the form γ −1(U ) where γ ∈ Γ and U ⊂ R is open.

(b) Prove that ifΓ is total, then V with theΓ topology is a locally convex topologicalvector space.

Let c : V → R+ be any function. For each v ∈ V let Iv := [−c(v), c(v)], and letI := ∏

v Iv, endowed with the product topology. Let

K := { v+ ∈ V+ : |v+(v)| ≤ c(v) for all v ∈ V } .

Let τ : K → I be the map with v-coordinate τv(v+) := v+(v).

(c) Prove that τ is an embedding when K as the subspace topology inherited fromthe V topology of V+.

(d) Prove that τ(K ) is a closed subset of I .(e) Prove that K is compact.

6.3 (Banach–Alaoglu Theorem) If V is a topological vector space, the V topology ofV ∗ is called theweak∗ topology. Let V be a normed space. The operator norm of V ∗is given by ‖v∗‖ = sup‖v‖≤1 |v∗(v)|. Prove that the operator norm is, in fact, a norm.Prove that the unit ball { v∗ ∈ V ∗ : ‖v∗‖ ≤ 1 } is compact in the weak∗ topology. (Letc : V → R+ be the function c(v) := ‖v‖, and apply the last exercise.) The Alaoglu-Bourbaki theorem (e.g., p. 80 of Kantorovich and Akilov 1982) is a generalizationfor locally convex spaces.

6.4 The closed convex hull of a subset A of a topological vector space V is theclosure of the convex hull of A. Prove that if C is a convex subset of V , then theclosure of C is convex. Conclude that the closed convex hull of A is the smallestsuperset of A that is both closed and convex.

6.5 If K is a subset of a vector space, a nonempty set A ⊂ K is an extremal set of Kif v0, v1 ∈ Awhenever v0, v1 ∈ K , 0 < t < 1, and (1 − t)v0 + tv1 ∈ A. If a singleton{v} is an extremal set, then v is an extreme point of K . Let K be a nonempty compactsubset of a locally convex topological vector space V .

(a) Use Zorn’s lemma to prove that every compact extremal subset of K containssuch a set that is minimal, insofar as it has no proper subset that is a compactextremal subset of K .

Exercises 141

(b) Prove that if A is a compact extremal subset of K and λ : V → R is a continuouslinear functional, then argminv∈Aλ(v) is a compact extremal subset of K .

(c) Prove that a minimal compact extremal subset of K is a singleton.(d) (Krein–Milman Theorem) Prove that the closed convex hull of K is contained

in the closed convex hull of the set of extreme points of K .

6.6 If X is a topological space and (Y, d) is a metric space, a sequence of functions{ fn} from X to Y is uniformly Cauchy if, for every ε > 0, there is an integer Nsuch that d( fm(x), fn(x)) < ε for all x and m, n ≥ N .

(a) Prove that if a uniformly Cauchy sequence of functions converges pointwise tof , then f is continuous.

Now suppose that X is paracompact, let V be a topological vector space, and letΦ : X → V be a lower semicontinuous correspondence with convex values.

(b) Prove that for any neighborhood U of the origin in V there is a locally finiteopen cover {Wα}α∈A of X and such that

⋂x∈Wα

(Φ(x) +U ) �= ∅ for each α.

A selection from Φ is a function f : X → V such that f (x) ∈ Φ(x) for all x ∈ X .

(c) Prove that for any convex neighborhoodU of the origin in V there is a continuousselection from the correspondence x → Φ(x) +U .

(d) Prove that if U is a neighborhood of the origin in V , f : X → V is con-tinuous, and Φ(x) ∩ ( f (x) +U ) �= ∅ for all x ∈ X , then the correspondencex → Φ(x) ∩ ( f (x) +U ) is lower hemicontinuous.

(e) Suppose that U and U ′ are convex neighborhoods of the origin of V , U issymmetric (in the sense thatU = −U ), and f : X → V is a continuous selectionfrom x → Φ(x) +U . Prove that there is a continuous selection f ′ from x →Φ(x) +U ′ such that f ′(x) ∈ f (x) +U +U ′ for all x .

(f) (Michael Selection Theorem) Prove that if V is a Banach space andΦ : X → Vis a lower hemicontinuous correspondence with closed convex values, then Φ

has a continuous selection.

This version of the result is from Michael (1956). Repovš and Semenov (2014) is asurvey of recent contributions to the extensive literature descended from this seminalcontribution.

6.7 (Gale and Mas-Colell 1975,1979) Let X := X1 × · · · × Xn where each Xi is anonempty compact convex subset of a Banach space. For each i let Ui be an opensubset of X , and let ϕi : Ui → Xi be a lower hemicontinuous correspondence withcompact convex values. Prove that that there is an x ∈ X such that for each i , eitherx /∈ Ui or xi ∈ ϕi (x). (Hint: use the Michael selection theorem to define an upperhemicontinuous convex valued correspondence Ψi : X → Xi , then apply the Fan-Glicksberg theorem to the correspondenceΨ : X → X given byΨ (x) := ∏

i Ψi (x).)

6.8 Let V and W be normed spaces. A linear operator λ : V → W is bounded ifthere is a constantC such that‖λ(v)‖ ≤ C‖v‖ for all v ∈ V . Prove thatλ is continuousif and only if it is bounded.


6.9 In the Hilbert space H of square summable sequences, for n = 1, 2, . . . letun = (0, . . . , 0, 1

n , 0, . . .) where the nonzero entry is the nth component. Let A :={u1, u2, . . .} ∪ {0}. Since un → 0, this set is compact, hence closed. Construct asequence in the convex hull of A that converges to a point outside the convex hull ofA, thereby showing that the convex hull of A is not closed, hence also not compact.

Chapter 7Essential Sets of Fixed Points

Figure7.1 shows a function f : [0, 1] → [0, 1] with two fixed points, s and t . Intu-itively, they are qualitatively different, in that a small perturbation of f can result ina function that has no fixed points near s, but this is not the case for t . This distinctionwas recognized by Fort (1950) who described s as inessential, while t is said to beessential.

In game theory one often deals with correspondences with sets of fixed points thatare infinite, and include continua such as submanifolds. As we will see, the definitionproposed by Fort can be extended to sets of fixed points rather easily: roughly, a setof fixed points is essential if every neighborhood of it contains fixed points of every“sufficiently close” perturbation of the given correspondence. (Here one needs to becareful, because in the standard terminology of game theory, following Jiang 1963,essential Nash equilibria, and essential sets of Nash equilibria, are defined in termsof perturbations of the payoffs. This is a form of Q-robustness, which is studied inSect. 7.4.) But it is easy to show that the set of all fixed points is essential, so someadditional condition must be imposed before essential sets can be used to distinguishsome fixed points from others.

The condition that works well, at least from amathematical viewpoint, is connect-edness. This chapter’s main result, Theorem 7.2, which is due to Kinoshita (1952),asserts that minimal (in the sense of set inclusion) essential sets are connected. Theproof has the following outline. Let K be a minimal essential set of fixed points ofan upper hemicontinuous convex valued correspondence F : X → X , where X is acompact, convex subset of a locally convex topological vector space. Suppose thatK is disconnected, so there are disjoint open sets U1,U2 such that K1 := K ∩U1

and K2 := K ∩U2 are nonempty and K1 ∪ K2 = K . Since K is minimal, K1 andK2 are not essential, so there are perturbations F1 and F2 of F such that each Fihas no fixed points near Ki . Let α1, α2 : X → [0, 1] be continuous functions suchthat each αi vanishes outside Ui and is identically 1 near Ki , and let α : X → [0, 1]be the function α(x) := 1 − α1(x) − α2(x). Then α, α1, α2 is a partition of unitysubordinate to the open cover X \ K ,U1,U2. The correspondence


143


144 7 Essential Sets of Fixed Points

Fig. 7.1 A function with an essential fixed point and an inessential fixed point

x �→ α(x)F(x) + α1(x)F1(x) + α2(x)F2(x)

is then a perturbation of F that has no fixed points near K , which contradicts theassumption that K is essential. Much of this chapter is concerned with filling in thetechnical details of this argument.

Section 7.1 gives the Fan–Glicksberg theorem,which is the extension of theKaku-tani fixed point theorem to infinite dimensional sets. Section 7.2 shows that convexvalued correspondences can be approximated by functions, and defines convex com-binations of convex valued correspondences, with continuously varying weights.Section 7.4 then states and proves Kinoshita’s theorem, which implies that minimalconnected sets exist. There remains the matter of proving that minimal essential setsactually exist, which is also handled in Sect. 7.4.

7.1 The Fan–Glicksberg Theorem

We now extend the Kakutani fixed point theorem to correspondences with infinitedimensional domains. The result below was proved independently by Fan (1952)and Glicksberg (1952) using quite similar methods; our proof is perhaps a bit closerto Fan’s. In a sense the result was already known, since it can be derived from theEilenberg–Montgomery theorem, but the proof below is much simpler.

7.1 The Fan–Glicksberg Theorem 145

Theorem 7.1 (Fan, Glicksberg) If V is a locally convex topological vector space,X ⊂ V is nonempty, convex, and compact, and F : X → X is an upper hemicontin-uous convex valued correspondence, then F has a fixed point.

We treat two technical points separately:

Lemma 7.1 If V is a (not necessarily locally convex) topological vector space andK ,C ⊂ V with K compact and C closed, then K + C is closed.

Proof We will show that the complement is open. Let y be a point of V that is notin K + C . For each x ∈ K , translation invariance of the topology of V implies thatx + C is closed, so Lemma 6.2 gives a neighborhood Wx of the origin such that(y + Wx + Wx ) ∩ (x + C) = ∅. Since we can replace Wx with −Wx ∩ Wx , we mayassume that −Wx = Wx , so that (y + Wx ) ∩ (x + C + Wx ) = ∅. Choose x1, . . . , xksuch that the sets xi + Wxi cover K , and let W := Wx1 ∩ . . . ∩ Wxk . Now

(y + W ) ∩ (K + C) ⊂ (y + W ) ∩⋃

i

(xi + C + Wxi )

⊂⋃

i

(y + Wxi ) ∩ (xi + C + Wxi ) = ∅.

�

Lemma 7.2 If V is a (not necessarily locally convex) topological vector space andK ,C,U ⊂ V with K compact, C closed, U open, and C ∩ K ⊂ U, then there is aneighborhood of the origin W such that (C + W ) ∩ K ⊂ U.

Proof Let L := K \U . Our goal is to find a neighborhood of the origin W suchthat (C + W ) ∩ L = ∅. Since C is closed, for each x ∈ L there is (by Lemma 6.2) aneighborhood Wx of the origin such that (x + Wx + Wx ) ∩ C = ∅. We can replaceWx with−Wx ∩ Wx , so we may insist that−Wx = Wx . As a closed subset of K , L iscompact, so there are x1, . . . , xk such that the sets xi + Wxi cover L . LetW := Wx1 ∩. . . ∩ Wxk . ThenW = −W , so if (C + W ) ∩ L is nonempty, then so isC ∩ (L + W ),but

L + W ⊂(

⋃

i

xi + Wxi

)+ W ⊂

⋃

i

xi + Wxi + Wxi .

�

Proof of Theorem 7.1 LetU be a closed convex neighborhood of the origin. (Lemma6.4 implies that such aU exists.) Let FU : X → X be the correspondence FU (x) :=(F(x) +U ) ∩ X . Evidently FU (x) is nonempty and convex, and the first of the tworesults above implies that it is a closed subset of X , which is compact, so it is alsocompact.

To show that FU is upper hemicontinuous we consider a particular x and a neigh-borhood T of FU (x). The second of the two results above implies that there is


a neighborhood W of the origin such that (F(x) +U + W ) ∩ X ⊂ T . Since F isupper hemicontinuous there is a neighborhood A of x such that F(x ′) ⊂ F(x) + Wfor all x ′ ∈ A, and for such an x ′ we have

FU (x ′) = (F(x ′) +U ) ∩ X ⊂ (F(x) + W +U ) ∩ X ⊂ T .

Since X is compact, there are finitely many points x1, . . . , xk ∈ X such thatx1 +U, . . . , xk +U is a cover of X . Let C be the convex hull of these points.Define G : C → C by setting G(x) := FU (x) ∩ C ; since G(x) contains some xi ,it is nonempty, and of course it is convex. Since C is the image of the con-tinuous function (α1, . . . , αk) �→ α1x1 + · · · + αk xk from the (k − 1)-dimensionalsimplex, it is compact, and consequently closed because V is Hausdorff. SinceGr(G) = Gr(FU ) ∩ (C × C) is closed, G is upper hemicontinuous. Therefore Gsatisfies the hypothesis of the Kakutani fixed point theorem and has a nonempty setof fixed points. Any fixed point of G is a fixed point of FU , so the setF (FU ) of fixedpoints of FU is nonempty. Of course it is also closed in X , hence compact.

The collection of compact sets

{F (FU ) : U is a closed convex neighborhood of the origin }

has the finite intersection property because

∅ = F (F |U1∩...∩Uk ) ⊂ F (F |U1) ∩ . . . ∩ F (F |Uk ).

Suppose that x∗ is an element of this intersection. If x∗ was not an element of F(x∗)there would be a closed neighborhoodU of the origin such that (x∗ −U ) ∩ F(x∗) =∅, which contradicts x∗ ∈ F (F |U ), so x∗ is a fixed point of F . �

7.2 Convex Valued Correspondences

Let X be a topological space, and let Y be a subset of a topological vector space V .Then C on(X,Y ) is the set of upper hemicontinuous convex valued correspondencesfrom X to Y . Let C onS(X,Y ) denote this set endowed with the relative topologyinherited from US(X,Y ), which was defined in Sect. 5.2. This section treats twotopological issues that are particular to convex valued correspondences: a) approxi-mation by continuous functions; b) the continuity of the process by which they arerecombined using convex combinations and partitions of unity.

The following result is a variant, for convex valued correspondences, of theapproximation theorem (Theorem 9.1) that is the subject of the next chapter.

Proposition 7.1 If X is a metric space, V is locally convex, and Y is either open orconvex, then C(X,Y ) is dense in C onS(X,Y ).

7.2 Convex Valued Correspondences 147

Proof Fix F ∈ C on(X,Y ) and a neighborhood U ⊂ X × Y of Gr(F). Our goal isto produce a continuous function f : X → Y with Gr( f ) ⊂ U .

Consider a particular x ∈ X . For each y ∈ F(x) there is a neighborhood Tx,y ofx and (by Lemma 6.2) a neighborhood Wx,y of the origin in V such that

Tx,y × (y + Wx,y + Wx,y) ⊂ U.

IfY is openwe can also require that y + Wx,y + Wx,y ⊂ Y . The compactness of F(x)implies that there are y1, . . . , yk such that the yi + Wx,yi cover F(x). Setting Tx :=⋂

i Tx,yi andWx := ⋂i Wx,yi , we have Tx × (F(x) + Wx ) ⊂ U and F(x) + Wx ⊂ Y

if Y is open. Since V is locally convex, we may assume that Wx is convex becausewe can replace it with a smaller convex neighborhood. Upper hemicontinuity givesa δx > 0 such that Uδx (x) ⊂ Tx and F(x ′) ⊂ F(x) + Wx for all x ′ ∈ Uδx (x).

Since metric spaces are paracompact there is a locally finite open cover {Tα}α∈A

of X that refines {Uδx/2(x)}x∈X . For each α ∈ A choose xα such that Tα ⊂ Uδα/2(xα),where δα := δxα

, and choose yα ∈ F(xα). Since metric spaces are normal, Theorem6.2 gives a partition of unity {ψα} subordinate to {Tα}α∈A. Let f : X → V be thefunction

f (x) :=∑

α∈A

ψα(x)yα.

Fixing x ∈ X , let α1, . . . , αn be the α such that ψα(x) > 0. After renumberingwe may assume that δα1 ≥ δαi for all i = 2, . . . , n. For each such i we have xαi ∈Uδαi /2

(x) ⊂ Uδα1(xα1), so that yαi ∈ F(xα1) + Wxα1

. Since F(xα1) + Wxα1is convex

we have(x, f (x)) ∈ Uδα1

(xα1) × (F(xα1) + Wxα1) ⊂ U.

Note that f (x) is contained in Y either because Y is convex or because F(xα1) +Wxα1

⊂ Y . Since x was arbitrary, we have shown that Gr( f ) ⊂ U . �

7.3 Convex Combinations of Correspondences

We now study correspondences constructed from given correspondences by takinga convex combination, where the weights are given by a partition of unity. Let Xbe a topological space, and let V be a topological vector space. Since addition andscalar multiplication are continuous, Proposition 4.2 and Lemma 4.12 imply that thecomposition

(α, K ) �→ {α} × L �→ αK = { αv : v ∈ K } (7.1)

and the Minkowski sum

(K , L) �→ K × L �→ K + L := { v + w : v ∈ K , w ∈ L } (7.2)


are continuous functions from R × K (V ) and K (V ) × K (V ) toK (V ).These operations define continuous functions on the corresponding spaces of

functions and correspondences. Let CS(X) and CW (X) denote the spaces CS(X,R)

and CW (X,R) defined in Sect. 5.5.

Lemma 7.3 The function (ψ, F) �→ ψF is continuous as a function from CS(X) ×US(X, V ) to US(X, V ), and also as a function from CW (X) × UW (X, V ) toUW (X, V ). The function (F1, F2) �→ F1 + F2 is continuous as a function fromUS(X, V ) × US(X, V ) to US(X, V ), and also as a function from UW (X, V ) ×UW (X, V ) to UW (X, V ).

Proof ByLemmas5.11 and 5.12, the continuity of (ψ, F) �→ ψF as a function fromCS(X) × US(X, V ) to US(X, V ) is equivalent to its continuity as a function fromCS(X) × CS(X,K (V )) to CS(X,K (V )), and similarly for the other functions. Inview of this the claims follow from Lemmas 5.14–5.17, and the continuity of scalarproduct and Minkowski sum noted above. �

Let PU k(X) be the space of k-element partitions of unity ψ1, . . . , ψk of X .LetPU k

S(X) andPU kW (X) bePU k(X) endowed with the relative topologies it

inherits as subspaces of CS(X)k and CW (X)k . The result above implies:

Proposition 7.2 The function

(ψ1, . . . , ψk, F1, . . . , Fk) �→ ψ1F1 + · · · + ψk Fk

is continuous as a function from PU kS(X) × US(X, V )k to US(X, V ) and also as

a function from PU kW (X) × UW (X, V )k to UW (X, V ).

7.4 Kinoshita’s Theorem

Let X be a compact convex subset of a locally convex topological vector space, andfix a particular F ∈ C on(X, X).

Definition 7.1 A set K ⊂ F (F) is an essential set of fixed points of F if it iscompact and for any open U ⊃ K there is a neighborhood V ⊂ C onS(X, X) of Fsuch that F (F ′) ∩U = ∅ for all F ′ ∈ V .

The following result from Kinoshita (1952) is a key element of the theory ofessential sets.

Theorem 7.2 (Kinoshita) If K ⊂ F (F) is essential and K1, . . . , Kk is a partitionof K into disjoint compact sets, then some K j is essential.

Proof Suppose that no K j is essential. Then for each j = 1, . . . , k there is a neigh-borhood Uj of K j such that for every neighborhood Vj ⊂ C onS(X, X) there is an

7.4 Kinoshita’s Theorem 149

Fj ∈ Vj with no fixed points in Uj . Replacing the Uj with smaller neighborhoodsif need be, we can assume that they are pairwise disjoint. Let U be a neighbor-hood of X \ (U1 ∪ . . . ∪Uk) whose closure does not intersect K . A compact Haus-dorff space is normal, so Theorem 6.2 implies the existence of a partition of unityϕ1, . . . , ϕk, ϕ : X → [0, 1] subordinate to the open cover U1, . . . ,Uk,U . Let V ⊂C onS(X, X) be a neighborhood of F . Proposition 7.2 implies that there are neigh-borhoods V1, . . . , Vk ⊂ C onS(X, X) of F such that ϕ1F1 + · · · + ϕk Fk + ϕF ∈ Vwhenever F1 ∈ V1, . . . , Fk ∈ Vk . For each j we can choose a Fj ∈ Vj that has nofixed points in Uj . Then ϕ1F1 + · · · + ϕk Fk + ϕF has no fixed points in X \Ubecause on each Uj \U it agrees with Fj . Since X \U is a neighborhood of K andV was arbitrary, this contradicts the assumption that K is essential. �

Recall that a topological space is connected if it is not the union of two disjointnonempty open sets. A subset of a topological space is connected if the relativetopology makes it a connected space.

Corollary 7.1 A minimal essential set is connected.

Proof Let K be an essential set. If K is not connected, then there are disjoint opensets U1,U2 such that K ⊂ U1 ∪U2 and K1 := K ∩U1 and K2 := K ∩U2 are bothnonempty. Since K1 and K2 are closed subsets of K , they are compact, so Kinoshita’stheorem implies that either K1 or K2 is essential. Consequently K cannot beminimal. �

7.5 Minimal Q-Robust Sets

Naturally we would like to know whether minimal essential sets exist. Because ofimportant applications in game theory, we will develop the analysis in the context ofa slightly more general concept.

Definition 7.2 A pointed space is a pair (A, a0) where A is a topological spaceand a0 ∈ A. A pointed map f : (A, a0) → (B, b0) between pointed spaces is acontinuous function f : A → B with f (a0) = b0.

Definition 7.3 Suppose (A, a0) is a pointed space and

Q : (A, a0) → (C onS(X, X), F)

is a pointed map. A nonempty compact set K ⊂ F (F) is Q-robust if, for everyneighborhood V ⊂ X of K , there is a neighborhood U ⊂ A of a0 such thatF (Q(a)) ∩ V = ∅ for all a ∈ U .

A set of fixed points is essential if and only if it is Id(C onS(X,X),F)-robust. At theother extreme, if Q is a constant function, so that Q(a) = F for all a, then anynonempty compact K ⊂ F (F) is Q-robust. The weakening of the notion of an


essential set provided by this definition is useful when certain perturbations of F arethought to be more relevant than others, or when the perturbations of F are derivedfrom perturbations of the parameter a in a neighborhood of a0. Some of the mostimportant refinements of the Nash equilibrium concept have this form. In particular,Jiang (1963) defines essential Nash equilibria, and essential sets of Nash equilibria,in terms of perturbations of the game’s payoffs, while Kohlberg and Mertens (1986)define stable sets of Nash equilibria in terms of those perturbations of the payoffsthat are induced by the trembles of Selten (1975).

Lemma 7.4 F (F) is Q-robust.

Proof The continuity ofF (Theorem 5.1) implies that for any neighborhood V ⊂ XofF (F) there is a neighborhoodU ⊂ A of a0 such thatF (Q(a)) ⊂ V for all a ∈ U .The Fan–Glicksberg fixed point theorem implies that F (Q(a)) is nonempty. �

This result shows that if our goal is to discriminate between some fixed points andothers, these concepts must be strengthened in some way. The two main methods fordoing this are to require either connectedness or minimality.

Definition 7.4 A nonempty compact set K ⊂ F (F) is a minimal Q-robust set ifit is Q-robust and minimal in the class of such sets: K is Q-robust and no propersubset is Q-robust. A minimal connected Q-robust set is a connected Q-robust setthat does not contain a proper subset that is connected and Q-robust.

In general a minimal Q-robust set need not be connected. For example, if(A, a0) = ((−1, 1), 0) and Q(a)(t) = argmaxt∈[0,1]at (so that F(t) = [0, 1] for allt) thenF (Q(a)) is {0} if a < 0 and it is {1} if a > 0, so the only minimal Q-robustset is {0, 1}. In view of this one must be careful to distinguish between a minimalconnected Q-robust set and a minimal Q-robust set that happens to be connected.

Theorem 7.3 If K ⊂ F (F) is a Q-robust set, then it contains a minimal Q-robustset, and if K is a connected Q-robust set, then it contains a minimal connectedQ-robust set.

Proof Let C be the set of Q-robust sets that are contained in K . We order this setby reverse inclusion, so that our goal is to show that C has a maximal element. Thisfollows from Zorn’s lemma if we can show that any completely ordered subsetO hasan upper bound in C . The finite intersection property implies that the intersection ofall elements of O is nonempty; let K∞ be this intersection. If K∞ is not Q-robust,then there is a neighborhood V of K∞ such that every neighborhoodU of a0 containsa point a such that Q(a) has no fixed points in V . If L ∈ O , we cannot have L ⊂ Vbecause L is Q-robust, but now { L \ V : L ∈ O } is a collection of compact sets withthe finite intersection property, so it has a nonempty intersection that is contained inK∞ but disjoint from V . Of course this is absurd.

The argument for connected Q-robust sets follows the same lines, except that inaddition to showing that K∞ is Q-robust, we must also show that it is connected. Ifnot there are disjoint open sets V1 and V2 such that K∞ ⊂ V1 ∪ V2 and K∞ ∩ V1 =

7.5 Minimal Q-Robust Sets 151

∅ = K∞ ∩ V2. For each L ∈ O we have L ∩ V1 = ∅ = L ∩ V2, so L \ (V1 ∪ V2)

must be nonempty because L is connected. As above, { L \ (V1 ∪ V2) : L ∈ O } hasa nonempty intersection that is contained in K∞ but disjoint from V1 ∪ V2, which isimpossible. �

Exercises

We outline certain concepts and results from the theory of refinements of Nashequilibrium. It is assumed that the reader knows the basic elements of the theory ofNash equilibrium as they are laid out in Sect. 15.9. We do not discuss the conceptualsignificance of these concepts and results; relevant background can be found in,for example, Selten (1975), Myerson (1978), Kohlberg and Mertens (1986), andMyerson (1991).

Let G = (S1, . . . , Sn, u1, . . . , un) be a given strategic form game: S1, . . . , Snare nonempty finite sets of pure strategies, and u1, . . . un : S → R are functions,where S = S1 × · · · × Sn is the set of pure strategy profiles. For any nonemptyfinite set X let

Δ(X) := { μ : X → [0, 1] :∑

μ(x) = 1 }

be the set of probability measures on X . For each i = 1, . . . , n the set of mixedstrategies for agent i is �i := Δ(Si ), and the set of totally mixed strategies is

�◦i := { σ∈�i : σi (si ) > 0 for all si ∈ Si }.

The sets of mixed strategy profiles and totally mixed strategy profiles are � :=�1 × · · · × �n and �◦ := �◦

1 × · · · × �◦n respectively.

The functions ui are understood to be von-Neumann–Morgenstern utility func-tions, andwe extendui to�i by taking expectations:ui (σ ) := ∑

s∈S(∏

i σi (si ))ui (s).For each i let BRi : � → �i be agent i’s best response correspondence: BRi (σ ) :=argmaxτi∈�i

ui (τi , σ−i ), and let BR : � → � be the best response correspondence:BR(σ ) := BR1(σ ) × · · · × BRn(σ ). A fixed point of BR is a Nash equilibrium.

Let QH be the set of possible payoffs u = (u1, . . . , un) for games with thepure strategy sets S1, . . . , Sn . We endow QH with the Euclidean topology derivedfrom the obvious identification with (RS)n . For u ∈ QH and σ ∈ � let BRu

i (σ ) :=argmaxτi∈�i

ui (τi , σ−i ) for each i , and let BRu(σ ) := BRu1 (σ ) × · · · × BRu

n (σ ).

7.1 A Nash equilibrium σ ∗ is essential (Wu and Jiang 1962) if, for every neighbor-hoodU ⊂ � of σ ∗, there is a neighborhood V ⊂ QH of u such that for every u ∈ V ,BRu has a fixed point in U . Give an example of a game with one player that doesnot have an essential Nash equilibrium.

7.2 (E. Solan and O. N. Solan) We study an example of a game with an isolatedtotallymixedNash equilibrium that is not essential. In the game below the first player


chooses the top or bottom row, the second player chooses the left or right column, andthe third player chooses the left or right matrix. Let S1 = S2 = S3 = {a, b} where arepresents top or left and b represents bottom or right.

(1, 1, 1) (−5, 0, 3)(0, 3,−5) (0, 0, 1)

(3,−5, 0) (1, 0, 0)(0, 1, 0) (0, 0, 0)

(a) Show that this game is symmetric in the following sense: if (r, s, t) ∈ S is a purestrategy profile, then the first player’s payoff at (r, s, t) is the same as the secondplayer’s payoff at (t, r, s) and the same as the third player’s payoff at (s, t, r).

(b) Show that this game has a single pure Nash equilibrium and noNash equilibria inwhich one player plays a pure strategy and another player plays a mixed strategy.

(c) Show that if the second player plays a with probability 12 + y and the third player

plays a with probability 12 + z, then the first player is indifferent between playing

a and b if and only if y − z + yz = 0.(d) Show that (0, 0, 0) is the unique solution of the system of equations x − y +

xy = 0, y − z + yz = 0, and z − x + zx = 0.(e) In this and the next part we consider the perturbed game obtained by adding 4ε

to all of the nonzero payoffs of the first player. Show that if the second playerplays a with probability 1

2 + y and the third player plays a with probability12 + z, then the first player is indifferent between playing a and b if and only ifε + y − z + yz = 0.

(f) Show that for small ε > 0 the system of equations x − y + xy = 0, ε + y − z +yz = 0, and z − x + zx = 0 has no solution.

For each i let QFi be the set of nonempty polytopes contained in �◦

i , and letQF

i := {�i } ∪ QFi . Let Q

F := QF1 × · · · × QF

n and QF := {(�1, . . . , �n)} ∪ QF .We endow each QF

i with the topology induced by the Hausdorff metric, we endowQF

1 × · · · × QFn with the product topology, and we endow QF and QF with the rela-

tive topologies induced by their inclusions in this space. For P ∈ QF and σ ∈ � letBRP(σ ) := BRP

1 (σ ) × · · · × BRPn (σ ) where BRP

i (σ ) := argmaxτi∈Pi ui (τi , σ−i ).A set of Nash equilibria is fully stable (Kohlberg and Mertens 1986) if it is minimalin the class of closed sets of Nash equilibria C such that for every neighborhoodU ⊂ � of C there is a neighborhood V ⊂ QF of (�1, . . . , �n) such that for everyP ∈ V there is a fixed point of BRP in U .

For each i let QTi be the set of Pi ∈ QF such that Pi = (1 − εi )�i + εiσ i for

some εi ∈ (0, 1) and σ i in the interior of �i , and let QT := QT1 × · · · × QT

n andQT := {(�1, . . . , �n)} ∪ QT .

7.3 Amixed strategy profile σ ∗ ∈ � is a perfect equilibrium (Selten 1975) if thereare sequences {Pr } in QT and {σ r } in �◦ such that Pr → (�1, . . . , �n), each σ r isa fixed point of BRPr

, and σ r → σ ∗.

Exercises 153

(a) Prove that a perfect equilibrium is a Nash equilibrium.(b) Prove that the set of perfect equilibria is nonempty.(c) Prove that the set of perfect equilibria is closed.

7.4 Find a two player game such that there is no Nash equilibrium σ ∗ such that forany neighborhood U ⊂ � of σ ∗ there is a neighborhood V ⊂ QT of (�1, �2) suchthat for every P ∈ V there is a fixed point of BRP inU . Make sure your example isminimal with respect to the numbers of pure strategies of the two agents, and showthat any “smaller” game has no such equilibrium.

For each i let QPi be the set of Pi ∈ QF such that for some σ i ∈ �◦

i , Pi is theconvex hull of all points obtained by permuting the coordinates of σ i . (Recall thatif the coordinates of σ i are all different, then Pi is a permutahedron.) Let QP :=QP

1 × · · · × QPn and QP := {(�1, . . . , �n)} ∪ QP .

7.5 A mixed strategy profile σ ∗ ∈ � is a proper equilibrium (Myerson 1978) ifthere are sequences {εr } in (0, 1) and {σ r } in�◦ such that εr → 0, σ r

i (si ) ≤ εrσ ri (ti )

for all r , i , and si , ti ∈ Si such that ui (si , σ r−i ) < ui (ti , σ r

−i ), and σ r → σ ∗.

(a) Prove that a proper equilibrium is a perfect equilibrium.(b) Prove that the set of proper equilibria is nonempty.(c) Prove that the set of proper equilibria is closed.

A set of Nash equilibria is stable (Kohlberg and Mertens 1986) if it is minimalin the class of closed sets of Nash equilibria C such that for every neighborhoodU ⊂ � of C there is a neighborhood V ⊂ QT of (�1, . . . , �n) such that for everyP ∈ V there is a fixed point of BRP in U .

7.6 A fact that is beyond our scope is that the set of Nash equilibria has finitelymanyconnected components. Taking this as known, prove that one of these componentscontains a fully stable set that in turn contains a stable set.

7.7 A pure strategy si ∈ Si is weakly dominated if there is a ti ∈ Si such thatui (si , s−i ) ≤ ui (ti , s−i ) for all s−i ∈ ∏

j =i S j , with strict inequality for some s−i .

(a) Prove that is σ ∗ is a perfect equilibrium, then each σ ∗i assigns no probability to

any weakly dominated pure strategy.(b) Prove that any stable set is contained in the set of perfect equilibria.(c) Prove that a fully stable set contains a proper equilibrium.(d) Find the fully stable sets and the stable sets of the game below (Fig. 7.2).

1\2 L RU (2,1) (2,1)D (0,1) (1,0)

Fig. 7.2 A two-by-two game


7.8 Let X be a topological space, and let V be a topological vector space. Recallthat for an arbitrary index set I a partition of unity for X is a collection of functions{ψi }i∈I from X to [0, 1] such that each x has a neighborhood on which only finitelymany of the functions are nonzero and

∑i ψi (x) = 1. Let PU I (X) be the space

of such partitions of unity. Let PU IS(X) and PU I

W (X) be PU I (X) endowedwith the relative topologies it inherits as subspaces of CS(X)I and CW (X)I . Provethat the function ({ψi }i∈I , {Fi }i∈I ) �→ ∑

i ψi Fi is continuous as a function fromPU I

S(X) × US(X, V )I to US(X, V ) and also as a function from PU IW (X) ×

UW (X, V )I to UW (X, V ).

Chapter 8Retracts

The theory of retracts was initiated by Karol Borsuk in his Ph.D. thesis, and soonbecame one of the central concepts of topology, in no small part due to its relevance tothe theory of fixed points. The book The Theory of Retracts (Borsuk 1967) continuesto be a key reference for the topic, even though the literature has continued to expandsince its publication.

This chapter begins with an example due to Kinoshita (1953) of a compact con-tractible subset of a Euclidean space that does not have the fixed point property.The example is elegant, but also rather complex, and nothing later depends on it,so it can be postponed until the reader is in the mood for a mathematical treat. Thepoint is that fixed point theory depends on some additional condition over and abovecompactness and contractibility.

After that we develop the required material from the theory of retracts. We firstdescribe retracts in general, and then briefly discuss Euclidean neighborhood retracts,which are retracts of open subsets of Euclidean spaces. This concept is quite general,encompassing simplicial complexes and (as we will see later) smooth submanifoldsof Euclidean spaces.

The central concept of the chapter is the notion of an absolute neighborhoodretract (ANR) which is a metrizable space whose image, under any embedding as aclosed subset of a metric space, is a retract of some neighborhood of itself. The twokey characterization results are that an open subset of a convex subset of a locallyconvex linear space is an absolute neighborhood retract, and that an ANR can beembedded in a normed linear space as a retract of an open subset of a convex set. Weestablish several additional properties of these spaces, eventually proving that anylocally finite simplicial complex is an ANR. Section17.11 provides two additionalsufficient conditions for a metric space to be an ANR, one of which is also necessary.

An absolute retract (AR) is a space that is a retract of any metric space it isembedded in as a closed subset. It turns out that the ARs are precisely the contractibleANR’s.


155


156 8 Retracts

The extension of fixed point theory to infinite dimensional settings ultimatelydepends on “approximating” the setting with finite dimensional objects. Section8.6provides one of the key results in this direction.

8.1 Kinoshita’s Example

A topological space X is contractible if its identity function is homotopic to aconstant function, so there is a continuous function c : X × [0, 1] → X , called acontraction, such that c0 = IdX and c1 is constant. (As usual with homotopies,ct := c(·, t) denotes the function “at time t .”) A subset S of a TVS is starshapedat x∗ if S contains the line segment between each of its points and x∗ in which casec(x, t) := (1 − t)x + t x∗ is a contraction of S. In particular, a set is convex if it isstarshaped at each of its points, so convex sets are contractible.

The circle is not contractible, and the proof of this illustrates some ideas thatwe will see in greater generality later. For this purpose, and also in our discussionof the example below, polar coordinates facilitate the description: (r, θ) ∈ R+ × R

is identified with (r cos θ, r sin θ) ∈ R2. The unit circle is C = { (1, θ) : θ ∈ R }. If

f : C → C is continuous, there is a unique continuous f : [0, 1] → R such thatf (0) ∈ [0, 1) and f (1, 2π t) = (1, 2π f (t)) for all t . (Make sure you could providea detailed formal proof of this assertion.) Since f (1, 0) = f (1, 2π), f (1) − f (0) isan integer called thewinding number of f . If h : C × [0, 1] → C is a homotopy, thewinding number of ht is a (locally constant, hence) constant function of t (again,makesure you could prove this) which is to say that the winding number is a homotopyinvariant. Since the winding number of IdC is one and the winding number of aconstant function is zero, they cannot be homotopic, so C is not contractible.

Borsuk (1935) presented an example of a compact subset ofR3 that is acyclic (hasthe homology of a point) but does not have the fixed point property. His example isnot contractible, so there arose the question of whether a compact contractible spacecould fail to have the fixed point property. (Whether a space can fail to have the fixedpoint property if it is compact, contractible, and locally connected (Sect. 8.4) seemsto be a problem that is still open.) Kinoshita (1953) presented the example describedbelow, which came to be known as the “tin can with a roll of toilet paper.” As youwill see, this description is apt, but does not do justice to the example’s beauty andingenuity.

We continue to work with polar coordinates. The circle C bounds the open diskD = { (r, θ) : r < 1 }. The “tin can” is

(C × [0, 1]) ∪ (D × {0}) ⊂ R3 .

Let ρ : R+ → [0, 1) be a homeomorphism. Of course ρ(0) = 0, ρ is strictlyincreasing, and ρ(τ) → 1 as τ → ∞. Let s : R+ → [0, 1) × R be the functions(τ ) = (ρ(τ ), τ ). We interpret s as taking values in the space of polar coordinates,so the image S = { s(τ ) : τ ≥ 0 } of s is a curve that spirals out from the origin,

8.1 Kinoshita’s Example 157

approaching the unit circle asymptotically. Perhaps S × [0, 1] ⊂ R3 doesn’t resem-

ble a roll of toilet paper in all respects, but you can see where the name came from.Let

X = (C × [0, 1]) ∪ (D × {0}) ∪ (S × [0, 1]).

Evidently X is closed, hence compact, and there is an obvious contraction of X thatfirst pushes the cylinder of the tin can and the toilet paper down onto the closed unitdisk and then contracts the disk to the origin.

We are now going to define functions

f1 : C × [0, 1] → X, f2 : D × {0} → X, f3 : S × [0, 1] → X

which combine to form a continuous function f : X → X with no fixed points. Letκ : [0, 1] → [0, 1] be continuous with κ(0) = 0, κ(z) > z for all 0 < z < 1, andκ(1) = 1. Fix a number ε ∈ (0, 2π). The function f1 : C × [0, 1] → C × [0, 1] isgiven by the formula

f1(1, θ, z) = (1, θ − (1 − 2z)ε, κ(z)).

This function raises the unit circle at height z to height κ(z) while rotating it by(1 − 2z)ε radians. Since κ(z) > z if 0 < z < 1, a fixed point must have either z = 0or z = 1, but ε is not a multiple of 2π , so f1 has no fixed points.

We decompose D = { (ρ(τ ), θ) : τ ≥ 0, θ ∈ R } as the union of

A = { (ρ(τ ), θ) ∈ D : τ ≥ ε } and B = { (ρ(τ ), θ) ∈ D : τ ≤ ε }.

Let f ′2 : A → D be the function

f ′2(ρ(τ ), θ, 0) = (ρ(τ − ε), θ − ε, 0).

Note that f ′2(s(τ ), 0) = (s(τ − ε), 0), so { s(τ ) : τ ≥ ε } ⊂ S is mapped onto all of

S. Let f ′′2 : B → S × [0, 1] be the function

f ′′2 (ρ(τ ), θ, 0) = (0, 0, 1 − τ/ε).

If (ρ(ε), θ, 0) ∈ A ∩ B, then f ′2(ρ(ε, θ, 0) = f ′′

2 (ρ(ε, θ, 0) = (0, 0, 0), so f ′2 and

f ′′2 combine to form a continuous function f2 : D → X . The function f ′

2 stretches Ato cover all of D, and f ′′

2 maps A ∩ B onto the origin while mapping all other pointsin B to points with positive third coordinates, so f2 has no fixed points.

The formulas defining f3 are less transparent, so we first describe what is goingon geometrically. Since f3 will agree with f2 on S × {0} we can already see that ifτ ≥ ε, then f3(s(τ ), 0) = (s(τ − ε), 0), and if τ ≤ ε, then f3(s(τ ), 0) = (0, 0, 1 −τ/ε). The function will continue in this fashion, mapping { (0, 0, z) : 0 ≤ z ≤ 1 }to { (s(τ ), 1) : τ ≤ 1 }, mapping { (s(τ ), 1) : τ ≤ 1 } to { (s(τ ), 1) : 1 ≤ τ ≤ 2 }, and

158 8 Retracts

mapping { (s(τ ), 1) : 1 ≤ τ } to { (s(τ ), 1) : 2 ≤ τ }. Fixed points in the interior ofS × [0, 1]will be avoided by increasing the final (“vertical”) component everywhere.

We decompose S × [0, 1] as the union of

E = { (s(τ ), z) : τ ≥ ε } and F = { (s(τ ), z) : τ ≤ ε }.

Let f ′3 : E → S × [0, 1] be the function

f ′3(s(τ ), z) = (s(τ − (1 − 2z)ε), κ(z)).

Let f ′′3 : F → S × [0, 1] be the function

f ′′3 (s(τ ), z) = (s((τ + ε)z), 1 − (1 − κ(z))τ/ε).

Note thatf ′3(s(ε), z) = (s(2εz), κ(z)) = f ′′

3 (s(ε), z),

so f ′3 and f ′′

3 combine to form a continuous function f3 : S × [0, 1] → S × [0, 1].It is easy to check that f3 has the formulas we saw above for points in S × {0}, sof2 and f3 agree on this set, and consequently they combine to form a continuousfunction from D ∪ (S × [0, 1]) to itself.

The function f : X → X combines f1, f2, and f3. To complete the verificationthat f is continuous we consider sequences in D ∪ (S × [0, 1]) converging to pointsin C × [0, 1]. First suppose that {(ρ(ti ), θi , 0)} is a sequence in D × {0} convergingto (1, θ, 0). (Convergence of the second component is mod 2π .) We have ti → ∞because ρ(ti ) → 1, so ti − ε → ∞ and ρ(ti − ε) → 1. Therefore

f2(ρ(ti ), θi , 0) = (ρ(ti − ε), θi − ε, 0) → (1, θ − ε, 0) = f1(1, θ, 0).

If {(s(τi ), zi )} is a sequence in S × [0, 1] converging to (1, θ, z), then

f3(s(τi ), zi ) = (ρ(τi − (1 − 2zi )ε), τi − (1 − 2zi )ε, κ(zi ))

and f1(1, θ, z) = (1, θ − (1 − 2z)ε, κ(z)). We have ρ(τi − (1 − 2zi )ε) → 1because τi → ∞. Also, s(τi ) = (ρ(τi ), τi ) → (1, θ), so τi → θ mod 2π , and thusτi − (1 − 2zi )ε → θ − (1 − 2z)ε mod 2π . Since κ is continuous, κ(zi ) → κ(z).

Finally we formally verify that f3 has the qualitative features mentioned above. Inaddition to the equations we saw before, if 0 ≤ z ≤ 1, then f3(0, 0, z) = (s(εz), 1),if τ ≤ ε, then f3(s(τ ), 1) = (s(τ + ε), 1), and if τ ≥ ε, then f3(s(τ ), 1) = (s(τ +ε), 1). If 0 < z < 1, then κ(z) > z, and if, in addition, 0 < τ < ε, then 1 − (1 −κ(z))τ/ε > 1 − (1 − κ(z)) = κ(z) > z.

8.2 Retracts 159

8.2 Retracts

This section prepares for later material by presenting general facts about retractionsand retracts. Let X be a metric space, and let A be a subset of X such that there isa continuous function r : X → A with r(a) = a for all a ∈ A. We say that A is aretract of X and that r is a retraction. Many desirable properties that X might haveare inherited by A.

Lemma 8.1 If X has the fixed point property, then A has the fixed point property.

Proof If f : A → A is continuous, then f ◦ r necessarily has a fixed point, say a∗,which must be in A, so that a∗ = f (r(a∗)) = f (a∗) is also a fixed point of f . �

Lemma 8.2 If X is contractible, then A is contractible.

Proof If c : X × [0, 1] → X is a contraction X , then so is (a, t) → r(c(a, t)). �

Lemma 8.3 If X is connected, then A is connected.

Proof We show that if A is not connected, then X is not connected. IfU1 andU2 arenonempty open subsets of A withU1 ∩U2 = ∅ andU1 ∪U2 = A, then r−1(U1) andr−1(U2) are nonemptyopen subsets of X with r−1(U1) ∩ r−1(U2) = ∅ and r−1(U1) ∪r−1(U2) = X . �

Here are two basic observations that are too obvious to prove.

Lemma 8.4 If s : A → B is a second retraction, then s ◦ r : X → B is a retraction,so B is a retract of X.

Lemma 8.5 If A ⊂ Y ⊂ X, then the restriction of r to Y is a retraction, so A is aretract of Y .

We say that A is a neighborhood retract in X if A is a retract of an openU ⊂ X .We note two other simple facts, the first of which is an obvious consequence of thelast result:

Lemma 8.6 Suppose that A is not connected: there are disjoint open sets U1,U2 ⊂X such that A ⊂ U1 ∪U2 with A1 := A ∩U1 and A2 := A ∩U2 both nonempty.Then A is a neighborhood retract in X if and only if both A1 and A2 are neighborhoodretracts in X.

Lemma 8.7 If A is a neighborhood retract in X and B is a neighborhood retract inA, then B is a neighborhood retract in X.

Proof Let r : U → A and s : V → B be retractions, where U is a neighborhood ofA and V ⊂ A is a neighborhood of B in the relative topology of A. The definitionof the relative topology implies that there is a neighborhood W ⊂ X of B such thatV = A ∩ W . Then U ∩ W is a neighborhood of B in X , and the composition of swith the restriction of r to U ∩ W is a retraction onto B. �

160 8 Retracts

A set A ⊂ X is locally closed if it is the intersection of an open set and a closedset. Equivalently, it is an open subset of a closed set, or a closed subset of an openset.

Lemma 8.8 A neighborhood retract is locally closed.

Proof If U ⊂ X is open and r : U → A is a retraction, A is a closed subset of Ubecause it is the set of fixed points of r . �

This terminology ‘locally closed’ is further explained by:

Lemma 8.9 If X is a topological space and A ⊂ X, then A is locally closed if andonly if each point x ∈ A has a neighborhood U such that U ∩ A is closed in U.

Proof If A = U ∩ C where U is open and C is closed, then U is a neighborhood ofeach x ∈ A, and A is closed in U . On the other hand suppose that each x ∈ A has aneighborhood Ux such that Ux ∩ A is closed in Ux , which is to say that Ux ∩ A =Ux ∩ A. Then A = ⋃

x (Ux ∩ A) = ⋃x (Ux ∩ A) = ( ⋃

x Ux) ∩ A. �

Corollary 8.1 If X is a locally compact Hausdorff space, a set A ⊂ X is locallyclosed if and only if A is locally compact.

Proof First suppose that A = U ∩ C is the intersection of an open and a closed set.If x ∈ A and K is a compact neighborhood (in X ) of x contained in U , then K ∩ Cis a neighborhood (in A) of x that is a compact because it is a closed subset of acompact set.

Now suppose that A is locally compact. Consider an x ∈ A, and let K be a compactneighborhood (in A). LetU := K ∪ (X \ A). ThenU is a neighborhood (in X ) of x ,and U ∩ A = K is closed in U because it is compact and and X is Hausdorff. Sincex was arbitrary, we have shown that the condition in the last result holds. �

8.3 Euclidean Neighborhood Retracts

A Euclidean neighborhood retract (ENR) is a topological space that is homeo-morphic to a neighborhood retract of a Euclidean space. If a subset of a Euclideanspace is homeomorphic to an ENR, then it is a neighborhood retract:

Proposition 8.1 Suppose that U ⊂ Rm is open, r : U → A is a retraction, B ⊂ R

n,and h : A → B is a homeomorphism. Then B is a neighborhood retract.

Proof Since A is locally closed and Rm is locally compact, each point in A has

a closed neighborhood that contains a compact neighborhood. Having a compactneighborhood is an intrinsic property, so every point in B has such a neighborhood,and Corollary 8.1 implies that B is locally closed. Let V ⊂ R

n be an open set thathas B as a closed subset. The Tietze extension theorem gives an extension of h−1

to a map j : V → Rm . After replacing V with j−1(U ), V is still an open set that

contains B, and h ◦ r ◦ j : V → B is a retraction. �

8.3 Euclidean Neighborhood Retracts 161

Note that every locally closed set A = U ∩ C ⊂ Rm is homeomorphic to a

closed subset ofRm+1, by virtue of the embedding x → (x, d(x,Rm \U )−1), whered(x,Rm \U ) is the distance from x to the nearest point not in U . Thus a sufficientcondition for X to be an ENR is that it is homeomorphic to a neighborhood retractof a Euclidean space, but a necessary condition is that it homeomorphic to a closedneighborhood retract of a Euclidean space.

In order to expand the scope of fixed point theory, it is desirable to show thatmany types of spaces are ENR’s. Eventually we will see that a smooth submanifoldof a Euclidean space is an ENR. At this point we can show that simplicial complexeshave this property.

Lemma 8.10 If K ′ = (V ′,C ′) is a subcomplex of a finite simplicial complex K =(V,C), then |K ′| is a neighborhood retract in |K |.Proof To begin with suppose that there are simplices of positive dimension in Kthat are not in K ′. Let σ be such a simplex of maximal dimension, and let β be thebarycenter of |σ |. Then |K | \ {β} is a neighborhood of |K | \ int |σ |, and there is aretraction r of the former set onto the latter that is the identity on the latter, of course,and which maps (1 − t)x + tβ to x whenever x ∈ |∂σ | and 0 < t < 1.

Iterating this construction and applying Lemma 8.7 above, we find that there isa neighborhood retract of |K | consisting of |K ′| and finitely many isolated points.Now Lemma 8.6 implies that |K ′| is a neighborhood retract in |K |. �

Proposition 8.2 If K = (V,C) is a finite simplicial complex, then |K | is an ENR.Proof LetΔ be the convex hull of the set of unit basis vectors inR|V |. After repeatedbarycentric subdivision of Δ there is a (|V | − 1)-dimensional simplex σ in the inte-rior of Δ. (This is a consequence of Proposition 2.10.) Identifying the vertices ofσ with the elements of V leads to an embedding of |K | as a subcomplex of thissubdivision, after which we can apply the result above. �

Giving an example of a closed subset of a Euclidean space that is not an ENR is abit more difficult. Eventually we will see that a contractible ENR has the fixed pointproperty, from which it follows that Kinoshita’s example is not an ENR. A simplerexample is the Hawaiian earring H , which is the union over all n = 1, 2, . . . ofthe circle in R

2 of radius 1/n centered at (1/n, 0). Suppose there was a retractionr : U → H of a neighborhood U ⊂ R

2 of H . Since U is a neighborhood of theorigin, for some n the entire disk D of radius 1/n centered at (1/n, 0) would becontained in U . Let C be the boundary of D. The function s : H → C that takesevery point in C to itself and every point outside of C to the origin is evidentlycontinuous, and thus a retraction. Therefore s ◦ r |D is a retraction of D onto C , andsince D is convex, hence contractible, it follows (Lemma8.2) that C is contractible,but in Sect. 8.1 we saw that this is not the case. Thus no such r exists.

162 8 Retracts

8.4 Absolute Neighborhood Retracts

Ametric space A is an absolute neighborhood retract (ANR) if h(A) is a neighbor-hood retract whenever X is a metric space, h : A → X is an embedding, and h(A) isclosed. This definition is evidently modelled on the description of ENR’s we arrivedat in the last section, with ‘metric space’ in place of ‘Euclidean space.’

We saw above that if A ⊂ Rm is a neighborhood retract, then the image of any

embedding of A in another Euclidean space is also a neighborhood retract, and forsome embedding the image is a closed subset of the Euclidean space. Thus a natural,and at least potentially more restrictive, extension of the concept is obtained bydefining an ANR to be a space A such that h(A) is a neighborhood retract wheneverh : A → X is an embedding of A in a metric space X , even if h(A) is not closed.

There is a second sense in which the definition is weaker than it might be. Atopological space is completely metrizable if its topology can be induced by acomplete metric. Since an ENR is homeomorphic to a closed subset of a Euclideanspace, an ENR is completelymetrizable. A subset of a topological space is aGδ if it isthe intersection of countablymany open sets. Problem6KofKelley (1955) shows thata topological space A is completely metrizable if and only if, whenever h : A → Xis an embedding of A in a metric space X , h(A) is a Gδ . The set of rational numbersis an example of a space that is metrizable, but not completely metrizable, becauseit is not a Gδ as a subset of R. To see this observe that the set of irrational numbersis

⋂r∈Q R \ {r}, so ifQ was a countable intersection of open sets, then ∅ would be a

countable intersection of open sets, contrary to the Baire category theorem (p. 200 ofKelley 1955). The next result shows that the union of { eπ ir : r ∈ Q } with the openunit disk in C is an ANR, but this space is not completely metrizable, so it is not anENR. Thus there are finite dimensional ANR’s that are not ENR’s. By choosing theleast restrictive definition we strengthen the various results below. However, thesecomplexities are irrelevant to compact ANR’s, which are the most important ANR’sthat will figure in our work going forward.

At first blush being an ANRmight sound like a remarkable property that can onlybe possessed by quite special spaces, but this is not the case at all. Although ANR’scannot exhibit the “infinitely detailed features” of the tin can with a roll of toiletpaper, the concept is not very restrictive, at least in comparison with other conceptsthat might serve as an hypothesis of a fixed point theorem.

Proposition 8.3 A metric space A is an ANR if it (or its homeomorphic image) is aretract of an open subset of a convex subset of a locally convex linear space.

Proof Let r : U → A be a retraction, where U is an open subset of a convex setC . Suppose h : A → X maps A homeomorphically onto a closed subset h(A) of ametric space X . Dugundji’s theorem implies that h−1 : h(A) → U has a continuousextension j : X → C . Then V := j−1(U ) is a neighborhood of h(A), and h ◦ r ◦j |V : V → h(A) is a retraction. �

Corollary 8.2 An ENR is an ANR.

8.4 Absolute Neighborhood Retracts 163

Corollary 8.3 A finite cartesian product of ANR’s is an ANR.

The proposition above gives a sufficient condition for a space to be anANR. Thereis a somewhat stronger necessary condition.

Proposition 8.4 If A is an ANR, then there is a convex subset C of Banach spacesuch that (a homeomorphic image of) A is both a closed subset of C and a retract ofa neighborhood U ⊂ C.

Proof Theorem 6.3 gives a map h : A → Z , where Z is a Banach space, such thath maps A homeomorphically onto h(A) and h(A) is closed in the relative topologyof its convex hull C . Since A is an ANR, there is a relatively open U ⊂ C and aretraction r : U → h(A). �

Corollary 8.4 A retract of an open subset of an ANR is an ANR.

Proof Suppose that B is an ANR, U ⊂ B is open, and r : U → A is a retraction.Proposition 8.4 allows us to regard B as a retract of an open subset V of a convexsubset C of a Banach space. If ρ : V → B is a retraction, then ρ−1(U ) is open andr ◦ ρ|ρ−1(U ) is a retraction, so Proposition 8.3 implies that A is an ANR. �

Since compact metric spaces are separable, compact ANR’s satisfy a moredemanding embedding condition than the one given by Proposition 8.4.

Proposition 8.5 If A is a compact ANR, then there exists an embedding ι : A → I∞such that ι(A) is a neighborhood retract in I∞.

Proof Urysohn’s Theorem 6.4 guarantees the existence of an embedding of A inI∞. Since A is compact, h(A) is closed in I∞, and since A is an ANR, h(A) is aneighborhood retract in I∞. �

The simplicity of an open subset of a Banach space is the ultimate source of theutility of ANR’s in the theory of fixed points. To exploit this simplicity we needanalytic tools that bring it to the surface. The rest of this section develops relativelysimple properties of ANR’s.

A topological space A is an absolute neighborhood extensor (ANE) if, wheneverY is a metric space, X is a closed subset of Y , and f : X → A is a map, there isa continuous extension of f to some neighborhood of U . Note that this definitiondoes not require that A is itself a metric space.Modifying the definition of an ANRto allow for this possibility would make no sense, because any unmetrizable spacesatisfies the defining condition vacuously, but the possibility that an ANE might notbe a metric space will actually be quite important for us. For metric spaces, on theother hand, ANR’s and ANE’s are two sides of a single coin.

Proposition 8.6 A metric space A is a ANR if and only if it is an absolute neigh-borhood extensor.

164 8 Retracts

Proof Consideration of the possibility that X = A shows that if A is anANE, then it isan ANR. Suppose that A is an ANR. Theorem 6.3 allows us to regard A as a relativelyclosed subset of a convex subset C of a Banach space. Let r : U → A be a retractionof a neighborhoodU ⊂ C of A. If Y is a metric space, X is a closed subset of Y , andf : X → A is continuous, Dugundji’s theorem (Theorem 6.5) gives a continuous

f : Y → C . Then V := f−1

(U ) is a neighborhood of X , and r ◦ f |V : V → A is acontinuous extension of f . �

A topological space X is locally contractible if, for each x0 ∈ X , each neigh-borhood V of x0 contains a neighborhood W such that there is a continuousc : W × [0, 1] → V such that c(x, 0) = x and c(x, 1) = x0 for all x ∈ W . The spaceX is locally path connected if, for each x0 ∈ X , each neighborhood V of x con-tains a neighborhood W such that for any x, x ′ ∈ W there is a continuous pathγ : [0, 1] → V with γ (0) = x and γ (1) = x ′. (At first sight local contractibility(local path connectedness) seems less natural than requiring that any neighborhoodof x contain a contractible (path connected) neighborhood, but in the current set-ting and many others the weaker conditions given by the definitions are more easilyverified, and they usually have whatever implications are desired.)

Proposition 8.7 An ANR A is locally contractible.

Proof We regard A as a relatively closed subset of a convex subset C of a Banachspace. Let r : U → A be a retraction of a neighborhoodU ⊂ C of A. Let V ⊂ A bea neighborhood of a point x0. Let W ⊂ r−1(V ) be a convex neighborhood of x , andlet W := W ∩ A. Then c : W × [0, 1] → V given by c(x, t) = r((1 − t)x + t x0)has the required properties. �

Corollary 8.5 An ANR A is locally path connected.

We also need to consider a somewhat stronger condition. Let (X, d) be a metricspace. We say that X is locally equiconnected (some authors say uniformly locallycontractible) if there is a neighborhoodW ⊂ X × X of the diagonal Δ := { (x, x) :x ∈ X } and a map λ : W × [0, 1] → X such that:

(a) λ(x, x ′, 0) = x ′ and λ(x, x ′, 1) = x for all (x, x ′) ∈ W ;(b) λ(x, x, t) = x for all x ∈ X and t ∈ [0, 1].We say that λ is an equiconnecting function.

LetU be an open covering of X . We say that two functions f, g : Y → X areU -close if, for every x ∈ X , there is someU ∈ U such that f (x), g(x) ∈ U .When Y isa topological space we say that f and g areU -homotopic if there is a homotopy h :Y × [0, 1] → X such that h0 = f , h1 = g, and for every x ∈ X there is someU ∈ Usuch that h(x, [0, 1]) ⊂ U . If, in addition, h(x, ·) is constant whenever f (x) = g(x),then h is a stationary, and f and g are stationarily U -homotopic.

Proposition 8.8 X is locally equiconnected if and only if each open cover U of Xhas a refinement V such that any two V -close maps of any topological space Y intoX are stationarily U -homotopic.


Proof First suppose that X is locally equiconnected. Let λ : W → X be an equicon-necting function, and let an open cover U be given. For each x ∈ X choose someUx ∈ U containing x . Since λ is continuous, {x} × {x} × [0, 1] is covered by opensubsets of λ−1(Ux ) of the form Vxt × Vxt × (t − ε, t + ε). This open cover has afinite subcover, so there is an open Vx such that λ(Vx × Vx × [0, 1]) ⊂ Ux . LetV := { Vx : x ∈ X }. If f, g : Y → X are V close, then h(x, t) = λ( f (x), g(x), t)is a stationary U -homotopy.

Conversely, if the condition holds, let U := {X}, let V be a refinement satisfy-ing the condition, and let W := ⋃

V∈V V × V . The maps f, g : W → X given byf (x, x ′) := x and g(x, x ′) := x ′ are V -close, and a stationary U -homotopy withthese endpoints is an equiconnecting function. �

Proposition 8.9 An ANR A is locally equiconnected.

Proof Weverify the condition given by the last result. Regard A as a relatively closedsubset of convex subset C of a Banach space. Let r : U → A be a retraction of aneighborhoodU ⊂ C of A. LetU be an open cover of A, and letV be a refinement ofU consisting of sets of the form V = V ∩ Awhere V ⊂ U is a convex neighborhoodof a point in A such that r(V ) is contained in some element ofU . If Y is a topologicalspace and f, g : Y → A are V -close, then h(x, t) := r((1 − t) f (x) + tg(x)) is astationary U -homotopy. �

Mathematicians say that a property of a topological space is local if having thatproperty is the same as each point having a neighborhood that has the property.Corollary 8.4 implies that any open subset of an ANR is an ANR. The next set ofresults develops the opposite implication, that if every point in A has a neighborhoodthat is an ANR, then A is an ANR. We start with some special cases.

Lemma 8.11 If A = U1 ∪U2 where U1 and U2 are ANR’s that are open in A, thenA is an ANR.

Proof As above, it suffices to show that A is an absolute neighborhood extensor. LetY be a metric space, let X be a closed subset of Y and let f : X → A be continuous.Let C1 := X \ f −1(U2) and C2 := X \ f −1(U1). These are disjoint closed subsetsof Y , and we can take disjoint open neighborhoods V1 and V2. For i = 1, 2 letXi := V1 ∩ X . Let D0 := Y \ (V1 ∪ V2), and let T := D ∩ X .

Corollary 8.4 implies that U1 ∩U2 is an ANR. Since T is closed in D, thereis an extension h : W → U1 ∩U2 of f |T to a neighborhood W of T in D. Letg0 : W ∪ X → A be the function that agrees with h on W and with f on X . SinceW = (W ∪ X) ∩ D, W is closed in W ∪ X , and of course X is closed in this setbecause it is closed in Y . Therefore g0 is continuous.

For i = 1, 2 observe that Vi \ Xi = Vi \ X is open in Y . Since (W ∪ Vi ) \ (W ∪Xi ) = Vi \ Xi , W ∪ Xi is closed in W ∪ Vi . Since Ui is a absolute neighborhoodextensor, there is an extension gi : Zi → Ui of g0|W∪Xi to a neighborhood Zi ofW ∪ Xi in W ∪ Vi .

Let Z := Z1 ∪ Z2, and let g : Z → A be the function that agrees with g1 on Z1

and with g2 on Z2. Since Z1 ∩ Z2 = W , g is well defined and continuous. All that

166 8 Retracts

remains is to show that Z is open in Y . First note that V1 ∪ W ∪ V2 = Y \ (D \ W )

is open. The only points of Z1 that are not in its interior in V1 ∪ W ∪ V2 are those inthe intersection ofW with the closure of V2, but such points are in the interior of Z2,and vice versa, so Z is in fact open in V1 ∪ W ∪ V2. �

Lemma 8.12 If I is any index set and for each i , Ai is an ANR, then the disjointunion A := ⋃

i∈I Ai is an ANR.

Proof Let h : A → X be an embedding of A in a metric space (X, d) such thath(A) is closed. For each i let Ui be the set of points in X that are closer to somepoint in h(Ai ) than they are to any point in h(A) \ h(Ai ). In the relative topol-ogy of h(A), h(Ai ) is both open and closed, so h(Ai ) and h(A) \ h(Ai ) are bothrelatively closed. Since h(A) is closed, they are closed and are consequently con-tained in disjoint open sets. Thus Ui is open in X . Possibly after replacing Ui

with a smaller neighborhood of h(Ai ), there is a retraction ri : Ui → h(Ai ). TheUi are pairwise disjoint by construction, so the ri combine to give a retractionr : ⋃

i Ui → h(A). �

Theorem 8.1 (Hanner 1951) If A is separable, then A is an ANR if and only if eachpoint in A has a neighborhood that is an ANR.

Dugundji (1952) and Kodama (1956) provide variants of this result for somenonparacompact spaces.

Proof Corollary8.4 has already established one of the implications. Let U be anopen cover of A whose elements are ANR’s. Since metric spaces are paracompact,U has a locally finite refinement, whose elements are ANR’s by Corollary8.4, sowe may assume that U is locally finite. Since A is separable, U is countable. (Fora countable dense subset D, if we associate each U ∈ U with some element of Dthat it contains, then each element of D is associated with at most finitely many U .)Let U := {U1,U2, . . .}.

For each n = 1, 2, . . . let Vn := U1 ∪ · · · ∪Un . By repeated applications of thelast result, Vn is an ANR. LetWn be the subset of Vn consisting of those points whosedistance to A \ Vn is greater than 1/n. Let Z1 := W1 and Z2 := W2, and for n ≥ 3let Zn := Wn \ Wn−2. Now

X =⋃

n

Vn =⋃

n

Wn =⋃

n

Zn =∞⋃

n=1

Z2n−1 ∪∞⋃

n=1

Z2n ,

and⋃∞

n=1 Z2n−1 and⋃∞

n=1 Z2n are ANR’s because they are disjoint unions of ANR’s,so the last result implies that X is an ANR. �

In Sect. 17.11 it will emerge that locally finite simplicial complexes are an impor-tant tool in the characterization of ANR’s. One important consequence of the lastresult is that they are themselves ANR’s.

Corollary 8.6 A locally finite simplicial complex is an ANR.


Proof A connected locally finite simplicial complex is separable, and is easily cov-ered by open subsets of finite simplicial complexes, so the last result implies that itis an ANR. A general locally finite simplicial complex is the disjoint union of itsconnected components, so we can apply Lemma8.12. �

Additional necessary conditions and sufficient conditions for a metric space tobe an ANR are presented in Sects. 17.9–17.11. In comparison with the materialpresented in this chapter, the arguments are quite intricate, but they do not requireadditional preparation, so the interested reader can study them now.

8.5 Absolute Retracts

A metric space A is an absolute retract (AR) if h(A) is a retract of X whenever Xis a metric space, h : A → X is an embedding, and h(A) is closed. Of course an ARis an ANR. Below we will see that an ANR is an AR if and only if it is contractible,so compact convex sets are AR’s. Eventually (Theorem 14.3) we will show thatnonempty compact AR’s have the fixed point property. In this sense AR’s fulfill ourgoal of replacing the assumption of a convex domain in Kakutani’s theorem with atopological condition.

The embedding conditions characterizing AR’s parallel those for ANR’s, withsome simplifications.

Proposition 8.10 If a metric space A is a retract of a convex subset C of a locallyconvex linear space, then it is an AR.

Proof Let r : C → A be a retraction. Suppose h : A → X maps A homeomorphi-cally onto a closed subset h(A) of a metric space X . Dugundji’s theorem impliesthat h−1 : h(A) → C has a continuous extension j : X → C . Then q := h ◦ r ◦ j isa retraction of X onto h(A). �

Proposition 8.11 If A is an AR, there is a convex subset C of a Banach space suchthat (a homeomorphic image of) A is both a closed subset of C and a retract of C.

Proof Theorem 6.3 gives a map h : A → Z , where Z is a Banach space, such thath maps A homeomorphically onto h(A) and h(A) is closed in the relative topologyof its convex hull C . Since A is an AR, there is a retraction r : C → h(A). �

A topological space A is an absolute extensor (AE) if, whenever Y is a metricspace, X is a closed subset of Y , and f : X → A is a map, there is a continuousextension of f to Y . As with the comparison of ANR’s and ANE’s, this definitiondoes not require that A is a metric space, and when it is the two concepts coincide.

Proposition 8.12 A metric space A is a AR if and only if it is an AE.

168 8 Retracts

Proof Consideration of the possibility that X = A shows that if A is an AE, thenit is an AR. Suppose that A is an AR. Theorem 6.3 allows us to regard A as arelatively closed subset of a convex subset C of a Banach space. Let r : C → Abe a retraction. If Y is a metric space, X is a closed subset of Y , and f : X → Ais continuous, Dugundji’s theorem (Theorem 6.5) gives a continuous f : Y → C .Then r ◦ f is a continuous extension of f . �

The remainder of the section proves:

Theorem 8.2 An ANR is an AR if and only if it is contractible.

In preparation for the proof we introduce a rather specialized concept. (The moreimportant variant appears in Section 14.2.) If X is a topological space and A ⊂ X ,the pair (X, A) is said to have the homotopy extension property with respect toANR’s if, whenever Y is an ANR, Z := (X × {0}) ∪ (A × [0, 1]), and g : Z → Yis continuous, g has a continuous extension h : X × [0, 1] → Y .

Proposition 8.13 (Borsuk 1937) If X is a metric space and A is a closed subset ofX, then (X, A) has the homotopy extension property with respect to ANR’s.

We separate out one of the larger steps in the argument.

Lemma 8.13 Let X be a metric space, let A be a closed subset of X, and let Z :=(X × {0}) ∪ (A × [0, 1]). Then for every neighborhood V ⊂ X × [0, 1] of Z thereis a map j : X × [0, 1] → V that agrees with the identity on Z.

Proof For each (a, t) ∈ A × [0, 1] choose a product neighborhood

U(a,t) × (t − ε(a,t), t + ε(a,t)) ⊂ V

whereU(a,t) ⊂ X is open and ε > 0. For any particular a the cover of {a} × [0, 1] hasa finite subcover, and the intersection of its first cartesian factors is a neighborhoodUa of a with Ua × [0, 1] ⊂ V . Let U := ⋃

a Ua . Thus there is a neighborhood U ofA such that U × [0, 1] ⊂ V .

Urysohn’s lemma gives a function α : X → [0, 1] with α(x) = 0 for all x ∈ X \U and α(a) = 1 for all a ∈ A, and the function j (x, t) := (x, α(x)t) satisfies therequired conditions. �

Proof of Proposition 8.13 Let Y be an ANR, let Z := (X × {0}) ∪ (A × [0, 1]), andlet g : Z → Y be continuous. By Theorem 6.3 we may assume without loss of gen-erality that Y is contained in a Banach space, and is a relatively closed subset ofits convex hull C . Dugundji’s theorem implies that there is a continuous extensionh : X × [0, 1] → C of g. Let W ⊂ C be a neighborhood of Y for which there is aretraction r : W → Y , let V := h−1(W ), and let j : X × [0, 1] → V be a continu-ous map that is the identity on Z , as per the result above. Then h := r ◦ h ◦ j is acontinuous extension of g whose image is contained in Y . �

We can now complete the main argument.

8.5 Absolute Retracts 169

Proof of Theorem 8.2 Let A be an ANR. By Theorem 6.3 we may embed A as arelatively closed subset of a convex subset C of a Banach space.

If A is an AR, then it is a retract of C . A convex set is contractible, and a retractof a contractible set is contractible (Lemma 8.2) so A is contractible.

Suppose that A is contractible. By Proposition 8.10 it suffices to show thatA is a retract of C . Let c : A × [0, 1] → A be a contraction, and let a1 be the“final value” a1, by which we mean that c(a, 1) = a1 for all a ∈ A. Set Z :=(C × {0}) ∪ (A × [0, 1]), and define g : Z → A by setting g(x, 0) := a1 for x ∈ Cand g(a, t) := c(a, 1 − t) for (a, t) ∈ A × [0, 1]. Proposition 8.13 implies the exis-tence of a continuous extension h : C × [0, 1] → A. Now r := h(·, 1) : C → A isthe desired retraction. �

8.6 Domination

In our development of the fixed point index an important idea will be to pass from atheory for certain simple or elementary spaces to a theory for more general spacesby showing that every space of the latter type can be “approximated” by a simplerspace, in the sense of the following definitions. Fix a metric space (X, d).

Definition 8.1 If Y is a topological space and ε > 0, a homotopy η : Y × [0, 1] →X is an ε-homotopy if d(η(y, t), η(y, t ′)) < ε for all y,∈ Y and t, t ′ ∈ [0, 1]. Wesay that η0 and η1 are ε-homotopic.

Definition 8.2 A topological space D ε-dominates C ⊂ X if there are continuousfunctions ϕ : C → D and ψ : D → X such that ψ ◦ ϕ : C → X is ε-homotopic toIdC . IfP is a simplicial complex we say thatP ε-dominates C if |P| ε-dominatesC .

In preparation for the argument below we introduce a concept of general impor-tance. If U is a locally finite open cover of X , the nerve of U is the simplicialcomplexNU = (U , �U ) where the elements of �U are ∅ and those finite σ ⊂ Usuch that

⋂V∈σ V �= ∅. For any partition of unity {αV : X → [0, 1]}V∈U subordinate

to U there is a continuous function KU : X → |NU | given by

KU (x) :=∑

V∈UαV (x)eV

where the eV are the standard unit basis vectors of RU .

Lemma 8.14 If e is ametric that is topologically equivalent to d, C ⊂ X is compact,and ε > 0, then there is a δ > 0 such that D ε-dominates C relative to d wheneverit δ-dominates C relative to e.

Proof An obvious argument by contradiction shows that there is a δ > 0 such thatfor all x ∈ C , then δ-ball around x relative to e is contained in the ε/2-ball around x

170 8 Retracts

relative to d. If η : C × [0, 1] → X is a δ-homotopy relative to e, η1 = IdC , x ∈ C ,and t, t ′ ∈ [0, 1], then e(η(x, t), η(x, 1)) = e(η(x, t), x) < δ and similarly for t ′, so

d(η(x, t), η(x, t ′)) ≤ d(η(x, t), x) + d(η(x, t ′), x) ≤ ε/2 + ε/2 = ε. �

This section’s main result is:

Theorem 8.3 (Domination Theorem) If X is a separable ANR and C ⊂ X is com-pact, then for any ε > 0 there is a finite simplicial complex that ε-dominates C.

Proof Proposition 8.4 implies that X may be embedded in a Banach space in sucha way that there is a retraction r : U → X where U is a relatively open subset of aconvex set. In view of the last result we may assume that d is the metric derived fromthe norm of the Banach space.

For each x ∈ X choose ρx > 0 such that

U2ρx (x) ⊂ U and r(U2ρx (x)) ⊂ Uε/2(x).

Choose x1, . . . , xn such that U := {Uρxi(xi )}ni=1 is a cover of C , and for each i

let ρi := ρxi and Vi := Uρi (xi ). We set ϕ := KU : C → |NU |, and we define ψ :|NU | → X by

ψ( n∑

j=1

β je j

):= r

( n∑

j=1

β j x j

).

The homotopy η : C × [0, 1] → X is

η(x, t) := r((1 − t)

∑

j

αU ,Uj (x)x j + t x)

.

To see that these definitions make sense and have the desired properties con-sider x ∈ C . Let j1, . . . , jk be the indices j such that x ∈ Vj . We may assumethat ρ j1 is the largest ρ ji . Then x ji ∈ U2ρ j1

(x j1) ⊂ U for all i = 1, . . . , k. We have∑k

i=1 β ji x ji ∈ U for all β j1 , . . . , β jk ≥ 0 that sum to 1, and the line segment betweenx and

∑j αU ,Vj (x)x j is contained inU and is mapped by r toUε/2(x j1). Since x is an

arbitrary point in C , it follows that ψ and η are well defined, and η is a ε-homotopy.Of course η0 = ψ ◦ ϕ and η1 = IdC . �

Sometimes we will need the following variant.

Theorem 8.4 If X is a locally compact ANR and C ⊂ X is compact, then for anyε > 0 there is an openU ⊂ R

m, for somem, such that U is compact and ε-dominatesC by virtue of maps ϕ : C → U and ψ : U → C.

8.6 Domination 171

Proof Since X is locally compact, C has a compact neighborhood D. Since D isseparable its interior is separable, and Proposition 8.3 implies that the interior of Dis an ANR. Since we can replace X with the interior of D, we may assume that X isseparable.

LetP be a finite simplicial complex that ε-dominatesC by virtue of the maps ϕ :C → |P| andψ ′ : |P| → X . Since |P| is anENR(Proposition8.2)wemayassumethat it is contained in some Rm . Let r : U ′ → |P| be a retraction of a neighborhoodU ′ ⊂ R

m . Since P is finite, |P| is compact, hence bounded, and U ′ contains aneighborhood U of |P| that is bounded and whose closure is contained in U ′. Letψ := ψ ′ ◦ r |U . Since ψ ◦ ϕ = ψ ′ ◦ ϕ, C is ε-dominated by U . �

Exercises

8.1 Let A be a retract of X .

(a) Prove that if X is locally compact, then A is locally compact.(b) Prove that if X is locally path connected, then A is locally path connected.

8.2 Let C be the unit circle centered at the origin of R2. Prove that if f and g arecontinuousmaps fromC to itself, and f and g have the samewinding number, then fand g are homotopic. (In general a homotopy invariant is complete if two maps withthe same domain and range that have the same value of the invariant are necessarilyhomotopic.) Hopf’s theorem (Theorem 14.4) generalizes this to all dimensions.

8.3 Prove that a retract of an AR is an AR.

8.4 Let X = ∏∞i=1 Xi be a countable cartesian product of metric spaces, endowed

with the product topology.

(a) Prove that if X is an AR, then each Xi is an AR.(b) Prove that if each Xi is an AR, then X is an AR.(c) Prove that if Xi = {0, 1} for all i , then X is not locally connected, hence not an

ANR.(d) Prove that if each Xi is an ANR and all but finitely many of the Xi are AR’s,

then X is an ANR. (The converse is also true; e.g., p. 93 of Borsuk 1967.)

8.5 Let X be a compact metric space, and let Y be an ANR.

(a) By embedding Y in a suitable Banach space, prove that C(X,Y ) is an ANR.(b) Prove that if Y is an AR, then C(X,Y ) is an AR.

8.6 Let Dm := { x ∈ Rm : ‖x‖ ≤ 1 }, let K be a compact subset of int Dm := { x ∈

Rm : ‖x‖ < 1 }, and let ∂K := K ∩ Dm \ K . Prove that if f : K → Dm is a continu-

ous functionwith f |∂K = Id∂K , then K ⊂ f (K ). (Extend f to amap f : Dm → Dm

by setting f (x) := x if x /∈ K .)

172 8 Retracts

8.7 Prove that if X ⊂ Rm is a compact AR, then Rm \ X does not have a connected

component that is bounded. (A more challenging problem is to prove that if X ⊂ Rm

is an ANR, thenRm \ X has finitely many connected components; e.g., p. 193 of Hu1965.) Conclude that Rm \ X is connected.

Chapter 9Approximation of Correspondencesby Functions

In extending fixed point theory from functions to correspondences, an importantmethod is to show that continuous functions are dense in the space of correspon-dences, so that any correspondence can be approximated by a function. In the lastchapter we saw such a result (Theorem 7.1) for convex valued correspondences, butmuch greater care and ingenuity is required by the arguments showing that con-tractible valued correspondences have good approximations. This chapter states andproves the key result in this direction. This result was proved in the Euclidean caseby Mas-Colell (1974) and extended to ANR’s by the author in McLennan (1991).This chapter is essentially a long proof that applies earlier concepts and results, butdoes not develop new ones, so there are no exercises.

9.1 The Approximation Result

Our main result can be stated rather easily. Fix ANR’s X and Y with X separable.Suppose that C ⊂ D ⊂ X where C and D are compact with C ⊂ int D.

Theorem 9.1 (ApproximationTheorem) If F : D → Y is an upper hemicontinuouscontractible valued correspondence, then for any neighborhood W of Gr(F |C) thereare:

(a) a continuous f : C → Y with Gr( f ) ⊂ W;(b) a neighborhood W ′ of Gr(F) such that, for any two continuous functions f0, f1 :

D → Y with Gr( f0),Gr( f1) ⊂ W ′, there is a homotopy h : C × [0, 1] → Ywith h0 = f0|C , h1 = f1|C , and Gr(ht ) ⊂ W for all 0 ≤ t ≤ 1.

Roughly, (a) is an existence result, while (b) is uniqueness up to effective equivalence.Here, and later in the book, thingswould bemuch simpler ifwe could haveC = D.

More precisely, it would be nice to drop the assumption that C ⊂ int D. This may


173


174 9 Approximation of Correspondences by Functions

be possible (that is, I do not know a relevant counterexample) but a proof wouldcertainly involve quite different methods.

The following is an initial indication of the significance of this result.

Theorem 9.2 If X is a compact ANR with the fixed point property, then any upperhemicontinuous contractible valued correspondence F : X → X has a fixed point.

Proof In the last result let Y := X and C := D := X . Endow X with a metric dX .For each j = 1, 2, . . . , let

Wj := { (x ′, y′) ∈ X × X : dX (x, x ′) + dX (y, y′) < 1/j }

for some (x, y) ∈ Gr(F), let f j : X → X be a continuous function with Gr( f j ) ⊂Wj , let z j be a fixed point of f j , and let (x ′

j , y′j ) be a point in Gr(F)with dX (x ′

j , z j ) +dX (y′

j , z j ) < 1/j . Passing to convergent subsequences,wefind that the common limitof the sequences {x ′

j }, {y′j }, and {z j } is a fixed point of F . �

Much later, applying Theorem 9.1, we will show that a nonempty compact con-tractible ANR has the fixed point property.

In order to indicate the overall nature of the argument, and to create a commonframework, we now state the main steps in the proof of Theorem 9.1. We fix a locallyconvex topological vector space V , a convex Q ⊂ V , and a relatively open Z ⊂ Q.

Proposition 9.1 Let B be a convex neighborhood of the origin in V . Suppose S ⊂ Zis compact and contractible. There is a convex neighborhood A of the origin in V suchthat for any simplex Δ, any continuous f ′ : ∂Δ → (S + A) ∩ Z has a continuousextension f : Δ → (S + B) ∩ Z.

Let K be a finite simplicial complex, and let J be a subcomplex. Since there isno risk of confusion our notation does not distinguish between a simplicial complexK and the space |K | = ⋃

σ∈K |σ |.Proposition 9.2 If F : K → Z be an upper hemicontinuous contractible valuedcorrespondence, then for any neighborhood W ⊂ K × Z of Gr(F) there is aneighborhood W ′ ⊂ J × Z of Gr(F |J ) such that any continuous f ′ : J → Z withGr( f ′) ⊂ W ′ has a continuous extension f : K → Z with Gr( f ) ⊂ W.

Proposition 9.3 If F : D → Z is an upper hemicontinuous contractible valued cor-respondence, then for any neighborhood W of Gr(F |C) there exist:

(a) a continuous f : C → Z with Gr( f ) ⊂ W;(b) a neighborhood W ′ of Gr(F) such that for any two functions f0, f1 : D → Z

with Gr( f0),Gr( f1) ⊂ W ′ there is a homotopy h : C × [0, 1] → Z with h0 =f0|C , h1 = f1|C , and Gr(ht ) ⊂ W for all 0 ≤ t ≤ 1.

The final step is not difficult. Since an ANR is a retract of a relatively open subsetof a convex subset of a locally convex space (Proposition 8.3) we may assume thereis a a retraction r : Z → Y . Let i : Y → Z be the inclusion.

9.1 The Approximation Result 175

Proof of Theorem9.1. Let W := (IdX × r)−1(W ). Proposition 9.3(a) implies thatthere is a continuous f : C → Z with Gr( f ) ⊂ W , and setting f := r ◦ f verifies(a) of Theorem 9.1.

Let W ′ ⊂ W be a neighborhood of Gr(i ◦ F) with the property asserted byProposition 9.3(b). Let W ′ := (IdX × i)−1(W ′). Suppose that f0, f1 : D → Y withGr( f0),Gr( f1) ⊂ W ′. Then there is a homotopy h : C × [0, 1] → Z with h0 =i ◦ f0|C , h1 = i ◦ f1|C , and Gr(ht ) ⊂ W for all 0 ≤ t ≤ 1. If we set h := r ◦ h, thenh0 = f0|C , h1 = f1|C , and Gr(ht ) ⊂ W for all 0 ≤ t ≤ 1, as per the assertion of (b)of Theorem 9.1. �

9.2 Technical Lemmas

The arguments require several technical results. Three of these have a similar char-acter, so it makes sense to present them together. In this section X is just a metricspace.

Lemma 9.1 Suppose X is compact and U1, . . . ,Un be a cover of X by open sets,none of which is X itself. Then there is a cover U1, . . . , Ur of X by open sets suchthat for each j = 1, . . . , r there is an i such that Ui contains U j and every Uk suchthat U j ∩ Uk = ∅.Proof Let α > 0 be such that 2α

1−α< 1−α

1+α. (That is, α <

√5 − 2.) For each x ∈ X

let rx be the supremum of the set of ε > 0 such that Uε(x) ⊂ Ui for some i , and letUx be an open subset of Uαrx (x) that contains x . We claim that for all x, x ′ ∈ X , ifUx ∩ Ux ′ = ∅, then Ux ′ ⊂ Urx (x). Aiming at a contradiction, assume that Ux ′ is notcontained in Urx (x). The distance from x to any point in Ux ′ cannot exceed α(rx +2rx ′), so α(rx + 2rx ′) > rx , which boils down to 2αrx ′ > (1 − α)rx . Since 2α

1−α<

1−α1+α

< 1−αα

it follows that (1 − α)rx ′ > αrx and thus rx ′ > α(rx + rx ′), which is notless than the distance from x to x ′. Therefore rx > rx ′ − α(rx + rx ′), which reducesto (1 + α)rx > (1 − α)rx ′ , but now 2α

1−α> rx

rx ′> 1−α

1+α, contrary to the condition on

α. Since X is compact there is a suitable open cover of the form Ux1 , . . . , Uxr . �

The remaining three results of this section have the following setting: there is acompact C ⊂ X , a topological space Y , an upper hemicontinuous correspondenceF : C → Y , and a neighborhood W ⊂ X × Y of Gr(F).

Lemma 9.2 For any x ∈ C there is a neighborhood Ux ⊂ X of x and a neighbor-hood Vx of F(x) such that Ux × Vx ⊂ W.

Proof By the definition of the product topology, for every y ∈ F(x) there existneighborhoods Uy of x and Vy of y such thatUy × Vy ⊂ W . Since F(x) is compactthere are y1, . . . , yK such that Uy1 , . . . ,Uyk is a cover of F(x). Let Ux := ⋂

j Uyjand Vx := ⋃

j Vy j . �


Lemma 9.3 There is an ε > 0 and a neighborhood W of Gr(F) such that

⋃

(x,y)∈WUε(x) × {y} ⊂ W .

Proof For each x ∈ C Lemma 9.2 allows us to choose δx > 0 and a neighborhoodVx of F(x) such thatU2δx (x) × Vx ⊂ W . Replacing δx with a smaller number if needbe, we may assume without loss of generality that F(x ′) ⊂ Vx for all x ′ ∈ U2δx (x).Choose x1, . . . , xH such that Uδx1

(x1), . . . ,UδxH(xH ) cover C . Let ε := min{δxi },

and setW :=

⋃

i

Uδxi(xi ) × Vxi .

�

Lemma 9.4 Suppose that f : S → C is a continuous function, where S is a compactmetric space. If W is a neighborhood of Gr(F ◦ f ), then there is a neighborhood Wof Gr(F) such that ( f × IdY )−1(W ) ⊂ W . �

Proof Consider a particular x ∈ X . Applying Lemma 9.2, for any s ∈ f −1(x) wecan choose a neighborhood Us of s and a neighborhood Vs ⊂ Y of F(x) such thatUs × Vs ⊂ W . Since f −1(s) is compact, there are s1, . . . , s� such that Us1 , . . . , Us�cover f −1(s). Let Vx := Vs1 ∩ · · · ∩ Vs� , and let U be a neighborhood of x smallenough that f −1(U ) ⊂ Us1 ∪ · · · ∪ Us� and F(x ′) ⊂ Vx for all x ′ ∈ U . (Such a Umust exist because S is compact and F is upper hemicontinuous.) Then

( f × IdY )−1(U × Vx ) ⊂⋃

i

Usi × Vx ⊂ W .

Since x was arbitrary, this establishes the claim. �

9.3 Proofs of the Propositions

Proof of Proposition 9.1. We assume without loss of generality that (S + B) ∩ Q ⊂Z . (Since S is compact, there are s1, . . . , sk ∈ S and neighborhoods of the originB1, . . . , Bk such that (s1 + 2Bi ) ∩ Q ⊂ Z for each i and S ⊂ ⋃

i (si + Bi ). We canreplace B with B ∩ ⋂

i Bi .) Since we can replace B with B ∩ −B, we may assumethat B = −B.

Let c : S × [0, 1] → S be a contraction. There is a convex neighborhood of theorigin A and δ > 0 such that A = −A, 2A ⊂ B, and c(s ′, t ′) − c(s, t) ∈ B for all(s, t), (s ′, t ′) ∈ S × [0, 1] with s − s ′ ∈ 3A and |t − t ′| < δ. (This is also a straight-forward consequence of continuity and compactness: there are (s1, t1), . . . , (sn, tn) ∈S × [0, 1], neighborhoods of the origin A1, . . . , An , and δ1, . . . , δn > 0, such thatS × [0, 1] is covered by the sets (si + 3Ai ) × (ti − δi , ti + δi ) and for each i ,

9.3 Proofs of the Propositions 177

c(s, t) − c(si , ti ) ∈ 12 B for all (s, t) ∈ S × [0, 1] with s − si ∈ 6Ai and |t − ti | <

2δi . The desired condition holds if we set A′ := 12 B ∩ ⋂

i Ai , A := A′ ∩ −A′, andδ := mini δi .) Let Δ be a simplex, and let f ′ : ∂Δ → (S + A) ∩ Q be a continuousfunction.

Let β be the barycenter of Δ. We define “polar coordinate” functions

y : Δ \ {β} → ∂Δ and t : Δ \ {β} → [0, 1)

implicitly by requiring that (1 − t (x))β + t (x)y(x) = x . Let

Δ1 := t−1([0, 13 ]), Δ2 := t−1([ 13 , 2

3 ]), Δ3 := t−1([ 23 , 1)) ∪ {β} .

We now define f on each of these three sets. Let z be the point S is contracted toby c: c(S, 1) = {z}. We define f on Δ1 by setting f (x) := z.

Let d be the metric on Δ given by Euclidean distance. Since f ′, t (·), and y(·) arecontinuous, and Δ2 is compact, for some sufficiently small λ > 0 it is the case that

f ′(y(x)) − f ′(y(x ′)) ∈ A and |t (x) − t (x ′)| < 13δ

for all x, x ′ ∈ Δ2 such that d(x, x ′) < λ. There is a polyhedral subdivision of Δ2

whose cells are the sets

y−1(F) ∩ t−1( 13 ), y−1(F) ∩ Δ2, y−1(F) ∩ t−1( 23 )

for the various faces F of Δ. Proposition 2.10 implies that repeated barycentricsubdivision of this polyhedral complex results eventually in a simplicial subdivisionof Δ2 whose mesh is less than λ. For each vertex v of this subdivision chooses(v) ∈ ( f ′(y(v)) + A) ∩ S, and set f (v) := c(s(v), 2 − 3t (v)). If Δ′ is a simplexof the subdivision of Δ2 with vertices v1, . . . , vr , and x = α1v1 + · · · + αrvr ∈ Δ′,then we set f (x) := α1 f (v1) + · · · + αr f (vr ).

If v is a vertex of the simplicial subdivision of Δ2 and t (v) = 13 , then f (v) =

c(s(v), 1) = z. The vertices of any simplex of the subdivision ofΔ2 that is containedin { x ∈ Δ : t (x) = 1

3 } = Δ1 ∩ Δ2 are all of this sort, so the two definitions of f onΔ1 ∩ Δ2 agree.

We define f on Δ3 by setting

f (x) := (3t (x) − 2) f ′(y(x)) + (3 − 3t (x)) f ( 13β + 23 y(x)) ,

where f ( 13β + 23 y(x)) has already been defined because 1

3β + 23 y(x) ∈ Δ2. If

t (x) = 23 , then x = 1

3β + 23 y(x), so the definitions of f on Δ2 and Δ3 agree at

x . If t (x) = 1, then y(x) = x and f (x) = f ′(x).Thus f is an unambiguously defined extension of f ′. Evidently f is continuous

on each of Δ1, Δ2, and Δ3, so it is continuous on all of Δ. It remains to show thatf (Δ) ⊂ (S + B) ∩ Q.


Of course f (Δ1) = {z} ∈ S ⊂ Q.To show that f (Δ2) ⊂ (S + B) ∩ Q consider a point x = α1v1 + · · · + αrvr in

a simplex Δ′ of the subdivision of Δ2, with vertices v1, . . . , vr . We have f (v1) =c(s(v1), 2 − 3t (v1)) ∈ S, and s(v j ) − f ′(y(v j )) ∈ A for all j = 1, . . . , r . For eachj = 2, . . . , r we have d(v j , v1) < λ, so f ′(y(v j )) − f ′(y(v1)) ∈ A and |t (v j ) −t (v1)| < 1

3δ. Therefore s(v j ) − s(v1) ∈ 3A and

f (v j ) − f (v1) = c(s(v j ), 2 − 3t (v j )

) − c(s(v j ), 2 − 3t (v j )

) ∈ B .

Thus

f (x) = f (v1) +r∑

j=2

α j ( f (v j ) − f (v1)) ∈ S + B .

Since f (x) is a convex combination of elements of S, it is an element of Q.Now consider x ∈ Δ3. Let x ′ := 2

3 y(x) + 13β. As above x

′ = α1v1 + · · · + αrvrwhere v1, . . . , vr are the vertices of a simplex of the triangulation of Δ2, but nowΔ′ ⊂ Δ2 ∩ Δ3, so t (x ′′) = 2

3 for all x ′′ ∈ Δ′. In particular, t (v1) = 23 and thus

f (v1) = c(s(v1), 0) = s(v1). Since d(x ′, v1) < λ and t (x ′) = t (v1) = 23 we have

f ′(y(x)) = f ′(y(x ′)) ∈ f ′(y(v1)) + A ∈ s(v1) + 2A ⊂ S + B. Above we showedthat f (x ′) ∈ f (v1) + B = s(v1) + B, so f (x) ∈ s(v1) + B because it is a convexcombination of f ′(y(x)) and f (x ′). Again f (x) is a convex combination of elementsof S, so it is an element of Q. �Lemma 9.5 If F : K → Z be an upper hemicontinuous contractible valued corre-spondence, and k is the maximal dimension of any simplex in K that is not in J , thenfor any neighborhood W ⊂ K × Z of Gr(F) there is a neighborhood W ′ ⊂ J × Zof Gr(F |H ) and a subdivision of K such that if H is the union of J and the (k − 1)-skeleton of that subdivision, then any continuous f ′ : H → Z with Gr( f ′) ⊂ W ′has a continuous extension f : K → Z with Gr( f ) ⊂ W.

Proof We develop two open coverings of K . Consider a particular x ∈ K . Lemma9.2 allows us to choose a neighborhood Ux of x and a convex neighborhood Bx ofthe origin in V such that

Ux × ((F(x) + Bx ) ∩ Z

) ⊂ W .

Since F(x) is contractible, Proposition 9.1 gives a convex neighborhood Ax of theorigin such that for any simplex Δ, any continuous function f ′ : ∂Δ → (F(x) +Ax ) ∩ Z has a continuous extension f : Δ → (S + Bx ) ∩ Z . Choose x1, . . . , xn suchthat Ux1 , . . . ,Uxn is a covering of K . Let A := ⋂n

i=1 Axi .Lemma 9.1 gives a covering U1, . . . , Up such that for each j there is some i

such that Uxi contains all Uk such that U j ∩ Uk = ∅. The upper hemicontinuity ofF implies that each y has an open neighborhood Uy contained in some U j such thatF(y′) ⊂ F(y) + 1

2 A for all y′ ∈ Uy . Choose y1, . . . , yp ∈ K such that Uy1 , . . . , Uypcover K . Set

9.3 Proofs of the Propositions 179

W ′ :=p⋃

j=1

Uy j × ((F(y j ) + 12 A) ∩ Z) .

Evidently Gr(F) ⊂ W ′. We have W ′ ⊂ W because for each j there is some i suchthat Uy j ⊂ Uxi and

(F(y j ) + 12 A) ∩ Z ⊂ ((F(xi ) + 1

2 Axi ) + 12 A) ∩ Z ⊂ (F(xi ) + Axi ) ∩ Z .

Proposition 2.10 and Lebesgue’s number lemma (Lemma 2.12) imply thatrepeated barycentric subdivision leads eventually to a subdivision of K with eachsimplex contained in some Uy j . Let H be the union of J with the (k − 1)-skeletonof this subdivision, and fix a continuous f ′ : H → Z with Gr( f ′) ⊂ W ′.

Suppose Δ is a k-simplex of K that is not contained in H . By constructionthere is a j such that Δ ⊂ Uy j . There is some xi with Uy j ′ ⊂ Uxi for all j

′ such thatUy j ∩ Uy j ′ = ∅, either because all of K is contained in a single Xi or as an applicationof the lemma above. The conditions imposed on our construction imply that

f (∂Δ) ⊂⋃

F(y j ′) + 12 A ⊂

⋃F(y j ′) + 1

2 Axi

where the unions are over all J ′ such that Uy j ∩ Uy j ′ = ∅, so f (∂Δ) ⊂ F(xi ) + Axi .Therefore f ′|∂Δ has a continuous extension to f : Δ → Z with Gr( f ) ⊂ W . Sincewe can extend f ′ one k-simplex at a time, the proof is complete. �Proof of Proposition 9.2. Let m be the dimension of K . Suppose that for somek = 0, . . . ,m we already have a neighborhoodWk ⊂ W of Gr(F) and a subdivisionof K such that if Hk is the union of J and the k-skeleton of this subdivision, andf ′ : Hk → Z is continuouswithGr( f ′) ⊂ Wk , then there is an extension f : K → Zwith Gr( f ) ⊂ W . (For k = m we may take Wm := W and the given subdivision ofK .) The last result implies that there is a neighborhood Wk−1 ⊂ Wk of Gr(F) and afurther subdivision of K such that if Hk−1 is the union of J and the (k − 1)-skeleton ofthis subdivision and f ′′ : Hk−1 → Z with Gr( f ′′) ⊂ Wk−1, then there is an extensionf ′ : Hk → Z withGr( f ′) ⊂ Wk . By induction on k there is a neighborhoodW0 ⊂ Wof Gr(F) and a subdivision of K such that if H0 is the union of J and the 0-skeleton ofthis subdivision and f ′ : H0 → Z with Gr( f ′) ⊂ W0, then there is an extension f ′ :H0 → Z withGr( f ′) ⊂ W .Any continuous f ′′ : J → Z withGr( f ′′) ⊂ W0 extendsto a continuous f ′ : H0 → Z with Gr( f ′) ⊂ W0 simply because F is nonemptyvalued, so it suffices to take W ′ := W0. �Proof of Proposition 9.3. Lemma 9.3 gives a neighborhood W ′′ of Gr(F) and ε > 0such that ⋃

(x,z)∈W ′′Uε(x) × {z} ⊂ W .

Since we can replace ε with a smaller number, we may assume that Uε(C) is con-tained in the interior of D. Because X is a separable ANR, the domination theorem


(Theorem 8.3) implies that there is a finite simplicial complex K that ε/2-dominatesD. Recall that this means that there are maps ϕ : D → K , ψ : K → X , andη : D × [0, 1] → X such that η0 = IdD , η1 = ψ ◦ ϕ, and for each x ∈ D there issome x ′ such that η(x, [0, 1]) ⊂ Uε/2(x ′). Of course this implies that η(x, [0, 1]) ⊂Uε(x), and in particular we have ϕ(C) ⊂ ψ−1(Uε(C)). Since ϕ(C) is compact andψ−1(Uε(C)) is open, Proposition 2.10 implies that after repeated subdivisions ofK , ψ−1(Uε(C)) contains the subcomplex H consisting of all faces of simplices thatintersect ϕ(C).

Let W ′′ := (ψ × IdZ )−1(W ′′) ⊂ K × Z . Since W ′′ is a neighborhood of Gr(F ◦ψ |H ), Proposition 9.2 implies the existence of a function f : H → Z with Gr( f ) ⊂W ′′. Let f := f ◦ ϕ|C . Then Gr( f ) ⊂ W , which verifies (a), because

(ϕ|C × IdZ )−1(W ′′) = ((ψ ◦ ϕ|C) × IdZ )−1(W ′′) ⊂⋃

(x,z)∈W ′′Uε(x) × {z} ⊂ W.

(9.1)Turning to (b), let G : H × [0, 1] → Z be the correspondence G(z, t) := F

(ψ(z)). Proposition 9.2 (withG, W ′′ × [0, 1], H × [0, 1], and H × {0, 1} in place ofF , W , K , and J ) gives neighborhoods W ′

0, W′1 ⊂ W ′′ of Gr(F ◦ ψ |H ) such that any

continuous function f : H × {0, 1} → Z with Gr( f ) ⊂ (W ′0 × {0}) ∪ (W ′

1 × {1})has a continuous extension h : H × [0, 1] → Z with Gr(h) ⊂ W ′′ × [0, 1].

Let W ′ := W ′0 ∩ W ′

1. Lemma 9.4 implies that there is a neighborhoodW ′ ofGr(F)

such that(ψ |H × IdZ )−1(W ′) ⊂ W ′ .

Replacing W ′ with W ′ ∩ W ′′ if need be, we may assume that W ′ ⊂ W ′′.Now consider continuous f0, f1 : D → Z with Gr( f0),Gr( f1) ⊂ W ′. Let f0 :=

f0 ◦ ψ |H and f0 := f0 ◦ ψ |H . We have Gr( f0),Gr( f ) ⊂ W ′, so there is a homotopyh : H × [0, 1] → Z with h0 = f0, h1 = f1, and Gr(ht ) ⊂ W ′′ for all t . Let h : C ×[0, 1] → Z be the homotopy h(x, t) := h(ϕ(x), t). In view of (9.1) we have

Gr(ht ) ⊂ (ϕ|C × IdZ )−1(W ′′) ⊂ W

for all t .Of courseh0 = f0 ◦ ϕ|C = f0 ◦ ψ ◦ ϕ|C andh1 = f1 ◦ ϕ|C = f1 ◦ ψ ◦ ϕ|C .Let j0 : C × [0, 1] → Z be the homotopy j0(x, t) := f0(η(x, t)). Then j00 =

f0|C , j01 = f0 ◦ ψ ◦ ϕ|C , and Gr( j0t ) ⊂ W for all t because d(x, η(x, t)) < ε and

(η(x, t), j0t (x)) = (η(x, t), f0(η(x, t))) ∈ W ′ ⊂ W ′′ .

Similarly, there is a homotopy j1 : C × [0, 1] → Z with j10 = f0 ◦ ψ ◦ ϕ|C , j11 =f1|C , and Gr( j1t ) ⊂ W for all t . Combining j0, h, and j1 in the usual manner givesa homotopy between f0 and f1 whose graph is contained in W × [0, 1]. The proofof (b) is complete. �

Part IVSmooth Methods

Chapter 10Differentiable Manifolds

This chapter introduces the basic concepts of differential topology: ‘manifold,’ ‘tan-gent vector,’ ‘smooth map,’ ‘derivative.’ If these concepts are new to you, you willprobably be relieved to learn that these are just the basic concepts of multivariatedifferential calculus, with a critical difference.

Inmultivariate calculus you are handed a coordinate system, and a geometry,whenyou walk in the door, and everything is a calculation within that given Euclideanspace. But many of the applications of multivariate calculus take place in spaces likethe sphere, or the physical universe, whose geometry is not Euclidean. The theoryof manifolds provides a language for the concepts of differential calculus that is inmany ways more natural, because it does not presume a Euclidean setting. Roughly,this has two aspects:

• In differential topology spaces that are locally homeomorphic to Euclidean spacesare defined, andwe then impose structure that allows us to talk about differentiationof functions between such spaces. The concepts of interest to differential topologyper se are those that are “invariant under diffeomorphism,” much as topology issometimes defined as “rubber sheet geometry,” namely the studyof those propertiesof spaces that don’t change when the space is bent or stretched.

• The second step is to impose local notions of angle and distance at each point ofa manifold. With this additional structure the entire range of geometric issues canbe addressed. This vast subject is called differential geometry.

For us differential topology will be primarily a tool that we will use to set up anenvironment in which issues related to fixed points have a particularly simple andtractable structure. We will only scratch its surface, and differential geometry willnot figure in our work at all.

The aim of this chapter is provide only as much information as we will need later,in the simplest and most concrete manner possible. Thus our treatment of the subjectis in various ways terse and incomplete, even as an introduction to this topic, whichhas had an important influence on economic theory. Milnor (1965a) and Guilleminand Pollack (1974) are the traditional entry points to this material for mathematical


183


184 10 Differentiable Manifolds

economists, and at a somewhat higher level (Hirsch 1976a) is more comprehensive,but still quite accessible. Lee (2013) is a recent, highly recommended, text.

10.1 Review of Multivariate Calculus

We begin with a quick review of the most important facts of multivariate differentialcalculus. Let f : U → R

n be a function whereU ⊂ Rm is open. Recall that if r ≥ 1

is an integer, we say that f is Cr if all partial derivatives of order ≤ r are definedand continuous. For reasons that will become evident in the next paragraph, it canbe useful to extend this notation to include r = 0, with C0 interpreted as a synonymfor “continuous.” We say that f is C∞ if it is Cr for all finite r . An order ofdifferentiability is either a nonnegative integer r or ∞, and we write 2 ≤ r ≤ ∞,for example, to indicate that r is such an object, within the given bounds.

If f is C1, then f is differentiable: for each x ∈ U and ε > 0 there is δ > 0 suchthat

‖ f (x ′) − f (x) − Df (x)(x ′ − x)‖ ≤ ε‖x ′ − x‖

for all x ′ ∈ U with ‖x ′ − x‖ < δ, where the derivative of f at x is the linear function

Df (x) : Rm → Rn

given by the matrix of first partial derivatives at x . If f is Cr , then the function

Df : U → L(Rm,Rn)

is Cr−1 if we identify L(Rm,Rn) with the space Rn×m of n × m matrices. The

reader is expected to know the standard facts of elementary calculus, especially thataddition and multiplication are C∞, so that functions built up from these operations(e.g., linear functions and matrix multiplication) are known to be C∞.

There are three basic operations used to construct new Cr functions from givenfunctions. The first is restriction of the function to an open subset of its domain, whichrequires no comment because the derivative is unaffected. The second is forming thecartesian product of two functions: if f1 : U → R

n1 and f2 : U → Rn2 are functions,

we define f1 × f2 : U → Rn1+n2 to be the function x → ( f1(x), f2(x)). Evidently

f1 × f2 is Cr if and only if f1 and f2 are Cr , and when this is the case we have

D( f1 × f2) = Df1 × Df2 .

The third operation is composition. The most important theorem of multivariatecalculus is the chain rule: if U ⊂ R

m and V ⊂ Rn are open and f : U → V and

g : V → Rp are C1, then g ◦ f is C1 and

D(g ◦ f )(x) = Dg( f (x)) ◦ Df (x)

10.1 Review of Multivariate Calculus 185

for all x ∈ U . Of course the composition of two C0 functions is C0. Arguing induc-tively, suppose we have already shown that the composition of two Cr−1 functionsis Cr−1. If f and g are Cr , then Dg ◦ f is Cr−1, and we can apply the result aboveabout cartesian products, then the chain rule, to the composition

x → (Dg( f (x)), Df (x)) → Dg( f (x)) ◦ Df (x)

to show that D(g ◦ f ) is Cr−1, so that g ◦ f is Cr .Often the domain and range of the pertinent functions are presented to us as vector

spaces without a given or preferred coordinate system, so it is important to observethat we can use the chain rule to achieve definitions that are independent of thecoordinate systems. Let X and Y be m- and n-dimensional vector spaces. (In thischapter all vector spaces are finite dimensional, with R as the field of scalars.) Letc : X → R

m and d : Y → Rn be linear isomorphisms. IfU ⊂ X is open, we can say

that a function f : U → Y is Cr , by definition, if d ◦ f ◦ c−1 : c(U ) → Rk is Cr ,

and if this is the case and x ∈ U , then we can define the derivative of f at x to be

Df (x) := d−1 ◦ D(d ◦ f ◦ c−1)(c(x)) ◦ c ∈ L(X,Y ) .

Using the chain rule, one can easily verify that these definitions do not depend on thechoice of c and d. In addition, the chain rule given above can be used to show that this“coordinate free” definition also satisfies a chain rule. Let Z be a third p-dimensionalvector space. Then if V ⊂ Y is open, g : V → Z is Cr , and f (U ) ⊂ V , then g ◦ fis Cr and D(g ◦ f ) = Dg ◦ Df .

Sometimes we will deal with functions whose domains are not open, and we needto define what it means for such a function to beCr . Let S be a subset of X of any sortwhatsoever. If Y is another vector space and f : S → Y is a function, then f is Cr

by definition if there is an open U ⊂ X containing S and a Cr function F : U → Ysuch that f = F |S . Evidently being Cr isn’t the same thing as having a well definedderivative at each point in the domain!

Note that the identity function on S is always Cr , and the chain rule impliesthat compositions of Cr functions are Cr . Those who are familiar with the categoryconcept will recognize that there is a category of subsets of finite dimensional vectorspaces and Cr maps between them. (If you haven’t heard of categories it wouldcertainly be a good idea to learn a bit about them, but what happens later won’tdepend on this language.)

We now state coordinate free versions of the inverse and implicit function theo-rems. Since you are expected to know the usual, coordinate dependent, formulationsof these results, and it is obvious that these imply the statements below, we give noproofs.

Theorem 10.1 (Inverse Function Theorem) If n = m (that is, X and Y are bothm-dimensional)U ⊂ X is open, f : U → Y isCr , x ∈ U, and D f (x) is nonsingular,then there is an open V ⊂ U containing x such that f |V is injective, f (V ) is openin Y , and ( f |V )−1 is Cr .


Suppose that U ⊂ X × Y is open and f : U → Z is a function. If f is C1, then,at a point (x, y) ∈ U , we can define “partial derivatives” Dx f (x, y) ∈ L(X, Z) andDy f (x, y) ∈ L(Y, Z) to be the derivatives of the functions

f (·, y) : { x ∈ X : (x, y) ∈ U } → Z and f (x, ·) : { y ∈ Y : (x, y) ∈ U } → Z

at x and y respectively.

Theorem 10.2 (Implicit Function Theorem) Suppose that p = n. (That is Y andZ have the same dimension.) If U ⊂ X × Y is open, f : U → Z is Cr , (x0, y0) ∈U, f (x0, y0) = z0, and Dy f (x0, y0) is nonsingular, then there is an open V ⊂ Xcontaining x0, an open W ⊂ U containing (x0, y0), and a Cr function g : V → Ysuch that g(x0) = y0 and

{ (x, g(x)) : x ∈ V } = { (x, y) ∈ W : f (x, y) = z0 } .

In additionDg(x0) = −Dy f (x0, y0)

−1 ◦ Dx f (x0, y0) .

We will sometimes encounter settings in which the decomposition of the domaininto a cartesian product is not given. Suppose that T is a fourth vector space,U ⊂ Tis open, t0 ∈ U , f : U → Z is Cr , and Df (t0) : T → Z is surjective. Let Y be alinear subspace of T of the same dimension as Z such that Df (t0)|Y is surjective,and let X be a complementary linear subspace: X ∩ Y = {0} and X + Y = T . Ifwe identify T with X × Y , then the assumptions of the result above hold. We willunderstand the implicit function theorem as extending in the obvious way to thissetting.

The next result generalizes the inverse function theorem, and it easily implies theimplicit function theorem. It allows us to impose coordinate systems that will bequite convenient in various arguments later.

Theorem 10.3 (Constant Rank Theorem) Suppose that U ⊂ Rm is open and f :

U → Rn is a Cr function such that for all x ∈ U, the rank of D f (x) is k. Then for

any p0 ∈ U there are neighborhoods V ⊂ U of p0 and W ⊂ Rn of f (p0), and Cr

diffeomorphisms ϕ : V → ϕ(V ) ⊂ Rm and ψ : V → ψ(V ) ⊂ R

n, such that

ψ ◦ f ◦ ϕ−1 : q → (q1, . . . , qk, 0, . . . , 0) .

Proof Wewrite f (p) = (A(y, z), B(y, z))where p = (y, z) ∈ Rk × R

m−k , A(y, z)∈ R

k , and B(y, z) ∈ Rn−k . Wemay assume that Dy A(y, z) has rank k for all (y, z) ∈

U because we could have reordered the coordinates and then restricted to someneighborhood of p0 to make this the case. Let ϕ(y, z) := (A(y, z), z). Then

Dϕ(y, z) =(Dy A(y, z) Dz A(y, z)

0 Im−k

)

10.1 Review of Multivariate Calculus 187

is nonsingular, so the inverse function theoremgives a neighborhoodV of p0 suchϕ|Vhas a Cr inverse. For q ∈ ϕ(V ) let q = (x,w) ∈ R

k × Rm−k . Then f (ϕ−1(x,w)) =

(x,C(x,w)) ∈ Rk × R

n−k , and DwC(x,w) = 0 because

D( f ◦ ϕ−1)(x,w) =(

Ik 0DxC(x,w) DwC(x,w)

)

has rank k. Possibly after replacing V with a smaller neighborhood of p0, we maytreat C as a function of x alone. We define ψ on a suitable neighborhood W of f (p)by setting ψ(u, v) := (u, v − C(u)) where u ∈ R

k and v ∈ Rn−k . Finally replace V

with V ∩ f −1(W ).

10.2 Smooth Partitions of Unity

A common problem in differentiable topology is the passage from local to global.That is, one is given or can prove the existence of objects that are defined locally in aneighborhood of each point, and onewishes to construct a global object with the sameproperties. A common and simplemethod of doing so is to take convex combinations,where the weights in the convex combination vary smoothly. This section developsthe technology underlying this sort of argument, then develops some illustrative anduseful applications.

Fix a finite dimensional vector space X and r such that 1 ≤ r ≤ ∞.

Definition 10.1 Suppose that {Uα}α∈A is a collection of open subsets of X andU := ⋃

α Uα . A Cr partition of unity for U subordinate to {Uα} is a collection{ϕβ : X → [0, 1]}β∈B of Cr functions such that:

(a) for each β the closure of Vβ := { x ∈ X : ϕβ(x) > 0 } is contained in some Uα;(b) {Vβ} is locally finite (as a cover of U );(c)

∑β ϕβ(x) = 1 for each x ∈ U .

The first order of business is to show that such partitions of unity exist. The keyidea is the following ingenious construction.

Lemma 10.1 There is a C∞ function γ : R → R with γ (t) = 0 for all t ≤ 0 andγ (t) > 0 for all t > 0.

Proof Let

γ (t) :={0, t ≤ 0,

e−1/t , t > 0.

Standard facts of elementary calculus can be combined inductively to show that foreach r ≥ 1 there is a polynomial Pr such that γ (r)(t) is Pr (1/t)e−1/t if t > 0. Sincethe exponential function dominates any polynomial, it follows that γ (r)(t)/t → 0 ast → 0, so that each γ (r) is differentiable at 0 with γ (r+1)(0) = 0. Thus γ is C∞. �


Note that for any open rectangle∏m

i=1(ai , bi ) ⊂ Rm the function

x →∏i

γ (xi − ai )γ (bi − xi )

is C∞, positive everywhere in the rectangle, and zero everywhere else.

Lemma 10.2 If {Uα} is a collection of open subsets of Rm and U = ⋃α Uα , then

U has a locally finite (relative to U) covering by open rectangles, each of whoseclosure in contained in some Uα .

Proof For any integer j ≥ 0 and vector k = (k1, . . . , km) with integer componentslet

Pj,k =m∏i=1

( ki−12 j , ki+1

2 j

)and Q j,k =

m∏i=1

( ki−32 j , ki+3

2 j

).

The cover consists of those Pj,k such that: (a) the closure of Pj,k is contained in someUα; (b) either j = 0 or there is no α such that the closure of Q j,k is contained inUα .

Consider a point x ∈ U . Evidently (b) implies that x has a neighborhood thatintersects the closures of only finitely many cubes in the collection, so the collectionis locally finite.

To show that the collection covers x let j be the least integer such that there is somek such that x ∈ Pj,k and the closure of Pj,k is contained inUα . (Obviously sufficientlylarge j have this property.) For such a k we define k ′ by letting k ′

i be ki/2 or (ki + 1)/2according to whether ki is even or odd. If j ≥ 1, then Pj,k ⊂ Pj−1,k ′ ⊂ Q j,k , so Pj,k

is in our collection, either because j = 0 or because the closure of Pj−1,k ′ is notcontained in any Uα . �

Suppose that {Uα}α∈A is a collection of open subsets of X and U := ⋃α Uα .

Imposing a coordinate system on X , then combining the observations above, gives acollection {ψβ}β∈B of C∞ functions ψβ : X → R

+ such that for each β the closureof Vβ = { x ∈ X : ψβ(x) > 0 } is contained in some Uα , and {Vβ} is a locally finitecover of U . If we define ϕβ : X → [0, 1] by setting ϕβ(x) = ψβ(x)/

∑β ′∈B ψβ ′(x),

then ϕβ is C∞ and∑

β ϕβ(x) = 1, so:

Theorem 10.4 For any collection {Uα}α∈A of open subsets of X there is a C∞partition of unity for

⋃α Uα subordinate to {Uα}.

For future reference we mention a consequence that comes up frequently:

Corollary 10.1 If U ⊂ X is open and C0 and C1 are disjoint closed subsets ofU, then there is a C∞ function α : U → [0, 1] with α(x) = 0 for all x ∈ C0 andα(x) = 1 for all x ∈ C1.

10.2 Smooth Partitions of Unity 189

Proof Let {ϕβ} be a C∞ partition of unity for U subordinate to {U \ C1,U \ C0},and set α(x) := ∑

Vβ∩C1 =∅ ϕβ . �

Now letY be a second vector space. As a first applicationwe consider the problem,which arises in connection with the definition in the last section, of what it meansfor a Cr function f : S → Y on a general domain S ⊂ X to be Cr . We say that fis locally Cr if each x ∈ S has a neighborhood Ux ⊂ X that is the domain of a Cr

function Fx : Ux → Y with Fx |S∩Ux = f |S∩Ux . This seems like the “conceptuallycorrect” definition of what it means for a function to be Cr , because this should bea local property that can be checked by looking at a neighborhood of an arbitrarypoint in the function’s domain. A Cr function is locally Cr , obviously. Fortunatelythe converse holds, so that the definition we have given agrees with the one thatis conceptually correct. (In addition, it will often be pleasant to apply the givendefinition because it is simpler!)

Proposition 10.1 If S ⊂ X and f : S → Y is locally Cr , then f is Cr .

Proof Let {Fx : Ux → Y }x∈S be as above. Let {ϕβ}β∈B be a C∞ partition of unityfor U := ⋃

x Ux subordinate to {Ux }. For each β choose an xβ such that the closureof { x : ϕβ(x) > 0 } is contained in Uxβ

, and let F := ∑β ϕβFxβ

: U → Y . Then Fis Cr because each point in U has a neighborhood in which it is a finite sum of Cr

functions. For x ∈ S we have

F(x) =∑

β

ϕβ(x)Fxβ(x) =

∑β

ϕβ(x) f (x) = f (x) .

�

Here is another useful result applying a partition of unity.

Proposition 10.2 For any S ⊂ X, C∞(S,Y ) is dense in CS(S,Y ).

Proof Fix a continuous f : S → Y and an open W ⊂ S × Y containing the graphof f . Our goal is to find a C∞ function from S to Y whose graph is also containedin W .

For each p ∈ S choose a neighborhood Up of p and εp > 0 small enough that

f (Up ∩ S) ⊂ Uεp ( f (p)) and (Up ∩ S) × U2εp ( f (p)) ⊂ W .

Let U := ⋃p∈W Up. Let {ϕβ}β∈B be a C∞ partition of unity for U subordinate

to {Up}p∈S . For each β let Vβ := { x : ϕβ(x) > 0 }, choose some pβ such thatVβ ⊂ Upβ

, and let Uβ := Upβand εβ := εpβ

. Let f : U → Y be the functionx → ∑

β ϕβ(x) f (pβ). Since {Vβ} is locally finite, f : U → Y is C∞, so f |S isC∞.

We still need to show that the graph of f |S is contained in W . Consider somep ∈ S. Of those β with ϕβ(p) > 0, let α be one of those for which εβ is maximal.


Of course p ∈ Upα, and f (p) ∈ U2εα

( f (pα)) because for any other β such thatϕβ(p) > 0 we have

‖ f (pβ) − f (pα)‖ ≤ ‖ f (pβ) − f (p)‖ + ‖ f (p) − f (pα)‖ < 2εα .

Therefore (p, f (p)) ∈ Upα× U2εα

( f (pα)) ⊂ W . �

10.3 Manifolds

The maneuver we saw in Sect. 10.1—passing from a calculus of functions betweenEuclidean spaces to a calculus of functions between vector spaces—was accom-plished not by fully “eliminating” the coordinate systems of the domain and range,but instead by showing that the “real” meaning of the derivative would not changeif we replaced those coordinate systems by any others. The definition of a Cr mani-fold, and of aCr function between such manifolds, is a more radical and far reachingapplication of this idea.

A manifold is an object like the sphere, the torus, and so forth, that “looks like” aEuclidean space in a neighborhood of any point, but which may have different sortsof large scale structure. We first of all need to specify what “looks like” means, andthis will depend on a degree of differentiability. Fix an m-dimensional vector spaceX , an open U ⊂ X , and a degree of differentiability 0 ≤ r ≤ ∞.

Recall that if A and B are topological spaces, a function e : A → B is an embed-ding if it is continuous and injective, and its inverse is continuous when e(A) hasthe subspace topology. Concretely, e is an injection that maps open sets of A to opensubsets of e(A). Note that the restriction of an embedding to any open subset of thedomain is also an embedding.

Lemma 10.3 If U ⊂ X is open and ϕ : U → Rk is a Cr embedding such that for

all x ∈ U the rank of Dϕ(x) is m, then ϕ−1 is a Cr function.

Proof By Proposition10.1 it suffices to show that ϕ−1 is locally Cr . Fix a point p inthe image of ϕ, let x := ϕ−1(p), let X ′ be the image of Dϕ(x), and let π : Rk → X ′be the orthogonal projection. Since ϕ is an immersion, X ′ is m-dimensional, and therank of D(π ◦ ϕ)(x) = π ◦ Dϕ(x) is m. The inverse function theorem implies thatthe restriction of π ◦ ϕ to some open subset of U containing x has aCr inverse. Nowthe chain rule implies that ϕ−1|ϕ(U ) = (π ◦ ϕ|U )−1 ◦ π |ϕ(U ) is C

r . �

Definition 10.2 A set M ⊂ Rk is an m-dimensional Cr manifold if, for each

p ∈ M , there is a Cr embedding ϕ : U → M , where U is an open subset of anm-dimensional vector space, such that for all x ∈ U the rank of Dϕ(x) is m andϕ(M) is a relatively open subset of M that contains p. We say that ϕ is a Cr param-eterization for M and ϕ−1 is a Cr coordinate chart for M . A collection {ϕi }i∈I ofCr parameterizations for M whose images cover M is called a Cr atlas for M .

10.3 Manifolds 191

Although the definition above makes sense when r = 0, we will have no use forthis case because there are certain pathologies that we wish to avoid. Among otherthings, the beautiful example known as the Alexander horned sphere Alexander(1924) shows that a C0 manifold may have what is known as a wild embedding ina Euclidean space. From this point on we assume that r ≥ 1.

There are many “obvious” examples of Cr manifolds such as spheres, the torus,etc. In analytic work one should bear in mind the most basic examples:

(a) A set S ⊂ Rk is discrete if each p ∈ S has a neighborhoodW such that S ∩ W =

{p}. A discrete set is a 0-dimensional Cr manifold.(b) Any open subset (including the empty set) of an m-dimensional affine subspace

of Rk is an m-dimensional Cr manifold. More generally, an open subset of anm-dimensional Cr manifold is itself an m-dimensional Cr manifold.

(c) If U ⊂ Rm is open and φ : U → R

k−m is Cr , then the graph

Gr(φ) := { (x, φ(x)) : x ∈ U } ⊂ Rk

ofφ is anm-dimensionalCr manifold, becauseϕ : x → (x, φ(x)) is aCr param-eterization.

10.4 Smooth Maps

Let M ⊂ Rk be anm-dimensionalCr manifold, and let N ⊂ R

� be an n-dimensionalCr manifold. We have already defined what it means for a function f : M → N tobe Cr : there is an open W ⊂ R

k that contains M and a Cr function F : W → R�

such that F |M = f . The following characterization of this condition is technicallyuseful and conceptually important.

Proposition 10.3 For a function f : M → N the following are equivalent:

(a) f is Cr ;(b) for each p ∈ M there are Cr parameterizations ϕ : U → M and ψ : V → N

such that p ∈ ϕ(U ), f (ϕ(U )) ⊂ ψ(V ), and ψ−1 ◦ f ◦ ϕ is a Cr function;(c) ψ−1 ◦ f ◦ ϕ is a Cr function whenever ϕ : U → M and ψ : V → N are Cr

parameterizations such that f (ϕ(U )) ⊂ ψ(V ).

Proof Because compositions of Cr functions are Cr , (a) implies (c), and since eachpoint in a manifold is contained in the image of a Cr parameterization, it is clearthat (c) implies (b). Fix a point p ∈ M and Cr parameterizations ϕ : U → M andψ : V → N with p ∈ ϕ(U ) and f (ϕ(U )) ⊂ ψ(V ). Lemma10.3 implies that ϕ−1

andψ−1 areCr , soψ ◦ (ψ−1 ◦ f ◦ ϕ) ◦ ψ−1 isCr on its domain of definition. Sincep was arbitrary, we have shown that f is locally Cr , and Proposition10.1 impliesthat f is Cr . Thus (b) implies (a). �


There is a more abstract approach to differential topology (which is followed inHirsch 1976a) in which an m-dimensional Cr manifold is a topological space Mtogether with a collection { ϕα : Uα → M }α∈A, where each ϕα is a homeomorphismbetween an open subset Uα of an m-dimensional vector space and an open subsetof M ,

⋃α ϕα(Uα) = M , and for any α, α′ ∈ A, ϕ−1

α′ ◦ ϕα is Cr on its domain ofdefinition. If N with collection { ψβ : Vβ :→ N } is an n-dimensional Cr manifold,a function f : M → N is Cr by definition if, for all α and β, ψ−1

β ◦ f ◦ ϕα is a Cr

function on its domain of definition.The abstract approach is preferable from a conceptual point of view; for example,

we can’t see some Rk that contains the physical universe, so our physical theoriesshould avoid reference to such anRk if possible. (SometimesRk is called the ambientspace.) However, in the abstract approach there are certain technical difficultiesthat must be overcome just to get acceptable definitions. In addition, the Whitneyembedding theorems (cf. Hirsch 1976a) show that, under assumptions that aresatisfied in almost all applications, a manifold satisfying the abstract definition canbe embedded in someRk , so our approach is not less general in any important sense.From a technical point of view, the assumed embedding of M in R

k is extremelyuseful because it automatically imposes conditions such as metrizability and thusparacompactness, and it allows certain constructions that simplify many proofs.

There is a category of Cr manifolds and Cr maps between them. (This can beproved from the definitions, or we can just observe that this category can be obtainedfrom the category of subsets of finite dimensional vector spaces andCr maps betweenthem by restricting the objects and morphisms.) The notion of isomorphism for thiscategory is:

Definition 10.3 A function f : M → N is a Cr -diffeomorphism if f is a bijectionand f and f −1 are both Cr . If such an f exists we say that M and N are Cr

diffeomorphic.

If M and N are Cr diffeomorphic we will, for the most part, regard them as twodifferent “realizations” of “the same” object. In this sense the spirit of the definitionof a Cr manifold is that the particular embedding of M in R

k is of no importance,and k itself is immaterial.

10.5 Tangent Vectors and Derivatives

There are many notions of “derivative” in mathematics, but invariably the term refersto a linear approximation of a function that is accurate “up to first order.” The firststep in defining the derivative of aCr map between manifolds is to specify the vectorspaces that serve as the linear approximation’s domain and range.

Fix an m-dimensional Cr manifold M ⊂ Rk . Throughout this section, when we

refer to a Cr parameterization ϕ : U → M , it will be understood that U is an opensubset of the m-dimensional vector space X .

10.5 Tangent Vectors and Derivatives 193

Definition 10.4 If ϕ : U → M is a C1 parameterization and p = ϕ(x), then thetangent space TpM of M at p is the image of the linear transformation Dϕ(x) :X → R

k .

We should check that this does not depend on the choice of ϕ. If ϕ′ : U ′ → M isa second C1 parameterization with ϕ′(x ′) := p, then the chain rule gives Dϕ′(x ′) =Dϕ(x) ◦ D(ϕ−1 ◦ ϕ′)(x ′), so the image of Dϕ′(x ′) is contained in the image ofDϕ(x), and vice versa by symmetry.

We can combine the tangent spaces at the various points of M :

Definition 10.5 The tangent bundle of M is

T M :=⋃p∈M

{p} × TpM ⊂ Rk × R

k .

For a Cr parameterization ϕ : U → M for M we define

Tϕ : U × X → { (p, v) ∈ T M : p ∈ ϕ(U ) } ⊂ T M

by settingTϕ(x,w) := (ϕ(x), Dϕ(x)w) .

Lemma 10.4 If r ≥ 2, then Tϕ is a Cr−1 parameterization for T M.

Proof It is easy to see that Tϕ is aCr−1 immersion, and that it is injective. The inversefunction theorem implies that its inverse is continuous. �

Every p ∈ M is contained in the image of some Cr parameterization ϕ, and forevery v ∈ TpM , (p, v) is in the image of Tϕ , so the images of the Tϕ cover T M .Thus:

Proposition 10.4 If r ≥ 2, then T M is a Cr−1 manifold.

Fix a second Cr manifold N ⊂ R�, which we assume to be n-dimensional, and a

Cr function f : M → N .

Definition 10.6 If F is a C1 extension of f to a neighborhood of p, the derivativeof f at p is the linear function

Df (p) := DF(p)|TpM : TpM → T f (p)N .

We need to show that this definition does not depend on the choice of extensionF . Let ϕ : U → M be a Cr parameterization whose image is a neighborhood of p,let x := ϕ−1(p), and observe that, for any v ∈ TpM , there is some w ∈ R

m such thatv = Dϕ(x)w, so that

DF(p)v = DF(p)(Dϕ(x)w) = D(F ◦ ϕ)(x)w = D( f ◦ ϕ)(x)w .


We also need to show that the image of Df (p) is, in fact, contained in T f (p)N . Letψ : V → N be a Cr parameterization of a neighborhood of f (p). The last equationshows that the image of Df (p) is contained in the image of

D( f ◦ ϕ)(x) = D(ψ ◦ ψ−1 ◦ f ◦ ϕ)(x) = Dψ(ψ−1( f (p))) ◦ D(ψ−1 ◦ f ◦ ϕ) ,

so the image of Df (p) is contained in the image of Dψ−1(ψ( f (p)), which is T f (p)N .Naturally the chain rule is the most important basic result about the derivative.We

expect that many readers have seen the following result, and at worst it is a suitableexercise, following from the chain rule of multivariable calculus without trickery, sowe give no proof.

Proposition 10.5 If M ⊂ Rk , N ⊂ R

�, and P ⊂ Rm are C1 manifolds, and f :

M → N and g : N → P are C1 maps, then, at each p ∈ M,

D(g ◦ f )(p) = Dg( f (p)) ◦ Df (p) .

We can combine the derivatives defined at the various points of M :

Definition 10.7 The derivative of f is the function T f : T M → T N given by

T f (p, v) := ( f (p), Df (p)v) .

These objects have the expected properties:

Proposition 10.6 If r ≥ 2, then T f is a Cr−1 function.

Proof Each (p, v) ∈ T M is in the image of Tϕ for some Cr parameterization ϕ

whose image contains p. The chain rule implies that

T f ◦ Tϕ : (x,w) → (f (ϕ(x)), D( f ◦ ϕ)(x)w

),

is a Cr−1 function. We have verified that T f satisfies (c) of Proposition10.3. �Proposition 10.7 T IdM = IdT M .

Proof Since IdRk is a C∞ extension of IdM , we clearly have DIdM(p) = IdTpM foreach p ∈ M . The claim now follows directly from the definition of T IdM . �Proposition 10.8 If M, N, and P are Cr manifolds and f : M → N and g : N →P are Cr functions, then T (g ◦ f ) = Tg ◦ T f .

Proof Using Proposition10.5 we compute that

Tg(T f (p, v)) = Tg( f (p), Df (p)v) = (g( f (p)), Dg( f (p))Df (p)v)

= (g( f (p)), D(g ◦ f )(p)v) = T (g ◦ f )(p, v) .

�

10.5 Tangent Vectors and Derivatives 195

For the categorically minded we mention that Proposition10.4 and the last threeresults can be summarized very succinctly by saying that if r ≥ 2, then T is a covari-ant functor from the category of Cr manifolds and Cr maps between them to thecategory of Cr−1 manifolds and Cr−1 maps between them. Again, we will not usethis language later, so in a sense you do not need to know what a functor is, butcategorical concepts and terminology are pervasive in modern mathematics, so itwould certainly be a good idea to learn the basic definitions.

Let’s relate the definitions above to more elementary notions of differentiation.Consider an open interval (a, b) ⊂ R, a C1 function f : (a, b) → M , and a pointt ∈ (a, b). Formally Df (t) is a linear function from Tt (a, b) to T f (t)M , but thinkingabout things in this way is usually rather cumbersome. Of course Tt (a, b) is justa copy of R, and we define f ′(t) := Df (t)1 ∈ T f (t)M , where 1 is the element ofTt (A, b) corresponding to 1 ∈ R. WhenM is an open subset ofRwe simplify furtherby treating f ′(t) as a number under the identification of T f (t)M with R. In this waywe recover the concept of the derivative as we first learned it in elementary calculus.

10.6 Submanifolds

For almost any kind of mathematical object, we pay special attention to subsets,or perhaps “substructures” of other sorts, that share the structural properties of theobject. One only has to imagine a smooth curve on the surface of a sphere to seethat such substructures of manifolds arise naturally. Fix a degree of differentiability1 ≤ r ≤ ∞. If M ⊂ R

k is an m-dimensional Cr manifold, 1 ≤ s ≤ r , N is ann-dimensional Cs manifold that is also embedded in R

k , and N ⊂ M , then N isa Cs submanifold of M . The integer m − n is called the codimension of N in M .

Proposition10.4 implies that T M is a Cr−1 submanifold of M × Rk , and the

reader can certainly imagine a host of other examples. There is one that might easilybe overlooked because it is so trivial: any open subset of M is a Cr manifold. Con-versely, any codimension zero submanifold of M is just an open subset. Evidentlysubmanifolds of codimension zero are not in themselves particularly interesting, butof course they occur frequently.

Submanifolds arise naturally as images of smooth maps, and as solution sets ofsystems of equations. We now discuss these two points of view at length, arrivingeventually at an important characterization result. Now let M ⊂ R

k and N ⊂ R� be

Cr manifolds that are m- and n-dimensional respectively, and let f : M → N be aCr function. We say that p ∈ M is:

(a) an immersion point of f if Df (p) : TpM → T f (p)N is injective;(b) a submersion point of f if Df (p) is surjective;(c) a diffeomorphism point of f is Df (p) is a bijection.

There are now a number of technical results. Collectively their proofs display theconstant rank theorem (Theorem10.3) as the linchpin of the analysis supporting thissubject.


Proposition 10.9 If p is an immersion point of f , then there is a neighborhoodV of p such that f (V ) is an m-dimensional Cr submanifold of N . In additionD f (p) : TpM → T f (p) f (V ) is a linear isomorphism

Proof The constant rank theorem gives neighborhoods V ⊂ M of p and W ⊂ N off (p) and Cr coordinate charts ϕ : V → R

m andψ : W → Rn such that f (V ) ⊂ W

and ψ ◦ f ◦ ϕ−1 : x → (x, 0) ∈ Rm × R

n−m . Thus ψ | f (V ) is Cr coordinate chartdisplaying f (V ) as a submanifold of N . The rank of Df (p) is not less than the rankof D(ψ ◦ f ◦ ϕ−1)(ϕ(p)), which is m. �

Proposition 10.10 If p is a submersion point of f , then there is a neighborhood Uof p such that f −1( f (p)) ∩U is an (m − n)-dimensional Cr submanifold of M. Inaddition Tp f −1(q) = ker Df (p).

Proof The constant rank theorem gives neighborhoods V ⊂ M of p and W ⊂N of f (p) and Cr coordinate charts ϕ : V → R

m and ψ : W → Rn such that

f (V ) ⊂ W and ψ ◦ f ◦ ϕ−1 : x → (x1, . . . , xn). If π : Rm → Rm−n is the pro-

jection x → (xm−n+1, . . . , xm), then π ◦ ϕ| f −1( f (p)) displays f −1( f (p)) ∩U as an(m − n)-dimensional Cr submanifold of M in a neighborhood of p. We obviouslyhave Tp f −1(q) ⊂ ker Df (p), and the two vector spaces have the same dimension.

�

Proposition 10.11 If p is a diffeomorphism point of f , then there is a neighborhoodV of p such that f (V ) is a neighborhood of f (p) and f |V : V → f (V ) is a Cr

diffeomorphism.

Proof The constant rank theorem gives neighborhoods V ⊂ M of p and W ⊂ N off (p) and Cr coordinate charts ϕ : V → R

m andψ : W → Rn such that f (V ) ⊂ W

and ψ ◦ f ◦ ϕ−1 = Idϕ(V ). We have

( f |V )−1 = ϕ−1 ◦ (ψ ◦ f ◦ ϕ−1)−1 ◦ ψ | f (V ) ,

which is Cr . �

Now let P be a p-dimensionalCr submanifold of N . The following is the technicalbasis of the subsequent characterization theorem.

Lemma 10.5 For any q ∈ P there is a neighborhood Z ⊂ N of q, an (n − p)-dimensional Cr manifold M, and a Cr function f : Z → M such that q is a sub-mersion point of f and f −1( f (q)) = P ∩ Z.

Proof Let ϕ : U → Rp be aCr coordinate chart for a neighborhoodU ⊂ P of q. Let

w := ϕ(q). Letψ : V → Rn be aCr coordinate charg for a neighborhood V ⊂ N of

q that contains U . Then the rank of D(ψ ◦ ϕ−1)(w) is p, so Rn = X ⊕ M where Xis the image of D(ψ ◦ ϕ−1)(w) and M is a complementary subspace. Let πX : X ⊕M → X and πM : X ⊕ M → M be the projections (x,m) → x and (x,m) → mrespectively. The inverse function theorem implies that (after replacing U with a

10.6 Submanifolds 197

smaller neighborhood of q) πX ◦ ψ ◦ ϕ−1 is a Cr diffeomorphism between ϕ(U )

and an open W ⊂ X . Let Z := V ∩ (πX ◦ ψ)−1(W ). Let

g := (πY − ψ ◦ ϕ−1 ◦ (πX ◦ ψ ◦ ϕ−1)−1 ◦ πX )|ψ(Z) and f := g ◦ ψ |Z .

Evidently every point of ψ(Z) is a submersion point of g, so every point of Zis a submersion point of f . If q ′ ∈ P ∩ Z , then q ′ = ϕ−1(w′) for some w′ ∈ U ,so f (q ′) = 0. On the other hand, suppose f (q ′) = 0, and let q ′′ be the image ofq ′ under the map ϕ−1 ◦ (πX ◦ ψ ◦ ϕ−1)−1 ◦ πX ◦ ψ . Then πX (ψ(q ′)) = πX (ψ(q ′′))and πY (ψ(q ′)) = πY (ψ(q ′′)), so q ′′ = q ′ and thus q ′ ∈ P . Thus f −1( f (q)) = P ∩Z . �

Theorem 10.5 Let N be a Cr manifold. For P ⊂ N the following are equivalent:

(a) P is a p-dimensional Cr submanifold of N .(b) For everyq ∈ P there is a relatively openneighborhood V ⊂ P, a p-dimensional

Cr manifold M, a Cr function f : M → P, a p ∈ f −1(q) that is an immersionpoint of f , and a neighborhood U of P, such that f (U ) = V .

(c) For every q ∈ P there is a neighborhood Z ⊂ N of q, an (n − p)-dimensionalCr manifold M, and a Cr function f : Z → M such q is a submersion point off and f −1( f (q)) = P ∩ Z.

Proof In view of the definition of a submanifold, (a) implies (b), and the last resultestablishes that (a) implies (c). Propositions10.9 and 10.10 give the reverse implica-tions. �

LetM ⊂ Rk and N ⊂ R

� be anm-dimensional and ann-dimensionalCr manifold,and let f : M → N be aCr function. We say that f is an immersion if every p ∈ Mis an immersion point of f . It is a submersion if every p ∈ M is a submersion point,and it is a local diffeomorphism if every p ∈ M is a diffeomorphism point. Thereare now some important results that derive submanifolds from functions.

Theorem 10.6 If f : M → N is a Cr immersion, and an embedding, then f (M) isan m-dimensional Cr submanifold of N .

Proof We need to show that any q ∈ f (M) has a neighborhood in f (M) that is an(n − m)-dimensional Cr manifold. Proposition10.9 implies that any p ∈ M has anopen neighborhood V such that f (V ) is a Cr (n − m)-dimensional submanifold ofN . Since f is an embedding, f (V ) is a neighborhood of f (p) in f (M). �

A submersion point of f is also said to be a regular point of f . If p is not aregular point of f , then it is a critical point of f . A point q ∈ N is a critical valueof f if some preimage of q is a critical point, and if q is not a critical value, then itis a regular value. Note the following paradoxical aspect of this terminology: if qis not a value of f , in the sense that f −1(q) = ∅, then q is automatically a regularvalue of f .


Theorem 10.7 (Regular Value Theorem) If q is a regular value of f , then f −1(q)

is an (m − n)-dimensional submanifold of M.

Proof This is an immediate consequence of Proposition10.10. �

This result has an important generalization. Let P ⊂ N be a p-dimensional Cr

submanifold.

Definition 10.8 The function f is transversal to P along S ⊂ M if, for all p ∈f −1(P) ∩ S,

im Df (p) + T f (p)P = T f (p)N .

We write f �S P to indicate that this is the case, and when S = M we simply writef � P .

Theorem 10.8 (Transversality Theorem) If f � P, then f −1(P) is an (m −n + p)-dimensional Cr submanifold of M. For each p ∈ f −1(P), Tp f −1(P) =Df (p)−1(T f (p)P).

Proof Fix p ∈ f −1(P). (If f −1(P) = ∅, then all claims hold trivially.) We use thecharacterization of a Cr submanifold given by Theorem10.5: since P is a submani-fold of N , there is a neighborhoodW ⊂ N of f (p) and aCr functionΨ : W → R

n−p

such that DΨ ( f (p)) has rank n − p and P ∩ W = Ψ −1(0).Let V := f −1(W ) and := Ψ ◦ f |V . Of course V is open, is Cr , and

f −1(P) ∩ V = −1(0). We compute that

im D (p) = DΨ ( f (p))(im Df (p)

) = DΨ ( f (p))(im Df (p) + ker DΨ ( f (p))

)

= DΨ ( f (p))(im Df (p) + T f (p)P

) = DΨ ( f (p))(T f (p)N ) = Rn−s .

(The third equality follows from the final assertion of Proposition10.10, and thefourth is the transversality assumption.) Thus p is a submersion point of . Since pis an arbitrary point of f −1(P) the claim follows from Theorem10.5.

We now have

Tp f−1(P) = ker D (p) = ker(DΨ ( f (p)) ◦ Df (p))

= Df (p)−1(ker DΨ (p)) = Df (p)−1(T f (p)P)

where the first and last equalities are from Proposition10.10. �

10.7 Tubular Neighborhoods

Let M ⊂ Rk be an m-dimensional Cr manifold, where 2 ≤ r ≤ ∞. A number of

important spaces are constructed by attaching a vector space to each point of M . Theformal definition is intuitive but a bit long winded.

10.7 Tubular Neighborhoods 199

Definition 10.9 A Cr vector bundle over M with h-dimensional fibers is a pair(E, π) where E is an (m + h)-dimensional Cr manifold, π : E → M is a Cr func-tion, for each p ∈ M , π−1(p) is a h-dimensional vector space called the fiber abovep, and each point of M is contained in an open setU for which there is a diffeomor-phism ρ : U × R

h → π−1(U ) such that:

(a) π ◦ ρ is the natural projection U × Rh → U , and

(b) for each p ∈ U the map v → ρ(p, v) is a linear isomorphism between Rh and

π−1(p).

A Cr section of (E, π) is a Cr function s : M → E such that π ◦ s = IdM , and thezero section is the map that takes each p to the origin of the fiber over p.

The most obvious example is M × Rh . We have already seen another example,

namely T M , where the maps Tϕ have the role of the functions ρ in the definitionabove. It should be evident that this definition could be refined or generalized inmany directions, and in fact bundles of this sort are a major theme of topology in thesecond half of the 20th century.

Let N be an n-dimensionalCs submanifold of M where 1 ≤ s ≤ r . If (E, π) is asabove, a set F ⊂ E is a Cs subbundle over N with g-dimensional fibers if F is a Cs

submanifold of E , π(F) ⊂ N , and, for each q ∈ N , π−1(q) ∩ F is a g-dimensionallinear subspace of π−1(q). Since T M is a Cr−1 submanifold of M × R

k , it is a Cr−1

subbundle. If N is alsoCr , then { (q, v) ∈ T M : q ∈ N } is aCr−1 subbundle of T M ,and T N is a Cr−1 subbundle of this bundle.

The normal bundle of N in M is

νN := {(q, v) ∈ T M : q ∈ N and v ⊥ Tq N } .

This is also aCr−1 submanifold of N × Rk (the formal verification is easy but tedious

to write out, hence left to the reader) so νN is a Cr−1 subbundle of T M . For eachq ∈ N we let νq N := { v : (q, v) ∈ νN } be (in effect) the fiber of νN over q. Notethat

T(q,0)νN = Tq N ⊕ νq N = TqM .

The main accomplishments of this section are the following result, a variant, anda couple of its many applications. For a continuous λ : N → R++ let

νNλ := { (q, v) ∈ νN : ‖v‖ < λ(q) } .

Theorem 10.9 (Tubular Neighborhood Theorem) There is a continuous λ : N →R++ and a Cr−1 embedding ι : νNλ → M such that ι(q, 0) = q and Dι(q, 0) =IdTqM for all q ∈ N.

The local construction is simple and concrete, corresponding to the intuitive natureof the result. Let

νM := { (p, v) ∈ M × Rk : v ⊥ TpM } ,


let πνM : νM → M be the projection, and let σνM : νM → Rk and σνN : νN → R

k

be the respectivemaps (p, v) → p + v and (q, v) → q + v.Under the identificationsofRk with TpM ⊕ νpM and TqM with Tq N ⊕ νq N weevidently have DσνM (p, 0) =IdRk and DσνN (q, 0) = IdTqM , and DπνM(p, 0) is the projection of TpM ⊕ νpMonto TpM .

For each q ∈ N the inverse function theorem implies that there are neighborhoodsU ⊂ νM and V ⊂ νN of (q, 0) such that σνM |U and σνN |V are Cr−1 embeddings.The image of σνM |U is open inRk , so (by continuity) we may assume that it containsthe image of σνN |V . Let

ιV := πνM ◦ (σνM |U )−1 ◦ σνN |V : V → M .

For each q ′ such that (q ′, 0) ∈ V the chain rule gives

DιV (q ′, 0) = DπνM(q ′, 0) ◦ DσνM(q ′, 0)−1 ◦ DσνN (q ′, 0) = IdTq′ M .

The inverse function theorem implies that (possibly after replacing V with a smallerneighborhood of (q, 0)) ιV is a Cr−1 embedding.

The more substantial technical difficulties are topological, having to do with pass-ing from the local result to an embedding that is defined everywhere in some neigh-borhood of the zero section of νN .

Lemma 10.6 If (X, d) and (Y, e) are metric spaces, f : X → Y is continuous, S isa subset of X such that f |S is an embedding, and for each s ∈ S the restriction of fto some neighborhood Ns of s is an embedding, then there is an open U such thatS ⊂ U ⊂ ⋃

s Ns and f |U is an embedding.

Proof For s ∈ S let δ(s) be one half of the supremum of the set of ε > 0 such thatUε(s) ⊂ Ns and f |Uε(s) is an embedding. The restriction of an embedding to anysubset of its domain is an embedding, which implies that δ is continuous.

Since f |S is an embedding, its inverse is continuous. In conjunction with thecontinuity of δ and d, this implies that for each s ∈ S there is a ζs > 0 such that

d(s, s ′) < min{δ(s) − 12δ(s

′), δ(s) − 12δ(s

′)} (10.1)

for all s ′ ∈ S with e( f (s), f (s ′)) ≤ ζs . For each s choose an openUs ⊂ X such thats ∈ Us ⊂ Uδ(s)/2(s) and f (Us) ⊂ Uζs/3( f (s)). LetU := ⋃

s∈S Us .Wewill show thatf |U is injective with continuous inverse.Consider s, s ′ ∈ S and y, y′ ∈ Y with e( f (s), y) < ζs/3 and e( f (s ′), y′) < ζs ′/3.

We claim that if y = y′, then (10.1) holds: otherwise e( f (s), f (s ′)) > ζs, ζs ′ , so that

e(y, y′) ≥ e( f (s), f (s ′)) − e( f (s), y) − e( f (s ′), y′)

> ( 12e( f (s), f (s ′)) − ζs/3) + ( 12e( f (s), f (s ′)) − ζs ′/3) ≥ 16 (ζs + ζs ′) .


In particular, if f (x) = y = y′ = f (x ′) for some x ∈ Us and x ′ ∈ Us ′ , then 12δ(s

′) +d(s, s ′) ≤ δ(s) and thus

Us ′ ⊂ Uδ(s ′)/2(s′) ⊂ Uδ(s ′)/2+d(s,s ′)(s) ⊂ Uδ(s)(s) .

We have x ∈ Us , x ′ ∈ Us ′ , and Us,Us ′ ⊂ Uδ(s)(s), and f |Uδ(s)(s) is injective, so itfollows that x = x ′. We have shown that f |U is injective.

We now need to show that the image of any open subset of U is open in therelative topology of f (U ). Fix a particular s ∈ S. In view of the definition of U , itsuffices to show that if Vs ⊂ Us is open, then f (Vs) is relatively open. The restrictionof f to Uδ(s)(s) is an embedding, so there is an open Zs ⊂ Y such that f (Vs) =f (Uδ(s)(s)) ∩ Zs . Since f (Vs) ⊂ f (Us) ⊂ Uζs/3( f (s)) we have

f (Vs) = (f (U ) ∩ Uζs/3( f (s)) ∩ Zs

) ∩ f (Uδ(s)(s)) .

Above we showed that ifUζs/3( f (s)) ∩ Uζs′ /3( f (s′)) is nonempty, then (10.1) holds.

Therefore f (U ) ∩ Uζs/3( f (s)) is contained in the union of the f (Us ′) for those s ′such that 1

2δ(s′) + d(s, s ′) < δ(s), and for each such s ′ we haveUs ′ ⊂ Uδ(s ′)/2(s ′) ⊂

Uδ(s)(s). Therefore f (U ) ∩ Uζs/3( f (s)) ⊂ f (Uδ(s)(s)), and consequently

f (Vs) = f (U ) ∩ Uζs/3( f (s)) ∩ Zs ,

so f (Vs) is relatively open in f (U ). �Lemma 10.7 If (X, d) is a metric space, S ⊂ X, and U is an open set containingS, then there is a continuous δ : S → R++ such that for all s ∈ S, Uδ(s)(s) ⊂ U.

Proof For each s ∈ S let βs := sup{ ε > 0 : Uε(s) ⊂ U }. Since X is paracompact(Theorem6.1) there is a locally finite refinement {Vα}α∈A of {Uβs (s)}s∈S . Theorem6.2gives a partition of unity {ϕα} subordinate to {Vα}. The claim holds trivially if thereis some α with Vα = X ; otherwise for each α let δα : S → R+ be the functionδα(s) := inf x∈X\Vα

d(s, x), which is of course continuous, and define δ by settingδ(s) := ∑

α ϕα(s)δα(s). If s ∈ S, s ∈ Vα , and δα′(s) ≤ δα(s) for all other α′ suchthat s ∈ Vα′ , then

Uδ(s)(s) ⊂ Uδα(s)(s) ⊂ Vα ⊂ Uβs′ (s′) ⊂ U

for some s ′, so Uδ(s)(s) ⊂ U . �The two lemmas above combine to imply the following result which (in conjunc-

tion with the observations above) implies the tubular neighborhood theorem.

Proposition 10.12 If (X, d) and (Y, e) aremetric spaces, f : X → Y is continuous,S is a subset of X such that f |S is an embedding, and for each s ∈ S the restrictionof f to some neighborhood Ns of s is an embedding, then there is a continuous ρ :S → R++ such thatUρ(s)(s) ⊂ Ns for all s and the restriction of f to

⋃s∈S Uρ(s)(s)

is an embedding.


The next result applies the methods used to prove the tubular neighborhoodtheorem to the tangent bundle instead of the normal bundle. For a continuousλ : M → R++ let

T Mλ := { (p, v) ∈ T M : ‖v‖ < λ(p) } .

Proposition 10.13 There is a continuous function λ : M → R++ and a Cr−1 func-tion κ : T Mλ → M such that κ(p, 0) = p and Dκ(p, ·)(0) = IdTpM for all p ∈ M,and the function κ : (p, v) → (p, κ(p, v)) is a Cr−1 diffeomorphism between Vλ anda neighborhood of the diagonal in M × M.

Proof Let σT M : T M → Rk be the function σT M(p, v) := p + v. LetU ⊂ νM be a

neighborhood of some (p, 0) such that σνM |U is aCr−1 embedding, and let V ⊂ T Mbe a neighborhood of (p, 0) such that σT M(V ) ⊂ σνM(U ). Let

κp := πνM ◦ (σνM |U )−1 ◦ σT M |V : V → M .

For p′ such that (p′, 0) ∈ V the chain rule gives

Dκp(p′, ·)(0) = DπνM(p′, 0) ◦ DσνM(p′, 0)−1 ◦ DσT M(p′, ·)(0) = IdTp′ M .

If κp(p′, v) = (p′, κp(p, v)), it is easy to see that Dκ(p, 0) is surjective, so theinverse function theorem implies that (possibly after replacing V with a smallerneighborhood of (p, 0)) κ is a Cr−1 embedding. Now the claim follows from Propo-sition10.12. �

The following construction provides a local simulation of convex combination.

Proposition 10.14 There is a neighborhood W of the diagonal in M × M and acontinuous function c : W × [0, 1] → M such that:

(a) c(p, p′, 0) = p for all (p, p′) ∈ W;(b) c(p, p′, 1) = p′ for all (p, p′) ∈ W;(c) c(p, p, t) = p for all p ∈ M and all t .

Proof Let λ, κ , and κ be as in the last result, and let W := κ(T Mλ). Let c : T M ×[0, 1] → T M be the function c(p, v, t) := (p, tv). Evidently c(T Mλ × [0, 1]) ⊂T Mλ. Clearly c = κ ◦ c ◦ (κ−1 × Id[0,1]) has all required properties. �

We now establish two results we will need later that illustrate the application ofthe tubular neighborhood theorem. Let M be anm-dimensional Cr manifold, and letN be an n-dimensional manifold that is no longer a submanifold of M .

Theorem 10.10 For any S ⊂ M, Cr−1(S, N ) is dense in CS(S, N ).

Proof Proposition10.2 implies that Cr−1(S, Vρ) is dense in CS(S, Vρ), andLemma5.14 implies that f → πν ◦ σ−1

ρ ◦ f is continuous. �


Theorem 10.11 Any neighborhood U ⊂ CS(S, N ) of a continuous f : S → Ncontains a neighborhood U ′ such that for any f0, f1 ∈ U ′ there is a homotopyh : S × [0, 1] → N with ht ∈ U ′ for all t , and if f0 and f1 are Cr−1, then so is h.

Proof The definition of the strong topology implies that there is an open W ⊂S × N such that f ∈ { f ′ ∈ C(S, N ) : Gr( f ′) ⊂ W } ⊂ U . Lemma10.7 implies thatthere is a continuous λ : N → R++ such that Uλ(y)(y) ⊂ Vρ for all y ∈ N and(x, π(σ−1

ρ (z))) ∈ W for all x ∈ S and z ∈ Uλ( f (x))( f (x)). Let W ′ := { (x, y) ∈ W :y ∈ Uλ( f (x))( f (x)) } and U ′ := { f ′ ∈ C(S, N ) : Gr( f ′) ⊂ W ′ } ⊂ U . For f0, f1 ∈U ′ define h by setting

h(x, t) := πν(σ−1

ρ ((1 − t) f0(x) + t f1(x)))

.

If f0 and f1 are Cr−1, so that they are the restrictions to S of Cr−1 functions definedon open supersets of S, then this formula defines a Cr−1 extension of h to an opensuperset of S × [0, 1], so that h is Cr−1. �

Recall that a topological space X is locally path connected if, for each x ∈ X ,each neighborhood U of x contains a neighborhood V such that for any x0, x1 ∈ Vthere is a continuous path γ : [0, 1] → U with γ (0) = x0 and γ (1) = x1. For anopen subset of a locally convex topological vector space, local path connectednessis automatic: any neighborhood of a point contains a convex neighborhood. For anysubset S ⊂ M let Cr

S(S, N ) be Cr (M, N ) with the relative topology inherited fromCS(S, N ). In view of Lemma5.9 we have:

Corollary 10.2 If S ⊂ M is compact, then CS(S, N ) and Cr−1S (S, N ) are locally

path connected.

10.8 Manifolds with Boundary

Let X be an m-dimensional vector space, and let H be a closed half space of X . Inthe same way that manifolds were “modeled” on open subsets of X , manifolds withboundary are “modeled” on open subsets of H . Examples of ∂-manifolds includethe m-dimensional unit disk

Dm := { x ∈ Rm : ‖x‖ ≤ 1 } ,

the annulus { x ∈ R2 : 1 ≤ ‖x‖ ≤ 2 }, and of course H itself. Since we will fre-

quently consider homotopies, a particularly important example is M × [0, 1] whereM is a manifold (without boundary). Thus it is not surprising that we need to extendour formalism in this direction. What actually seems more surprising is the infre-quency with which one needs to refer to “manifolds with corners,” which are spacesthat are “modeled” on the nonnegative orthant of Rm .


There is a technical point that we need to discuss. IfU ⊂ H is open and f : U →Y is C1, where Y is another vector space, then the derivative Df (x) is defined atany x ∈ U , including those in the boundary of H , in the sense that all C1 extensionsf : U → Y of f to open (in X ) sets U with U ∩ H = U have the same derivative atx . This is fairly easy to prove by showing that if w ∈ X and the ray rw = { x + tw :t ≥ 0 } from x “goes into” H , then the derivative of f along rw is determined by f ,and that the set of such w spans X . We won’t belabor the point by formalizing thisargument.

The following definitions parallel those of the last section. IfU ⊂ H is open andϕ : U → Y is a function, we say that ϕ is a Cr ∂-immersion if it is Cr and the rankof Dϕ(x) is m for all x ∈ U . If, in addition, ϕ is a homeomorphism between U andϕ(U ), then we say that ϕ is a Cr ∂-embedding.

Definition 10.10 If M ⊂ Rk , an m-dimensional Cr ∂-parameterization for M is a

Cr ∂-embedding ϕ : U → M , where U ⊂ H is open and ϕ(U ) is a relatively opensubset of M . If each p ∈ M is contained in the image of a Cr parameterization forM , then M is an m-dimensional Cr manifold with boundary.

We will often write “∂-manifold” in place of the cumbersome phrase “manifold withboundary.”

Fix an m-dimensional Cr ∂-manifold M ⊂ Rk . We say that p ∈ M is a boundary

point ofM if there aCr ∂-parameterization ofM that maps a point in the boundary ofH to p. If any Cr parameterization of a neighborhood of p has this property, then alldo; this is best understood as a consequence of invariance of domain (Theorem14.11)which is most commonly proved using algebraic topology. Invariance of domain isquite intuitive, and eventually we will be able to establish it, but in the meantimethere arises the question of whether our avoidance of results derived from algebraictopology is “pure.” One way of handling this is to read the definition of a ∂-manifoldas specifying which points are in the boundary. That is, a ∂-manifold is defined to bea subset of Rk together with an atlas of m-dimensional Cr parameterizations {ϕi }i∈Isuch that each ϕ−1

j ◦ ϕi maps points in the boundary of H to points in the boundaryand points in the interior to points in the interior. In order for this to be rigorous itis necessary to check that all the constructions in our proofs preserve this feature,but this will be clear throughout. With this point cleared up, the boundary of M iswell defined; we denote this subset by ∂M . Note that ∂M automatically inherits asystem of coordinate systems that display it as an (m − 1)-dimensional Cr manifold(without boundary).

Naturally our analytic work will be facilitated by characterizations of ∂-manifoldsthat are somewhat easier to verify than the definition.

Lemma 10.8 For M ⊂ Rk the following are equivalent:

(a) M is an m-dimensional ∂-manifold;(b) for each p ∈ M there is a neighborhoodW ⊂ M,anm-dimensionalCr manifold

(without boundary) W , and a Cr function h : W → R such that W = h−1(R+)

and Dh(p) = 0.

10.8 Manifolds with Boundary 205

Proof Fix p ∈ M . If (a) holds then there is a Cr ∂-embedding ϕ : U → M , whereU ⊂ H is open and ϕ(U ) is a relatively open subset of M . After composing withan affine function, we may assume that H = { x ∈ R

m : xm ≥ 0 }. Let ϕ : U → Rk

be a Cr extension of ϕ to an open (in Rm) superset of U . After replacing U with a

smaller neighborhood of ϕ−1(p) it will be the case that ϕ is a Cr embedding, andwe may replaceU with its intersection with this smaller neighborhood. To verify (b)we set W := ϕ(U ) and W := ϕ(U ), and we let h be the last component function ofϕ−1.

Now suppose that W , W , and h are as in (b). Let ψ : V → W be a Cr parame-terization for W whose image contains p, and let x := ψ−1(p). Since Dh(p) = 0

there is some i such that ∂(h◦ψ)

∂xi(x) = 0; after reindexing we may assume that i = m.

Let η : W → Rm be the function

η(x) := (x1, . . . , xm−1, h(ψ(x))

).

Examination of the matrix of partial derivatives shows that Dη(x) is nonsingular,so, by the inverse function, after replacing W with a smaller neighborhood of x , wemay assume that η is aCr embedding. Let U := η(V ),U := U ∩ H , ϕ := ψ ◦ η−1 :U → W , and ϕ := ϕ|U : U → W . Evidently ϕ is a Cr ∂-parameterization for M .

�

The following consequence is obvious, but is still worth mentioning because itwill have important applications.

Proposition 10.15 If M is an m-dimensional Cr manifold, f : M → R is Cr , anda is a regular value of f , then f −1([a,∞)) is an m-dimensional Cr ∂-manifold.

The definitions of tangent spaces, tangent manifolds, and derivatives, are onlyslightly different fromwhatwe sawearlier. Suppose thatM ⊂ R

k is anm-dimensionalCr ∂-manifold, ϕ : U → M is a Cr ∂-parameterization, x ∈ U , and ϕ(x) = p. Thedefinition of a Cr function gives a Cr extension ϕ : U → R

k of ϕ to an open (inRm)superset of U , and we define TpM to be the image of Dϕ(x). (Of course there isno difficulty showing that Dϕ(x) does not depend on the choice of extension ϕ.) Asbefore, the tangent manifold of M is

T M =⋃p∈M

{p} × TpM .

Let πT M : T M → M be the natural projection π : (p, v) → p.Wewish to show thatT M is aCr−1 ∂-manifold.To this enddefineTϕ : U × R

m →π−1T M(U ) by setting Tϕ(x,w) := (ϕ(x), Dϕ(x)w). If r ≥ 2, then Tϕ is an injective

Cr−1 ∂-immersion whose image is open in T M , so it is a Cr ∂-embedding. SinceT M is covered by the images of maps such as Tϕ , it is indeed a Cr−1 ∂-manifold.

If N ⊂ R� is an n-dimensional Cr ∂-manifold and f : M → N is a Cr map, then

the definitions of Df (p) : TpM → T f (p)N for p ∈ M and T f : T M → T N , and


the main properties, are what we saw earlier, with only technical differences in theexplanation. In particular, T extends to a functor from the category Cr ∂-manifoldsand Cr maps to the category of Cr−1 ∂-manifolds and Cr−1 maps.

We also need to reconsider the notion of a submanifold. One can of course definea Cr ∂-submanifold of M to be a Cr ∂-manifold that happens to be contained in M ,but the submanifolds of interest to us satisfy additional conditions. Any point in thesubmanifold that lies in ∂M should be a boundary point of the submanifold, and wedon’t want the submanifold to be tangent to ∂M at such a point.

Definition 10.11 If M is aCr ∂-manifold, a subset P is a neat Cr ∂-submanifold ifit is aCr ∂-manifold, ∂P = P ∩ ∂M , and for each p ∈ ∂P we have TpP + Tp∂M =TpM .

The reason this is the relevant notion has to do with transversality. Suppose thatM is a Cr ∂-manifold, N is a (boundaryless) Cr manifold, P is a Cr submanifoldof N , and f : M → N is Cr . We say that f is transversal to P along S ⊂ M , andwrite f �S P , if f |M\∂M �S\∂M P and f |∂M �S∩∂M P . As above, when S = M wewrite f � P .

The transversality theorem generalizes as follows:

Proposition 10.16 If f : M → N is a Cr function that is transversal to P, thenf −1(P) is a neat Cr submanifold of M with ∂ f −1(P) = f −1(P) ∩ ∂M.

Proof We need to show that a neighborhood of a point p ∈ f −1(P) has the requiredproperties. If p ∈ M \ ∂M , this follows from the Theorem10.8, so suppose thatp ∈ ∂M . Lemma10.8 implies that there is a neighborhood W ⊂ M of p, an m-dimensionalCr manifold W , and aCr function h : W → R such thatW = h−1(R+),h(p) = 0, and Dh(p) = 0. Let f : W → N be a Cr extension of f |W .1 We mayassume that f is transverse to P , so the transversality theorem implies that f −1(P)

is a Cr submanifold of W .Since f and f |∂M are both transverse to P , there must be a v ∈ TpM \ Tp∂M

such that D f (p)v ∈ T f (p)P . This implies two things. First, since v /∈ ker Dh(p) =Tp∂M and f −1(P) ∩ W = f −1(P) ∩ W ∩ h−1(R+), Lemma10.8 implies that f −1

(P) ∩ W is a Cr ∂-manifold in a neighborhood of p. Second, the transversal-ity theorem implies that Tp f −1(P) includes v, so we have Tp f −1(P) + Tp∂M =TpM . �

10.9 Classification of Compact 1-Manifolds

In order to study the behavior of fixed points under homotopy, we will need to under-stand the structure of h−1(q) when M and N are manifolds of the same dimension,

1If ψ : V → N is a Cr parameterization for N whose image contains f (W ), then ψ−1 has a Cr

extension, because that is what it means for a function on a possibly nonopen domain to be Cr , andthis extension can be composed with ψ to give f .

10.9 Classification of Compact 1-Manifolds 207

h : M × [0, 1] → N

is a Cr homotopy, and q is a regular value of h. The transversality theorem impliesthat h−1(q) is a 1-dimensionalCr ∂-manifold, so our first step is the following result.

Proposition 10.17 Anonempty compact connected1-dimensionalCr manifold isCr

diffeomorphic to the circle C := { (x, y) ∈ R2 : x2 + y2 = 1 }. A compact connected

1-dimensional Cr ∂-manifold with nonempty boundary is Cr diffeomorphic to [0, 1].Of course no one has any doubts about this being true. If there is anything to

learn from the following technical lemma and the subsequent argument, it can onlyconcern technique. Readers who skip this will not be at any disadvantage.

Lemma 10.9 Suppose that a < b and c < d, and that there is an increasing Cr

diffeomorphism f : (a, b) → (c, d). Then for sufficiently large Q ∈ R there is anincreasing Cr diffeomorphism λ : (a, b) → (a − Q, d) such that λ(s) = s − Q forall s in some interval (a, a + δ) and λ(s) = f (s) for all s in some interval (b − ε, b).

Proof Lemma10.1 presented a C∞ function γ : R → [0,∞] with γ (t) = 0 for allt ≤ 0 and γ ′(t) > 0 for all t > 0. Setting

κ(s) := γ (s − a − δ)

γ (s − a − δ) + γ (b − ε − s)

for sufficiently small δ, ε > 0 gives aC∞ function κ : (a, b) → [0, 1]with κ(s) = 0for all s ∈ (a, a + δ), κ(s) = 1 for all s ∈ (b − ε, b), and κ ′(s) > 0 for all s suchthat 0 < κ(s) < 1. For any real number Q we can define λ : (a, b) → R by setting

λ(s) = (1 − κ(s))(s − Q) + κ(s) f (s) .

Clearly this will be satisfactory if λ′(s) > 0 for all s. A brief calculation gives

λ′(s) = 1 + κ(s)( f ′(s) − 1) + κ ′(s)(Q + f (s) − s)

= (1 − κ(s))(1 − f ′(s)) + f ′(s) + κ ′(s)(Q + f (s) − s).

If Q is larger than the upper bound for s − f (s), then λ′(s) > 0 when κ(s) is closeto 0 or 1. Since those s for which this is not the case will be contained in a compactinterval on which κ ′ positive and continuous, hence bounded below by a positiveconstant, if Q is sufficiently large then λ′(s) > 0 for all s. �

Proof of Proposition10.17 Let M be a nonempty compact connected1-dimensional Cr manifold. We can pass from a Cr atlas for M to a Cr atlas whoseelements all have connected domains by taking the restrictions of each element of theatlas to the connected components of its domain. To be concrete, we will assume thatthe domains of the parameterizations are connected subsets ofR, i.e., open intervals.Since we can pass from a parameterization with unbounded domain to a countable


collection of restrictions to bounded domains, we may assume that all domains arebounded. Since M is compact, any atlas has a finite subset that is also an atlas. Wenow have an atlas of the form

{ ϕ1 : (a1, b1) → M, . . . , ϕK : (aK , bK ) → M } .

Finally, we may assume that K is minimal. Since M is compact, K > 1.Let p be a limit point of ϕ1(s) as s → b1. If p was in the image of ϕ1, say

p = ϕ1(s1), then the image of a neighborhood of s1 would be a neighborhood ofp, and points close to b1 would be mapped to this neighborhood, contradicting theinjectivity of ϕ1. Therefore p is not in the image of ϕ1. After reindexing, we mayassume that p is in the image of ϕ2, say p = ϕ2(t2).

Fix ε > 0 small enough that [t2 − ε, t2 + ε] ⊂ (a2, b2). Since ϕ2((t2 − ε, t2 + ε))

and M \ ϕ2([t2 − ε, t2 + ε]) are open and disjoint, and there at most two s such thatϕ1(s) = ϕ2(t2 ± ε), there is some δ > 0 such thatϕ1((b2 − δ, b1)) ⊂ ϕ2((t2 − ε, t2 +ε)). Then f = ϕ−1

2 ◦ ϕ1|(b1−δ,b1) is a Cr diffeomorphism. The intermediate valuetheorem implies that it is monotonic. Without loss of generality (we could replace ϕ2

with t → ϕ2(−t)) we may assume that it is increasing. Of course lims→b1 f (s) = t2.The last result implies that there is some real number Q and an increasing Cr

diffeomorphism λ : (b1 − δ, b1) → (b1 − δ − c, t2) such that λ(s) = s − Q for all snear b1 − δ and λ(s) = f (s) for all s near b1.We can now define ϕ : (a1 − Q, b2) →M by setting

ϕ(s) :=

⎧⎪⎨⎪⎩

ϕ1(s + Q), s ≤ b1 − δ − Q,

ϕ1(λ−1(s)), b1 − δ − Q < s < t2,

ϕ2(s), s ≥ t2.

Wehaveλ−1(s) = s + Q for all s in a neighborhood of b1 − δ − Q andϕ(s) = ϕ2(s)for all s close to t2. Therefore ϕ is aCr function. Each point in its domain has a neigh-borhood such that the restriction of ϕ to that neighborhood is a Cr parameterizationfor M , which implies that it maps open sets to open sets. If it was injective, it wouldbe a Cr coordinate chart whose image was the union of the images of ϕ1 and ϕ2,which would contradict the minimality of K .

Therefore ϕ is not injective. Since ϕ1 and ϕ2 are injective, there must be s <

b1 − δ − c such that ϕ(s) = ϕ(s ′) for some s ′ > t1. Let s0 be the supremum of suchs. If ϕ(s0) = ϕ(s ′) for some s ′ > t1, then the restrictions of ϕ to neighborhoods ofs0 and s ′ would both map diffeomorphically onto some neighborhood of this point,which would give a contradiction of the definition of s0. Therefore ϕ(s0) is in theclosure of ϕ(((t1, b2)), but is not an element of this set, so it must be lims ′→b2 ϕ(s ′).Arguments similar to those given above imply that there are α, β > 0 such that theimages of ϕ|(b2−α,b2) and ϕ|(s0−β,s0) are the same, and the Cr diffeomorphism

g = (ϕ|(s0−β,s0))−1 ◦ ϕ|(b2−α,b2)

10.9 Classification of Compact 1-Manifolds 209

is increasing. Applying the lemma above again, there is a real number R and anincreasingCr diffeomorphismλ : (b2 − α, b2) → (b2 − α − R, s0) such thatλ(s) =s − R for s near b2 − α and λ(s) = g(s) for s near b2.

We now define ψ : [s0, s0 + R) → M by setting

ψ(s) :={

ϕ(s), s0 ≤ s ≤ b2 − α,

ϕ(λ−1(s − R)), b2 − α < s < s0 + R.

Then ψ agrees with ϕ near b2 − α, so it is Cr , and it agrees with ϕ(s − R) nears0 + R, so it can be construed as a Cr function from the circle (thought of RmoduloR) to M . This function is easily seen to be injective, and it maps open sets to opensets, so its image is open, but also compact, hence closed. Since M is connected,its image must be all of M , so we have constructed the desired Cr diffeomorphismbetween the circle and M .

The argument for a compact connected one dimensional Cr ∂-manifold withnonempty boundary is similar, but somewhat simpler, so we leave it to the reader.

�

Although it will not figure in the work here, the reader should certainly be awarethat the analogous issues for higher dimensions are extremely important in topology,and mathematical culture more generally. In general, a classification of some typeof mathematical object is a description of all the isomorphism classes (for whateveris the appropriate notion of isomorphism) of the object in question. The result aboveclassifies compact connected 1-dimensional Cr manifolds.

The problem of classifying oriented surfaces (2-dimensional manifolds) was firstconsidered in a paper of Möbius in 1870. The classification of all compact connectedsurfaceswas correctly stated by vanDyke in 1888. This result was proved for surfacesthat can be triangulated by Dehn and Heegaard in 1907, and in 1925 Rado showedthat any surface can be triangulated.

After some missteps, Poincaré formulated a fundamental problem for the classifi-cation of 3-manifolds: is a simply connected compact 3-manifold necessarily home-omorphic to S3? (A connected topological space X is simply connected if any con-tinuous function f : S1 → X has a continuous extension F : D2 → X .) AlthoughPoincaré did not express a strong view, this became known as the Poincaré conjec-ture. As it resisted solution while the four color theorem and Fermat’s last theoremwere proved, it became perhaps themost famous open problem inmathematics. Curi-ously, the analogous theorems for higher dimensions were proved first, by Smale in1961 for dimensions five and higher, and by Freedman in 1982 for dimension four.Finally in late 2002 and 2003 Perelman posted three papers that sketched a proof ofthe original conjecture. Over the next three years three different teams of two math-ematicians set about filling in the details of the argument. In the middle of 2006 eachof the teams posted a (book length) paper giving a complete argument. AlthoughPerelman’s papers were quite terse, and many details needed to be filled in, all threeteams agreed that all gaps in his argument were minor.


Exercises

10.1 Let M and N be C∞ manifolds. Two C∞ functions f0, f1 : M → N are C∞homotopic if there is a C∞ homotopy h : M × [0, 1] → N with h0 = f0 and h1 =f1.

(a) Prove that if f0 and f1 are C∞ homotopic, then there is a C∞ homotopy h :M × [0, 1] → N with ht = f0 for all t ∈ [0, 1

3 ] and ht = f1 for all t ∈ [ 23 , 1].(b) Prove that “are C∞ homotopic” is an equivalence relation.

10.2 Prove that if s ≥ 1, M is a Cs manifold, A ⊂ M , and r : M → A is a Cs

retraction, then A is a Cs submanifold.

10.3 Formulate a definition of aCr fiber bundle that is analogous to Definition10.9except that the fibre can be any Cr manifold. Consider the map π : S3 → S2 givenby

π(x) := (2x3x3 + 2x2x4,−2x1x4 + 2x2x3, x21 + x22 − x23 − x24 ).

Verify that the image of π is in fact S2. (If w = x1 + i x2 and z = x3 + i x4 then thefirst two coordinates of p(x) are the real and complex parts of 2wz and the third is|w|2 − |z|2.) Show that (S3, π) is a fiber bundle with fiber S1. It was discovered byHopf (1931) and is known as the Hopf fibration.

Let n be a positive integer, and let S1, . . . , Sn be nonempty finite sets of purestrategies. For each i = 1, . . . , n let

Hi := { σi : Si → R :∑si∈Si

σi (si ) = 1 } .

Let S := S1 × · · · × Sn and H := H1 × · · · × Hn . A game for S1, . . . , Sn is ann-tuple u = (u1, . . . , un) of functions ui : S → R. LetG be the space of such games.We extend ui to H multilinearly:

ui (σ ) =∑s∈S

( ∏j

σ j (s j ))ui (s) .

Let

E := { (u, σ ) ∈ G × H : ui (si , σ−i ) = ui (ti , σ−i ) for all i and all si , ti ∈ Si } .

10.4 Prove that E is a n|S|-dimensional C∞ manifold.

Let π : E → G be the natural projection (u, σ ) → u.

10.5 Prove that if (u, σ ) is a regular point of π , then there is a neighborhoodU ⊂ Hof σ such that (u, σ ′) /∈ E for all σ ′ ∈ U \ {σ }.

Exercises 211

As in the exercises for Chap.7, letΣi := Δ(Si ) andΣ := Σ1 × · · · × Σn . Recallthat σ ∈ Δ is a Nash equilibrium for u ∈ G if ui (si , σ−i ) ≤ ui (σ ) for all i andsi ∈ Si . The support of σi ∈ Σi is { si ∈ Si : σi (si ) > 0 }. A Nash equilibrium σ

is totally mixed if the support of each σi is all of Si , and a totally mixed Nashequilibrium is regular if (u, σ ) is a regular point of π . A Nash equilibrium σ isstrict if ui (si , σ−i ) < ui (σ ) for all i and all si that are not in the support of σi , andit is regular if it is strict and is (in the obvious sense) a regular totally mixed Nashequilibrium of the game obtained by eliminating pure strategies for each i that arenot in the support of σi .

10.6 Prove that if σ is a regular Nash equilibrium for u, then there is a neighbor-hood V ⊂ Σ that contains no other Nash equilibria for u. Conclude that if all Nashequilibria for u are regular, then there are finitely many equilibria.

10.7 Prove that if σ is a regular Nash equilibrium for u, then there is a neighborhoodU ⊂ G of u and a C∞ function e : U → Σ such that e(u) = σ and for each u′ ∈ U ,e(u′) is a regular Nash equilibrium for u′.

We now consider applications to a general equilibrium exchange economy. Let

Δ := { p ∈ R�+ : p1 + · · · + p� = 1 } ,

and let Δ◦ := Δ ∩ R�++ and ∂Δ := Δ \ Δ◦. Demand information is summarized by

C∞ functions f1, . . . , fm : Δ◦ × R++ → R�++ such that for each i = 1, . . . ,m:

(a) p · fi (p,w) = w for all (p,w) ∈ Δ◦ × R++.(b) ‖ fi (pk,wk)‖ → ∞ whenever {(pk,wk)}∞k=1 is a sequence in Δ◦ × R++ such

that pk → p ∈ ∂Δ and lim infk wk > 0.

(Additional properties that the fi possess when they are derived from utility maxi-mization do not figure in our analysis.)

AWalrasian equilibrium for an endowment vector ω = (ω1, . . . , ωm) ∈ (R�)m

is a pair (p, x) ∈ Δ◦ × (R�++)m such that:

(a) p · ωi > 0 and xi = fi (p, p · ωi ) for each i = 1, . . . ,m;(b)

∑i xi = ∑

i ωi .

10.8 Let z : Δ◦ → R� be a continuous function such that:

• p · z(p) = 0 for all p ∈ Δ◦;• there is x ∈ R

� such that z(p) ≥ x (that is, z j (p) ≥ x j for all j) for all p ∈ Δ◦;• ‖z(pk)‖ → ∞ whenever {pk} is a sequence in Δ◦ converging to a point p ∈ ∂Δ.

Let F : Δ → Δ be the correspondence

F(p) :={argmaxq∈Δq · z(p), p ∈ Δ◦,{ q ∈ Δ : q · p = 0 }, p ∈ ∂Δ.


(a) Prove that if {pk} is a sequence in Δ◦ converging to p ∈ ∂Δ and p j >

0, then lim supk z j (pk) < ∞. (Observe that pkj z j (p

k) = −∑h = j p

khzh(p

k) ≤|x1| + · · · + |x�|.)

(b) Prove that F is upper hemicontinuous and convex valued.(c) (Debreu-Gale-Kuhn-Nikaido Lemma) Prove that there is a p∗ ∈ Δ◦ such that

z(p∗) = 0.

10.9 Prove that if ω ∈ (R�++)m , then a Walrasian equilibrium for ω exists.

10.10 Prove that for each i , Bi := { (p, xi ) ∈ Δ◦ × R�++)m : fi (p, p · xi ) = xi } is

an �-dimensional C∞ manifold.

Let B be the set of (p, x) ∈ Δ◦ × (R�++)m such that fi (p, p · xi ) = xi for all i , andlet E be the set of (p, x, ω) ∈ B × (R�)m such that (p, x) is a Walrasian equilibriumfor ω.

10.11 (a) Prove that B is a (� + m − 1)-dimensional manifold.(b) Prove that E is a �m-dimensional manifold.(c) Display E as a vector bundle with base B.

Let π : E → (R�)m be the projection (p, x, ω) → ω. We say that (p, x) is aregular equilibrium for an endowment vector ω if (p, x, ω) is a regular point of π .

10.12 Prove that if (p, x) is a regular equilibrium forω, then there is a neighborhoodU ⊂ Δ◦ × (R�++)m of (p, x) such that (p′, x ′, ω) /∈ E for all (p′, x ′) ∈ U \ {(p, x)}.10.13 Prove that if (p, x) is a regular equilibrium forω, then there is a neighborhoodU ⊂ (R�++)m of ω and C∞ function e : U → Δ◦ × (R�++)m such that for each ω′ ∈U , e(ω′) is a regular equilibrium for ω′.

A regular economy for f1, . . . , fm is an endowment vector ω ∈ (R�++)m that isa regular value of π .

10.14 Prove that if ω is a regular economy, then there are finitely many Walrasianequilibria for ω.

Chapter 11Sard’s Theorem

The results concerning existence and uniqueness of systems of linear equations havebeen well established for a long time, of course. In the late 19th century Walrasrecognized that the system describing economic equilibria had (after recognizingthe redundant equation now known as Walras’ law) the same number of equationsand free variables, which suggested that “typically” economic equilibria should beisolated and also robust, in the sense that the endogenous variables will vary contin-uously with the underlying parameters in some neighborhood of the initial point. Itwas several decades before methods for making these ideas precise were establishedin mathematics, and then several more decades elapsed before they were importedinto theoretical economics.

The original versions of what is now known as Sard’s theorem appeared duringthe 1930’s. There followed a process of evolution, both in the generality of the resultand in the method of proof, that culminated in the version due to Federer. In addition,Smale (1965) provided a version for Banach spaces. Our treatment here is primarilybased on Milnor (1965a), fleshed out with some arguments from Sternberg (1983),which (in its first edition) seems to have been Milnor’s primary source. While notcompletely general, this version of the result is adequate for all of the applicationsin economic theory to date, many of which are extremely important.

Suppose 1 ≤ r ≤ ∞, and let f : U → Rn be a Cr function, where U ⊂ R

m isopen. If f (x) = y and Df (x) has rank n, then the implicit function theorem (Theo-rem10.2) implies that, in a neighborhood of x , f −1(y) can be thought of as the graphof a Cr function. Intuition developed by looking at low dimensional examples sug-gests that for “typical” values of y this pleasant situation will prevail at all elementsof f −1(y), but even in the case m = n = 1 one can see that there can be a countableinfinity of exceptional y. Thus the difficulty in formulating this idea precisely is thatwe need a suitable notion of a “small” subset of Rn . This problem was solved bythe theory of Lebesgue measure, which explains the relatively late date at which theresult first appeared.

Measure theory has rather complex foundations (Sect. 17.4 presents some of thebasic elements) so it is preferable that it not be a prerequisite. Thus it is fortunate


213


214 11 Sard’s Theorem

that only the notion of a set of measure zero is required. Section11.1 defines thisnotion and establishes its basic properties. One of the most important results inmeasure theory is Fubini’s theorem, which, roughly speaking, allows functions to beintegrated one variable at a time. Section11.2 develops a Fubini-like result for setsof measure zero. With these elements in place, it becomes possible to state and proveSard’s theorem in Sect. 11.3. Section11.4 explains how to extend the result to mapsbetween sufficiently smooth manifolds.

The application of Sard’s theorem that is most important in the larger scheme ofthis book is given in Sect. 11.5. The overall idea is to show that any map betweenmanifolds can be approximated by one that is transversal to a given submanifold ofthe range.

11.1 Sets of Measure Zero

If (X, d) is a metric space, S is a subset of X , and α > 0, the α-dimensional Haus-dorff measure of S is the infimum of the set of sums

∑i r

αi such that there is a

cover of S consisting of balls of radii ri centered at points xi ∈ X . The Hausdorffdimension of S is the infimum of the set of α such that the α-dimensional Hausdorffmeasure of S is zero. For example, Shishikura (1994) showed that the boundary ofthe Mandelbroit set has Hausdorff dimension two, which means roughly that it isas jagged as it could possibly be. A beautiful informal introduction to the circle ofideas surrounding these concepts, which is the branch of analysis called geometricmeasure theory, is given by Morgan (1988).

The following is the most general version of Sard’s theorem. It is due to Federer,and a proof can be found in Sect. 3.4 of Federer (1969), which also provides acomplete set of counterexamples showing it to be best possible.

Theorem 11.1 (Federer) Let U ⊂ Rm be open, and let f : U → R

n be a Cr func-tion. For 0 ≤ p < m let Rp be the set of points x ∈ U such that the rank of D f (x)is less than or equal to p. Then f (Rp) has α-dimensional Hausdorff measure zerofor all α ≥ p + m−p

r .

We’ll say that a set S ⊂ Rm has measure zero if it has m-dimensional Hausdorf

measure zero, so that for any ε > 0, there is a sequence {(x j , r j )}∞j=1 in Rk × (0, 1)

such thatS ⊂

⋃

j

Ur j (x j ) and∑

j

rmj < ε .

Of course we can use different sets, such as cubes, as a measure of whether a set hasmeasure zero. Specifically, if we can find a covering of S by balls of radius r j with∑

j rmj < ε, then there is a covering by cubes of side length 2r j with

∑j (2r j )

m <

2mε, and we can require that these cubes are aligned (in the obvious sense) with thecoordinate axes. Ifs we can find a covering of S by cubes of side lengths 2� j with∑

j (2� j )m < ε, then there is a covering by balls of radius

√m� j with

∑j (

√m� j )

m <

11.1 Sets of Measure Zero 215

(√m/2)mε. We can also use rectangles

∏mi=1[ai , bi ] because we can cover such a

rectangle with a collection of cubes of almost the same total volume. From the pointof view of our methodology it is important to recognize that we “know” this as a factof arithmetic (and in particular the distributive law) rather than as prior knowledgeconcerning a concept of volume.

The rest of this section develops a few basic facts. The following property of setsof measure zero occurs frequently in proofs.

Lemma 11.1 If S1, S2, . . . ⊂ Rm are sets of measure zero, then S1 ∪ S2 ∪ . . . has

measure zero.

Proof For given ε take the union of a countable cover of S1 by rectangles of totalvolume < ε/2, a countable cover of S2 by rectangles of total volume < ε/4, etc. �

It is intuitively obvious that a set of measure zero cannot have a nonempty interior,but our methodology requires that we “forget” everything we know about volume,using only arithmetic to prove it.

Lemma 11.2 If S has measure zero, its interior is empty, so its complement is dense.

Proof If not, then S contains a closed cube C aligned with the coordinate axes, sayof side length �. We may assume that the vertices of C are contained in the lattice Lof points in Rm whose coordinates are multiples of 1/n for some integer n. Supposethat S has a covering by open cubes C j aligned with the coordinate axes of sidelengths � j . The Lebesgue number lemma (Lemma2.12) implies that, after replacingn with some integral multiple of itself, each of the subcubes c of C of side length1/n with vertices in L is contained in some C jc . For each j let v j be the product ofthe side lengths of the maximal (generalized) rectangle aligned with the coordinateaxes that is contained in C j and whose vertices have coordinates in L . Purely as amatter of arithmetic we have

�m =∑

j

|{ c : jc = j }|/nm ≤∑

j

v j ≤ �mj .

�

The next result implies that the notion of a set of measure zero is invariant underC1 changes of coordinates. In the proof of Theorem11.2 we will use this flexibilityto choose coordinate systems with useful properties. In addition, this fact is the keyto the definition of sets of measure zero in manifolds. In preparation for the proof wemention that if V and W are normed spaces and L : V → W is a continuous lineartransformation, then the operator norm of L is

‖L‖ := sup‖v‖=1

‖L(v)‖ .

Lemma 11.3 If U ⊂ Rm is open, f : U → R

m is C1, and S ⊂ U has measure zero,then f (S) has measure zero.


Proof LetC ⊂ U be a closed cube. SinceU can be covered by countably many suchcubes (e.g., all cubes contained in U with rational centers and rational side lengths)it suffices to show that f (S ∩ C) has measure zero. Let B := maxx∈C ‖Df (x)‖. Forany x, y ∈ C we have

‖ f (x) − f (y)‖ =∥∥∥

∫ 1

0Df ((1 − t)x + t y)(y − x) dt

∥∥∥

≤∫ 1

0‖Df ((1 − t)x + t y)‖ × ‖y − x‖ dt ≤ B‖y − x‖.

If {(x j , r j )}∞j=1 is a sequence such that

S ∩ C ⊂⋃

j

Ur j (x j ) and∑

j

rmj < ε ,

thenf (S ∩ C) ⊂

⋃

j

UBr j ( f (x j )) and∑

j

(Br j )m < Bmε .

�

11.2 A Weak Fubini Theorem

For a set S ⊂ Rm and t ∈ R let

S(t) := { (x2, . . . , xm) ∈ Rm−1 : (t, x2, . . . , xm) ∈ S }

be the t-slice of S. Let P(S) be the set of t such that S(t) does not have (m − 1)-dimensional measure zero. Certainly it seems natural to expect that if S is a set ofm-dimensional measure zero, then P(S) should be a set of 1-dimensional measurezero, and conversely. This is true, by virtue of Fubini’s theorem,which is an importanttheorem ofmeasure theory, but we do not have themeans to prove it in full generality.Fortunately we will only need Proposition11.1 below, which is a special case.

Fix a compact set C , which we assume is contained in the rectangle∏m

i=1[ai , bi ].For each δ > 0 let Pδ(C) be the set of t such that C(t) cannot be covered by finitelymany open rectangles whose (m − 1)-dimensional volumes sum to less than δ.

Lemma 11.4 For each δ > 0, Pδ(C) is closed.

Proof If t is in the complement of Pδ(C), then any collection of open rectangles thatcover C(t) also covers C(t ′) for t ′ sufficiently close to t , because C is compact. �

Lemma 11.5 If P(C) has measure zero, then C has measure zero.

Proof Fix ε > 0, and choose δ < ε/2(b1 − a1). Since Pδ(C) ⊂ P(C), it has onedimensional measure zero, and since it is closed, hence compact, it can be covered by

11.2 A Weak Fubini Theorem 217

the union J of finitelymanyopen intervals of total length ε/2(b2 − a2) . . . (bm − am).In thisway { x ∈ C : x1 ∈ J } is covered by a union of open rectangles of total volume≤ ε/2.

For each t /∈ J we can choose a finite union of rectangles inRm−1 of total volumeless than δ that covers C(t), and these will also cover C(t ′) for all t ′ in some openinterval around t . Since [a1, b1] \ J is compact, it is covered by a finite collection ofsuch intervals, and it is evident that we can construct a cover of { x ∈ C : x1 /∈ J } oftotal volume less than ε/2. �

Lemma 11.6 If C has measure zero, then P(C) has measure zero.

Proof Since P(C) = ⋃n=1,2,... P1/n(C), it suffices to show that Pδ(C) has measure

zero for any δ > 0. For any ε > 0 there is a covering ofC by finitely many rectanglesof total volume less than ε. For each t there is an induced covering C(t) by a finitecollection of rectangles, and there is an induced covering of [a1, b1]. The total lengthof intervals with induced coverings of total volume greater than δ cannot exceed ε/δ.

�

Proposition 11.1 If S ⊂ Rm is locally closed, then S has measure zero if and only

if P(S) has measure zero.

Proof of Proposition11.1 Suppose that S = C ∩U where C is closed andU is open.Let A1, A2, . . . be a countable collection of compact rectangles that cover U . Thenthe following are equivalent:

(a) S has measure zero;(b) each C ∩ A j has measure zero;(c) each P(C ∩ A j ) has measure zero;(d) P(S) has measure zero.

Specifically, Lemma11.1 implies that (a) and (b) are equivalent, and also thatP(S) = ⋃

j P(C ∩ A j ), after which the equivalence of (c) and (d) follows froma third application of the result. The equivalence of (b) and (c) follows from thelemmas above. �

11.3 Sard’s Theorem

We now come to this chapter’s central result. Recall that a critical point of a C1

function is a point in the domain at which the rank of the derivative is less than thedimension of the range, and a critical value is a point in the range that is the image ofa critical point. The case α = n and p = n − 1 in Federer’s theorem above reducesto:

Theorem 11.2 If U ⊂ Rm is open and f : U → R

n is a Cr function, where r >

max{m − n, 0}, then the set of critical values of f has measure zero.


Proof If n = 0, then f has no critical points and therefore no critical values. Ifm = 0, then U is either a single point or the null set, and if n > 0 its image hasmeasure zero. Therefore we may assume that m, n > 0. Since r > m − n impliesboth r > (m − 1) − (n − 1) and r > (m − 1) − n, by induction we may assumethat the claim has been established with (m, n) replaced by either (m − 1, n − 1) or(m − 1, n).

Let C by the set of critical points of f . For i = 1, . . . , r let Ci be the set of pointsinU at which all partial derivatives of f up to order i vanish. It suffices to show that:

(a) f (C \ C1) has measure 0;(b) f (Ci \ Ci+1) has measure zero for all i = 1, . . . , r − 1;(c) f (Cr ) has measure zero.

Proof of (a): We will show that each x ∈ C \ C1 has a neighborhood V such thatf (V ∩ C) has measure zero. This suffices because C \ C1 is an open subset of aclosed set, so it is covered by countably many compact sets, each of which is coveredby finitely many such neighborhoods, and consequently it has a countable cover bysuch neighborhoods.

After reindexing we may assume that ∂ f1∂x1

(x) = 0. Let V be a neighborhood of x

in which ∂ f1∂x1

does not vanish. Let h : V → Rm be the function

h(x) := ( f1(x), x2, . . . , xm) .

The matrix of partial derivatives of h at x is

⎛

⎜⎜⎜⎝

∂ f1∂x1

(x) ∂ f1∂x2

(x) · · · ∂ f1∂xm

(x)0 1 · · · 0...

......

0 0 · · · 1

⎞

⎟⎟⎟⎠

,

so the inverse function theorem implies that, after replacing V with a smaller neigh-borhood of x , h is a diffeomorphism onto its image. The chain rule implies that thecritical values of f are the critical values of g := f ◦ h−1, so we can replace f withg, and g has the additional property that g1(z) = z1 for all z in its domain. The upshotof this argument is that we may assume without loss of generality that f1(x) = x1for all x ∈ V .

For each t ∈ R let V t := {w ∈ Rm−1 : (t,w) ∈ V }, let f t : V t → R

n−1 be thefunction

f t (w) := ( f2(t,w), . . . , fn(t,w)) ,

and let Ct be the set of critical points of f t . The matrix of partial derivatives of f atx ∈ V is

11.3 Sard’s Theorem 219

⎛

⎜⎜⎜⎝

1 0 · · · 0∂ f2∂x1

(x) ∂ f2∂x2

(x) · · · ∂ f2∂xm

(x)...

......

∂ fn∂x1

(x) ∂ fn∂x2

(x) · · · ∂ fn∂xm

(x)

⎞

⎟⎟⎟⎠

,

so x is a critical point of f if and only if (x2, . . . , xm) is a critical point of f x1 , andconsequently

C ∩ V =⋃

t

{t} × Ct and f (C ∩ V ) =⋃

t

{t} × f t (Ct ) .

Since the result is known to be true with (m, n) replaced by (m − 1, n − 1), eachf t (Ct ) has (n − 1)-dimensional measure zero. In addition, the continuity of therelevant partial derivatives implies that C \ C1 is locally closed, so Proposition11.1implies that f (C ∩ V ) has measure zero.

Proof of (b): As above, it is enough to show that an arbitrary x ∈ Ci \ Ci+1 has aneighborhood V such that f (Ci ∩ V ) has measure zero. Choose a partial derivative

∂ i+1 f∂xs1 ···∂xsi ·∂xsi+1

that does not vanish at x . Define h : U → Rm by

h(x) := (∂ i f

∂xs1 ···∂xsi (x), x2, . . . , xm) .

After reindexingwemay assume that si+1 = 1, so that thematrix of partial derivativesof h at x is triangular with nonzero diagonal entries. By the inverse function theoremthe restriction of h to some neighborhood V of x is a C∞ diffeomorphism. Letg := f ◦ (h|V )−1. Then h(V ∩ Ci ) ⊂ {0} × R

m−1. Let

g0 : { y ∈ Rm−1 : (0, y) ∈ h(V ) } → R

n

be the map g0(y) = g(0, y). Then f (V ∩ (Ci \ Ci+1)) is contained in the set ofcritical values of g0, and the latter set has measure zero because the result is alreadyknown when (m, n) is replaced by (m − 1, n).

Proof of (c): SinceU can be covered by countably many compact cubes, it suffices toshow that f (Cr ∩ I ) has measure zero whenever I ⊂ U is a compact cube. Since Iis compact and the partials of f of order r are continuous, Taylor’s theorem impliesthat for every ε > 0 there is δ > 0 such that

‖ f (x + h) − f (x)‖ ≤ ε‖h‖r

whenever x, x + h ∈ I with x ∈ Cr and ‖h‖ < δ. Let L be the side length of I . Foreach integer d > 0 divide I into dm subcubes of side length L/d. The diameter ofsuch a subcube is

√mL/d. If this quantity is less than δ and the subcube contains a

point x ∈ Cr , then its image is contained in a cube of sidelength 2ε(√mL)r centered


at f (x). There are dm subcubes of I , each one of which may or may not contain apoint in Cr , so for large d, f (Cr ∩ I ) is contained in a finite union of cubes of totalvolume at most

(2(

√mL)r

)nεndm−nr . Now observe that nr ≥ m: either m < n and

r ≥ 1, or m ≥ n and

nr ≥ n(m − n + 1) = (n − 1)(m − n) + m ≥ m .

Therefore f (Cr ∩ I ) is contained in a finite union of cubes of total volume at most(2(

√mL)r

)nεn , and ε may be arbitrarily small. �

Instead of worrying about just which degree of differentiability is the smallest thatallows all required applications of Sard’s theorem, in the remainder of the book wewill, for themost part, workwith objects that are smooth, where smooth is a synonymforC∞. This will result in no loss of generality, since for the most part the argumentsdepend on the existence of smooth objects, which will follow from Proposition10.2.However, in Chap.15 there will be given objects that may, in applications, be onlyC1, but Sard’s theorem will be applicable because the domain and range have thesame dimension. It is perhaps worth mentioning that for this particular case there isa simpler proof, which can be found on p. 72 of Spivak (1965).

11.4 Measure Zero Subsets of Manifolds

In most books Sard’s theorem is presented as a result concerning maps betweenEuclidean spaces, as in the last section, with relatively little attention to the extensionto maps between manifolds. Certainly this extension is intuitively obvious, and thereare no real surprises or subtleties in the details, which are laid out in this section.

Definition 11.1 If M ⊂ Rk is an m-dimensional C1 manifold, then S ⊂ M has m-

dimensional measure zero if ϕ−1(S) has measure zero whenever U ⊂ Rm is open

and ϕ : U → M is a C1 parameterization.

In order for this to be sensible, it should be the case that ϕ(S) has measure zerowhenever ϕ : U → M is a C1 parameterization and S ⊂ U has measure zero. Thatis, it must be the case that if ϕ′ : U ′ → M is another C1 parameterization, thenϕ′−1

(ϕ(S)) has measure zero. This follows from the application of Lemma11.3 toϕ′−1 ◦ ϕ.

Clearly the basic properties of sets of measure zero in Euclidean spaces—thecomplement of a set of measure zero is dense, and countable unions of sets ofmeasure zero havemeasure zero—extend, by straightforward verifications, to subsetsof manifolds of measure zero. Since uncountable unions of sets of measure zero neednot have measure zero, the following fact about manifolds (as we have defined them,namely submanifolds of Euclidean spaces) is comforting, even if the definition abovemakes it superfluous.

11.4 Measure Zero Subsets of Manifolds 221

Lemma 11.7 If M ⊂ Rk is an m-dimensional C1 manifold, then M is covered by

the images of a countable system of parameterizations {ϕ j : Uj → M} j=1,2,....

Proof If p ∈ M and ϕ : U → M is aCr parameterization with p ∈ ϕ(U ), then thereis an open set W ⊂ R

k such that ϕ(U ) = M ∩ W . Of course there is an open ballB of rational radius whose center has rational coordinates with p ∈ B ⊂ W , and wemay replace ϕ with its restriction to ϕ−1(B). Now the claim follows from the factthat there are countably many balls in R

k of rational radii centered at points withrational coordinates. �

The “conceptually correct” version of Sard’s theorem is an easy consequence ofthe Euclidean special case.

Theorem 11.3 (Morse-Sard Theorem) If f : M → N is a smooth map, where Mand N are smooth manifolds, then the set of critical values of f has measure zero.

Proof Let C be the set of critical points of f . In view of the last result it sufficesto show that f (C ∩ ϕ(U )) has measure zero whenever ϕ : U → M is a parameter-ization for M . That is, we need to show that ψ−1( f (C ∩ ϕ(U ))) has measure zerowhenever ψ : V → N is a parameterization for N . But ψ−1( f (C ∩ ϕ(U ))) is theset of critical values of ψ−1 ◦ f ◦ ϕ, so this follows from Theorem11.2. �

11.5 Genericity of Transversality

Let M and N be smooth manifolds, and let P be a smooth submanifold of N . Recallthat a smooth function f : M → N is transversal to P if, for every p ∈ f −1(P),

Df (p)(TpM) + T f (p)P = T f (p)N ,

Intuitively this seems like the normal state of affairs, and a violation of this conditionseems “special” or “unlikely.” Our goal in this section is to prove the followingtheorem:

Theorem 11.4 Let L, M, and N be smooth manifolds, let π : N → L be a smoothsubmersion, and let P be a smooth submanifold of N such that π |P is a submersion.If f : M → N is continuous, π ◦ f is smooth, and A ⊂ M × N is an open neigh-borhood of Gr( f ), then there is a smooth function f ′ : M → N that is transversalto P with Gr( f ′) ⊂ A and π ◦ f ′ = π ◦ f .

There is also a sense in which transversality is stable, namely that in a topology onsmooth functions that is finer than the ones we have studied here, because it controlsfirst derivatives, the set of smooth functions from M to N that are transversal to Pis open if P is a closed subset of N , so that if f is transversal to P , then so are allnearby functions. This result can be found in Guillemin and Pollack (1974), Hirsch(1976a).


Most sources consider only the special case of Theorem11.4 without L and π .More precisely, the usual result can be understood as the case in which L is 0-dimensional, so that π |P is automatically a submersion, and the condition π ◦ f ′ =π ◦ f is vacuous. We are motivated by a particular application.

A vector field on a set S ⊂ M is a continuous function ζ : S → T M such thatπ ◦ ζ = IdM , where π : T M → M is the projection. We can write ζ(p) = (p, ζp)where ζp ∈ TpM . Thus a vector field on S attaches a tangent vector ζp to each p ∈ S,in a continuous manner. The zero section of T M is M × {0} ⊂ T M . If we like wecan think of it as the image of the vector field that is identically zero, and of courseit is also an m-dimensional smooth submanifold of T M . Theorem11.4 implies that:

Proposition 11.2 If ζ is a vector field on M and A ⊂ T M is an open neighborhoodof { (p, ζp) : p ∈ M }, then there is a smooth vector field ζ ′ such that { (p, ζ ′

p) : p ∈M } ⊂ A and ζ ′ is transversal to the zero section of T M.

The proof of Theorem11.4 is a construction that repeatedly modifies the functionon small sets. Over the course of the argument our perspective will shift from linearto local to global. LetU , V , andW be finite dimensional vector spaces whose dimen-sions are, respectively, the dimension of M , the difference between the dimensionof N and the dimension of L , and the dimension of L . Let π : V × W → W be theprojection.

Lemma 11.8 Let α : U → V × W be a linear function, let γ be a positive number,and let β : U × V → V × W be the function β(u, v) = α(u) + (γ v, 0). Let P bea linear subspace of V × W such that π(P) = W, let Q = β−1(P), and let :Q → V be the restriction of the projection U × V → V to Q. If (Q) = V , thenα(U ) + P = V × W.

If (Q) = V , then, for any v ∈ V \ (Q), β(·, v) is transversal to P becauseits image does not intersect P . Thus this result can be regarded as the linear specialcase of our theorem: some small perturbation of α is transversal to P .

Proof Consider (v,w) ∈ V × W . Since P projects onto W , for any u ∈ U there arep ∈ P and v′ ∈ V such that (v,w) = α(u) + (γ v′, 0) + p. There is a u′ such that(u′,−v′) ∈ Q. Let p′ := β(u′,−v′) = α(u′) − (γ v′, 0) ∈ P . Then

(v,w) = α(u) + (γ v′, 0) + p′ + (p − p′) = α(u + u′) + p − p′ ∈ α(U ) + P.

�

Let OU ⊂ U and OV×W ⊂ V × W be open.

Lemma 11.9 Let B be an open subset of V , and let P be a smooth submanifoldof OV×W such that π |P is a submersion. Let g : OU × B → OV×W be a smoothfunction, that is transversal to P, such that for each x ∈ OU there is a γx > 0 suchthat Dg(x, b)(0, v) = (γxv, 0) for all b and v. Let Q := g−1(P). Then Q is a smoothmanifold. Let : Q → B be the restriction of the projection OU × B → B to Q.If b is a regular value of , then g(·, b) is transversal to P.

11.5 Genericity of Transversality 223

Proof The transversality theorem implies that Q is a smooth manifold. Fix x ∈g(·, b)−1(P). Let

α := Dg(·, b)(x) : U → V × W and β := Dg(x, b) : U × V → V × W .

Evidently β(u, v) = α(u) + (γxv, 0) for all u and v. The transversality theoremimplies that T(x,b)Q := β−1(Tg(x,b)P). Since b is a regular point of , the restric-tion of the projection U × V → V to T(x,b)Q is surjective. Therefore the last resultimplies that

im Dg(·, b) + Tg(x,b)P = im α + Tg(x,b)P = V × W.

�

Lemma 11.10 Let C, K , and Z be subsets ofOU withC closed, K ⊂ Z, K compact,Z open, and Z compact and contained inOU . Let f : OU → OV×W be a continuousfunction such that π ◦ f and the restriction of f to some neighborhood of C aresmooth. Let A be a neighborhood of the graph of f . Suppose that P is a smoothsubmanifold of OV×W such that π |P is a submersion and f is transversal to P onC. Then there is an f ′ : OU → OV×W such that:

(a) π ◦ f ′ = π ◦ f ;(b) f ′ agrees with f on U \ Z;(c) f ′ is smooth on a neighborhood of C ∪ K;(d) the graph of f ′ is contained in A;(e) f ′ is transversal to P on C ∪ K.

Proof Let B be a closed ball centered at the origin of V that is small enough that(x, f (x) + (2v, 0)) ∈ A for all x ∈ Z and v ∈ B. Let f = ( fV , fW ) where fV :OU → V and fW : OU → W . Let

AV := { (u, v) ∈ OU × V : (u, v, fW (u)) ∈ A and (u, v + B, fW (u)) ⊂ A if u ∈ Z } .

Evidently AV is open, and it contains the graph of fV . Theorem10.10 gives a smoothfunction fV : OU → OV whose graph is contained in AV .

Corollary 10.1 gives a smooth γ : Ou → [0, 1] that is identically one on a neigh-borhood of K and identically zero on a neighborhood ofOU \ Z . Let g : OU × B →OV×W be the function

g(x, b) := ((1 − γ (x)) fV (x) + γ (x)( fV (x) + b), fW (x)

).

LetY be the union of a neighborhood ofC onwhich fV is smooth and a neighborhoodof K on which γ is identically one. Then the restriction of g to Y × B is smooth andtransversal to P . Let Q := { (x, b) ∈ Y × B : g(x, b) ∈ P }, and let : Q → B bethe restriction of the projection OU × B → B. The transversality theorem implies


that Q is a smooth manifold, and Sard’s theorem guarantees that the interior of Bcontains a regular value of , say b. Our construction guarantees that f ′ := g(·, b)satisfies (a)–(d), and the last result implies that it is transversal to P on a neighborhoodof C ∪ K . �

Proof of Theorem11.4 For each p ∈ M the constant rank theorem gives open setsOV×W ⊂ V × W andOW ⊂ W and smooth parameterizations ψ : OV×W → N andρ : OW → L and such that f (x) is in the image ofψ and ρ−1 ◦ π ◦ ψ agrees with theprojection V × W → W . Let OU ⊂ U be open, and let ϕ : OU → M be a parame-terization such that p ∈ ϕ(OU ) and f maps the closure of ϕ(OU ) to ψ(OV×W ). LetK be a compact subset of ϕ(OU ) that contains p in its interior.

We claim that there is collection {(O iU ,O i

V×W ,O iW , ϕi , ψ i , ρi , K i )}i∈I of tuples

as above such that the interiors of the Ki cover M and the closures of the ϕi (O iU )

are a locally finite collection of sets. Since M is paracompact, there is no difficultyarranging for such a collections such that {K i } is locally finite. For each i there isan εi such that Uεi (K i ) intersects only those K j that intersect K i . By replacing O i

Uwith a smaller neighborhood of K i , we can require that the closure of ϕi (O i

U ) iscontained in Uεi /3(K i ). In this case the closure of ϕi (O i

U ) intersects the closure ofϕi (O i

U ) only if K i intersects K j , so the closures of the ϕi (O iU ) are a locally finite

collection.The set of (x, y) ∈ A such that y ∈ ψ i (O i

V×W ) for all i such that x is in theclosure of ϕi (O i

U ) is a subset of A that contains the graph of f . It is open becausethe additional restrictions are imposed on a locally finite collection of closed sets.By replacing A with this set we can insure that y ∈ ψ i (O i

V×W ) whenever (x, y) ∈ Aand x is in the closure of ϕi (O i

U ).Since M is separable, we may assume that I = Z. (In fact there is a variant of the

argument that assumes only that I is a well ordered set, that you might like to workout for yourself.) For each i we let Ci := K 1 ∪ · · · ∪ K i−1. We inductively define asequence of functions f 0, f 1, f 2, . . . fromM to N , beginningwith f 0 := f . Assumethatwe have already defined an f i−1 whose graph is contained in A and that is smoothon a neighborhood of Ci and transversal to P on Ci . Let K i := ϕi−1

(K i ), C i :=ϕi−1

(Ci ), P i := ψ i−1(P), Ai := (ϕi × ψ i )−1(A), and f i−1 := ψ i−1 ◦ f i−1 ◦ ϕi .

(The restriction on A developed above implies that f i−1 is well defined.) If Z i

is a neighborhood of K i whose closure is compact and contained in O iU , the last

result gives a continuous f i : O iU → OV×W whose graph is contained in Ai , that is

smooth on a neighborhood of C i ∪ K i and transversal to P i on C i ∪ K i , that agreeswith f i−1 outside of Z i , and such that π ◦ f i = π ◦ f i−1. Define f i by settingf i |ϕi (O i

U ) := ψ i ◦ f i ◦ ϕi−1and f i |M\ϕi (O i

U ) := f i−1|M\ϕi (O iU ). Evidently the graph

of f i is contained in A, and it is smooth on a neighborhood of Ci+1 and transversalto P on Ci+1.

For each p ∈ M let f (p) = limi f i (p). Since, on some neighborhood of p, f i

differs from f i−1 only finitely many times, this limit it well defined. Evidentlyπ ◦ f ′ = π ◦ f , f ′ is smooth and transversal to P , and its graph is containedin A. �

Exercises 225

Exercises

11.1 (Hirsch) Use the regular value theorem and the classification of 1-manifoldsto prove that a C1 retraction r : Dm → Sm−1 cannot have a regular value. Prove thatthere cannot be C1 function f : Dm → Dm that does not have a fixed point becausethe function r : Dm → Sm−1 that maps each x to the point where the ray originatingfrom f (x) and passing through x intersects Sm−1 would be a C1 contraction. ProveBrouwer’s fixed point theorem by showing that any continuous f : Dm → Dm canbe approximated, in a suitable sense, by a C1 function.

11.2 Let M be a (boundaryless) m-dimensional Cr (1 ≤ r ≤ ∞) manifold, let C ⊂M be closed, and let U ⊂ M be an open set containing C . Prove that there is anm-dimensional Cr ∂-submanifold P of M with K ⊂ P ⊂ U .

11.3 Apply Sard’s theorem to the map π : E → G of Exercises10.4–10.7 to provethat the set of u ∈ G whose Nash equilibria are all regular is generic in the sensethat it is open and its complement has measure zero.

11.4 Apply Sard’s theorem to the map π : E → (R�)m of Exercises10.8–10.14 toprove that the set regular economies for the given f1, . . . , fm is generic in the sensethat it is open and its complement has measure zero.

Chapter 12Degree Theory

Orientation is an intuitively familiar phenomenon, modelling, among other things,the fact that there is no way to turn a left shoe into a right shoe by rotating it, but themirror image of a left shoe is a right shoe. Consider that when you look at a mirrorthere is a coordinate system in which the map taking each point to its mirror image isthe linear transformation (x1, x2, x3) �→ (x1,−x2, x3) if the second coordinate axisis the one that recedes into the distance. It turns out that the critical feature of thistransformation is that its determinant is negative. After some preliminary geometricmaterial in Sects. 12.1, 12.2 describes the formalism used to impose an orientationon a vector space, and Sect. 12.3 describes what we mean by an assignment of anorientation to the tangent spaces of the points of a manifold that is “continuous.”

Section 12.4 discusses two senses inwhich an orientation on a given object inducesa derived orientation: (a) an orientation on a ∂-manifold induces an orientation of itsboundary; (b) given a smooth map between twomanifolds of the same dimension, anorientation of the tangent space of a regular point in the domain induces an orientationof the tangent space of that point’s image. If both manifolds are oriented, we candefine a sense in which the map is orientation preserving or orientation reversing bycomparing the induced orientation of the tangent space of the image point with itsgiven orientation.

In Sect. 12.5 we first define the smooth degree of a smooth (where “smooth” nowmeans C∞) map over a regular value in the range to be the number of preimages ofthe point at which the map is orientation preserving minus the number of points atwhich it is orientation reversing. Although the degree for smooth functions providesthe correct geometric intuition, it is insufficiently general. The desired generaliza-tion is achieved by approximating a continuous function with smooth functions, andshowing that any two sufficiently accurate approximations are homotopic, so thatsuch approximations can be used to define the degree of the given continuous func-tion. However, instead of working directly with such a definition, it turns out that anaxiomatic characterization is more useful.


227


228 12 Degree Theory

Section 12.6 proves two basic properties of the degree. First, the degree of thecomposition of two maps is the product of their degrees. Second, the degree of thecartesian product of two functions is the product of their degrees.

12.1 Some Geometry

Important arguments later in this chapter will be constructive. In preparation, thissection develops certain geometric concepts and constructions which are also quiteinteresting and important in their own right. Let X be a finite dimensional vectorspace endowed with an inner product.

Anq-frame in X is an orderedq-tuple (v1, . . . , vq)of linearly independent vectorsin X . This frame is orthonormal if 〈vi , v j 〉 = δi j (Kronecker δ) for all i and j . LetFq be the set of q-frames in X , and let Oq be the set of orthonormal q-frames in X .Of course Fq is an open subset of Xq , and Oq is compact.

Let Γ q : Fq → Oq be the map Γ q(v1, . . . , vq) := (w1, . . . , wq) where w1, . . . ,

wq are defined inductively by setting

ui := vi −i−1∑

j=1

⟨vi , w j

⟩w j and wi := ui

‖ui‖ .

This map is called theGram–Schmidt process. To see thatw1, . . . , wq are orthonor-mal, observe that if w1, . . . , wi−1 are orthonormal, then taking the inner productof the first equation with each w j shows that ui is orthogonal to w1, . . . , wi−1.Since w1, . . . , wi−1 are linear combinations of v1, . . . , vi−1 and v1, . . . , vi are lin-early independent, ui �= 0. Therefore wi is a well defined continuous functionof v1, . . . , vi . If (w1, . . . , wq) ∈ Oq , then an obvious inductive argument givesΓ q(w1, . . . , wq) := (w1, . . . , wq). Thus:

Lemma 12.1 The Gram–Schmidt process is a C∞ retraction.

The Grassman manifold of q-planes in X is the set Gq of all q-dimensionallinear subspaces of X . Let

span : Fq → Gq

be the function that takes each (v1, . . . , vq) to the span of v1, . . . , vq . Of course thevectors inΓ q(v1, . . . , vq) are linearly independent linear combinations of v1, . . . , vq ,so span ◦ Γ q = span.

We endow Gq with the quotient topology induced by span, which is the finestsuch that span is continuous. (Exercise 12.2 asks you to show that with this topology,Gq is in fact a manifold.) Concretely, a set in Gq is open if and only if its preimagein Fq is open.

Lemma 12.2 span is an open map.

12.1 Some Geometry 229

Proof For a given open U ⊂ Fq we need so show that span−1(span(U )) is open.Let (v1, . . . , vq) be an element of the latter set, and let (u1, . . . , uq) be an elementof U such that span(u1, . . . , uq) = span(v1, . . . , vq). There is a nonsingular q × qmatrix M such that

M

⎛

⎜⎝v1...

vq

⎞

⎟⎠ =⎛

⎜⎝u1...

uq

⎞

⎟⎠ .

For (v′1, . . . , v′

q) near (v1, . . . , vq)we can use this equation to define a (u′1, . . . , u′

q) ∈U such that span(u′

1, . . . , u′q) = span(v′

1, . . . , v′q). �

For (w1, . . . , wq) ∈ Oq and z ∈ X let

π(w1,...,wq )(z) :=∑

i

〈z, wi 〉wi .

For (v1, . . . , vq) ∈ Fq and z ∈ X let π(v1,...,vq )(z) := πΓ q (v1,...,vq )(z). (Since Γ q is aretraction there is no ambiguity when (v1, . . . , vq) ∈ Oq .

Lemma 12.3 π(v1,...,vq )(z) is closer to z than any other point in the span of v1, . . . , vq .

Proof Since the Gram–Schmidt process does not change the span, it suffices to provethis for (w1, . . . , wq) ∈ Oq . Setting π := π(w1,...,wq )(z), note that 〈z − π, wi 〉 = 0 forall i , so for any α1, . . . , αq ∈ R we have

⟨z − (π +

∑

i

αi wi ), z − (π +∑

i

αi wi )⟩ = ‖z − π‖2 +

∑

i

α2i .

�

If V ∈ Gq and z ∈ X , the projection of z onto V is the point πV (z) in V thatis nearest to z. In view of Lemma 12.3, this definition makes sense because for any(v1, . . . , vq) such that span(v1, . . . , vq) = V we have πV (z) = π(v1,...,vq )(z).

Proposition 12.1 The map (V, z) �→ πV (z) is continuous.

Proof For an open W ⊂ X ,

OW := { ((w1, . . . , wq), z) ∈ Oq × X : π(w1,...,wq )(z) ∈ W })

is open because π(w1,...,wq )(z) is defined explicitly by a continuous formula, andFW := (Γ q × IdX )−1(OW ) is open because Γ q is continuous, so

{ (V, z) : πV (z) ∈ W } = (span × IdX )(FW )

is open because span is an open map. �


Lemma 12.4 For any V ∈ Gq, πV (·) : X → V is linear.

Proof For any (w1, . . . , wq) ∈ Oq , π(w1,...,wq )(·) is defined by a linear formula, andfor some (w1, . . . , wq) we have π(w1,...,wq )(·) = πV (·). �

Lemma 12.5 Any V ∈ Gq has a neighborhood W such that for all V ′, V ′′ ∈ W ,πV ′(·) : V ′′ → V ′ is nonsingular.

Proof If v1, . . . , vq is a basis of V , then Proposition 12.1 implies that

πV ′′(v1), . . . , πV ′′(vq) and πV ′(πV ′′(v1)), . . . , πV ′(πV ′′(vq))

are bases of V ′′ and V ′ for all V ′ and V ′′ in some neighborhood of V . �

If τ : [0, 1] → Gk is continuous, a selection from τ is a continuous functions : [0, 1] → X such that s(t) ∈ τ(t) for all t . We say that s is nonvanishing ifs(t) �= 0 for all t , and selections s1, . . . , sh are linearly independent if, for each t ,s1(t), . . . , sh(t) are linearly independent.

Proposition 12.2 For any continuous τ : [0, 1] → Gq and any v0 ∈ τ(0) \ {0} thereis a nonvanishing selection s from τ with s(0) = v0.

Proof Each t ∈ [0, 1] is contained in an open interval (a, b) such that πτ(t)(·) :τ(t ′) → τ(t) is nonsingular for all t, t ′ ∈ (a, b) ∩ [0, 1]. After taking a finite sub-cover of [0, 1], we can find 0 = t0 < t1 < . . . ,< tk−1 < tk = 1 such that πτ(t)(·) :τ(ti ) → τ(t) is nonsingular for all i and t ∈ [ti , ti+1]. We now proceed induc-tively: set s(0) = v0, and if s has already been defined on [0, ti ], for t ∈ (ti , ti+1]let s(t) := πτ(t)(s(ti )). �

If q ≤ r , letΦq,r := { (V, W ) ∈ Gq × Gr : V ⊂ W } .

If (V, W ) ∈ Φq,r , the orthogonal complement of V in W is

⊥(V, W ) := { w ∈ W : 〈w, v〉 = 0 for all v ∈ V } .

Lemma 12.6 ⊥ : Φq,r → Gr−q is a continuous function.

Proof Fix (V, W ) ∈ Φq,r . Let v1, . . . , vr be a basis of W such that v1, . . . , vq . For(V ′, W ′) in some neighborhood of (V, W ), πV ′(v1), . . . , πV ′(vq) is a basis of V ′,

πV ′(v1), . . . , πV ′(vq), πW ′(vq+1), . . . , πW ′(vr )

are a basis of W ′, and the last r − q components of the application of Γ r to thisordered basis is a basis of ⊥(V, W ). Therefore the claim follows from the continuityof projection, the Gram–Schmidt procedure, and span. �

12.1 Some Geometry 231

Lemma 12.7 If ϕ = (ϕV , ϕW ) : [0, 1] → Φq,r is continuous, there is a continuousω : [0, 1] → Or such that for each t, (ω1(t), . . . , ωq(t)) is a basis of ϕV (t) and(ω1(t), . . . , ωr (t)) is a basis of of ϕW (t).

Proof We can apply Proposition 12.2 repeatedly to choose continuous paths

ν1, . . . , νr : [0, 1] → X

such that for each i and t , νi (t) ∈ ϕW (t) is orthogonal to the span of ν1(t), . . . , νi−1(t)and νi (t) ∈ ϕV (t) if i ≤ q. (The continuity of span and the last result imply that⊥(span(ν1, . . . , νi−1(t)), ϕV (t)) (if i ≤ q) and ⊥(span(ν1, . . . , νi−1(t)), ϕW (t)) arecontinuous functions of t .) Let ω(t) := Γ r (ν1(t), . . . , νr (t)). �

Proposition 12.3 Suppose that τ : [0, 1] → Gm is continuous, ν1, . . . , νq are lin-early independent selections from τ , vq+1, . . . , vr ∈ τ(0), and

ν1(0), . . . , νq(0), vq+1, . . . , vr

are linearly independent. Then there are selections νq+1, . . . , νr from τ such thatνq+1(0) = vq+1, . . . , νr (0) = vr , and ν1, . . . , νr are linearly independent.

Proof For each t let ϕV (t) := span(ν1(t), . . . , νq(t)) and ϕW (t) := τ(t). The lastresult gives selections ω1, . . . , ωr from τ such that for each t , ω1(t), . . . , ωr (t) isa basis of τ(t) and ω1(t), . . . , ωq(t) is a basis of ϕV (t). There is a nonsingularr × r matrix M = (ai j ) such that νi (0) = ∑

j ai jω j (0) for all i = 1, . . . , q andwi =∑j ai jω j (0) for all i = q + 1, . . . , r . For i = q + 1, . . . , r define the selection νi

by setting νi (t) := ∑j ai jω j (t). Evidently

∑j a1 jω j (t), . . . ,

∑j aq jω j (t), νq+1(t),

. . . , νr (t) are linearly independent, and∑

j ai jω j (t), . . . ,∑

j ai jω j (t) and ν1(t),. . . , νq(t) have the same span. �

12.2 Orientation of a Vector Space

The intuitionunderlyingorientation is simple enough, but the formalism is a bit heavy,with the main definitions expressed as equivalence classes. We now assume that thevector space X is m-dimensional. An orientation of X is a connected component ofFm .

Proposition 12.4 If m > 0, then X has exactly two orientations.

Proof For any (v1, . . . , vm) ∈ Fm the map A �→ (∑

j a1 j v j , . . . ,∑

j amj v j ) is a lin-ear bijection between the set of m × m matrices A = (ai j ) and Xm . For any pathin Fm the corresponding path in the space of matrices cannot encounter any matrixwith determinant zero. In particular there cannot be a path between elements of Fm

whose corresponding matrices have determinants of opposite signs. Therefore thereare at least two components.


Fix an orthonormal basis e1, . . . , em of X . If (v1, . . . , vm) ∈ Fm and i �= j , thent �→ (v1, . . . , vi + tv j , . . . , vm) is a map from R to Fm . Combining such paths, wecan find a path from any (v1, . . . , vm) ∈ Fm to (w1, . . . , wm) where wi = ∑

j bi j e j

with bi j �= 0 for all i and j . Continuing fromw1, . . . , wm , such paths can be combinedto eliminate all off diagonal coefficients, arriving at an ordered basis of the form(c1e1, . . . , cmem). From here we can continuously rescale the coefficients, arrivingat an ordered basis (d1e1, . . . , dmem) with di = ±1 for all i . For any (v1, . . . , vm) ∈Fm,k and any i = 1, . . . , m − 1 there is a path

θ �→ (v1, . . . , cos θvi + sin θvi+1, cos θvi+1 − sin θvi , . . . , vm)

from (v1, . . . , vm) to (v1, . . . ,−vi ,−vi+1, . . . , vm). Evidently such paths can becombined to construct a path from (d1e1, . . . , dmem) to (±e1, e2, . . . , em). Thus forany (v1, . . . , vm) there is a path in Fm from (v1, . . . , vm) to either (e1, . . . , em) or(−e1, e2, . . . , em), so that there are at most two components. �

Two ordered bases of X are said to have the same orientation if they are in thesame component of Fm ; otherwise they have the opposite orientation. The resultabove implies that:

Corollary 12.1 If (v1, . . . , vm), (v′1, . . . , v′

m) ∈ Fm and M = (ai j ) is the matrixsuch that v′

i = ∑j ai j v j , then (v1, . . . , vm) and (v′

1, . . . , v′m) have the same orienta-

tion if and only if the determinant of M is positive.

An oriented vector space is a finite dimensional vector space for which one ofthe two orientations has been specified. An ordered basis of an oriented vector spaceis said to be positively oriented (negatively oriented) if it is (not) an element of thespecified orientation.

The result above has a second interpretation. The general linear group of X isthe group GL(X) of all nonsingular linear transformations L : X → X , with com-position as the group operation. The identity component of GL(X) is the subgroupGL+(X) of linear transformations with positive determinant; its elements are saidto be orientation preserving, and the elements of the other component are said tobe orientation reversing. If we fix a particular basis e1, . . . , em there is a bijectionL ↔ (Le1, . . . , Lem) between GL(X) and the set of ordered bases of X , which givesthe following version of the last result.

Corollary 12.2 GL+(X) is path connected.

Any basis induces an obvious bijection between GL+(X) and the set of m × mmatrices with positive determinant, so:

Corollary 12.3 The set of m × m matrices with positive determinant is path con-nected.

We will need the following application of this fact.

12.2 Orientation of a Vector Space 233

Proposition 12.5 Let τ : [0, 1] → Gm be continuous, and let ν1, . . . , νm andν ′1, . . . , ν

′m be linearly independent m-tuples of selections from τ . If ν1(0), . . . , νm(0)

and ν ′1(0), . . . , ν

′m(0) have the same orientation, then there is a linearly independent

m-tuple of selections ν1, . . . , νm that agrees with ν1, . . . , νm on some neighborhoodof 0 and agrees with ν ′

1, . . . , ν′m on some neighborhood of 1.

Proof For each t let M(t) = (ai j (t)) be the matrix such that ν ′i (t) = ∑

j ai j (t)ν j (t).The determinant of M(t) is a continuous function that is positive at 0 and vanishesnowhere, so it is positive everywhere. Let t be any element of (0, 1), and let N be afunction from [0, 1] to the space of nonsingular m × m matrices such that N (0) = I(the m × m identity matrix) and N (1) = M(t). Choose t0 ∈ (0, t) and t1 ∈ (t, 1).Let ν1, . . . , νm agree with ν1, . . . , νm on the interval [0, t0] and with ν ′

1, . . . , ν′m on

the interval [t1, 1]. For t0 ≤ t ≤ t1 let s(t) := t−t0t1−t0

and

⎛

⎜⎝ν1(t)

...

νm(t)

⎞

⎟⎠ := (1 − s(t))N (s(t))

⎛

⎜⎝ν1(t)

...

νm(t)

⎞

⎟⎠ + s(t)N (s(t))M(t)−1

⎛

⎜⎝ν ′1(t)...

ν ′m(t)

⎞

⎟⎠ .

This construction is satisfactory if ν1(t), . . . , νm(t) are linearly independent for allt , and we claim that this is necessarily the case if t0 and t1 are close enough to t .Otherwise there would be a sequence {tr } converging to t and a convergent sequence{sr } in [0, 1], say with limit s, such that the right hand side (with sr in place of s(t)and tr in place of t) was an m-tuple of linearly dependent vectors for all r . But

(1 − sr )N (sr )

⎛

⎜⎝ν1(tr )

...

νm(tr )

⎞

⎟⎠ + sr N (sr )M(t)−1

⎛

⎜⎝ν ′1(tr )...

ν ′m(tr )

⎞

⎟⎠ → N (s)

⎛

⎜⎝ν1(t)

...

νm(t)

⎞

⎟⎠ ,

so this is impossible. �

12.3 Orientation of a Manifold

Now let M ⊂ Rk be a smooth m-dimensional ∂-manifold. Roughly, an orientation

of M is a “continuous” assignment of an orientation to the tangent spaces of pointsof M . In order to make this precise we prove a result that allows us to continuouslytransport an orientation along a curve.

The following fact is basic.

Lemma 12.8 The function p �→ Tp M from M to the m-dimensional subspaces ofR

k is continuous.

Proof It suffices to show that the function is continuous on the image of a smoothparameterization ϕ : U → M for M , where U is an open subset of a half space of


an m-dimensional vector space X . Let b1, . . . , bm be a basis of X . For p ∈ ϕ(U ) wehave

Tp M = span{Dϕ(ϕ−1(p))b1, . . . , Dϕ(ϕ−1(p))bm},

so (Lemma 12.2) p �→ Tp M is a composition of continuous functions. �

If γ : [0, 1] → M is a continuous path, a vector field along γ is a continuous

ν : [0, 1] → Rk

such that ν(t) ∈ Tγ (t)M for all t . Such vector fields ν1, . . . , νq are linearly inde-pendent if, for each t , ν1(t), . . . , νq(t) are linearly independent. The comparison ofthe orientations at the two ends of the path does not depend on the choice of vectorfields.

Lemma 12.9 If γ : [0, 1] → M is a path and ν1, . . . , νm, ν1, . . . , νm are vectorfields along γ such that ν1, . . . , νm and ν1, . . . , νm are linearly independent, then(ν1(0), . . . , νm(0)) and (ν1(0), . . . , νm(0)) have the same orientation if and only if(ν1(1), . . . , νm(1)) and (ν1(1), . . . , νm(1)) have the same orientation.

Proof For each t let M(t) = (ai j (t)) be the matrix such that νi (t) = ∑j ai j (t)ν j (t).

This matrix is a continuous function of t and each M(t) is nonsingular, so the signsof |M(1)| and |M(0)| are the same, and the claim follows from Corollary 12.1. �

To validate “transporting” orientation from γ (0) to γ (1) we need to produce atleast one system of acceptable vector fields. The following consequence of Propo-sition 12.3 guarantees the existence of suitable extensions of such collections in awide variety of circumstances.

Proposition 12.6 Suppose that γ : [0, 1] → M is continuous, ν1, . . . , νq are lin-early independent vector fields along γ , vq+1, . . . , vr ∈ Tγ (0)M, and

ν1(0), . . . , νq(0), vq+1, . . . , vr

are linearly independent. Then there are vector fields νq+1, . . . , νr along γ such thatνq+1(0) = vq+1, . . . , νr (0) = vr and ν1, . . . , νr are linearly independent.

In particular, there exist linearly independent vector fields ν1, . . . , νm .In this sense we can speak of the orientation of Tγ (b)M induced by γ and an

orientation of Tγ (a)M . We say that γ is a loop if γ (a) = γ (b). In this case γ isorientation reversing if a given orientation of Tγ (a)M differs from the orientationof Tγ (b)M = Tγ (a)M induced by γ and the given orientation of Tγ (a)M . We saythat M is unorientable if it has an orientation reversing loop, and otherwise M isorientable.

An orientation of M is a specification of an orientation of each Tp M such that forany path γ : [0, 1] → M the specified orientation of Tγ (1)M is the one induced by

12.3 Orientation of a Manifold 235

γ and the specified orientation of Tγ (0)M . If M is unorientable, then it has no orien-tations. If M is orientable, then each connected component has two orientation, andan orientation of M amounts to a specification of an orientation of each component.An oriented manifold is an orientable manifold together with a specification of anorientation. We say that an ordered basis (v1, . . . , vm) of some Tp M is positivelyoriented if the orientation contains (p, (v1, . . . , vm)), and otherwise it is negativelyoriented.

Probably you already know that the Moëbius strip is the best known example of a∂-manifold that is not orientable,while theKlein bottle is the best known example of acompact manifold that is not orientable. From several points of view two dimensionalprojective space is a more fundamental example of a manifold that is not orientable,but it is more difficult to visualize. (If you are unfamiliar with any of these spacesyou should do a quick web search.)

12.4 Induced Orientation

In this section we first explain how an orientation on a ∂-manifold induces an ori-entation on the boundary. We then study maps between oriented manifolds, and inparticular points where the function hits a submanifold of the domain. This is thenspecialized to the case of the image of a homotopy, giving a result that is the basisfor the invariance of the degree and the index under homotopy.

Let M be a smooth m-dimensional ∂-manifold. If p ∈ ∂ M and v ∈ Tp M \Tp(∂ M), we say that v is inward pointing if there is a smooth path γ : [0, 1] → Mwith γ (0) = p and γ ′(0) = v, and otherwise v is outward pointing. It is obviousthat v is inward pointing if and only if −v is outward pointing. For each p ∈ ∂ M letn p be the outward point vector of unit length in Tp M that is orthogonal to Tp(∂ M).Passing to a coordinate system, one can easily show that n p is a continuous functionof p.

Proposition 12.7 If M is orientable, then ∂ M is orientable. Any orientation of Minduces an orientation of ∂ M defined by specifying that if p ∈ ∂ M and v2, . . . , vm ∈Tp(∂ M), then v2, . . . , vm is a positively oriented ordered basis of Tp(∂ M) if and onlyif n p, v2, . . . , vn is a positively oriented ordered basis of Tp M.

Proof Fix a path γ : [0, 1] → ∂ M . Proposition 12.6 gives linearly independent vec-tor fields ν2, . . . , νm for ∂ M along γ . Let n denote the vector field t �→ nγ (t) for Malong γ . Evidently n, ν2, . . . , νm are linearly independent.

Aiming at a contradiction, suppose that γ is orientation reversing for ∂ . Thenν2(0), . . . , νm(0) and ν2(1), . . . , νm(1) are oppositely oriented bases of Tγ (0)(∂ M).But n(1) = n(0), so n(0), ν2(0), . . . , νm(0) and n(1), ν2(1), . . . , νm(1) are oppo-sitely oriented bases of Tγ (0)M , which means that γ is orientation reversing for M ,contrary to assumption.

The induced orientation of ∂ M is, in fact, an orientation, if, for all γ andν2, . . . , νm , ν2(0), . . . , νm(0) is a positively oriented ordered basis of Tγ (0)(∂ M)


if and only if ν2(1), . . . , νm(1) is a positively oriented ordered basis of Tγ (1)(∂ M).But ν2(0), . . . , νm(0) is a positively oriented ordered basis of Tγ (0)(∂ M) if and onlyif n(0), ν2(0), . . . , νm(0) is a positively oriented ordered basis of Tγ (0)M , and thisis true if and only if n(1), ν2(1), . . . , νm(1) is a positively oriented ordered basis ofTγ (1)M , and in turn this is true if and only if ν2(1), . . . , νm(1) is a positively orientedordered basis of Tγ (1)(∂ M). �

Now suppose that M and N are two m-dimensional oriented smooth manifolds,now without boundary, and that f : M → N is a smooth function. If p is a regularpoint of f , we say that f is orientation preserving at p if D f (p) maps positivelyoriented ordered bases of Tp M to positively oriented ordered bases of T f (p)N ; oth-erwise f is orientation reversing at p.

We can generalize this. Suppose that M is an oriented m-dimensional smooth∂-manifold, N is an oriented n-dimensional boundaryless manifold, P is an oriented(n − m)-dimensional submanifold of N , and f : M → N is a smooth map that istransversal to P . We say that f is positively oriented relative to P at a pointp ∈ f −1(P) if

D f (p)v1, . . . , D f (p)vm, wm+1, . . . , wn

is a positively oriented ordered basis of T f (p)N whenever v1, . . . , vm is a positivelyoriented orderedbasis of Tp M andwm+1, . . . , wn is a positively oriented orderedbasisof T f (p) P . It is easily checked that whether or not this is the case does not depend onthe choice of positively oriented ordered bases v1, . . . , vm and wm+1, . . . , wn . Whenthis is not the case we say that f is negatively oriented relative to P at p.

Now, in addition, suppose that f −1(P) is finite. The oriented intersection num-ber I ( f, P) is the number of points in f −1(P) at which f is positively orientedrelative to P minus the number of points at which f is negatively oriented relative toP . An idea of critical importance for us is that, under natural and relevant conditions,this number is a homotopy invariant. This corresponds to the special case of the fol-lowing result in which M is the cartesian product of an m-dimensional boundarylessmanifold and [0, 1].Theorem 12.1 Suppose that M is an (m + 1)-dimensional oriented smooth ∂-manifold, N is an n-dimensional smooth manifold, P is a compact (n − m)-dimensional smooth submanifold of N and f : M → N is a smooth function suchthat f and f |∂ M are transversal to P, and f −1(P) is compact. Then

I ( f |∂ M , P) = 0 .

Proof Proposition 10.16 implies that f −1(P) is a neat smooth ∂-submanifold of M .Since f −1(P) is compact, it has finitely many connected components, and Proposi-tion 10.17 implies that each of these is either a loop or a line segment. Recalling thedefinition of neatness, we see that the elements of f −1(P) ∩ ∂ M are the endpointsof the line segments. Fix one of the line segments. It suffices to show that f |∂ M ispositively oriented relative to P at one endpoint and negatively oriented relative toP at the other.

12.4 Induced Orientation 237

The line segment is a smooth ∂-manifold, and there is a smooth path γ : [0, 1] →M , with nonzero derivative everywhere, that traverses it. (Formally, γ can be con-structed by gluing together smooth parameterizations of open subsets, using a par-tition of unity.) Let n(t) := γ ′(t) for all t . Let v1, . . . , vm be an ordered basis forTγ (0)(∂ M) such that n(0), v1, . . . , vm is a positively oriented ordered basis of Tγ (0)M ,and let v′

1, . . . , v′m be a basis for Tγ (1)(∂ M) such that n(1), v′

1, . . . , v′m is a positively

oriented ordered basis of Tγ (1)M .Proposition 12.3 implies that there are vector fields ν1, . . . , νm and ν ′

1, . . . , ν′m

along γ such that n, ν1, . . . , νm and n, ν1, . . . , νm are linearly independent, ν1(0) =v1, . . . , νm(0) = vm , and ν ′

1(1) = v′1, . . . , ν

′m(1) = v′

m . Since n(1), ν1(1), . . . , νm(1)and n(1), v′

1, . . . , v′m have the same orientation, Proposition 12.5 implies that there

are vector fields ν1, . . . , νm along γ such that n, ν1, . . . , νm are linearly inde-pendent, ν1(0) = v1, . . . , νm(0) = vm , and ν1(1) = v′

1, . . . , νm(1) = v′m . Proposition

12.3 implies that there are linearly vector fields ωm+1, . . . , ωn along f ◦ γ .For each t , ωm+1(t), . . . , ωn(t) span T(γ (t)) P and, by transversality,

D f (γ (t))n(t), D f (γ (t))ν1(t), . . . , D f (γ (t))νm(t), ωm+1(t), . . . , ωn(t)

span T f (γ (t))N , but D f (γ (t))n(t) ∈ T f (γ (t)) P , so

D f (γ (t))ν1(t), . . . , D f (γ (t))νm(t), ωm+1(t), . . . , ωn(t)

span T f (γ (t))N . Thus this is a positively oriented basis of T f (γ (0))N when t = 0if and only if it is a positively oriented basis of T f (γ (1))N when t = 1. In addi-tion, ωm+1(0), . . . , ωn(0) is a positively oriented basis of T f (γ (0)) P if and only ifωm+1(1), . . . , ωn(1) is a positively oriented basis of T f (γ (1)) P . Neatness implies thatn(0) is inward pointing and n(1) is outward pointing, so, according to the notion ofinduced orientation of Proposition 12.7, v1, . . . , vm is a negatively oriented basis ofTγ (0)(∂ M) and v′

1, . . . , v′m is a positively oriented basis of Tγ (1)(∂ M). We conclude

that f |∂ M is positively oriented relative to P at γ (0) if and only if it is negativelyoriented relative to P at γ (1). �

12.5 The Degree

Let M and N be smooth m-dimensional oriented manifolds. For a compact C ⊂ Mlet ∂C := C ∩ (M \ C) be the topological boundary of C .

Definition 12.1 A continuous function f : C → N with compact domain C ⊂ Mis degree admissible over q ∈ N if

f −1(q) ∩ ∂C = ∅ .


If, in addition, f is smooth and q is a regular value of f , then f is smoothly degreeadmissible over q. Let D(M, N ) be the set of pairs ( f, q) in which f : C → N isa continuous function with compact domain C ⊂ M that is degree admissible overq ∈ N . Let D∞(M, N ) be the set of ( f, q) ∈ D(M, N ) such that f is smoothlydegree admissible over q.

The main idea is that if f is smoothly degree admissible over q, then the degree off over q is the number of p ∈ f −1(q) at which f is orientation preserving minus thenumber of p ∈ f −1(q) at which f is orientation reversing. When f is merely degreeadmissible over q, its degree over q is the degree over q of nearby functions that aresmoothly degree admissible over q. We need to require that f have no preimages ofq in ∂ f because small perturbations of f could either eliminate such preimages ofmove them into the interior of C .

In order for the definition of the degree over q to make sense, it must be the casethat all sufficiently good smoothly degree admissible functions have the same degreeover q. We will do this by showing that they are homotopic, and that the degree ispreserved by suitable homotopies.

Definition 12.2 If C ⊂ M is compact, a homotopy h : C × [0, 1] → N is degreeadmissible over q if , for each t , ht is degree admissible over q. We say that h issmoothly degree admissible over q if, in addition, h is smooth and h0 and h1 aresmoothly degree admissible over q.

The most useful characterization of the degree is axiomatic. We begin by charac-terizing the degree for functions that are smoothly degree admissible.

Proposition 12.8 There is a unique function deg∞ : D∞(M, N ) → Z, taking ( f, q)

to deg∞q ( f ), such that:

(Δ1) deg∞q ( f ) = 1 for all ( f, q) ∈ D∞(M, N ) such that f −1(q) is a singleton {p}

and f is orientation preserving at p.(Δ2) deg∞

q ( f ) = ∑ri=1 deg

∞q ( f |Ci ) whenever ( f, q) ∈ D∞(M, N ), the domain of

f is C, and C1, . . . , Cr are pairwise disjoint compact subsets of C such that

f −1(q) ⊂ (C1 \ ∂C1) ∪ . . . ∪ (Cr \ ∂Cr ) .

(Δ3) deg∞q (h0) = deg∞

q (h1) whenever C ⊂ M is compact and the homotopy h :C × [0, 1] → N is smoothly degree admissible over q.

Proof For ( f, q) ∈ D(M, N ) the inverse function theorem implies that each p ∈f −1(q) has a neighborhood that contains no other element of f −1(q), and since U iscompact it follows that f −1(q) is finite. Let deg∞

q ( f ) be the number of p ∈ f −1(q)

at which f is orientation preserving minus the number of p ∈ f −1(q) at which f isorientation reversing.

12.5 The Degree 239

Clearly deg∞ satisfies (Δ1) and (Δ2). Suppose that h : C × [0, 1] → N issmoothly degree admissible over q. Let V be a neighborhood of q such that forall q ′ ∈ V :

(a) h−1(q ′) ⊂ U × [0, 1];(b) q ′ is a regular value of h0 and h1;(c) deg∞

q ′ (h0) = deg∞q (h0) and deg∞

q ′ (h1) = deg∞q (h1).

Sard’s theorem implies that some q ′ ∈ V is a regular value of h. In view of (a) wecan apply Theorem 12.1, concluding that the degree of h|∂(U×[0,1]) = h|U×{0,1} overq ′ is zero. Since the orientation of M × {0} induced by M × [0, 1] is the oppositeof the induced orientation of M × {1}, this implies that deg∞

q ′ (h0) − deg∞q ′ (h1) = 0,

from which it follows that deg∞q (h0) = deg∞

q (h1). We have verified (Δ3).It remains to demonstrate uniqueness. In view of (Δ2), this reduces to showing

uniqueness for ( f, q) ∈ D∞(M, N ) such that f −1(q) = {p} is a singleton. If f isorientation preserving at p, this is a consequence of (Δ1), so we assume that f isorientation reversing at p.

The constructions in the remainder of the proof are easy to understand, but tediousto elaborate in detail, sowe only explain themain ideas.Using the path connectednessof eachorientation (Proposition 12.4) and anobvious homotopybetween an f that hasp as a regular point and its linear approximation, with respect to some coordinatesystems for the domain and range, one can show that (Δ3) implies that deg∞

q ( f )

does not depend on the particular orientation reversing f . Using one of the bumpfunctions constructed after Lemma 10.1, one can easily construct a smooth homotopyj : M × [0, 1] → M such that j0 = IdM , each jt is a smooth diffeomorphism, andj1(p) is any point in some neighborhood of p. Applying (Δ3) to h := f ◦ j , wefind that deg∞

q ( f ) does not depend on which point (within some neighborhood ofp) is mapped to q. The final construction is a homotopy between the given f and afunction f ′ that has three preimages of q near p, with f ′ being orientation reversingat two of them and orientation preserving at the third. In view of the other conclusionswe have reached, (Δ3) implies that

deg∞q ( f ) = 2 deg∞

q ( f ) + 1 .

�

In preparation for the next result we show that deg∞ is continuous in a ratherstrong sense.

Proposition 12.9 If C ⊂ M is compact, f : C → N is continuous, and q ∈ N \f (∂C), then there are neighborhoods Z ⊂ C(C, N ) of f and V ⊂ N \ f (∂C) of qsuch that

deg∞q ′ ( f ′) = degq ′′( f ′′)

whenever f ′, f ′′ ∈ Z ∩ C∞(C, N ), q ′, q ′′ ∈ V , q ′ is a regular value of f ′, and q ′′is a regular value of f ′′.


Proof Let V be an open disk in N that contains q with V ⊂ N \ f (∂C). Then

Z ′ := { f ′ ∈ C(C, N ) : f (∂C) ⊂ N \ V }

is an open subset of C(C, N ), and Theorem 10.11 gives an open Z ⊂ Z ′ containingf such that for any f ′, f ′′ ∈ Z ∩ C∞(C, N ) there is a smooth homotopy h : C ×[0, 1] → N with h0 = f ′, h1 = f ′′, and ht ∈ Z ′ for all t , which implies that h is adegree admissible homotopy, so (Δ3) implies that deg∞

q ′′′( f ′) = deg∞q ′′′( f ′′)whenever

q ′′′ ∈ V is a regular point of both f ′ and f ′′.Since Sard’s theorem implies that such a q ′′′ exists, it now suffices to show that

deg∞q ′ ( f ′) = deg∞

q ′′( f ′) whenever f ′ ∈ Z ∩ C∞(C, N ) and q ′, q ′′ ∈ V are regularvalues of f ′. Let j : N × [0, 1] → N be a smooth function with the following prop-erties:

(a) j0 = IdN ;(b) each jt is a smooth diffeomorphism;(c) j (y, t) = y for all y ∈ N \ V and all t ;(d) j1(q ′) = q ′′.

(Construction of such a j , using the techniques of Sect. 10.2, is left as an exercise.)Clearly jt (q ′) is a regular value of jt ◦ f for all t , so the concrete characterizationof deg∞ implies that deg∞

jt (q ′)( jt ◦ f ′) is locally constant as a function of t . Since theunit interval is connected, it follows that deg∞

q ′ ( f ′) = deg∞q ′′( j1 ◦ f ′). On the other

hand jt ◦ f ′ ∈ Z ′ for all t , so the homotopy (y, t) �→ j ( f ′(y), t) is smoothly degreeadmissible over q ′′, and (Δ3) implies that deg∞

q ′′( j1 ◦ f ′) = deg∞q ′′( f ′). �

The theory of the degree is completed by extending the degree to continuousfunctions, dropping the regularity condition.

Theorem 12.2 There is a unique function deg : D(M, N ) → Z, taking ( f, q) todegq( f ), such that:

(D1) degq( f ) = 1 for all ( f, q) ∈ D(M, N ) such that f is smooth, f −1(q) is asingleton {p}, and f is orientation preserving at p.

(D2) degq( f ) = ∑ri=1 degq( f |Ci ) whenever ( f, q) ∈ D(M, N ), the domain of f is

C, and C1, . . . , Cr are pairwise disjoint compact subsets of U such that

f −1(q) ⊂ C1 ∪ . . . ∪ Cr \ (∂C1 ∪ . . . ∪ ∂Cr ) .

(D3) For each compact C ⊂ M and q ∈ N the function f �→ degq( f ) is a contin-uous (i.e., locally constant) function on { f ∈ C(C, N ) : ( f, q) ∈ D(M, N ) }.

Proof We claim that if deg : D(M, N ) → Z satisfies (D1)–(D3), then its restrictionto D∞(M, N ) satisfies (Δ1)–(Δ3). For (Δ1) and (Δ2) this is automatic. Supposethat C ⊂ M is compact and h : U × [0, 1] → N is a smoothly degree admissiblehomotopy over q. Such a homotopy may be regarded as a continuous function from

12.5 The Degree 241

[0, 1] toC(U , N ). Therefore (D3) implies that degq(ht ) is a locally constant functionof t , and since [0, 1] is connected, it must be constant. Thus (Δ3) holds.

Theorem 11.4 implies that for any ( f, q) ∈ D(M, N ) the set of smooth f ′ : M →N that have q as a regular value is dense at f . In conjunction with Proposition12.9, this implies that the only possibility consistent with (D3) is to set degq( f ) :=deg∞

q ′ ( f ′) for ( f ′, q ′) ∈ D∞(M, N )with f ′ and q ′ close to f and q. This establishesuniqueness, and Proposition 12.9 also implies that the definition is unambiguous. Itis easy to see that (D1) and (D2) follow from (Δ1) and (Δ2), and (D3) is automatic.

�Since (D2) implies that the degree of f over q is the sum of the degrees of the

restrictions of f to the various connected components of the domain of f , it makessense to study the degree of the restriction of f to a single component. For thisreason, when studying the degree one almost always assumes that M is connected.(In applications of the degree this may fail to be the case, of course.) The image ofa connected set under a continuous mapping is connected, so if M is connected andf : M → N is continuous, its image is contained in one of the connected componentsof N . Therefore it also makes sense to assume that N is connected.

Recall that a map f : M → N is proper if f −1(C) is compact whenever C ⊂ Nis compact. If this is the case and q ∈ N , then (D2) implies that degq( f |C) is thesame for all compact neighborhoods C of f −1(q), and (D3) asserts that degq( f |C) iscontinuous as a function of q. Since Z has the discrete topology, this means that it isa locally constant function, so if N is connected, then it is in fact constant. When Nis connected and f is proper we will simply write deg( f ), and speak of the degreeof f without any mention of a point in N .

12.6 Composition and Cartesian Product

In Chap.5 we emphasized restriction to a subdomain, composition, and cartesianproducts, as the basic set theoretic methods for constructing new functions fromones that are given. The behavior of the degree under restriction to a subdomain isalready expressed by (D2), and in this section we study the behavior of the degreeunder composition and products. In both cases the result is given by multiplication,reflecting basic properties of the determinant.

Proposition 12.10 If M, N, and P are oriented m-dimensional smooth manifolds,C ⊂ M and D ⊂ N are compact, f : C → N and g : D → P are continuous, gis degree admissible over r ∈ P, and g−1(r) is contained in one of the connectedcomponents of N \ f (∂C), then for any q ∈ g−1(r) we have

degr (g ◦ f ) = degq( f ) × degr (g) .

Proof Since C∞(C, N ) and C∞(D, P) are dense in C(C, N ) and C(D, P) (Theo-rem10.10) and composition is a continuous operation (Proposition 5.5) the continuity


property (D3) of the degree implies that is suffices to prove the claim when f andg are smooth. Sard’s theorem implies that there are points arbitrarily near r that areregular values of both g and g ◦ f , and Proposition 12.9 implies that the relevantdegrees are unaffected if r is replaced by such a point, so we may assume that r hasthese regularity properties.

For q ∈ g−1(r) let sg(q) be 1 or −1 according to whether g is orientation pre-serving or orientation reversing at q. For p ∈ (g ◦ f )−1(q) define s f (p) and sg◦ f (p)

similarly. In view of the chain rule and the definition of orientation preservation andreversal, sg◦ f (p) = sg( f (p))s f (p). Therefore

deg(g ◦ f ) =∑

p∈(g◦ f )−1(r)

sg( f (p))s f (p) =∑

q∈g−1(r)

sg(q)

( ∑

p∈g−1(q)

s f (p)

)

=∑

q∈g−1(r)

sg(q) degq( f ) .

Since g−1(r) is contained in a single connected component of N \ f (∂C), Proposi-tion 12.9 implies that degq( f ) is the same for all q ∈ g−1(r), and

∑q∈g−1(r) sg(q) =

degr (g). �

The hypotheses of the last result are rather stringent, which makes it rather artifi-cial. For topologists the following special case is the main point of interest.

Corollary 12.4 If M, N, and P are compact oriented m-dimensional smooth man-ifolds, N is connected, and f : M → N and g : N → P are continuous, then

deg(g ◦ f ) = deg( f ) × deg(g) .

For cartesian products the situation is much simpler.

Proposition 12.11 Suppose that M and N are oriented m-dimensional smoothmanifolds, M ′ and N ′ are oriented m ′-dimensional smooth manifolds, C ⊂ M andC ′ ⊂ M are compact, and f : C → N and f ′ : C ′ → N ′ are degree admissible overq and q ′ respectively. Then

deg(q,q ′)( f × f ′) = degq( f ) × degq ′( f ′) .

Proof For reasons explained in other proofs above, we may assume that f and f ′ aresmooth and that q and q ′ are regular values of f and f ′. For p ∈ f −1(r) let s f (p)

be 1 or −1 according to whether f is orientation preserving or orientation reversingat p, and define s f ′(p′) for p′ ∈ f ′−1

(q ′) similarly. Since the determinant of a blockdiagonal matrix is the product of the determinants of the blocks, f × f ′ is orientationpreserving or orientation reversing at (p, p′) according to whether sp( f )sp′( f ′) ispositive or negative, so

12.6 Composition and Cartesian Product 243

deg(q,q ′)( f × f ′) =∑

(p,p′)∈( f × f ′)−1(q,q ′)

sp( f )sp′( f ′)

=∑

p∈ f −1(q)

sp( f ) ×∑

p′∈ f ′−1(q ′)

sp′( f ′) = degq( f ) · degq ′( f ′) .

�

Exercises

12.1 For (v1, . . . , vq) ∈ Fq and (w1, . . . , wq) ∈ Oq , prove that

(w1, . . . , wq) = Γ q(v1, . . . , vq)

if and only if there is a q × q lower triangular matrix M = (ai j )with positive entrieson the diagonal such that

M

⎛

⎜⎝w1...

wq

⎞

⎟⎠ =⎛

⎜⎝v1...

vq

⎞

⎟⎠ .

Prove that Γ q is an open map.

12.2 If X is m-dimensional, endow the Grassman manifold Gq with the structure ofan q(m − q)-dimensional smooth manifold.

12.3 Let α : Sm → Sm be the antipodal map α(p) := −p where, as usual, Sm :={ p = (p0, . . . , pm) ∈ R

m+1 : ‖p‖ = 1 } is the m-dimensional unit sphere.

(a) Prove that α is orientation preserving if and only if m is odd.

Real m-dimensional projective space is

Pm := { {p, α(p)} : p ∈ Sm }.

For i = 0, . . . , m let Ui := { p ∈ Sm : pi > 0 }, and let ϕi : Ui → Pm be the mapϕi (p) := {p, α(p)}. In the obvious sense these maps may be regarded as a C∞ atlasof parameterizations for Pm .

(b) Prove that Pm is orientable if and only if m is odd.(c) When m is odd, what is deg(α)?

12.4 TheRiemann sphere is the space S := C ∪ {∞}with a differentiable structuredefined by the two parameterizations IdC and the map z �→ 1/z (with 0 �→ ∞). Onecan define a category of complex manifolds and holomorphic maps between them,for which this is the most elementary example, but we will ignore the complex


structure (except that we use complex arithmetic in defining maps) and regard thisas 2-dimensional manifold. A rational polynomial is a ratio r(z) = p(z)/q(z) oftwo polynomials. Show that r may be regarded as a C∞ function r : S → S. Whatis its degree?

12.5 Let Z2 := {0, 1} be the integers mod 2. Let M and N be (not necessarily ori-entable) smooth m-dimensional manifolds. If ( f, q) ∈ D∞(M, N ) and q is a regularvalue of f , let

deg∞2 ( f, q) := | f −1(q)| mod 2.

(a) Prove that if h : C × [0, 1] → N is a smooth homotopy that is smoothly degreeadmissible over q, then deg∞

2 (h0, q) = deg∞2 (h1, q). (Prove this first with the

additional assumption that q is a regular value of h.)(b) Prove that there is a unique function deg2 : D(M, N ) → Z2, taking ( f, q) to

deg2q( f ) that satisfies (D1) (modified by removing the requirement that f beorientation preserving) (D2), and (D3).

Chapter 13The Fixed Point Index

We now take up the theory of the fixed point index. Roughly, the index assigns aninteger to each continuous function f : C → X where X is a “well enough behaved”space, C ⊂ X is compact, and f has no fixed points on the boundary of C . In aEuclidean setting, when f is smooth and all its fixed points are regular, the indexis the number of fixed points at which f is “like” the constant function minus thenumber of fixed points at which it is not. In order to extend the theory to moregeneral spaces and less well behaved functions, and to correspondences, we take anaxiomatic approach.

Section 13.1 presents an axiom system for an index on a single Euclidean space.The Normalization axiom requires that the index of a constant function is 1. TheAdditivity axiom asserts that if the fixed points of f are contained in a finite union ofdisjoint compact subsets ofC that do not have any fixed points on their boundary, thenthe index of f is the sum of the indices of the restrictions of f to these subdomains.The Continuity axiom requires that the index is unaffected by sufficiently smallperturbations of the function. These axioms uniquely characterize the Euclideanindex.

For continuous functions defined on compact subsets of Euclidean spaces thisis no more than a different rendering of the theory of the degree. But while thedegree is restricted to finite dimensional manifolds, the fixed point index extends toa much higher level of generality. Conceptually, the reason for this seems to be thatthe domain and the range of the function or correspondence necessarily have thesame topology. Concretely, there is a property called Commutativity that equates theindices of the two compositions f ′ ◦ f and f ◦ f ′ where C ⊂ X and C ′ ⊂ X ′ arecompact, f : C → C ′ and f ′ : C ′ → C are continuous, and f ′ ◦ f and f ◦ f ′ donot have fixed points on the boundaries of C and C ′ respectively. In Sect. 13.2 weshow that the index for Euclidean spaces satisfies Commutativity, and that it also hasa property called Multiplication which asserts that the index behaves naturally withrespect to cartesian products.

The description of Commutativity above is too restrictive, because it is a localproperty of f and f ′ near the fixed points of the two compositions, which does notrequire that f and f ′ be defined on all of C and C ′. In a more general formulation


245


246 13 The Fixed Point Index

Commutativity requires that f ′ ◦ f |E and f ◦ f ′|E ′ have the same index, where E ⊂D ⊂ X , E ′ ⊂ D′ ⊂ X ′, f : D → X ′ and f ′ : D′ → X are continuous functions,f (E) ⊂ D′ and f ′(E ′) ⊂ D, and certain other conditions are satisfied. Workingwith such a formulation is possible, but quite cumbersome. A much more elegantapproach is to recognize that Additivity implies that the index of a function dependsonly on the germ of the function at its set of fixed points, where this germ is (bydefinition) the equivalence class of functions that agree with the given function onsome neighborhood of the set of fixed points. Section 13.3 defines the relevant spaceof germs, explains some basic properties, and uses these to reformulate the Euclideanindex.

Section 13.4 extends the index to locally compact ANR’s. The key idea is thatif X is such an ANR, C ⊂ X is compact, and f : C → X has no fixed points inthe boundary of C , then the domination theorem (Theorem 8.4) can be used toapproximate f near its set of fixed points with a composition of two functions, thefirst of which goes from C to a subset of a Euclidean space, while the second goesfrom that subset back to X . Commutativity requires that this composition have thesame index as the composition in the other order, which is a function from a subsetof a Euclidean space to that space. We may hope to use this phenomenon to define anindex for ANR’s, and in fact (after many verifications) this does give a well definedindex satisfying all the desired properties.

Theorem 9.1 implies that any upper hemicontinuous contractible valued corre-spondence from a compact subset C of an ANR X to a second ANR Y can beapproximated by a continuous function. If an extension of the index to such corre-spondences satisfies Continuity, the index of the correspondence must agree withthe index of a sufficiently nearby approximating function. As above, we take thiscondition as a definition of the extended index. Once again there are many verifi-cations (the definition must make sense, and all axioms must be checked) but theresult (which is given in Sect. 13.5) is our most general and flexible formulation ofthe fixed point index.

An argument applying the fact that the index is uniquely characterized by theaxioms might construct a function, demonstrate that it satisfied all axioms, and con-clude that it was in fact the index. But for the general index uniqueness pertainsto the entire system of indices on all ANR’s, which makes such an argument quitecumbersome at best. This motivates an interest in results showing that for certainspaces the index is determined by the first three axioms (Normalization, Additivity,and Continuity). Section 13.6 gives such a result. In particular, the first three axiomssuffice to determine the index for any finite simplicial complex.

13.1 A Euclidean Index

We develop the index in several phases. The first, carried out in this section, simplytranslates the degree into a tool for studying fixed points.

13.1 A Euclidean Index 247

Recall that if X is a topological space X andC ⊂ X , the topological boundary ofCis ∂C := C ∩ X \ C , and intC := C \ ∂C is the interior of C . An index admissiblefunction for X is a continuous function f : C → X with compact domainC that hasno fixed points in its boundary: F ( f ) ∩ ∂C = ∅. For each compact C ⊂ X let I X

Cbe the set of such functions, and let I X := ⋃

C I XC be the set of index admissible

functions for X .

Proposition 13.1 For each finite dimensional vector space V there is a unique func-tion Λ : I V → Z satisfying

(I1) (Normalization1) If c : C → V is a constant function whose value is an elementof int C, then

Λ(c) = 1 .

(I2) (Additivity) If f : C → V is an element ofI V ,C1, . . . ,Cr are pairwise disjointcompact subsets of C, and F ( f ) ⊂ int C1 ∪ . . . ∪ int Cr, then

Λ( f ) =∑

i

Λ( f |Ci ) .

(I3) (Continuity) For each compact C ⊂ V ,Λ|C(C,V )∩I V is continuous (i.e., locallyconstant).

Concretely Λ( f ) is given by

Λ( f ) = deg0(IdC − f ) .

Proof Observe that ifC ⊂ V is compact, then f : C → V is index admissible if andonly if IdC − f is degree admissible over the origin. Now (I1)–(I3) follow directlyfrom (D1)–(D3).

Toproveuniqueness suppose that ΛV is an index forI V . For (g, q) ∈ D(V, V ) let

dq(g) := ΛV (IdC − g − q) ,

where C is the domain of g. It is straightforward to show that d satisfies (D1)–(D3),so it must be the degree, and consequently ΛV = Λ. �

13.2 Multiplication and Commutativity

This section establishes twoproperties of theEuclidean index given in the last section.The first of these is fairly straightforward.

1In the literature this condition is sometimes described as “Weak Normalization,” in contrast witha stronger condition defined in terms of homology.


Proposition 13.2 If V and V ′ are finite dimensional vector spaces and f ∈ I V

and f ′ ∈ I V ′, then f × f ′ ∈ I V×V ′

and

Λ( f × f ′) = Λ( f ) × Λ( f ′).

Proof Let C and C ′ be the domains of f and f ′. EvidentlyF ( f × f ′) = F ( f ) ×F ( f ′), so ifF ( f ) andF ( f ′) are contained in the respective interiors of C and C ′,then F ( f × f ′) is contained in the interior of C × C ′. Thus f × f ′ ∈ I V×V ′

.Lemma 5.7 implies that the function ( f, f ′) → f × f ′ is continuous, and Λ

and Λ satisfy (I3), so it suffices to prove the claim after replacing f and f ′ withsufficiently nearby continuous functions. Since the smooth functions are dense inC(C, V ) andC(C ′, V ′) (Proposition 10.2) wemay assume that f and f ′ are smooth.In addition, Sard’s theorem implies that the regular values of IdV − f are dense, soafter perturbing f by adding an arbitrarily small constant, we can make it the casethat 0 is a regular value. Similarly, we may arrange for 0 to be a regular value ofIdV − f ′.

For x ∈ F ( f ) and x ′ ∈ F ( f ′) elementary properties of the determinant give

|I − D( f × f ′)(x, x ′)| = |I − Df (x)| × |I − Df ′(x ′)|.

Therefore

Λ( f × f ′) =∑

(x,x ′)∈F ( f × f ′)

sgn(|I − D( f × f ′)(x, x ′)|)

=∑

x∈F ( f )

∑

x ′∈F ( f ′)

sgn(|I − Df (x)|) × sgn(|I − Df ′(x ′)|)

= ( ∑

x∈F ( f )

sgn(|I − Df (x)|)) × ( ∑

x ′∈F ( f ′)

sgn(|I − Df ′(x ′)|))

= Λ( f ) × Λ( f ′).

�The second property, which is known as Commutativity, is superficially similar.

Proposition 13.3 Suppose that V and V ′ are finite dimensional vector spaces, B ⊂V and B ′ ⊂ V ′ are compact, C ⊂ V and C ′ ⊂ V ′ are compact with B ⊂ C andB ′ ⊂ C ′, and g ∈ C∞(C, V ′) and g′ ∈ C∞(C ′, V ). If g(B) ⊂ C ′ and g′(B ′) ⊂ C ′,g′ ◦ g|B ∈ I V∞ , and g ◦ g′|B ′ ∈ I V ′

∞ , then

Λ(g′ ◦ g|B) = Λ(g ◦ g′|B ′).

While it might seem natural to guess that this is the case, the proof involves anontrivial fact of linear algebra that was not known before the subject was developed

13.2 Multiplication and Commutativity 249

in the late 1940’s. The doctoral thesis of Browder (1948) showed both that thisproperty of the index is quite general, and that it can be used to extend index theoryto very general spaces.While this is, perhaps, themost important consequence of thisproperty of the index, the property also comes up frequently in arguments involvingthe index. We will first prove this result, then turn to developing the idea moregenerally, in order to use it to develop the index for ANR’s.

Proposition 13.4 (Jacobson (1953) pp. 103–106) Suppose K : V → W and L :W → V are linear transformations, where V and W are vector spaces of dimensionsm and n respectively over an arbitrary field. Suppose m ≤ n. Then the characteristicpolynomials2 κK L and κLK of K L and LK are related by the equation κK L(λ) =λn−mκLK (λ). In particular,

κLK (1) = |IdV − LK | = |IdW − K L| = κK L(1) .

Proof We can decompose V and W as direct sums V = V1 ⊕ V2 ⊕ V3 ⊕ V4 andW = W1 ⊕ W2 ⊕ W3 ⊕ W4 where

V1 = kerK ∩ im L , V1 ⊕ V2 = im L , V1 ⊕ V3 = kerK ,

and similarly for W . With suitably chosen bases the matrices of K and L have theforms

⎡

⎢⎢⎣

0 K12 0 K14

0 K22 0 K24

0 0 0 00 0 0 0

⎤

⎥⎥⎦ and

⎡

⎢⎢⎣

0 L12 0 L14

0 L22 0 L24

0 0 0 00 0 0 0

⎤

⎥⎥⎦

Computing the product of these matrices, we find that

κK L(λ) =

∣∣∣∣∣∣∣∣

λI −K12L22 0 −K12L24

0 λI − K22L22 0 −K22L24

0 0 λI 00 0 0 λI

∣∣∣∣∣∣∣∣

Using elementary facts about determinants, this reduces to κK L(λ) = λn−k |λI −K22L22|, where k = dim V2 = dimW2. In effect this reduces the proof to the specialcase V2 = V and W2 = W , i.e. K and L are isomorphisms. But this case followsfrom the computation

|λIdV − LK | = |L−1| · |λIdV − LK | · |L| = |L−1(λIdV − LK )L| = |λIdW − K L|.

�

2When V is a finite dimensional vector space and L : V → V is a linear transformation, κL (t) :=|tIdV − L| is the characteristic polynomial of L .


Proof of Proposition 13.3 Proposition 5.5 implies that g′ ◦ g and g ◦ g′ are continuousfunctions of (g, g′), and Λ and Λ satisfy (I3), so it suffices to prove the claim afterreplacing g and g′ with sufficiently nearby continuous functions. Since the smoothfunctions are dense in C(C, V ′) and C(C ′, V ) (Proposition 10.2) we may assumethat g and g′ are smooth. In addition, Sard’s theorem implies that the regular values ofIdV − g′ ◦ g are dense, so after perturbing g′ by adding an arbitrarily small constant,we can make it the case that 0 is a regular value. In the same way we can add a smallconstant to g to make 0 a regular value of IdV ′ − g ◦ g′, and if the constant is smallenough it will still be the case that 0 is a regular value of IdV − g′ ◦ g.

Evidently g(F (g′ ◦ g|B)) = F (g ◦ g′|B ′) and g′(F (g ◦ g′|B ′)) = F (g′ ◦ g|B).The fixed points of g′ ◦ g|B are isolated and therefore finite in number. Let them bex1, . . . , xr , and for each i let x ′

i := g(xi ). Then x ′1, . . . , x

′r are the fixed points of

g ◦ g′|B ′ . For each i Proposition 13.4 gives

|I − D(g′ ◦ g)(xi )| = |I − Dg′(x ′i )Dg(xi )|

= |I − Dg(xi )Dg′(x ′i )| = |I − D(g ◦ g′)(x ′

i )| .

Therefore each x ′i is a regular fixed point of g ◦ g′|B ′ , and applying Additivity to sum

over i gives the desired equality. �

13.3 Germs

Earlier we mentioned that the index is a local concept, insofar as it depends onlyon the restriction of the function to an arbitrarily small neighborhood of its set offixed points. Up to this point this has been only a minor conceptual imperfection,but not recognizing it explicitly would soon lead to unfortunate complications. Someof these relate to Commutativity, which in many texts is expressed less generally,with g(C) ⊂ C ′ and g′(C ′) ⊂ C . Our formulation is more accurate, but also rathercumbersome. In particular, in extending the index to more general settings it wouldbe quite tedious to show that the desired condition held in every instance of the setuplaid out in the hypotheses of Proposition 13.3. To avoid these difficulties we introducea concept from mathematics that may seem like a bit of abstraction for its own sake.In this case, however, it will be both simplifying and conceptually clarifying.

Let X and Y be topological spaces, and let A ⊂ X be compact. Two continuousfunctions f : U → Y and f ′ : U ′ → Y defined on neighborhoods of A have thesame germ at A if there is a neighborhood V of A such that V ⊂ U ∩U ′ andf |V = f ′|V . This is easily seen to be an equivalence relation, and its equivalenceclasses are called germs of continuous functions from X to Y at A. Let gA(X,Y ) bethe set of such germs. For f as above letγA( f )be its equivalence class,which is calledthe germ of f at A. We say that f is a representative of γA( f ). If γ ∈ gA(X,Y ),then A is the domain of γ .

13.3 Germs 251

If Z ⊂ A is compact, then a function f defined in a neighborhood of A is alsodefined in a neighborhood of Z . Clearly γZ ( f ) depends only on γA( f ), so there is afunction from gA(X,Y ) to gZ (X,Y ). For γ ∈ gA(X,Y ) let γ |Z denote the associatedelement of gZ (X,Y ).

For any γ ∈ gA(X,Y ) there is an associated function from A to Y , and we willoften abuse notation by letting γ denote this function. Suppose that B ⊂ Y is compactand γ (A) ⊂ B. If Z is a third topological space and η ∈ gB(Y, Z), the compositionη ◦ γ of γ and η is the germ of g ◦ f where f : U → Y and g : V → Z are repre-sentatives of γ and η with f (U ) ⊂ V . (Proving that this definition is independent ofthe choice of representatives is an easy exercise).

Suppose that, in addition to X ,Y , and A, X ′ andY ′ are topological spaces and A′ ⊂X ′ is compact. For γ ∈ gA(X,Y ) and γ ′ ∈ gA′(X ′,Y ′)with representatives f and f ′we define γ × γ ′ to be the germ of the function f × f ′, (x, x ′) → ( f (x), f ′(x ′)).This definition is unambiguous because any neighborhood of A × A′ contains someU ×U ′ where U and U ′ are neighborhoods of A and A′. Again it is easy to showthat this definition is independent of the choice of representatives.

A germ γ ∈ gA(X, X) is index admissible if there is an f ∈ I X with γA( f ) = γ

andF ( f ) ⊂ A. If this is the case (but not when γ is inadmissible) we letF (γ ) :=F ( f ). Let G X be the set of index admissible germs at compact subsets of X . LetπX : I X → G X be the function πX ( f ) = γF ( f )( f ).

For each finite dimensional vector space V we define a function Λ : G V → Z

implicitly by settingΛ(πV ( f )) := Λ( f ).

Since πV is surjective and Λ : I V → Z satisfies Additivity, this makes sense.

Proposition 13.5 For each finite dimensional vector space,Λ is the unique functionfrom G V to Z satisfying:

(I1) (Normalization) For all x ∈ V , if cx : V → V is the constant function withvalue x, then

Λ(γ{x}(cx )) = 1.

(I2) (Additivity) If γ ∈ G V has domain A, A1, . . . , Ar are pairwise disjoint compactsubsets of A, and F (γ ) ⊂ A1 ∪ . . . ∪ Ar , then

Λ(γ ) =∑

i

Λ(γ |Ai ) .

(I3) (Continuity) For each compact C ⊂ V , Λ ◦ πV |I VCis continuous.

In addition these functions satisfy:

(I4) (Commutativity) If A ⊂ V and A′ ⊂ V ′ are compact, γ ∈ gA(V, V ′) and γ ′ ∈gA′(V ′, V ) with γ (A) ⊂ A′ and γ ′(A′) ⊂ A, and γ ′ ◦ γ and γ ◦ γ ′ are indexadmissible, then

Λ(γ ′ ◦ γ ) = Λ(γ ◦ γ ′) .


(I5) (Multiplication) If γ ∈ G V and γ ′ ∈ G V ′, then

Λ(γ × γ ′) = Λ(γ ) × Λ(γ ′).

Note that our notation does not distinguish between Λ as a function with domainI V and as a function with domain G V . There will be other such abuses of notationbelow. In each case it will be clear, due to the uniqueness assertions in the variousresults, that the two indices are related either by virtue of one being the restrictionof the other to a subdomain, or due to some relation such as Λ( f ) = Λ(γF ( f )( f ))which relates the two indices for V .

Proof That (I1)–(I3) hold for Λ : G V → Z follows immediately from the fact thatΛ : I V → Z satisfies the analogous conditions. Furthermore, if ΛV : G V → Z sat-isfies (I1)–(I3) and we define ΛV : I V → Z by setting ΛV ( f ) = ΛV (πV ( f )), thenthis function satisfies (I1)–(I3) of Proposition 13.1. Therefore the uniqueness asser-tion of that result implies uniqueness here.

Let A, A′, γ , and γ ′ satisfy the hypotheses of (I4). Let g and g′ be representatives ofγ and γ ′. LetC andC ′ be the domains of g and g′, and let B and B ′ be compact neigh-borhoods of A and A′ that are contained in g−1(C) and g′−1

(C ′) respectively. Sinceγ ′ ◦ γ and γ ◦ γ ′ are index admissible, after replacing B and B ′ with smaller compactneighborhoodswe haveF (g′ ◦ g|B) ⊂ A andF (g ◦ g′|B ′) ⊂ A′, which implies thatg(F (g′ ◦ g|B)) = F (g ◦ g′|B ′) and g(F (g ◦ g′|B ′)) = F (g′ ◦ g|B). We now have

Λ(γ ′ ◦ γ ) = Λ(g′ ◦ g|B) and Λ(γ ◦ γ ′) = Λ(g ◦ g′|B ′)

by definition, and Proposition 13.3 implies the desired equality.To prove Multiplication let γ and γ ′ be elements of I V and I V ′

. Since γ andγ ′ are index admissible they have representatives g and g′ with F (g) = F (γ ) andF (g′) = F (γ ′). Now

Λ(γ × γ ′) = Λ(g × g′) = Λ(g) × Λ(g′) = Λ(γ ) × Λ(γ ′). �

13.4 Extension to ANR’s

We now extend the index to quite general spaces. Let ANR be the class of locallycompact ANR’s. The proof of the following is lengthy, but the basic idea is simple.Given X ∈ ANR and γ ∈ G X with domain A, there is a compact neighborhoodC ⊂ X of A and an f : C → X such that γA( f ) = γ . For any ε > 0 the dominationtheorem (Theorem 8.4) gives an open subset U of a finite dimensional vector spaceand maps ϕ : C → U and ψ : U → X such that ψ ◦ ϕ and IdC are ε-homotopic.For some compact neighborhood B ⊂ C of F (γ ), if ε is sufficiently small, thenCommutativity forces (13.1), and the bulk of the proof verifies that this equationgives a satisfactory definition of ΛX .

13.4 Extension to ANR’s 253

Theorem 13.1 There is a unique system of functionsΛX : G X → Z for X ∈ ANRsatisfying:

(I1) (Normalization) For all X ∈ ANR and x ∈ X, if cx : X → X is the constantfunction with value x, then

ΛX (γ{x}(cx )) = 1.

(I2) (Additivity) If X ∈ ANR, γ ∈ G X has domain A, and A1, . . . , Ar are pairwisedisjoint compact subsets of A withF (γ ) ⊂ A1 ∪ . . . ∪ Ar , then

ΛX (γ ) =∑

i

ΛX (γ |Ai ) .

(I3) (Continuity) For each X ∈ ANR and compact C ⊂ X,ΛX ◦ πX |C(C,X)∩I X iscontinuous (i.e., locally constant).

(I4) (Commutativity) If X, X ′ ∈ ANR, A ⊂ X and A′ ⊂ X ′ are compact, γ ∈gA(X, X ′) and γ ′ ∈ gA′(X ′, X) with γ (A) ⊂ A′ and γ ′(A′) ⊂ A, and γ ′ ◦ γ

and γ ◦ γ ′ are index admissible, then

ΛX (γ ′ ◦ γ ) = ΛX ′(γ ◦ γ ′) .


(I5) (Multiplication) If X, X ′ ∈ ANR, γ ∈ G X , and γ ′ ∈ G X ′, then

ΛX×X ′(γ × γ ′) = ΛX (γ ) × ΛX ′(γ ′).

Proof It will be visually clearer if we write Λ for the index for finite dimensionalvector spaces. When it is necessary to save space we will write compositions offunctions multiplicatively, omitting the symbol ◦.

Consider X ∈ ANR and γ ∈ G X with domain A. Since X is locally compactand γ is index admissible, there is an f ∈ I X whose domain C is a compact neigh-borhood of A such that γA( f ) = γ , and F ( f ) ⊂ A. For any ε > 0 Theorem 8.4gives a finite dimensional vector space V and an open U ⊂ V whose closure U iscompact and ε-dominates C by virtue of maps ϕ : C → U and ψ : U → X and anε-homotopy η : C × [0, 1] → X with η0 = IdC and η1 = ψ ◦ ϕ. Let B be a compactneighborhood of F (γ ) such that f (B) is contained in the interior of C . Roughly,the idea is to set

ΛX (γ ) := Λ(πV (ϕ ◦ f ◦ ψ |ψ−1(B))). (13.1)

In order for this to work we need ε to be small enough that the open ε-ball around Bis contained in C , so that f ◦ ηt |B is defined for all t , and for all x ∈ ∂B the distancefrom x to f (x) is at least ε, which implies that f ◦ η is index admissible.


In fact the equation above must be satisfied by any system of functions ΛX satis-fying (I1)–(I5). Additivity and Continuity give

ΛX (γ ) = ΛX (πX ( f )) = ΛX (πX ( f |B)) = ΛX (πX ( f ◦ ψ ◦ ϕ|B)) .

Let A := F ( f ◦ ψ ◦ ϕ|B) and A′ := ψ−1(A). We have

πX ( f ◦ ψ ◦ ϕ|B) = γA( f ◦ ψ ◦ ϕ|B) = γA′( f ◦ ψ |ψ−1(C)) ◦ γA(ϕ|B) .

Therefore Commutativity implies that

ΛX (γ ) = ΛV (γA(ϕ|B) ◦ γA′( f ◦ ψ |ψ−1(C))) = ΛV (γA′(ϕ ◦ f ◦ ψ |ψ−1(C))) .

Since Λ, in its application to V , is the unique function satisfying (I1)–(I3), we arriveat (13.1). At this point we have shown that there is at most one system of functionsΛX satisfying (I1)–(I5).

The detailed argument has a small wrinkle, arising from the need to show that ourdefinition does not depend on the choices of B, ε, V , U , ϕ, ψ , and η. Our formaldefinition of the index is (13.1) when, in addition to the conditions described above,ε is small enough that:

(a) f (Uε(B)) ⊂ C ;(b) x �= y whenever x ∈ ∂B, x ′ ∈ Uε(x), and y ∈ Uε( f (x ′));(c) Uε( f (B)) ⊂ C ;(d) x �= y whenever x ∈ ∂B and y ∈ U2ε( f (x)).

Holding B and ε fixed, suppose that V1, U1, ϕ1, ψ1, η1 and V2, U2, ϕ2, ψ2,η2 be two different choices of the data in question. For all t ∈ [0, 1], (a) impliesthat f (η2t (B)) ⊂ C . Therefore ϕ1 f η2tψ1|ψ−1

1 (B) is defined. If z is a fixed point ofϕ1 f η2tψ1|ψ−1

1 (B), thenψ1(z) is a fixed point ofψ1ϕ1 f η2t |B , and (b) (with x = ψ1(z),

x ′ = η2t (x), and y = ψ1(ϕ1( f (x ′)))) implies that ψ1(z) /∈ ∂B, so z /∈ ∂ψ−11 (B).

Therefore t → ϕ1 f η2tψ1|ψ−11 (B) is an index admissible homotopy. Applying Conti-

nuity gives

Λ(πV1(ϕ1 f ψ1|ψ−11 (B))) = Λ(πV1(ϕ1 f ψ2ϕ2ψ1|ψ−1

1 (B))).

We have ψ2(ϕ2(B)) ⊂ Uε(B) ⊂ C , so ϕ2(B) ⊂ ψ−12 (C) and

ϕ1 f ψ2ϕ2ψ1|ψ−11 (B) = ϕ1 f ψ2|ψ−1

2 (C) ◦ ϕ2ψ1|ψ−11 (B) .

Let A := F (ϕ1 f ψ2ϕ2ψ1|ψ−11 (B)) and A′ := ϕ2(ψ1(A)). We have

πV1(ϕ1 f ψ2|ψ−12 (C)

◦ ϕ2ψ1|ψ−11 (B)

) = γA(ϕ1 f ψ2 ◦ ϕ2ψ1) = γA(ϕ1 f ψ2) ◦ γA′(ϕ2ψ1) .


Therefore Commutativity gives

Λ(πV1(ϕ1 f ψ1|ψ−11 (B))) = Λ(γA′(ϕ2ψ1) ◦ γA(ϕ1 f ψ2)) = Λ(γA′(ϕ2ψ1ϕ1 f ψ2)) .

We have A′ ⊂ ψ−12 (B) and ψ1(ϕ1( f (B))) ⊂ ψ1(ϕ1(Uε(B) ⊂ U2ε(B) ⊂ C , so

γA′(ϕ2ψ1ϕ1 f ψ2) = πV2(ϕ2ψ1ϕ1 f ψ2|ψ−12 (B)) .

For all t ∈ [0, 1], (c) implies that η1t ( f (B)) ⊂ C , so ϕ2η1t f ψ2|ψ−12 (B) is defined. If

y is a fixed point of ϕ2η1t f ψ2|ψ−12 (B), then ψ2(y) is a fixed point of ψ2ϕ2η1t f |B , and

(d) implies that ψ2(y) /∈ ∂B, so y /∈ ∂ψ−12 (B). Therefore t → ϕ2η1t f ψ2|ψ−1

2 (B) isan index admissible homotopy. Applying Continuity gives

Λ(πV2(ϕ2ψ1ϕ1 f ψ2|ψ−12 (B))) = Λ(πV2(ϕ2 ◦ f ◦ ψ2|ψ−1

2 (B))) .

Combining the various equations above, we obtain

Λ(πV1(ϕ1 f ψ1|ψ−11 (B))) = Λ(πV2(ϕ2 ◦ f ◦ ψ2|ψ−1

2 (B))) ,

so, for given B and ε, the proposed definition does not depend on the choice of V ,U , ϕ, ψ , and η.

It is much easier to see that the proposed definition does not depend on B and ε.First suppose that B ′ is a compact neighborhood of F (γ ) that is contained B, and(a)–(d) are satisfied by B ′ and ε′. If we replace ε with min{ε, ε′}, and V , U , ϕ, ψ ,and η satisfy (a)–(d) for B and ε, then they also satisfy these conditions for B ′ andε′, and Additivity gives

Λ(πV (ϕ ◦ f ◦ ψ |ψ−1(B))) = Λ(πV (ϕ ◦ f ◦ ψ |ψ−1(B ′))) .

At this point we have shown that the proposed definition is unambiguous, hencevalid, and that it is the only possibility for satisfying (I1)–(I5). We now show thatthese conditions do in fact hold.

Normalization:

If F (γ ) = {x} is a singleton and f is constant in a neighborhood V of x , thenϕ(x) is the unique fixed point of ϕ ◦ f ◦ ψ , and this function is locally constant nearthis point, so Additivity and Normalization for Λ give ΛX (γ ) = 1.

Additivity:

Suppose that A1, . . . , Ar ⊂ A are pairwise disjoint compacta andF (γ ) ⊂ A1 ∪· · · ∪ Ar . Choose pairwise disjoint compact neighborhoods B1, . . . , Br of these setswith B1 ∪ · · · ∪ Br ⊂ B. If ε is small enough that (a)–(d) are satisfied for B and eachof B1, . . . , Br , and V , U , ϕ, ψ , and η are as above, then Additivity for Λ gives


ΛX (γ ) = Λ(πV (ϕ ◦ f ◦ ψ |ψ−1(B))) =∑

i

Λ(πV (ϕ ◦ f ◦ ψ |ψ−1(Bi ))) =∑

i

ΛX (γ |Ai ).

Continuity:

With f and B given, let ε be small enough that (a)–(d) hold, and let V ,U ,ϕ,ψ , andη satisfy the stated conditions. Since B andC are compact and the inequalities in (a)–(d) are strict, there is a neighborhoodW ⊂ C(C, X)of f such that (a)–(d) are satisfiedby B, ε, and all f ′ ∈ W . Lemmas 5.13 and 5.14 imply that f → ϕ ◦ f ◦ ψ |ψ−1(B)

is a continuous function from C(C, X) to C(ψ−1(B), V ), so the desired continuityfollows from the Continuity condition of Proposition 13.1.

Multiplication:

In addition to X , A,C , f , and B, let X ′, A′,C ′, f ′, and B ′ be given. Let ε be smallenough that (a)–(d) are satisfied by both f and B and f ′ and B ′. Recalling that X × X ′is endowed with the metric ((x, x ′), (y, y′)) → max{d(x, y), d ′(x ′, y′)}, it is easyto check that these conditions are also satisfied by f × f ′ and B × B ′. Let V , U , ϕ,ψ , and η and V ′, U ′, ϕ′, ψ ′, and η′ be as above. Then V × V ′, U ×U ′, ϕ × ϕ′, andψ × ψ ′, also satisfy the relevant conditions for the given ε, and in particularU ×U

′

ε-dominates C × C ′ by virtue of the maps ϕ × ϕ′ and ψ × ψ ′ and the ε-homotopyt → ηt × η′

t . We now have the computation

ΛX×X ′(γ × γ ′) = Λ(πV×V ′((ϕ × ϕ′) ◦ ( f × f ′) ◦ (ψ × ψ ′)|(ψ×ψ ′)−1(B×B ′)))

= Λ(πV×V ′(ϕ ◦ f ◦ ψ |ψ−1(B) × ϕ′ ◦ f ′ ◦ ψ ′|ψ ′−1(B ′)))

= Λ(πV (ϕ ◦ f ◦ ψ |ψ−1(B)) × πV ′(ϕ′ ◦ f ′ ◦ ψ ′|ψ ′−1(B ′)))

= Λ(πV (ϕ ◦ f ◦ ψ |ψ−1(B))) · Λ(πV ′(ϕ′ ◦ f ′ ◦ ψ ′|ψ ′−1(B ′))) = ΛX (γ ) × ΛX ′(γ ′).

Here the first and last equality are definitional, the second is the definition of acartesian product of functions, the third is the definition of a cartesian product ofgerms, and the fourth is Multiplication for Λ.

Commutativity:

We are given X, X ′ ∈ ANR, compact A ⊂ X and A′ ⊂ X ′, and γ ∈ gA(X, X ′)and γ ′ ∈ gA′(X ′, X) with γ (A) ⊂ A′, γ ′(A′) ⊂ A, F (γ ′ ◦ γ ) ⊂ A, and F (γ ◦γ ′) ⊂ A′, so that γ ′ ◦ γ and γ ◦ γ ′ are index admissible. Let g : C → X ′ andg′ : C ′ → X be representatives of γ and γ ′, where C and C ′ are compact neigh-borhoods of A and A′. Choose compact neighborhoods B and B ′ of F (γ ′ ◦ γ )

and F (γ ◦ γ ′) that are contained in the interiors of C ∩ g−1(C ′) and C ′ ∩ g′−1(C)

respectively. After replacing B and B ′ with smaller neighborhoods, if necessary,we have F (g′ ◦ g|B) = F (γ ′ ◦ γ ) and F (g ◦ g′|B ′) = F (γ ◦ γ ′). Choose com-pact neighborhoods D and D′ ofF (γ ′ ◦ γ ) andF (γ ◦ γ ′) that are contained in theinteriors of B ∩ g−1(B ′) and B ′ ∩ g′−1

(B).


Let ε > 0 be “small enough,” in a sense to be specified below. Theorem 8.4gives a finite dimensional vector space V and an open U ⊂ V whose closure U iscompact and ε-dominates C by virtue of maps ϕ : C → U and ψ : U → X and anε-homotopy η : C × [0, 1] → X between IdC and ψ ◦ ϕ. Similarly, there is a finitedimensional vector space V ′ and an openU ′ ⊂ V ′ whose closureU ′

is compact andε-dominatesC ′ by virtue of maps ϕ′ : C ′ → U ′ andψ : U ′ → X ′ and a ε-homotopyη′ : C ′ × [0, 1] → X ′ between IdC ′ and ψ ′ ◦ ϕ′.

The heart of the proof is the calculation

ΛX (γ ′ ◦ γ ) = Λ(πV (ϕg′gψ |ψ−1(D))) = Λ(πV (ϕg′ψ ′ϕ′gψ |ψ−1(D)))

= Λ(πV (ϕg′ψ ′|ψ−1(B ′)) ◦ πV ′(ϕ′gψ |ψ−1(D)))

= Λ(πV ′(ϕ′gψ |ψ−1(B)) ◦ πV (ϕg′ψ ′|ψ−1(D′)))

= Λ(πV ′(ϕ′gψϕg′ψ ′|ψ−1(D′))) = Λ(πV ′)(ϕ′gg′ψ ′|ψ−1(D′))) = ΛX ′(γ ◦ γ ′).

In order for the first and last equalities to be valid, (a)–(d) must be satisfied with Dand g′ ◦ g, and D′ and g ◦ g′, in place of f and B. In order for the second and sixthequalities to be valid applications of Continuity we need Uε(g(D)) ⊂ g′−1

(C) andUε(g′(D′)) ⊂ g−1(C ′) respectively, and it needs to be the case that for all t thereare no fixed points of ϕg′η′

t gψ |ψ−1(D) in ∂ψ−1(D) and there are no fixed points ofϕ′gηt g′ψ ′|ψ ′−1(D′) in ∂ψ ′−1

(D′). This will be the case ifψϕg′η′t g|D andψ ′ϕ′gηt g′|D′

have no fixed points in ∂D and ∂D′. In order for the third and fifth equality tobe valid we need g(D) ⊂ (ψ ′ϕ′)−1(B ′) and g(D′) ⊂ (ψϕ)−1(B). Of course thecentral equality is Commutativity for Λ, which is applicable if ϕg′ψ ′ϕ′gψ |ψ−1(D)

and ϕ′gψϕg′ψ ′|ψ−1(D′) are index admissible. It is easy to see that all these conditionshold if ε is small enough. �

We will often need to treat the index as a function assigning an integer to eachindex admissible function. Therefore for each X ∈ ANR we defineΛX : I X → Z

by setting ΛX ( f ) := ΛX (γF ( f )( f )). These functions have the expected properties:

Proposition 13.6 The system of functions ΛX : I X → Z for X ∈ ANR satisfy:

(I1) (Normalization) If X ∈ ANR, C ⊂ X is compact, x ∈ int C, and cx : C → Xis the constant function with value x, then

ΛX (cx ) = 1 .

(I2) (Additivity) If X ∈ ANR, f ∈ I X has domain C, C1, . . . ,Cr are pairwisedisjoint compact subsets of C, and F ( f ) ⊂ C1 ∪ . . . ∪ Cr , then

ΛX ( f ) =∑

i

ΛX ( f |Ci ) .


(I3) (Continuity) For each X ∈ ANR and each compact C ⊂ X, ΛX |C(C,X)∩I X iscontinuous.

(I4) (Commutativity) If X, X ′ ∈ ANR, D ⊂ C ⊂ X and D′ ⊂ C ′ ⊂ X ′, C, D, C ′,and D′ are compact, g : C → X ′ and g′ : C ′ → X are continuous, g(D) ⊂ C ′and g(D′) ⊂ C, and g′ ◦ g|D ∈ I X and g ◦ g′|D′ ∈ I X ′

, then

ΛX (g′ ◦ g|D) = ΛX ′(g ◦ g′|D′) .

(I5) (Multiplication) If X, X ′ ∈ ANR, f ∈ I X , and f ′ ∈ I X ′, then

ΛX×X ′( f × f ′) = ΛX ( f ) × ΛX ′( f ′) .

Proof Clearly Normalization and Continuity are automatic. If f ∈ C(C, X) ∩ I X

and C1, . . . ,Cr are pairwise disjoint compact subsets of C that containF (C) in theunion of their interiors, then the definition and the version of Additivity from the lastresult give

ΛX ( f ) = ΛX (γF ( f )( f )) =∑

i

ΛX (γF ( f |Ci )( f |Ci )) =∑

i

ΛX ( f |Ci ).

Under the hypotheses of (I4), if F := F (g′ ◦ g|D) and F ′ := F (g ◦ g′|D′), then thedefinitions and Commutativity for germs give

ΛX (g′ ◦ g|D) = ΛX (γF (g′ ◦ g|D)) = ΛX (γF ′(g′|D′) ◦ γF (g|D))

= ΛX ′(γF (g|D) ◦ γF ′(g′|D′)) = ΛX ′(γF ′(g ◦ g′|D′)) = ΛX ′(g ◦ g′|D′) .

Similarly, if f ∈ C(C, X) ∩ I X and f ′ ∈ C(C ′, X ′) ∩ I X ′, then

ΛX×X ′( f × f ′) = ΛX×X ′(γF ( f × f ′)( f × f ′)) = ΛX×X ′(γF ( f )( f ) × γF ( f ′)( f′))

= ΛX (γF ( f )( f )) · ΛX ′(γF ( f ′)( f )) = ΛX ( f ) · ΛX ′( f ′). �

13.5 Extension to Correspondences

We now complete the definition of the index by extending it to contractible valuedcorrespondences. We follow the pattern laid out above, defining germs of correspon-dences and expressing the properties of the index in terms of them.

Let X and Y be Hausdorff spaces. Recall that U (X,Y ) is the set of upper hemi-continuous correspondences F : X → Y . If X is compact, then the strong and weaktopologies coincide, and we endow U (X,Y ) with this topology. An index admis-sible correspondence for X is a contractible valued F ∈ U (C, X) whose domainis compact and contains no fixed points of F in its boundary:F (F) ∩ ∂C = ∅. Let

13.5 Extension to Correspondences 259

J XC be the set of such correspondences with domain C , and let J X := ⋃

C J XC

be the set of all such correspondences for X .Let A ⊂ X be compact. Two upper hemicontinuous correspondences F : U → Y

and F ′ : U ′ → Y defined on neighborhoods of A have the same germ at A if thereis a neighborhood V of A such that V ⊂ U ∩U ′ and F |V = F ′|V . As before thisis an equivalence relation, and its equivalence classes are called germs of upperhemicontinuous correpondences from X to Y at A. Let GA(X,Y ) be the set of suchgerms. For F as above let ΓA(F) be its equivalence class, which is called the germof F at A. We say that F is a representative of ΓA(F). If Γ ∈ GA(X,Y ), then A isthe domain of Γ .

If Z ⊂ A is compact, then a function F defined in a neighborhood of A is alsodefined in a neighborhood of Z . Clearly ΓZ (F) depends only on ΓA(F), so thereis a function from GA(X,Y ) to GZ (X,Y ). For Γ ∈ GA(X,Y ) let Γ |Z denote theassociated element of GZ (X,Y ).

Suppose that, in addition to X , Y , and A, X ′ and Y ′ are topological spaces andA′ ⊂ X ′ is compact. For Γ ∈ GA(X,Y ) and Γ ′ ∈ GA′(X ′,Y ′) with representativesF and F ′ we define Γ × Γ ′ to be the germ of the correspondence F × F ′ taking(x, x ′) to F(x) × F ′(x ′). This definition is unambiguous because (Lemma 4.11) anyneighborhood of A × A′ contains someU ×U ′ whereU andU ′ are neighborhoodsof A and A′. Again it is easy to show that this definition is independent of the choiceof representatives.

A germ Γ ∈ GA(X, X) is index admissible if there is an F ∈ J X such thatΓA(F) = Γ andF (F) ⊂ A. If this is the case (but not when Γ is inadmissible) welet F (Γ ) := F (F). Let C X be the set of index admissible correspondence germsat compact subsets of X . Let πX : J X → C X be the function πX (F) = ΓF (F)(F).

The most general formulation of the fixed point index is:

Theorem 13.2 There is a unique system of functionsΛX : C X → Z for X ∈ ANRsatisfying:

(I1) (Normalization) For all X ∈ ANR and x ∈ X, if cx : X → X is the constantfunction with value x, then

ΛX (γ{x}(cx )) = 1.

(I2) (Additivity) If X ∈ ANR, Γ ∈ C X has domain A, A1, . . . , Ar are pairwisedisjoint compact subsets of A, and F (Γ ) ⊂ A1 ∪ . . . ∪ Ar , then

ΛX (Γ ) =∑

i

ΛX (Γ |Ai ) .

(I3) (Continuity) For each X ∈ ANR and compact C ⊂ X, ΛX ◦ πX |J X is con-tinuous.

(I4) (Commutativity) If X, X ′ ∈ ANR, A ⊂ X and A′ ⊂ X ′ are compact, γ ∈gA(X, X ′) and γ ′ ∈ gA′(X ′, X) with γ (A) ⊂ A′ and γ ′(A′) ⊂ A, and γ ′ ◦ γ

and γ ◦ γ ′ are index admissible, then


ΛX (γ ′ ◦ γ ) = ΛX ′(γ ◦ γ ′) .


(I5) (Multiplication) If X, X ′ ∈ ANR, Γ ∈ C X , and Γ ′ ∈ C X ′, then

ΛX×X ′(Γ × Γ ′) = ΛX (Γ ) × ΛX ′(Γ ′).

Proof Let ΛX denote the index for functions. Fix an X ∈ ANR and agermΓ ∈ C X .Let G be a representative of Γ with F (G) = F (Γ ), and let C be the domain ofG. Let W be a neighborhood of Gr(G) that does not intersect { (x, x) : x ∈ ∂C }.Theorem 9.1 implies that there is neighborhood V ⊂ W of Gr(G) such that forany maps f0, f1 : C → X with Gr( f0),Gr( f1) ⊂ V there is a homotopy h : C →[0, 1] → X with h0 = f0, h1 = f1, and Gr(ht ) ⊂ W for all t . Theorem 9.1 alsoimplies that a continuous f : C → X with Gr( f ′) ⊂ V exists. We would like to setΛX (Γ ) := ΛX ( f ).

Of course we must first show that this definition does not depend on the variouschoices we made. LetG ′,C ′,W ′, V ′, and f ′ be second versions of the objects above.Then G and G ′ agree on some compact neighborhood C ′′ ⊂ C ∩ C ′ of F (Γ ); letG ′′ := G|C ′′ = G ′|C ′′ . As above, there is a neighborhood W ′′ of Gr(G ′′) that doesnot intersect { (x, x) : x ∈ ∂C ′′ } and a neighborhood V ′′ ⊂ W ′′ of Gr(G ′′) such thatfor any maps f ′′

0 , f ′′1 : C → X with Gr( f ′′

0 ),Gr( f ′′1 ) ⊂ V ′′ there is a homotopy h′′ :

C × [0, 1] → X with h′′0 = f ′′

0 , h′′1 = f ′′

1 , and Gr(h′′t ) ⊂ W ′′ for all t . Finally, there

is a continuous f ′′ : C ′′ → X with Gr( f ′′) ⊂ V ′′. Let U ⊂ V be a neighborhood ofGr(G) such thatU ∩ { (x, x) : x ∈ C \ C ′′ } = ∅ andU ∩ (C ′′ × X) ⊂ V ′′. Theorem9.1 implies that U contains the graph of some continuous f : C → X . There isan index admissible homotopy between f and f , so ΛX ( f ) = ΛX ( f ). Additivityimplies that ΛX ( f ) = ΛX ( f |C ′′). There is an index admissible homotopy betweenf |C ′′ and f ′′, so ΛX ( f |C ′′) = ΛX ( f ′′). Thus ΛX ( f ) = ΛX ( f ′′), but of course thesame argument shows that ΛX ( f ′) = ΛX ( f ′′), so ΛX ( f ) = ΛX ( f ′).

We now need to verify that (I1)–(I5) hold. The indexwe have defined above agreeswith the one from the last sectionwhenΓ is the germof a continuous function becausewe can arrange for the function itself to be f in the definition above. Therefore (I1)and (I4) hold automatically.

Suppose that A1, . . . , Ar are pairwise compact subsets of A whose union con-tains F (Γ ). Let C1, . . . ,Cr ⊂ C be pairwise disjoint compact neighborhoods ofA1, . . . , Ar , and for each i = 1, . . . , r let Wi , Vi , and fi be as above with respect toG|Ci , so that ΛX (Γ |Ai ) = ΛX ( fi ). Since we may replace V with

V ∩ (⋃

i

Vi ∪ ((C \⋃

i

Ci ) × X))

wemay assume that V ∩ (Ci × X) ⊂ Vi . There is a homotopy hi : Ci × [0, 1] → Xwith hi0 = f |Ci , hi1 = fi , and Gr(hit ) ⊂ Wi for all t , so Additivity and Continu-ity give

13.5 Extension to Correspondences 261

ΛX (Γ ) = ΛX ( f ) =∑

i

ΛX ( f |Ci ) =∑

i

ΛX ( fi ) =∑

i

ΛX (Γ |Ai ).

Continuity follows more or less directly from the definition. For G ∈ J X withdomain C Theorem 9.1 gives a neighborhood V of Gr(G) such that for any mapsf0, f1 : C → X with Gr( f0),Gr( f1) ⊂ V there is a homotopy h : C → [0, 1] → Xwith h0 = f0, h1 = f1, and Gr(ht ) ⊂ X × X \ { (x, x) : x ∈ ∂C } for all t . Our def-inition gives ΛX (ΓF (G ′)(G ′)) = ΛX (ΓF (G)(G)) for all G ′ ∈ U (C, X) ∩ J X withGr(G ′) ⊂ V , so ΛX ◦ πX |J X is constant in a neighborhood of G, hence continu-ous at G.

In addition to X , Γ , G, C , W , and V , let X ′, Γ ′, G ′, C ′, W ′, and V ′ asabove be given. Let W be a neighborhood of Gr(G × G ′) that does not intersect{ ((x, x ′), (x, x ′)) : x ∈ C, x ′ ∈ C ′ }, and let V be a neighborhood of Gr(G × G ′)such that for any continuous f0, f1 : C × C ′ → X × X ′ with Gr( f0),Gr( f1) ⊂ Vthere is a homotopy h : C × C × [0, 1] → X × X ′ with f0 = f0, h1 = f1, andGr(ht ) ⊂ W for all t . By replacing V and V ′ with their intersections with suffi-ciently small balls around Gr(G) and Gr(G ′) we can insure that V × V ′ is in anarbitrarily small ball around Gr(G × G ′), so we may assume that V × V ′ ⊂ V .Now Multiplication for the functional index gives

ΛX×X ′(Γ × Γ ′) = ΛX×X ′( f × f ′) = ΛX ( f ) × ΛX ′( f ′) = ΛX (Γ ) × ΛX ′(Γ ′).

�

It should be obvious that a more general version of Proposition 13.6 holds: thesystem of functions ΛX : J X → Z given by ΛX (G) := ΛX (ΓF (G)(G)) satisfiesNormalization, Additivity, Continuity, and Multiplication. Since it would be tediousand serve little purpose, we omit the formal statement.

13.6 Uniqueness

The uniqueness of the index is conceptually important, but it is not so easy to bringit to bear in applications. A typical situation is that one has defined a function onthe set of index admissible functions on a single space, and it has been establishedthat this function satisfies (I1)–(I3). Thus there arises the question of which spaceshave a unique such function. Amann and Weiss (1973) and Nussbaum (1974) estab-lish uniqueness in settings that are (from our point of view) rather special, and auniqueness result for a differentiable manifold is proved by Furi et al. (2004), butotherwise the issue does not seem to have received much attention. Here we presentan elementary result that allows us to easily obtain the Furi et al. result, and we useit to prove a uniqueness result for a finite simplicial complex.

We say that a topological space X is a single index space if there is a singlefunction ΛX : I X → Z satisfying (I1)–(I3).


Theorem 13.3 Suppose that for every f ∈ I X and every neighborhood V ⊂C(C, X) of f there is an f ′ ∈ V such thatF ( f ′) has a finite partition D1, . . . , Dk

into compact sets, and each Di has a neighborhood Ui ⊂ X that is a single indexspace. Then X is a single index space.

Proof By Continuity, any index is determined by its values on functions such asf ′. By Additivity, the index for f ′ is determined by its restrictions to arbitrarilysmall neighborhoods of the Di . For any openU ⊂ X that is a single index space, therestriction of the index to those index admissible g : C → U in I X with C ⊂ U isan index for U , and is thus uniquely determined. �

Proposition 13.1 asserts that each V is a single index space. Since Theorem11.4 implies that any function between smooth manifolds can be approximated by asmooth manifold with a discrete set of fixed points, we have:

Corollary 13.1 Any smooth manifold is a single index space.

The corresponding result for finite simplicial complexes is more sophisticated.

Theorem 13.4 IfP is a finite simplicial complex, then |P| is a single index space.Proof Let a compact C ⊂ |P|, an index admissible f : C → |P|, and a neighbor-hoodW ⊂ C × |P| of Gr( f ) be given. In view of the result above it suffices to showthatW contains the graph of an index admissible f ′′ : |P| → |P|whose set of fixedpoints has a finite partition into compact sets, each element of which is contained inthe interior of a simplex that is not a proper face of any other simplex of P .

We regardP as a geometric simplicial complex embedded in a Euclidean spaceV . The Tietze extension theorem allows us to extend f to a continuous function withrangeV that is definedon aneighborhoodofC . Since |P| is anENR(Proposition 8.2)we can compose with a retraction to obtain an extension of f with range |P| definedon a neighborhoodU ofC . By replacingU with a smaller neighborhoodwe can insurethat the extension has no fixed points inU \ C . After sufficient repeated barycentricsubdivision, every simplex that intersects C will be contained in U . Let Q be thesubcomplex consisting of all such simplices and all their faces. Then the restrictionof the extension of f to |Q| is index admissible. If we can find an f ′ : |Q| → |P|with the desired properties whose graph is contained in W ∪ ((|P| \ C) × |P|),then its restriction to C will be what we are looking for. The point of this part of theargument is that we may assume that C = |Q| for some subcomplex Q.

The simplicial approximation theorem (Theorem 2.7) implies that (after sufficientrepeated barycentric subdivision) there is a simplicial map from |Q| to |P| whosegraph is contained in W , and it suffices to prove the claim with f replaced by thismap. Therefore wemay assume that f is simplicial for some subdivisionsQ′ andP ′ofQ andP . To simplify notation we writeQ andP in place ofQ′ andP ′, whichwill cause no confusion provided one recognizes that that now Q is not typically asubcomplex ofP . For each Q ∈ Q let PQ := f (Q). Since f is simplicial, PQ ∈ P ,and F |Q is affine and maps the interior of Q into the interior of PQ .

13.6 Uniqueness 263

Fix ε > 0 such that if f ′ : C → |P| is continuous and ‖ f ′ − f ‖ < ε, then f ′ isindex admissible and Gr( f ′) ⊂ W . For k = 0, 1, 2, . . . let Qk be the k-skeleton ofQ. We construct maps fk : |Qk | → P such that:

(a) ‖ fk − f ||Q k |‖ < ε;(b) for all Q ∈ Qk , fk(Q) ⊂ PQ ;(c) for each Q ∈ Qk the set of fixed points of fk in the interior of Q is compact and

has a neighborhood NQ ⊂ |Qk | such that fk(NQ) is contained in the interior ofPQ .

The construction of the fk is an induction that begins with letting f0 := f ||Q 0|.Obviously f0 satisfies (a)–(c). Suppose that we have already defined an fk−1 on|Qk−1| satisfying (a)–(c).

We now explain how fk−1 is extended to a given k-simplex Q ∈ Qk . Fix a point xin the interior of Q. Every point in Q is (1 − t)x + t y for some t ∈ [0, 1] and somey ∈ ∂Q, and f ((1 − t)x + t y) = (1 − t) f (x) + t f (y). For some αQ > 1 we definefk |Q : Q → PQ by setting

fk |Q((1 − t)x + t y) :={

(1 − αQt) f (x) + αQt fk−1(y), αQt ≤ 1,

fk−1(y), αQt > 1.

If αQ = 1, then ‖ fk |Q − f |Q‖ < ε because ‖ fk−1|∂Q − f |∂Q‖ < ε, so we canchoose αQ slightly greater than 1 such that ‖ fk |Q − f |Q‖ < ε.

For each simplex Q′ in ∂Q there is an open neighborhood NQ′ ⊂ |Qk−1| of the setof fixed points of fk−1 in the interior of Q′ that ismapped by fk−1 to the interior of PQ′ .Evidently NQ′ = { (1 − t)x + t y : y ∈ NQ′ and αQt > 1 } is an open neighborhoodof this set of fixed points in Q that is mapped by fk |Q to the interior of PQ′ . Theset of fixed points of fk |Q that are not contained in the various NQ′ is compact andcontained in the interior of Q, so it is mapped by fk |Q to the interior of PQ , anda neighborhood of it is also mapped to the interior of PQ . It is now evident thatcombining the various fk |Q gives a function fk : |Qk | → |P| satisfying (a)–(c).

Let f ′ : C → |P| be the final function constructed by this process. For eachQ ∈ Q let NQ be an open neighborhood of the set of fixed points in the interior ofQ such that f ′(NQ) is contained in the interior of PQ . Let ψQ : C → [0, 1] be acontinuous function with ψQ(z) > 0 for all z ∈ NQ and ψQ(y) = 0 for all y /∈ NQ ,and let yQ be an element of the interior of a maximal element of P that containsPQ . We define f ′′ : C → |P| by setting

f ′′(x) :={

(1 − ψQ(x)) f ′(x) + ψQ(x)yQ, x ∈ NQ,

f ′(x), x ∈ |Q| \ ⋃Q NQ .

If each ψQ is sufficiently close to the constant zero function, then ‖ f ′′ − f ‖ < ε.The fixed points of f ′′ are contained in the various NQ , and for each Q the fixedpoints of f ′′|NQ are contained in the interior of a maximal element ofP containing


Q. The set of fixed points of f ′′|NQ is a relatively open subset of the set of fixedpoints of f ′′, so it is compact because it is the complement of the sets of fixed pointsof the other f ′′|NQ′ . �

Exercises

Computing the fixed point index is usually not straightforward. To a rough approxi-mation there are four methods:

(a) If the function has regular fixed points (or can be approximated by such a func-tion) compute signs of the determinants of the relevant matrices of partial deriva-tives.

(b) Show that a set of fixed points has index zero by presenting a perturbation of thefunction or correspondence that has no nearby fixed points.

(c) If the domain is a compact AR, and the indices of all but one component ofthe set of fixed points are known, then the index of the last component can beinferred from the sum of the indices being +1.

(d) Present a homotopy between the given function or correspondence and a functionor correspondence forwhich the index of the relevant set of fixed points is alreadyknown.

The following problems present some instances of these methods.

13.1 Prove that if a < b and f : [a, b] → R is index admissible, then ΛR( f ) ∈{−1, 0, 1}.13.2 Prove that if C ⊂ R

2 is compact and contains, the origin is in its interior,f : C → R

2 is index admissible, and f (x) �= λx for all x ∈ C \ {0} and λ > 1, thenf (0) = 0 and ΛR2( f ) = 1. (Hint: consider the homotopy h(x, t) := t f (x).)

13.3 In a two player coordination game the two players have the same finite set Sof pure strategies. They each receive a payoff of 1 if they both choose the same purestrategy and a payoff of 0 if they choose different pure strategies.

(a) Describe the set of Nash equilibria.(b) Find the index of each Nash equilibrium, and prove this result using the index

axioms.(c) Prove the result you found in (b) by approximating the best response correspon-

dence with a suitable function and computing the sign of the determinant of therelevant matrix.

13.4 (Rubinstein 1989) General A and General B are commanders of allied forces.General A has received an order that both should attack. She transmits this to GeneralB using an email system that sends an automatic confirmation-of-receipt email tothe sender of each email (including confirmation-of receipt emails) that it receives.

Exercises 265

For each email there is a 10% chance that it is not received, so at the end of thecommunication phase the two generals have sent sA and sB emails respectively,where sA ≥ 1 and sA − 1 ≤ sB ≤ sA.

(a) Compute the conditional probabilities of having sent the last message, which arePr(sB = s − 1|sA = s) and Pr(sA = s|sB = s) for s ≥ 1.

For each general the payoff when neither attacks is 0, the payoff when both attack is10, the payoff when she attacks and the other does not is −10, and the payoff whenshe does not attack and the other does is −5. A behavior strategy for C ∈ {A, B}specifies a probability πC(s) ∈ [0, 1] of attacking conditional on sending s emailsfor each s ≥ 1. (We assume that General B cannot attack without receiving at leastone email.)

(b) For what value of πB(1) will General A be indifferent about whether to attackor not when she has sent only one email.

(c) What condition must πB(s − 1) and πB(s) satisfy if General A is indifferentbetween attacking and not attacking when she has sent s > 1 emails? Whatcondition must πA(s) and πA(s + 1) satisfy if General B is indifferent betweenattacking and not attacking when she has sent s emails?

(d) Prove that if (π∗A, π

∗B) is a subgame perfect Nash equilibrium and there is some

C ∈ {A, B} and s such that π∗C(s) = 1, then π∗

A(s) = π∗B(s) = 1 for all s.

(e) Prove that for a given πB , if πB(s − 1) = 0 and General A is indifferent aboutwhether to attack when she sends s emails, then she strictly prefers to attackwhen she sends s + 1 emails.

(f) Prove that there are two subgame perfect Nash equilibria, one in which neithergeneral ever attacks, and one in which both generals always attack.

(g) Give topologies on the spaces of mixed strategies with respect to which theyare convex compact subsets of Banach spaces. Define a best response corre-spondence that is upper hemicontinuous and convex valued, and whose fixedpoints are the subgame perfect Nash equilibria. Show that the equilibrium inwhich neither general ever attacks is an inessential fixed point. Conclude thatthe equilibrium in which both always attack has index +1.

Part VApplications

Chapter 14Topological Consequences

This chapter is a relaxing and refreshing change of pace. Instead ofworking very hardto slowly build up a toolbox of techniques and specific facts, we are going to harvestthe fruits of our earlier efforts, using the axiomatic description of the fixed pointindex, and other major results, to quickly derive a number of quite famous theorems.In Sect. 14.1 we define the Euler characteristic, relate it to the Lefschetz fixed pointtheorem, and then describe the Eilenberg–Montgomery theorem as a special case.

For two general compact manifolds, the degree of a map from one to the other is arather crude invariant, in comparison with many others that topologists have defined.Nevertheless, when the range is the m-dimensional sphere, the degree is already a“complete” invariant in the sense that it classifies functions up to homotopy: if M is acompact m-dimensional manifold that is connected, and f and f ′ are functions fromM to the m-sphere of the same degree, then f and f ′ are homotopic. This famoustheorem, due to Hopf, is the subject of Sect. 14.2.

Section14.3 presents several other results concerning fixed points and antipodalmaps of a map from a sphere to itself. Some of these are immediate consequencesof index theory and the Hopf theorem, but the Borsuk–Ulam theorem requires asubstantial proof, so it should be thought of as a significant independent fact oftopology. It has many consequences, including the fact that spheres of differentdimensions are not homeomorphic.

In Sect. 14.4 we state and prove the theorem known as invariance of domain. Itasserts that if U ⊂ R

m is open, and f : U → Rm is continuous and injective, then

the image of f is open, and the inverse is continuous. One may think of this as apurely topological version of the inverse function theorem, but from the technicalpoint of view it is much deeper.


269


270 14 Topological Consequences

14.1 Euler, Lefschetz, and Eilenberg–Montgomery

The definition of the Euler characteristic, and Euler’s use of it in the analyses of vari-ous problems, is often described as the historical starting point of topology as a branchof mathematics. In popular expositions the Euler characteristic of a 2-dimensionalmanifold M is usually defined by the formula χ(M) := V − E + F where V , E ,and F are the numbers of vertices, edges, and 2-simplices in a triangulation of M .Our definition is:

Definition 14.1 The Euler characteristic χ(X) of a compact ANR X is ΛX (IdX ).

Here is a sketch of a proof that our definition of χ(M) agrees with Euler’s whenM is a triangulated compact 2-manifold. We deform the identity function slightly,achieving a function f : M → M with the following description. Assume that thetriangulation is realized geometrically, to that each simplex has a barycenter. Thebarycenters (including the vertices) are the fixed points of f . The map deformsthe identity by pushing each other point away from the barycenter of the smallestsimplex that contains it. Thus a point on an edge between the edge’s barycenterand a vertex is pushed toward the vertex. Each two dimensional simplex is dividedinto six subsimplices by barycentric subdivision, and points on the line segmentsof this subdivision emanating from the barycenter are pushed further down the linesegment away from the barycenter. Other points are also moved further away fromthe barycenter. (This is easy to visualize, but it would be tedious to provide a formulafor a continuous function with these properties.) Euler’s formula follows once weshow that the index of a vertex is+1, the index of the barycenter of an edge is−1, andthe index of the barycenter of a 2-simplex is+1.Wewill not give a detailed argumentto this effect; very roughly it corresponds to the intuition that f is “compressive” ateach vertex, “expansive” at the barycenter of each 2-simplex, and expansive in onedirection and compressive in another at the barycenter of an edge. Imagining that fis differentiable, one can also compute the sign of the determinant of Id − f at eachfixed point.

Although Euler could not have expressed the idea in modern language, he cer-tainly understood that the Euler characteristic is important because it is a topologicalinvariant.

Theorem 14.1 If X and X ′ are homeomorphic compact ANR’s, then

χ(X) = χ(X ′) .

Proof For any homeomorphism h : X → X ′, Commutativity implies that

χ(X) = ΛX (IdX ) = ΛX (IdX ◦ h−1 ◦ h) = ΛX ′ (h ◦ IdX ◦ h−1) = ΛX ′ (IdX ′ ) = χ(X ′) .

�

14.1 Euler, Lefschetz, and Eilenberg–Montgomery 271

The analytic method implicit in Euler’s definition—pass from a topological space(e.g., a compact surface) to a discrete object (in this case a triangulation) that can beanalyzed combinatorially and quantitatively—has of course been extremely fruitful.But as a method of proving that the Euler characteristic is a topological invariant,it fails in a spectacular manner. There is first of all the question of whether a trian-gulation exists. That a two dimensional compact manifold is triangulable was notproved until the 1920s, by Rado. In the 1950s Bing and Moise proved that compactthree dimensional manifolds are triangulable, and a stream of research during thissame general period showed that smooth manifolds are triangulable, but in general acompact manifold need not have a triangulation. For simplicial complexes topologi-cal invariance would follow from invariance under subdivision, which can be provedcombinatorically, and the Hauptvermutung, which was the conjecture that any twosimplicial complexes that are homeomorphic have subdivisions that are combinator-ically isomorphic. This conjecture was formulated by Steinitz and Tietze in 1908,but in 1961 Milnor presented a counterexample, and in the late 1960s it was shownto be false even for triangulable manifolds.

The Lefschetz fixed point theorem is a generalization Brouwer’s theorem thatwas developed by Lefschetz for compact manifolds in Lefschetz (1923, 1926) andextended by him to manifolds with boundary in Lefschetz (1927). Using quite dif-ferent methods, Hopf extended the result to simplicial complexes in Hopf (1928).

Definition 14.2 If X is a compact ANR and F : X → X is an upper hemicontinuouscontractible valued correspondence, the Lefschetz number of F is ΛX (F).

Theorem 14.2 If X is a compact ANR, F : X → X is an upper hemicontinuouscontractible valued correspondence and ΛX (F) �= 0, then F(F) �= ∅.

Proof When F(F) = ∅ two applications of Additivity give

ΛX (F |∅) = ΛX (F) = ΛX (F |∅) + ΛX (F |∅) .

�

In Lefschetz’ originally formulation the Lefschetz number of a function wasdefined using algebraic topology. Thus one may view the Lefschetz fixed point the-orem as a combination of the result above and a formula expressing the Lefschetznumber in terms of homology.

In the Kakutani fixed point theorem, the hypothesis that the correspondence isconvex valued cries out for generalization, because convexity is not a topological con-cept that is preserved by homeomorphisms of the space. The Eilenberg–Montgomerytheorem asserts that if X is a compact acyclic ANR, and F : X → X is an upperhemicontinuous acyclic valued correspondence, then F has a fixed point. Unfor-tunately it would take many pages to define acyclicity, so we will simply say thatacyclicity is a property that is invariant under homeomorphism, and is weaker thancontractibility. The known examples of spaces that are acyclic but not contractibleare not objects one would expect to encounter “in nature,” so it seems farfetched that


the additional strength of the Eilenberg–Montgomery theorem, beyond that of theresult below, will ever figure in economic analysis.

Theorem 14.3 If X is a nonempty compact AR and F : X → X is an upper hemi-continuous contractible valued correspondence, then F has a fixed point.

Proof Recall (Theorem8.2) that an absolute retract is an ANR that is contractible.Theorem9.1 implies that F can be approximated in the sense of Continuity by acontinuous function, so ΛX (F) = ΛX ( f ) for some continuous f : X → X . Let c :X × [0, 1] → X be a contraction. Then (x, t) �→ c( f (x), t) (or (x, t) → f (c(x, t)))is a homotopy between f and a constant function, so Continuity and Normalizationimply that ΛX ( f ) = 1. Now the claim follows from the last result. �

14.2 The Hopf Theorem

Two functions that are homotopic may differ in their quantitative features, but fromthe perspective of topology these differences are uninteresting. Two functions thatare not homotopic differ in some qualitative way that one may hope to characterizein terms of discrete objects. A homotopy invariant may be thought of as a functionwhose domain is the set of homotopy classes; equivalently, it may be thought of asa mapping from a space of functions that is constant on each homotopy class. Afundamental method of topology is to define and study homotopy invariants.

The degree is an example: for compact manifolds M and N of the same dimensionit assigns an integer to each continuous f : M → N , and if f and f ′ are homotopic,then they have the same degree. There are a great many other homotopy invariants,whose systematic study is far beyond our scope. In the study of such invariants, oneis naturally interested in settings in which some invariant (or collection of invariants)gives a complete classification, in the sense that if two functions are not homotopic,then the invariant assigns different values to them. The prototypical result of this sort,due to Hopf, asserts that the degree is a complete invariant when N is the m-sphere.

Theorem 14.4 (Hopf) If M is an m-dimensional compact connected smooth mani-fold, then two maps f, f ′ : M → Sm are homotopic if and only if deg( f ) = deg( f ′).

We provide a rather informal sketch of the proof. Since the ideas in the argumentare geometric, and easily visualized, this should be completely convincing, and littlewould be gained by adding more formal details of particular constructions.

We already know that two homotopic functions have the same degree, so ourgoal is to show that two functions of the same degree are homotopic. Consider aparticular f : M → Sm . The results of Sect. 10.7 imply that CS(M, Sm) is locallypath connected, and that C∞(M, Sm) is dense in this space, so f is homotopic to asmooth function. Suppose that f is smooth, and that q is a regular value of f . (Theexistence of such a q follows from Sard’s theorem.) The inverse function theorem

14.2 The Hopf Theorem 273

implies that if D is a sufficiently small disk in Sm centered at q, then f −1(D) is acollection of pairwise disjoint disks, each containing one element of f −1(q).

Let q− be the antipode of q in Sm . (This is−q when Sm is the unit sphere centeredat the origin inRm+1.) Let j : Sm × [0, 1] → Sm be a homotopy with j0 = IdSm thatstretches D until it covers Sm , so that j1 maps the boundary of D and everythingoutside D to q−. Then f := j0 ◦ f is homotopic to j1 ◦ f .

We have shown that the f we started with is homotopic to a function with thefollowing description: there are finitely many pairwise disjoint disks in M , every-thing outside the interiors of these disks is mapped to q−, and each disk is mappedbijectively (except that all points in the boundary are mapped to q−) to Sm . We shallleave the peculiarities of the case m = 1 to the reader: when m ≥ 2, it is visuallyobvious that homotopies can be used to move these disks around freely, so that twomaps satisfying this description are homotopic if they have the same number of disksmapped onto Sm in an orientation preserving manner and the same number of disksin which the mapping is orientation reversing.

The crucial step in the argument is to show that a disk in which the orientation ispositive and a disk in which the orientation is negative can be “cancelled,” so that themap is homotopic to a map satisfying the description above, but with one fewer diskof each type. Repeating this cancellation, we eventually arrive at a map in which themapping is either orientation preserving in all disks or orientation reversing in alldisks. Thus any map is homotopic to a map of this form, and any two such maps withthe same number of disks of the same orientation are homotopic. Since the numberof disks is the absolute value of the degree, and the maps are orientation preservingor orientation reversing according to whether the degree is positive or negative, weconclude that maps of the same degree are homotopic.

For the cancellation step it is best to adopt a concrete model of the domain andrange. Let w : Rm → Sm be a continuous function that maps the open disk of radius1 centered at e1 = (1, 0, . . . , 0) homeomorphically onto Sm \ {q−} while mappingevery other point to q−. Let r : Rm → R

m be themap r(x) = (|x1|, x2, . . . , xn). Thenw ◦ r maps the closed unit disks centered at e1 and −e1 onto Sm , with opposite ori-entation, and it maps every other point inRm to q−. For 0 ≤ t ≤ 1 let st : Rm → R

m

be the map st (x) := (x1 + 2t, x2, . . . , xm). Then t �→ ht := w ◦ st ◦ r is a homotopybetween w ◦ r and the constant map with value q−.

In preparation for an application of the Hopf theorem, we introduced an importantconcept of general topology. (A variant made an appearance in Sect. 8.5.) If X is atopological space and A ⊂ X , the pair (X, A) has the homotopy extension propertyif, for any topological space Y and any function g : (X × {0}) ∪ (A × [0, 1]) → Y ,there is a homotopy h : X × [0, 1] → Y that is an extension of g: h(x, 0) = g(x, 0)for all x ∈ X and h(x, t) = g(x, t) for all (x, t) ∈ A × [0, 1].Lemma 14.1 The pair (X, A) has the homotopy extension property if and only if(X × {0}) ∪ (A × [0, 1]) is a retract of X × [0, 1].Proof If (X, A) has the homotopy extension property, then we can set Y := (X ×{0}) ∪ (A × [0, 1]) and g := IdY in the definition above, in which case a continuous


extension of g to X × [0, 1] is a retraction. On the other hand, if r is such a retraction,then for any g : (X × {0}) ∪ (A × [0, 1]) → Y there is continuous extension h =g ◦ r . �

The next two results are worth noting, even if they will not be applied later.

Corollary 14.1 If (X, A) and (A, B) have the homotopy extension property, thenso does (X, B).

Proof Let r : X × [0, 1] → (X × {0}) ∪ (A × [0, 1]) and s : A × [0, 1] → (A ×{0}) ∪ (B × [0, 1]) be retractions. Let s ′ : (X × {0} ∪ (A × [0, 1]) → (X × {0}) ∪(B × [0, 1]) be the function that agrees with s on A × [0, 1] and maps (x, 0) to itselfwhen x /∈ A. Then s ′ is continuous [just why, exactly?], so it is a retraction, and thuss ′ ◦ r is a retraction. �

Proposition 14.1 If X is a finite simplicial complex and A is a subcomplex, then(X, A) has the homotopy extension property.

Proof Since we can pass from X to A by repeatedly removing maximal simpliciesof X that are not in A, in view of the last result it suffices to show this if there isonly one simplex σ in X that is not in A. But in this case either the boundary of σ iscontained in A, in which case there is an argument like the proof of the following,or it isn’t, and another very simple construction works. [Add details] �

For the remainder of the chapter Dm := { x ∈ Rm : ‖x‖ ≤ 1 } is the unit disk in

Rm and Sm−1 := { x ∈ R

m : ‖x‖ = 1 } is it boundary.Lemma 14.2 The pair (Dm, Sm−1) has the homotopy extension property.

Proof There is an obvious retraction

r : Dm × [0, 1] → (Dm × {0}) ∪ (Sm−1 × [0, 1])

defined by projecting radially from (0, 2) ∈ Rm × R. �

We now relate the degree of a map from Dm to Rm with what may be thought ofas the “winding number” of the restriction of the map to Sm−1.

Theorem 14.5 If m ≥ 2, f : Dm → Rm is continuous, 0 /∈ f (Sm−1), and f : Sm−1

→ Sm−1 is the function x �→ f (x)/‖ f (x)‖, then deg0( f ) = deg( f ).

Proof Let k := deg( f ), let ρk : Dm → Dm be the map

(r cos θ, r sin θ, x3, . . . , xm) �→ (r cos kθ, r sin kθ, x3, . . . , xm) ,

and let σk := ρk |Sm−1 : Sm−1 → Sm−1. By considering the degree over an x ∈ Sm−1

with (x1, x2) �= (0, 0) it is easy to see that deg(σk) = k.

14.2 The Hopf Theorem 275

The Hopf theorem implies that there is a homotopy h : Sm−1 × [0, 1] → Sm−1

with h0 = f and h1 = σk . Let h : Sm−1 × [0, 1] → Rm be the homotopy

h(x, t) := ((1 − t)‖ f (x)‖ + t

)h(x, t) .

Note that h0 = f |Sm−1 , h1 = σk , and 0 /∈ h(Sm−1 × [0, 1]. Extend this to g : (Dm ×{0}) ∪ (Sm−1 × [0, 1]) → R

m by setting g(x, 0) := f (x). The last result implies thatg extends to a homotopy j : Dm × [0, 1] → R

m . Since j is degree admissible over0, deg0( f ) = deg0( j1).

There is an additional homotopy � : Dm × [0, 1] → Rm with �0 = j1 given by

�(x, t) := (1 − t) j1(x) + tρk(x) .

Note that �t |Sm−1 = σk for all t , so � is degree admissible over 0, and thus deg0( j1) =deg0(ρk). By considering the degree of ρk over a regular value near 0, it is easy tosee that deg0(ρk) = k. �

14.3 More on Maps Between Spheres

Insofar as spheres are the simplest “nontrivial” (where, in effect, this means noncon-tractible) topological spaces, it is entirely natural that mathematicians would quicklyinvestigate the application of degree and index theory to these spaces, and to mapsbetween them. There are many results coming out of this research, some of whichare quite famous.

Some of our arguments involve induction on m, and for this purpose we willregard Sm−1 as a subset of Sm by setting

Sm−1 := { x ∈ Sm : xm+1 = 0 } .

Let am : Sm → Sm be the function

am(x) := −x .

Two points x, y ∈ Sm are said to be antipodal if y = am(x). Regarded topologically,am is a fixed point free diffeomorphism whose composition with itself is IdSm . Itis easy to see that the derivative of am is orientation preserving if m is even andorientation reversing if m is odd, so deg(am) = (−1)m .

LetEm := { (x, y) ∈ Sm × Sm : y �= am(x) } .

There is a continuous function rm : Em × [0, 1] → Sm given by


rm(x, y, t) := (1 − t)x + t y

‖(1 − t)x + t y‖ .

Of course r(x, y, 0) = x and r(x, y, 1) = y.

Proposition 14.2 Suppose f, f ′ : Sm → Sn are continuous. If they do not map anypoint to a pair of antipodal points—that is, f ′(p) �= an( f (p)) for all p ∈ Sm—thenf and f ′ are homotopic.

Proof Specifically, there is the homotopy h(x, t) = rn( f (x), f ′(x), t). �

Consider a continuous function f : Sm → Sn . If m < n, then f is homotopic toa constant map, and thus rather uninteresting. To see this, first note that the smoothfunctions are dense inC(Sm, Sn), and a sufficiently nearby function does notmap anypoint to the antipode of its image under f , so f is homotopic to a smooth function.So, suppose that f is smooth. By Sard’s theorem, the regular values of f are dense,and since n > m, a regular value is a y ∈ Sn with f −1(y) = ∅. We now have thehomotopy h(x, t) = rn( f (x), an(y), t).

When m > n, on the other hand, the analysis of the homotopy classes of mapsfrom Sm to Sn is a very difficult topic that has been worked out for many specificvalues of m and n, but not in general. We will only discuss the case of m = n, forwhich the most basic question is the relation between the index and the degree.

Theorem 14.6 If f : Sm → Sm is continuous, then

ΛSm ( f ) = 1 + (−1)m deg( f ) .

Proof Hopf’s theorem (Theorem 14.4) implies that two maps from Sm to itself arehomotopic if they have the same degree, and the index is a homotopy invariant, soif suffices to determine the relationship between the degree and index for a specificinstance of a map of each possible degree.

We begin with m = 1. For d ∈ Z let f1,d : S1 → S1 be the function

f1,d(cos θ, sin θ) := (cos dθ, sin dθ) .

If d > 0, then f −11,d (1, 0) consists of d points at which f1,d is orientation preserving,

when d = 0 there are points in S1 that are not in the image of f1,0, and if d > 0, thenf −11,d (1, 0) consists of d points at which f1,d is orientation reversing. Therefore

deg( f1,d) = d .

Now observe that f1,1 is homotopic to a map without fixed points, while for d �= 1the fixed points of f1,d are the points

(cos 2πk

d−1 , sin2πkd−1

)(k = 0, . . . , d − 2) .

14.3 More on Maps Between Spheres 277

If d > 1, then motion in the domain is translated by f1,d into more rapid motion inthe range, so the index of each fixed point is −1. When d < 1, f1,d translates motionin the domain into motion in the opposite direction in the range, so the index of eachfixed point is 1. Combining these facts, we conclude that

ΛS1( f1,d) = 1 − d ,

which establishes the result when m = 1.Let em+1 = (0, . . . , 0, 1) ∈ R

m+1. Then

Sm = { αx + βem+1 : x ∈ Sm−1, α ≥ 0, α2 + β2 = 1 } .

We define fm,d inductively by the formula

fm,d(αx + βem+1

) := α fm−1,−d(x) − βem+1 .

If fm−1,−d is orientation preserving (reversing) at x ∈ Sm−1, then fm,d is clearlyorientation reversing (preserving) at x , so deg( fm,d) = − deg( fm−1,−d). Therefore,by induction, deg( fm,d) = d.

The fixed points of fm,d are evidently the fixed points of fm−1,−d . Fix such an x .Computing in a local coordinate system, one may easily show that the index of x ,as a fixed point of fm,d , is the same as the index of x as a fixed point of fm−1,−d , soΛSm ( fm,d) = ΛSm−1( fm−1,−d). By induction,

ΛSm ( fm,d ) = ΛSm−1 ( fm−1,−d ) = 1 + (−1)m−1 deg( fm−1,−d ) = 1 + (−1)m deg( fm,d ) .

�

Corollary 14.2 If a map f : Sm → Sm has no fixed points, then deg( f ) = (−1)m+1.If f does not map any point to its antipode, which is to say that am ◦ f has no fixedpoints, then deg( f ) = 1. Consequently, if f does not map any point either to itselfor its antipode, then m is odd.

Proof The first claim follows from ΛSm ( f ) = 0 and the result above. In particular,am has no fixed points, so deg(am) = (−1)m+1. The second result now follows fromthe multiplicative property of the degree of a composition (Corollary12.4):

(−1)m+1 = deg(am ◦ f ) = deg(am) × deg( f ) = (−1)m+1 deg( f ) .

�

Proposition 14.3 If the map f : Sm → Sm never maps antipodal points to antipodalpoints—that is, am( f (p)) �= f (am(p)) for all p ∈ Sm—then deg( f ) is even. If m iseven, then deg( f ) = 0.

Proof The homotopy h : Sm × [0, 1] → Sm given by


h(p, t) := rm( f (p), f (am(p)), t)

shows that f and f ◦ am are homotopic, whence deg( f ) = deg( f ◦ am). Corol-larys12.4 and 14.2 give

deg( f ) = deg( f ◦ am) = deg( f ) deg(am) = (−1)m+1 deg( f ) ,

and when m is even it follows that deg( f ) = 0.Since f is homotopic to a nearby smooth function, we may assume that it is

smooth, in which case each ht is also smooth. Sard’s theorem implies that each ht

has regular values, and since h1/2 = h1/2 ◦ am , any regular value of h1/2 has an evennumber of preimages. The sum of an even number of elements of {1,−1} is even,so it follows that deg( f ) = deg(h1/2) is even. �

Combining this result with the first assertion of Corollary14.2 gives a result thatwas actually applied to the theory of general economic equilibrium byHart andKuhn(1975):

Corollary 14.3 Any map f : Sm → Sm either has a fixed point or a point p suchthat f (am(p)) = am( f (p)).

Of course am extends to the map x �→ −x fromRm+1 to itself, and in appropriate

contexts we will understand it in this sense. If D ⊂ Rm+1 satisfies am(D) = D, a

map f : D → Rn+1 is said to be antipodal if

f ◦ am |D = an ◦ f .

The next result seems to be naturally paired with Proposition14.3, but it is actuallymuch deeper.

Theorem 14.7 If a map f : Sm → Sm is antipodal, then its degree is odd.

Proof There are smooth maps arbitrarily close to f . For such an f ′ the map

p �→ rm( f ′(p),− f ′(−p), 12 )

is well defined, smooth, antipodal, and close to f , so it is homotopic to f and has thesame degree. Evidently it suffices to prove the claim with f replaced by this map,so we may assume that f is smooth. Sard’s theorem implies that there is a regularvalue of f , say q.

After rotating Sm wemay assume thatq = (0, . . . , 0, 1) and−q = (0, . . . , 0,−1)are the North and South poles of Sm . We would like to assume that

(f −1(q) ∪ f −1(−q)

) ∩ Sm−1 = ∅ ,

and we can bring this about by replacing f with f ◦ h where h : Sm → Sm is anantipodal diffeomorphism than perturbs neighborhoods of the points in f −1(q) ∪


f −1(−q) while leaving points far away from these points fixed. (Such an h caneasily be constructed using the methods of Sect. 10.2.)

Since a sum of numbers drawn from {−1, 1} is even or odd according to whetherthe number of summands is even or odd, our goal reduces to showing that f −1(q)

has an odd number of elements. When m = 0 this is established by considering thetwo antipode preserving maps from S0 to itself. Proceeding inductively, suppose theresult has been established when m is replaced by m − 1.

For p ∈ Sm , p ∈ f −1(q) if and only if −p ∈ f −1(−q), because f is antipodal,so the number of elements of f −1(q) ∪ f −1(−q) is twice the number of elementsof f −1(q). Let

Sm+ := { p ∈ Sm : pm+1 ≥ 0 } and Sm

− := { p ∈ Sm : pm+1 ≤ 0 }

be the Northern and Southern hemispheres of Sm . Then p ∈ Sm+ if and only if −p ∈Sm− , so Sm+ contains half the elements of f −1(q) ∪ f −1(−q). Thus it suffices to show

that ( f −1(q) ∪ f −1(−q)) ∩ Sm+ has an odd number of elements.For ε > 0 consider the small open and closed disks

Dε := { p ∈ Sm : pm+1 > 1 − ε } and Dε := { p ∈ Sm : pm+1 ≥ 1 − ε }

centered at the North pole. Since f is antipode preserving,−q is also a regular valueof f . In view of the inverse function theorem, f −1(Dε ∪ −Dε) is a disjoint unionof diffeomorphic images of Dε, and none of these intersect Sm−1 if ε is sufficientlysmall. Concretely, for each p ∈ f −1(q) ∪ f −1(−q) the component C p of f −1(Dε ∪−Dε) containing p is mapped diffeomorphically by f to either Dε or −Dε, and thevarious C p are disjoint from each other and Sm−1. Therefore we wish to show thatf −1(Dε ∪ −Dε) ∩ Sm+ has an odd number of components.Let M := Sm+ \ f −1(Dε ∪ −Dε). Clearly M is a compact m-dimensional smooth

∂-manifold. Each point in Sm \ {q,−q} has a unique representation of the formαy + βq where y ∈ Sm−1, 0 < α ≤ 1, and α2 + β2 = 1. Let j : Sm \ {q,−q} →Sm−1 be the function j

(αy + βq

) := y, and let

g := j ◦ f |M : M → Sm−1 .

Sard’s theorem implies that some q∗ ∈ Sm−1 is a regular value of both g and g|∂ M .Theorem12.1 implies that degq∗(g|∂ M) = 0, so (g|∂ M)−1(q∗) has an even number ofelements. Evidently g maps the boundary of each C p diffeomorphically onto Sm−1,so each such boundary contains exactly one element of (g|∂ M)−1(q∗). In addition,j maps antipodal points of Sm \ {q,−q} to antipodal points of Sm−1, so g|Sm−1 isantipodal, and our induction hypothesis implies that (g|∂ M)−1(q∗) ∩ Sm−1 has anodd number of elements. Therefore the number of components of f −1(Dε ∪ −Dε)

contained in Sm+ is odd, as desired. �

The hypotheses can be weakened:


Corollary 14.4 If the map f : Sm → Sm satisfies f (−p) �= f (p) for all p, thenthe degree of f is odd.

Proof This will follow from the last result once we have shown that f is homo-topic to an antipodal map. Let h : Sm × [0, 1] → Sm be the homotopy h(p, t) :=rm( f (p),− f (−p), 2t). The hypothesis implies that this is well defined, and h1 isantipodal. �

Theorem 14.8 (Borsuk–Ulam Theorem) If f : Sm → Rm is continuous, then there

is a p ∈ Sm such that f (p) = f (am(p)).

Proof We think of Rm as Sm with a point removed, so a continuous f : Sm → Rm

amounts to a function from Sm to itself that is not surjective, and whose degree isconsequently zero. Now the claim follows from the last result. �

This famous result has a wealth of geometric consequences.

Corollary 14.5 If f : Sm → Rm is continuous and antipodal, then there is a p ∈ Sm

such that f (p) = 0.

Proof The Borsuk–Ulam theorem gives a point p such that f (p) = f (−p). If f isalso antipodal, then f (−p) = − f (p) so f (p) = 0. �

Regarding Sm−1 as a subset of Rm in the usual way, it follows that:

Corollary 14.6 There is no continuous antipodal f : Sm → Sm−1.

Corollary 14.7 There is no continuous g : Dm → Sm−1 such that g|Sm−1 is antipo-dal.

Proof As in the proof of Theorem14.7 let Sm+ and Sm− be the Northern and South-ern hemispheres of Sm . There is an obvious homeomorphism of Dm and Sm+ thatrestricts to IdSm−1 . If g : Sm+ → Sm−1 was continuous and antipodal, we could definea continuous and antipodal f : Sm → Sm−1 by setting

f (p) :={

g(p), p ∈ Sm+ ,

am(g(am(p))), p ∈ Sm− .

�

If f : Sm → Rm was continuous, but there was no p ∈ Sm such that f (p) =

f (−p), then the restriction of the map p �→ ( f (p) − f (−p))/|| f (p) − f (−p)‖ toSm+ could be construed as a continuous map from Dm to Sm−1 whose restriction toSm−1 was antipodal. Thus this last result implies the Borsuk–Ulam theorem, so thesethree corollaries are equivalent reprasings of that result.

Corollary 14.8 Any cover F1, . . . , Fm+1 of Sm by m + 1 closed sets has a least oneset that contains a pair of antipodal points.


Proof Define f : Sm → Rm by setting f (p) := g(p) − g(−p) where

g(p) := (d(x, F1), . . . , d(x, Fm)

)

where d(x, x ′) := ‖x − x ′‖ is the usual metric for Rm+1. Evidently f is continuousand antipodal, so Corollary14.5 implies that there is a p such that g(p) = g(−p).If gi (p) = 0, then p,−p ∈ Fi , and if all the components of g(p) are nonzero, thenp,−p ∈ Fm+1. �

Corollary 14.9 Any cover U1, . . . , Um+1 of Sm by m + 1 open sets has a least oneset that contains a pair of antipodal points.

Proof Suppose that ε > 0. For i = 1, . . . , m + 1 set Fi := { p ∈ Sm : d(p, Sm \Ui ) ≥ ε }. Then each Fi is a closed subset of Ui , and the Lebesgue number lemmaimplies that these sets cover Sm if ε is sufficiently small. �

If a closed subset of Sm does not contain a pair of antipodal points, then a smallenough neighborhood of this set also has this property, so Corollary14.9 impliesCorollary 14.8. We now present an argument showing that Corollary14.8 impliesthe conclusion of Corollary14.6, so that these two corollaries are also equivalentrephrasings of the Borsuk–Ulam theorem.

Consider an m-simplex in Dm that has the origin in its interior. Let F1, . . . , Fm+1

be the radial projections of the facets of the simplex onto Sm−1. These sets are closedand cover Sm−1. If f : Sm → Sm−1 is continuous, then f −1(F1), . . . , f −1(Fm+1)

are a cover of Sm by closed sets, and Corollary14.8 implies the existence of a pairp,−p ∈ f −1(Fi ) for some i . If f was also antipodal, then f (−p) = − f (p), but Fi

is is separated from the origin by a hyperplane, so f (p),− f (p) ∈ Fi is impossible.Two other consequences of the Borsuk–Ulam theorem—the following “obvious”

facts—are actually highly nontrivial.

Theorem 14.9 Spheres of different dimensions are not homeomorphic.

Proof If k < m then, since Sk can be embedded in Rm , the Borsuk–Ulam theorem

implies that a continuous function from Sm to Sk cannot be injective. �

Theorem 14.10 Euclidean spaces of different dimensions are not homeomorphic.

Proof If k �= m and f : Rk → Rm was a homeomorphism, for any sequence {x j } in

Rk with {x j } → ∞ the sequence { f (x j )} could not have a convergent subsequence,

so ‖ f (x j )‖ → ∞. Identifying Rk and Rm with Sk \ {ptk} and Sm \ {ptm}, where ptkand ptm are the respective north poles, the extension of f to Sk given by settingf (ptk) := ptm would be continuous, with a continuous inverse, contrary to the lastresult. �


14.4 Invariance of Domain

Themain result of this section, invarianceof domain, is a famous resultwith numerousapplications. It can be thought of as a purely topological version of the inversefunction theorem. The next two lemmas prepare the proof.

Lemma 14.3 Suppose Sm+ is the Northern hemisphere of Sm, f : Sm+ → Sm is a mapsuch that f |Sm−1 is antipodal, and p ∈ Sm+ \ Sm−1 is a point such that −p /∈ f (Sm+)

and p /∈ f (Sm−1). Then degp( f ) is odd.

Proof Let f : Sm → Sm be the extension of f given by setting f (p) := − f (−p)

when pm+1 < 0. Clearly f is continuous and antipodal, so its degree is odd. Thehypotheses imply that f −1(p) ⊂ Sm+ \ Sm−1, and that f is degree admissible over p,so Additivity implies that degp( f ) = degp( f ). �

Lemma 14.4 If f : Dm → Rm is injective, then deg f (0)( f ) is odd, and f (Dm)

includes a neighborhood of f (0).

Proof Replacing f with x �→ f (x) − f (0), we may assume that f (0) = 0. Leth : Dm × [0, 1] → R

m be the homotopy

h(x, t) := f ( x1+t ) − f (−t x

1+t ) .

Of course h0 = f and h1 is antipodal. If ht (x) = 0 then, because f is injective,x = −t x , so that x = 0. Therefore h is a degree admissible homotopy over zero, sodeg0(h0) = deg0(h1), and the last result implies that deg0(h1) is odd, so deg0(h0) =deg0( f ) is odd. The Continuity property of the degree implies that degy( f ) is oddfor all y in some neighborhood of f (0). Since, by Additivity, degy( f ) = 0 whenevery /∈ f (Dm), we conclude that f (Dm) contains a neighborhood of 0. �

The next result is quite famous, being commonly regarded as one of the majoraccomplishments of algebraic topology. As the elementary nature of the assertionsuggests, it is applied quite frequently.

Theorem 14.11 (Invariance of Domain) If U ⊂ Rm is open and f : U → R

m iscontinuous and injective, then f (U ) is open and f is a homeomorphism onto itsimage.

Proof The last result can be applied to a closed disk surrounding any point in thedomain, so for any open V ⊂ U , f (V ) is open. Thus f −1 is continuous. �

14.5 Essential Sets Revisited

Let X be a compact ANR, let C ⊂ X be compact, and let F : C → X be an indexadmissible correspondence. IfΛX (F) �= 0, thenF(F) is essential, and of course thisrobustness is crucial in economic applications. Suppose that ΛX (F) = 0. Are therenecessarily correspondences near F that have no fixed points?

14.5 Essential Sets Revisited 283

Suppose that C1, . . . , Cr are pairwise disjoint compact subsets of C with F(F)

contained in the interior of C1 ∪ . . . ∪ Cr . When r > 1 it can easily happen thatΛX (F) = ∑

i ΛX (F |Ci ) = 0 even though ΛX (F |Ci ) �= 0 for some i . In addition,F(F) is inessential if and only if each F(F |Ci ) is inessential. These considerationssuggest that we should study the problem when F(F) is connected, and we willassume that this is the case. Since we may replace C with the connected componentthat contains F(F), we can and will also assume that C is connected.

Without additional assumptions, there is little hope of achieving positive results.When X is an ANR, it is a retract of an open subset of a Banach space. Pursuingthe techniques we develop below in that context would lead eventually to composinga perturbation with a retraction, and it is difficult to prevent the retraction fromintroducing undesiredfixed points.An approach to this issue for simplicial complexesis developed in Chap. VIII of Brown (1971).

In this section we develop two contexts in which, if ΛX (F) = 0 and F(F) isconnected, then F(F) is inessential. Specifically, our attention will be restricted tothe following settings: a) X is a “well behaved” subset of a smooth manifold; b) Xis a compact convex subset of a Euclidean space.

The gist of the argument used to prove these results is to first approximate with asmooth function that has only regular fixed points, which are necessarily finite andcan be organized in pairs of opposite index, then perturb to eliminate each pair, asper the following result. As before let

Dm := { x ∈ Rm : ‖x‖ ≤ 1 } and Sm−1 := { x ∈ Dm : ‖x‖ = 1 }.

Proposition 14.4 If g : Dm → Rm is continuous, 0 /∈ g(Sm−1), and deg0(g) = 0,

then there is a continuous g : Dm → Rm \ {0} with g|Sm−1 = g|Sm−1 and maxx∈Dm

‖g(x)‖ = maxx∈Sm−1 ‖g(x)‖.

Proof Let g : Sm−1 → Sm−1 be the function g(x) = g(x)/‖g(x)‖. Theorem14.5implies that deg(g) = 0, so the Hopf theorem implies that there is a homotopy h :Sm−1 × [0, 1] → Sm−1 with h0 = g and h1 a constant function. Let γ := maxx∈Sm−1

‖g(x)‖. For (x, t) ∈ Sm−1 × [0, 1] we set

g(t x) = (t‖g(x)‖ + (1 − t)γ

)h1−t (x) .

This is defined and continuous at 0 because h1 is constant. Evidently ‖g(t x)‖ =t‖g(x)‖ + (1 − t)γ ≤ γ . Since the origin is not in the image of h, it is not in theimage of g. For x ∈ Sm−1 we have g(x) = ‖g(x)‖h0(x) = ‖g(x)‖g(x) = g(x). �

We first apply this result to vector fields.

Proposition 14.5 If ζ ∈ VM has domain C, E(ζ ) is connected, ind(ζ ) = 0, W0 ⊂ Cis an open neighborhood of E(ζ ), and Z ⊂ T M is a neighborhood of ζ(C), thenthere is a vector field ζ ∗ on C with ζ ∗(C) ⊂ Z, ζ ∗|C\W0 = ζ |C\W0 , and E(ζ ∗) = ∅.


Proof Since we may replace W0 with a smaller neighborhood of E(ζ ), we mayassume that (p, αζp) ∈ Z for all p ∈ W0 and all α ∈ [0, 1]. Let W1 and W2 be opensubsets of C with E(ζ ) ⊂ W2, W 2 ⊂ W1, and W 1 ⊂ W0. We may assume that W2 ispath connected. (For example it could be a finite union of disks.)

Proposition11.2 and Corollary 10.1 combine to imply that there is a vector fieldζ on C that agrees with ζ on C \ W1, is Cr−1 on W2, and has only regular equilibria,all of which are in W2. The number of equilibria is necessarily finite, and we mayassume that, among all the vector fields on C that agree with ζ on C \ W1 and haveonly isolated equilibria in W2, ζ minimizes this number. Aiming at a contradiction,suppose that this number is positive.

Since the index of ζ is zero, there must be two equilibria of opposite index. Thereis1 a Cr embedding γ : (−ε, 1 + ε) → W2 with γ (0) = p0 and γ (1) = p1, wherep0 and p1 are equilibria of ζ of opposite index that are the only equilibria in theimage of γ . Applying the tubular neighborhood theorem, this path can be used toconstruct a Cr parameterization ϕ : V → U where U ⊂ M is open, V ⊂ R

m is aneighborhood of Dm , and γ ([0, 1]) ⊂ ϕ(Dm \ Sm−1). Let g : V → R

m be definedby setting g(x) := Dϕ(x)−1ζϕ(x). Proposition14.4 gives a continuous function g :V → R

m \ {0} that agrees with g on the closure of V \ Dm . We extend g to all of Vby setting g(x) := g(x) if x /∈ Dm . Define a new vector field ζ on ϕ(V ) by setting

ζ (p) := Dϕ(ϕ−1(p))g(ϕ−1(p)) .

Since ζ has fewer equilibria than ζ , this contradicts minimality.Thus ζ has no equilibria. Since (p, 0) ∈ Z for all p ∈ W0 there is some δ > 0

such that (p, δζp) ∈ Z for all p ∈ W 1. Let α : C → [0, 1] be a continuous functionthat is identically zero on C \ W0 and is identically one on W 1. For p ∈ C let

ζ ∗(p) := (p, (1 − α(p))ζp + α(p)δζp) .

Evidently ζ ∗ satisfies all required properties. �

There is an obvious similarity between a neighborhood of the zero section in thetangent space and a neighborhood of the diagonal in M × M . The possibility ofpassing from one point of view to the other will be exploited on several occasions, sowe begin by introducing some machinery that formalizes it. Let λ : M → R++ be acontinuous function, let T Mλ = { (p, v) ∈ T M : ‖v‖ < λ(p) }, and let κ : T Mλ →M be a Cr−1 function such that:

(a) κ(p, 0) = p and Dκ(p, ·)(0) = IdTp M for all p ∈ M ;(b) κ = π × κ : T Mλ → M × M is a Cr−1 embedding.

1A formal verification of this obvious existence claim is rather tedious. First of all the case m =1 must be handled separately, and we leave the details to the reader. When m > 1 we may fixsome equilibrium p0 of ζ and let S be the set of p ∈ W2 such that there is a Cr embeddingγ : (−ε, 1 + ε) → W2 with γ (0) = p0, γ (1) = p, and no equilibria other than these points in itsimage. It is not hard to show that S is both open and closed, hence all of W2 \ {p0}.

14.5 Essential Sets Revisited 285

Proposition10.13 guarantees the existence of λ and κ with these properties.This section’s first principal result is:

Theorem 14.12 (O’Neill 1953) If C ⊂ M is compact, f : C → M is index admis-sible, F( f ) is connected, and ΛM( f ) = 0, then F( f ) is inessential.

Proof For a given open neighborhood W ⊂ C × M of the graph of f we need to finda f : C → M with Gr( f ) ⊂ W that has no fixed points. Let Wλ := κ(T Mλ), whereλ, κ , and κ are as in the last section. Let U be a neighborhood of F( f ) such thatGr( f |U ) ⊂ Wλ, and let U0 be a neighborhood of F( f ) with U 0 ⊂ U . For p ∈ Ulet ζ(p) := κ−1(p, f (p)). Then ζ is a vector field on U whose set of equilibriais F( f ). The relationship between the fixed point index and the vector field indexgiven by Theorem15.5 implies that ind(ζ ) = 0. The last result implies that there is avector field ζ ∗ onU whose image is contained in W that has no equilibria and agreeswith ζ on U \ U0. An f : C → M with the desired properties is the function thatagrees with f on C \ U and is given by f (p) = ρ(κ(ζ ∗(p))) when p ∈ U , whereρ : M × M → M is the projection on the second component. �

Economic applications call for a version of the result for correspondences. Ideallyone would like to encompass contractible valued correspondences in the setting ofa manifold, but the methods used here are not suitable. Instead we are restricted toconvex valued correspondences, and thus to settings where convexity is defined.

Theorem 14.13 If X ⊂ Rm is convex, C ⊂ X is compact, F : C → X is an index

admissible upper hemicontinuous convex valued correspondence, ΛX (F) = 0, andF(F) is connected, then F is inessential.

Caution: The analogous result does not hold for essential sets of Nash equilibria,which are defined by Jiang (1963) in terms of perturbations of the game’s payoffs.Hauk and Hurkens (2002) give an example of a game with a component of the setof Nash equilibria that has index zero but is robust with respect to perturbations ofpayoffs.

There is now a technical preparation for the proof.

Lemma 14.5 If X ⊂ Rm is convex, C ⊂ X is compact, F : C → X is an index

admissible upper hemicontinuous convex valued correspondence, and W ⊂ C × Xis a neighborhood of Gr(F), there is a β > 0 and a neighborhood W ′ ⊂ W of Gr(F)

such that for all x ∈ Uβ(F(F)), W ′(x) := { w ∈ X : (x, w) ∈ W ′ } contains Uβ(x)

and has x as a star. (That is, W ′(x) contains the line segment between x and anyw ∈ W ′(x).)

Proof For each x ∈ F(F) choose a convex neighborhood Zx ⊂ X of F(x) and aclosed neighborhood Dx ⊂ C of x such that Dx × Zx ⊂ W , and choose αx > 0 suchthat Uαx (Dx ) ⊂ Zx . Choose x1, . . . , xk such that the interiors of Dx1 , . . . , Dxk coverF(F). Letβ > 0 be small enough thatβ < αx j for all j = 1, . . . , k andUβ(F(F)) ⊂Dx1 ∪ · · · ∪ Dxk . Let D := Dx1 ∪ . . . ∪ Dxk and


W ′ := (Dx1 × Zx1) ∪ . . . ∪ (Dxk × Zxk ) ∪ (W \ (D × X)) .

If x ∈ Uβ(F(F)), then x ∈ Dx j for some j , so Uβ(x) ⊂ Uαx j(Dx j ) ⊂ Zxk ⊂ W ′(x).

In addition, x is a star of W ′(x) because it is a star of each Zx j such thatx ∈ Dx j . �

Proof of Theorem 14.13 For a given open neighborhood W ⊂ C × X of the graphof F we need to produce a continuous f : C → X with Gr( f ) ⊂ W andF( f ) = ∅.We may replace W by a smaller neighborhood, so by Continuity we can requirethat any continuous f : C → X with Gr( f ) ⊂ W has ΛX ( f ) = ΛX (F) = 0, andthe lemma above allows us to assume that there is some β > 0 such that for allx ∈ Uβ(F(F)), W (x) := { w ∈ X : (x, w) ∈ W } contains Uβ(x) and has x as astar.

Let U ⊂ C ∩ Uβ(F(F)) be an open path connected neighborhood of F(F). LetW ′ := W \ { (x, x) : x ∈ C \ U }. Proposition10.2 gives aC∞ function f0 : C → XwithGr( f ) ⊂ W ′ that has only regular fixed points. Let f0 beminimal for the numberof fixed points among all the continuous functions f : C → X with Gr( f ) ⊂ W ′that have finitely many regular fixed points. (That is, f is differentiable at eachfixed point x , and |IdRm − D f (x)| �= 0.) Aiming at a contradiction, we assume thatF( f0) �= ∅. Since ΛX ( f0) = 0, f0 must have two fixed points x0 and x1 of oppositeindex.

As in the proof of Proposition14.5, there is a C∞ embedding γ : (−ε, 1 + ε) →U with γ (0) = x0 and γ (1) = x1 whose image contains no other fixed points off . Applying the tubular neighborhood theorem, this path can be used to construct aneighborhood V of γ ([0, 1])with V ⊂ U that contains no other fixed points of f , anda C∞ coordinate chart ϕ : V → R

m with Dm ⊂ ϕ(V ) and ϕ(γ ([0, 1])) contained inthe interior of Dm .

We now modify f0, creating a function that is close to the identity on V . Letδ = maxx∈V ‖ f0(x) − x‖. If δ ≤ β let f1 := f0. Otherwise let α : C → [β/δ, 1] bea continuous function with α(x) = β/δ for all x ∈ V and α(x) = 1 for all x ∈ C \U , and define f1 : C → X by setting f1(s) := x + α(x)( f0(x) − x). For each x ∈U , x is a star of W (x), so Gr( f1) ⊂ W . In addition, the fixed points of f1 in Uare x0 and x1, and these are regular for f1 and have the same indices for f0 andf1.Let ζ : U → R

m be the function ζ(x) := f1(x) − x and let g := ζ ◦ ϕ−1 :ϕ(V ) → R

m . The fixed point index of x0 is its vector field index as an equilib-rium of the vector field x �→ (x, ζ(x)), and similarly for x1. Therefore deg0(g) = 0.Proposition14.4 gives a continuous function g : ϕ(V ) → R

m \ {0} that agrees withg on ϕ(V ) \ Dm and has ‖g(y)‖ ≤ β for all y ∈ Dm . We let ζ := g ◦ ϕ : V → R

m

and define f : C → X by setting f (x) := x + ζ (x) if x ∈ V and f (x) := f1(x) oth-erwise. Of course f is continuous, and ‖ζ (x)‖ ≤ β for all x ∈ V , so Gr( f ) ⊂ W .The fixed points of f are the fixed points of f0 other than x0 and x1, and f agreeswith f0 in a neighborhood of each of them. Since f has fewer fixed points than f0we have contradicted the minimality of f0. �

Exercises 287

Exercises

14.1 Prove that there is no injective continuous function f : Sn → Rn , where Sn :=

{ x ∈ Rn+1 : ‖x‖ = 1 }.

14.2 Suppose that X is a topological space and A ⊂ X .

(a) Prove that if (X, A) has the homotopy extension property, then A is a closedneighborhood retract in X .

(b) Use Proposition8.13 to show that if X is an ANR and A is a closed subset of X ,then (X, A) has the homotopy extension property.

14.3 (This problems presumes elementary concepts of group theory.) Let I n :=[0, 1]n , and let ∂ I n be its boundary. We think of I n with ∂ I n collapsed to a point asa representation of the n-sphere Sn := { x ∈ R

n+1 : ‖x‖ = 1 }. For pairs (X, A) and(Y, B), where X and Y are topological spaces and A ⊂ X and B ⊂ Y , a continuousfunction from (X, A) to (Y, B) is a continuous f : X → Y such that f (A) ⊂ B,and a homotopy of such functions is a continuous h : X × [0, 1] → Y such thath(A × [0, 1]) ⊂ B. For a space X with a basepoint x0 we write (X, x0) rather than(X, {x0}). Let πn(X, x0) be the set of homotopy classes [ f ] of maps f : (I n, ∂ I n) →(X, x0). For [ f ], [g] ∈ πn(X, x0) let [ f ] ∗ [g] be the homotopy class of the map

(t1, . . . , tn) �→{

f (2t1, t2, . . . , tn), t1 ≤ 1/2,

g(2t1 − 1, t2, . . . , tn), 1/2 ≤ t1 .

(a) Prove that [ f ] ∗ [g] is well defined in the sense of not depending on the choiceof representatives f and g.

(b) Prove that ∗ is a group operation:

(i) ∗ is associative.(ii) If e : I n → X is the constant map with value x0, then [e] ∗ [ f ] = [ f ] =

[ f ] ∗ [e].(iii) If f −1 is the map (t1, . . . , tn) �→ f (1 − t1, t2, . . . , tn) then [ f ] ∗ [ f −1] =

[e] = [ f −1] ∗ [ f ].(c) Explainwhyπn(X, x0) is abelian if n ≥ 2. (Instead of trying to define a particular

homotopy, draw some pictures.)

The groups πn(X, x0) are called the homotopy groups of (X, x0), and π1(X, x0) isthe fundamental group of (X, x0). Letm : (X, x0) → (Y, y0) be continuous, and letπn(m) : πn(X, x0) → πn(Y, y0) be the function [ f ] �→ [m ◦ f ]. (This is evidentlywell defined, in the sense of independence of the choice of representative.) Often wewrite m∗ rather than πn(m).

(d) Prove that πn(m) is a homomorphism.(e) Prove that πn is a covariant functor from the category of pointed spaces and

continuous maps to the category of groups and homomorphisms.


(f) If (M, p0) is an n-dimensional oriented manifold with basepoint p0, interpretthe degree of maps Sn → M as a homomorphism from πn(M, p0) to Z. Whatdoes Hopf’s theorem say about this homomorphism when M = Sn?

We identify I n−1 with { t ∈ I n : tn = 0 }, and we let J n−1 be the closure of I n \ I n−1.We think of (I n, ∂ I n) with J n−1 collapsed to a point as a representation of the pair(Dn, Sn−1)where Dn is the n-disk Dn := { x ∈ R

n : ‖x‖ ≤ 1 }. A topological tripleis a triple (X, A, B) where X is a topological space and B ⊂ A ⊂ X . We definecontinuous maps between triples and homotopies of such maps as above. If X is atopological space and x0 ∈ A ⊂ X , for n ≥ 2 letπn(X, A, x0) be the set of homotopyclasses of maps f : (I n, ∂ I n, J n−1) → (X, A, x0).

(g) Discuss how to modify of the arguments above to show that the binary oper-ation ∗ (defined just as above) makes πn(X, A, x0) into a group, and that acontinuous map m : (X, A, x0) → (Y, B, y0) induces a homomorphism m∗ :πn(X, A, x0) → πn(Y, B, y0).

(h) In the sequence

· · · → π2(A, x0) → π2(X, x0) → π2(X, A, x0) → π1(A, x0) → π1(X, x0)

the homomorphisms i∗ : πn(A, x0) → πn(X, x0) and j∗ : πn(X, x0) → πn

(X, A, x0) are induced by the inclusions i : (A, x0) → (X, x0) and j :(X, x0, x0) → (X, A, x0), and the homomorphism ∂ : πn(X, A, x0) → πn−1

(A, x0) is [ f ] �→ [ f |I n−1]. (You should convince yourself that ∂ is a homo-morphism.) Prove that the sequence is exact: the image of each homomorphism(except the last) is the kernel of its successor.

(i) Prove that for all d ≥ 1 and n ≥ 2, ∂ : πn(Dd , Sd−1, p0) → πn−1(Sd , p0) is anisomorphism. (Of course p0 is an arbitrary point in Sd−1.)

14.4 Recall that for n ≥ 0, n-dimensional (real) projective space Pn is the set of1-dimensional subspaces of Rn+1, topologized in the “obvious” way. Alternatively,we may regard Pn as the space obtained from Sn by identifying antipodal points, sothere is a canonical map bn : Sn → Pn taking x to {x,−x}.(a) Show that Pn is orientable if and only if n is odd.(b) Show that IdPn is not homotopic to a constant map.(c) Prove that if γ : [0, 1] → Pn is continuous and bn(x0) = γ (0), then there is a

unique continuous γ : [0, 1] → Sn such that γ (0) = x0 and bn ◦ γ = γ .

If γ : [0, 1] → Sn is continuous, γ = bn ◦ γ , γ (0) = γ (1), and γ (1) = an(γ (0)),then we say that γ is an antipodal loop.

(d) For a continuous f : Pn → Pn , prove that there is a continuous f : Sn → Sn

such that f ◦ bn = bn ◦ f if and only if f ◦ γ is an antipodal loop wheneverγ : [0, 1] → Pn is an antipodal loop.

(e) For an arbitrary p0 ∈ Pn , prove that π1(Pn, p0) is the group of integers mod 2.

Chapter 15Dynamical Systems

Unlike physics and chemistry, economics does not have definite dynamic laws ofmotion that govern the process that leads to equilibrium. Nevertheless there areequilibria, for example the mixed equilibrium of the battle of the sexes, that arenot observed persistently, apparently because they are unstable. Thus stability is animportant issue, but instead of focusing on particular processes, we are most inter-ested in the relationship between dynamic stability and coarse or qualitative featuresof the dynamical system. This chapter’s central result asserts that a component ofthe set of equilibria of a dynamical system is unstable if its Euler characteristic doesnot agree with the vector field index of the negation of the vector field defining thesystem. Since the vector field index is a homotopy invariance, this result impliesinstability for a wide range of vector fields, which in applications often includes allsystems that are economically natural.

Section15.1 reviews the basic results concerning existence and uniqueness ofsolutions of ordinary differential equations. These results are transferred to dynamicalsystems defined by vector fields on smooth manifolds in Sect. 15.2.

The space of mixed strategy profiles of a strategic form game is not a manifold,but dynamical systems in this set are studied extensively in evolutionary game theory.Therefore, at least ideally, we should work in a framework that is general enough toencompass such examples, and also respects the principle that the theory of dynamicalsystems is not dependent on the geometry of Euclidean space. Section15.3 introducesthe notion of a diffeoconvex body, which is a subset Σ of a smooth m-dimensionalmanifoldM , each ofwhose points has a coordinate chart thatmaps the portion ofΣ inits domain to an open subset of a closed convex subset ofRm with nonempty interior.The set of points in M that have a unique nearest point in Σ is a neighborhood of Σ ,and the map taking each such point p to the nearest point rΣ(p) is a key technicaltool. In particular, it is important that rΣ is Lipschitz on smaller neighborhoods ofΣ .

Section15.4 extends the basic existence-uniqueness results for dynamical systemsto not outward pointing vector fields onΣ . Roughly, themethod is to extend the vectorfield by defining the vector at a point p to be the projection onto TpM of the vector


289


290 15 Dynamical Systems

at rΣ(p) plus rΣ(p) − p. We show that the flow of the extended vector field movestoward Σ at points near Σ , so that trajectories beginning at points in Σ can neverleave. Consequently the basic results concerning existence and uniqueness transferto this setting.

In addition to the degree and the fixed point index, there is a third expression ofthe underlying mathematical principle for vector fields. In Sect. 15.5 we present anaxiomatic description of the vector field index, paralleling our axiom systems forthe degree and fixed point index, and establish existence and uniqueness. Insofar asa vector field on a manifold can be identified with a function from the manifold toitself that is near the identity, this index is naturally related to the fixed point indexof this derived map, and also to the flow of the vector field for small times. All theseresults extend to not outward pointing vector fields on diffeoconvex bodies.

In the remainder of the chapter we develop the relationship between the fixedpoint index and the stability of equilibria, and sets of equilibria, of such a dynamicalsystem. The notion of stability discussed in Sect. 15.6, namely asymptotic stability,has a rather complicated definition, but the intuition is simple: a compact set A isasymptotically stable if the trajectory of each point in some neighborhood of A iseventually drawn into, and remains inside, arbitrarily small neighborhoods of A.A well known and very intuitive sufficient condition for asymptotic stability is theexistence of a Lyapunov function, which may be thought of as a rough measure ofthe distance from A along the path of the system.

A less well known and much deeper result is the converse Lyapunov theorem,which asserts that existence of a Lyapunov function is also a necessary conditionfor asymptotic stability. For a dynamical system on a manifold without boundarythis result achieved its final form in a paper by Wilson (1969). Section15.7 extendsthe converse Lyapunov theorem to a dynamical system on a diffeoconvex body byshowing that if a compact A ⊂ Σ is asymptotically stable for a not outward pointingvector field on Σ , then it is also asymptotically stable for the extension of the vectorfield described above. The converse Lyapunov theorem gives a Lyapunov functionfor the extended vector field, and its restriction to Σ is a Lyapunov function for thegiven vector field.

Once all this background material is in place, it will not take long to prove thechapter’s culminating result, which asserts that if A is a asymptotically stable for thedynamical system defined by a vector field ζ , and A is an ANR, then the vector fieldindex of −ζ is the Euler characteristic of A. The method of proof (from Demichelisand Ritzberger 2003) is to construct a homotopy between the identity function onsome neighborhood of A and a retraction of that neighborhood onto A. Until closeto the end, the homotopy follows the forward flow, and a key difficulty is to find aneighborhood that is mapped into itself by Φ(·, t) for small positive t . The converseLyapunov theorem provides a Lyapunov function, and for sufficiently small ε > 0the preimage of [0, ε] is a suitable neighborhood.

Paul Samuelson (1941, 1942, 1947) advocated a “correspondence principle” intwo papers and his famous book Foundations of Economic Analysis. The idea isthat the stability of an economic equilibrium, with respect to natural dynamics ofadjustment to equilibrium, implies certain qualitative properties of the equilibrium’s

15 Dynamical Systems 291

comparative statics. There are 1-dimensional settings in which this idea is regardedas natural and compelling, but Samuelson’s writings do not formulate it as a generaltheorem, and its nature and status in higher dimensions has not been well understood;Echenique (2008) provides a concise summary of the state of knowledge and relatedliterature. Because the relationship between the stability of a set of equilibria ofa vector field and its Euler characteristic depends only on economically naturalqualitative properties of the vector field, the consequence for economics (e.g., isolatedequilibria of index−1 should not be observed) is a natural and compelling extensionof the correspondence principle to multiple dimensions.

Finally we settle an issue related to essential sets of fixed points. As has beenmentioned previously, Additivity implies that if a set of fixed points has nonzeroindex, then it is essential. Section14.5 presents two contexts in which one can provethe converse, that if a connected set of fixed points has index zero, then it is inessential.

15.1 Euclidean Dynamical Systems

In this section we review the required elements of the theory of ordinary differentialequations in Euclidean space. LetU ⊂ R

m be open. A vector field onU is a functionz : U → U × R

m whose first component is IdU . We write z(x) = (x, zx ). (We reallyonly need the function x �→ zx , but this setup is consistent with what comes later.)A finite trajectory of z is a C1 function γ : [a, b] → U such that

γ ′(s) = zγ (s) (15.1)

for all s. A C1 function γ : I → U , where I ⊂ R is a (closed, open, half open,bounded, or unbounded) interval, is a trajectory of z if, for each compact subinterval[a, b] ⊂ I , γ |[a,b] is a finite trajectory.

Without assumptions beyond continuity the dynamics associated with z need notbe uniquely determined: there can be more than one trajectory satisfying an initialcondition that specifies the position of the trajectory at a particular moment. Forexample, suppose that m = 1, U = R, and

zt ={0, t ≤ 0,

2√t, t > 0.

Then for any s0 there is a trajectory γs0 : R → R given by

γs0(s) :={0, s ≤ s0,

(s − s0)2, s > s0.

For most purposes this sort of indeterminacy is unsatisfactory, so we need toimpose a condition that implies that for any initial condition there is a unique trajec-


tory. Let (X, d) and (Y, e) be metric spaces, and let f : X → Y be a function. ForL > 0, f is L-Lipschitz if

e( f (x), f (x ′)) ≤ Ld(x, x ′)

for all x, x ′ ∈ X , and f is Lipschitz if it is L-Lipschitz for some L . We say that fis locally Lipschitz if each x ∈ X has a neighborhoodU such that f |U is Lipschitz.Note that the vector field z is (locally) Lipschitz (as a function from U to U × R

m)if and only if x �→ zx is (locally) Lipschitz.

The basic existence-uniqueness result for ordinary differential equations is:

Theorem 15.1 (Picard–Lindelöf Theorem) Suppose that U ⊂ Rm is open and z is

a locally Lipschitz vector field on U. For any compact C ⊂ U there is an ε > 0 suchthat there for each x ∈ C there is a unique trajectory F(x, ·) : (−ε, ε) → U of zsuch that F(x, 0) = x. In addition F : C × (−ε, ε) → U is continuous, and if z isCs (1 ≤ s ≤ ∞) then so is F.

Due to its fundamental character, a detailed proof would be out of place here, butwe will briefly describe the central ideas of two methods. First, for any Δ > 0 onecan define a piecewise linear approximate solution going forward in time by settingFΔ(x, 0) := x and inductively applying the equation

FΔ(x, t) = FΔ(x, kΔ) + (t − kΔ) × z(FΔ(x, kΔ)) for kΔ < t ≤ (k + 1)Δ .

Concrete calculations show that this collection of functions has a limit as Δ → 0,that this limit is continuous and satisfies the differential equation (15.1), and alsothat any solution of (15.1) is a limit of this collection. These calculations give pre-cise information concerning the accuracy of the numerical scheme for computingapproximate solutions described by this approach.

The second proof scheme uses a fixed point theorem. It considers the mappingF �→ F given by the equation

F(x, t) := x +∫ t

0zF(x,s) ds .

This defines a function from C(C × [−ε, ε],U ) to C(C × [−ε, ε],Rm). As usual,the range is endowed with the supremum norm. A calculation shows that if ε issufficiently small, then the restriction of this function to a certain neighborhood ofthe function (x, t) �→ x is actually a contraction. Since C(C × [−ε, ε],Rm) is acomplete metric space, the contraction mapping theorem gives a unique fixed point.Additional details can be found in Chap.5 of Spivak (1979) and Chap.8 of Hirschand Smale (1974).

15.2 Dynamics on a Manifold 293

15.2 Dynamics on a Manifold

A manifold is the natural setting for the study of dynamical systems. Throughoutthe remainder of this chapter we will work with a fixed order of differentiability r ,where 3 ≤ r ≤ ∞, and a given m-dimensional Cr manifold M ⊂ R

k . Let T M bethe tangent space of M , and let π : T M → M be the projection (p, v) �→ p. Recallthat a vector field on S ⊂ M is a continuous ζ : S → T M such that π ◦ ζ = IdS .We write ζ(p) = (p, ζp) where ζp ∈ TpM .

A first challenge is to define what it means for a vector field on a subset of M tobe locally Lipschitz. Now the function p �→ ζp maps a subset of Rk to R

k , and itturns out that it would not be wrong to say that ζ is locally Lipschitz if this functionis, but in principle we are primarily interested in the “pictures” of the manifoldgiven by coordinate charts. If N ⊂ R

l is a second m-dimensional Cr manifold,U ⊂ M and V ⊂ N are open, h : U → V is a Cr diffeomorphism, SU = S ∩U ,and SV := h(SU ), then we define h∗(ζ ) to be the vector field on SV given by

h∗(ζ )(q) = (q, Dh(h−1(q))ζh−1(q)) .

Recall that if X and Y are normed spaces, the operator norm of a continuous lineartransformation : X → Y is ‖‖ := sup‖x‖=1 ‖(x)‖.Lemma 15.1 Let zU : SU → R

k be the function p �→ ζp, and let zV : V → Rl be

the function q �→ Dh(h−1(q))ζh−1(q). Then zU is locally Lipschitz if and only if zVis locally Lipschitz.

Proof Fixing q0 ∈ SV let p0 := h−1(q0). There is a neighborhood W ⊂ SU of p0in which ‖Dh(p)‖ and ‖zU (p)‖ are bounded, and ‖Dh(p) − Dh(p′)‖ is boundedby some multiple of ‖p − p′‖ for all p, p′ ∈ W . (This can be proved by applyingelementary facts concerningderivatives to thematrix of partial derivatives of h.) Sinceh−1 is C1, q0 has a neighborhood in W ′ ⊂ h(W ) such that ‖h−1(q) − h−1(q ′)‖ isbounded by a multiple of ‖q − q ′‖ for all q, q ′ ∈ W ′. For such q and q ′, if we letp := h−1(q) and p′ := h−1(q ′), then

‖zV (q) − zV (q ′)‖ = ‖Dh(p)(zU (p) − zU (p′)) + (Dh(p) − Dh(p′))zU (p′)‖

≤ ‖Dh(p)‖ × ‖zU (p) − zU (p′)‖ + ‖Dh(p) − Dh(p′)‖ × ‖zU (p′)‖ .

The facts laid out above combine to imply that zV is locally Lipschitz if zU is locallyLipschitz. Since

zU (p) = Dh(p)−1zV (h(p)) = Dh−1(h(p))zV (h(p))

the other implication follows by symmetry. �


We are now justified in making the following definition: ζ is locally Lipschitzif, for any Cr coordinate chart ϕ : U → R

m on an open U ⊂ M , ϕ∗(ζ ) is locallyLipschitz.

We now establish the basic results concerning dynamical systems on manifolds.We first consider the case S = M , so that ζ is a vector field on all ofM . AC1 functionγ : [a, b] → M (where a < b) is a finite trajectory of ζ if γ ′(s) = ζγ (s) for all s. IfI ⊂ R is a (closed, open, half open, bounded, or unbounded) interval, a C1 functionγ : I → M is a trajectory of ζ if its restriction to each compact subinterval of I is afinite trajectory.

As always, we should make sure that nothing depends on the frame of reference.Let N be a second Cr manifold, and let h : M → N be a Cr diffeomorphism. Ifγ : (a, b) → M is C1, then the chain rule gives

(h ◦ γ )′(s) = Dh(γ (s))γ ′(s) ,

so γ ′(s) = ζγ (s) if and only if (h ◦ γ )′(s) = h∗(ζ )h(γ (s)). Therefore:

Lemma 15.2 A C1 curve γ : (a, b) → M is a trajectory of ζ if and only if h ◦ γ isa trajectory of h∗(ζ ).

We can now bring the Picard–Lindelöf theorem to M .

Proposition 15.1 Suppose that ζ is locally Lipschitz and C ⊂ M is compact. Thenthere is an ε > 0 such that for each p ∈ C there is a unique trajectory Φ(p, ·) :(−ε, ε) → M for ζ such that Φ(p, 0) = p. Furthermore, Φ : C × (−ε, ε) → M iscontinuous, and if ζ is Cs (1 ≤ s ≤ r) then so is Φ.

Proof First suppose that C is contained in the domain of a Cr coordinate chart ϕ :U → V ⊂ R

m . Let F : ϕ(C) × (−ε, ε) → V be the function given byTheorem15.1for the vector field ϕ∗(ζ ). Then the function Φ : C × (−ε, ε) → M given by

Φ(p, t) := ϕ−1(F(ϕ(p), t))

inherits the continuity and smoothness properties of F , and for each p ∈ C ,Φ(p, 0) =p and Φ(p, ·) is a trajectory for ζ . The composition of ϕ with a trajectory for ζ is atrajectory for ϕ∗(ζ ), and there is a unique such trajectory, soΦ(p, ·)must be unique.

For the general case we can cover C with the interiors of a finite collectionK1, . . . , Kr of compact subsets, each of which is contained in the image of someCr parameterization ϕi : Ui → M . For each i let Φi : Ki × (−εi , εi ) → M be asabove. Because the trajectories are unique these combine to give a unique satisfactoryfunction Φ : C × (−ε, ε) → M where ε := mini εi . �

The flow domain of ζ is the set W of pairs (p, t) ∈ M × R such that if t ≤ 0(t ≥ 0) there is a trajectory γ : [t, 0] → M (γ : [0, t] → M) of ζ with γ (0) = p.

Theorem 15.2 The flow domain of ζ is an open subset of M × R that containsM × {0}. There is a unique function Φ : W → M such for each p ∈ M, Φ(p, ·)

15.2 Dynamics on a Manifold 295

is a trajectory of ζ with Φ(p, 0) = p. If (p, s) ∈ W and (Φ(p, s), t) ∈ W, then(p, s + t) ∈ W and

Φ(p, s + t) = Φ(Φ(p, s), t) .

In addition Φ is continuous, and if ζ is Cs (1 ≤ s ≤ r) then so is Φ. If S ⊂ M andS × {t} ⊂ W, then Φ(·, t)|S is an embedding.Proof Consider (p, t) ∈ W with t ≥ 0. (The argument when t ≤ 0 is similar.) Firstsuppose there are two distinct trajectories γ1, γ2 : [0, t] → M with γ1(0) = γ2(0) =p. Let t0 := inf{ s ∈ [0, t] : γ1(s) �= γ2(s) }. By applying the last result to a compactC ⊂ M containing γ1(t0) = γ2(t0), we find that γ1 and γ2 must agree on some interval(t0 − ε, t0 + ε), which is a contradiction. Thus Φ is unique. It is easy to constructtrajectories that show that (p, s + t) ∈ W and Φ(p, s + t) = Φ(Φ(p, s), t) when(p, s) ∈ W and (Φ(p, s), t) ∈ W .

Let γ : [0, t] → M be the unique trajectory with γ (0) = p. We can cover {p} ×[0, t] with finitely many sets C1 × (t1 − ε1, t1 + ε1), · · · ,Ck × (tk − εk, tk + εk)

where, for each i , Ci is a compact neighborhood of γ (ti ) and the conclusions of thelast result are satisfied. It is easy to see that if (γ (t), t) ∈ Ck × Ik , then

(⋂i Ci

) × Ikis a neighborhood of (p, t) that is contained inW . ThusW is open. The continuity andsmoothness properties of Φ on

(⋂i Ci

) × Ik are inherited from the correspondingproperties given by the result above. If S × {t} ⊂ W , then Φ(·, t)|S and its inverseΦ(·,−t)|Φ(S,t) are continuous. �

As with homotopies, it is conventional to write the time argument of Φ as asubscript, and we shall usually do so. We will frequently be interested in the imagesof sets, so, for example,ΦR+(A) is { Φ(a, t) : a ∈ A and t ≥ 0 }. There are additionalabbreviations, such as Φt (A) in place of Φ{t}(A), that should cause no confusion.

The vector field ζ is said to be complete if W = M × R. When this is the caseeach Φt : M → M is a homeomorphism (or Cs diffeomorphism if ζ is Cs) withinverse Φ−t , and t �→ Φt is a homomorphism from R (thought of as a group) to thespace of homeomorphisms (or Cs diffeomorphisms) between M and itself.

An equilibrium of ζ is a point p ∈ M such that ζp = 0. If p is an equilibrium,then Φt (p) = 0 for all t such that (p, t) ∈ W . It is possible that Φt has fixed pointsthat are not equilibria; a trajectory of ζ is a cycle if it is not constant but it does takeon the same value at two different times. We will be interested in the fixed pointindex of Φt |C for small positive values of t , so we might wonder whether Φt canhave fixed points that are not equilibria when t is arbitrarily small. If ζ is Lipschitz,this is not possible. Actually, nothing below will depend on this, so we only sketchthe main ideas.

Let γ : [a, b] → Rk be a C2 curve. We assume that there is a constant L > 0

such that ‖γ ′′(t)‖ ≤ L‖γ ′(t)‖ for all t . (This is the case if γ is a trajectory of an L-Lipschitz vector field.)We say that γ is parameterized by arc length if ‖γ ′(t)‖ = 1for all t . If this is the case, the curvature of γ at time t is ‖γ ′′(t)‖. Fenchel’s theorem(e.g., do Carmo1976 or Tapp 2016) states that if γ is nonconstant and closed, whichis to say that γ (a) = γ (b), then the total curvature

∫ ba ‖γ ′′(t)‖ dt is at least 2π . (In


addition, theminimum is attained if and only if the image of γ lies in a 2-dimensionalplane, and circumscribes a convex subset of this plane.) Thus 2π ≤ L(b − a) whenγ is closed. More generally, if ‖γ ′(t)‖ is a nonzero constant, the curvature at time t is‖γ ′′(t)‖/‖γ ′(t)‖, and the total curvature of the portion of the curve traversed duringa small interval [t, t + Δt] is bounded by LΔt because it is roughly this curvaturetimes the length ‖γ ′(t)‖Δt of the portion of the curve traversed during this interval.When γ may not have constant speed, the curvature at time t is the norm of theprojection of γ ′′(t) onto the linear subspace orthogonal to γ ′(t), divided by ‖γ ′(t)‖,so the curvature is bounded above by ‖γ ′′(t)‖/‖γ ′(t)‖, and it is still the case that theamount of time required to traverse a closed curve is bounded below by 2π/L .

15.3 Diffeoconvex Bodies

In evolutionary game theory one studies dynamical systems defined by vector fieldson a set such as the cartesian product of the simplices of mixed strategies for theagents in a strategic form game. In order for this to be sensible, the vector field shouldnot point out of the set at any point on the boundary. The natural way to bring thePicard–Lindelöf theorem to bear is to extend the vector field to an open neighborhoodof the set, then show that trajectories for the extended vector field that begin in theset stay in the set, and do not depend on the choice of extension. Later we will definea vector field index for vector fields defined on subsets of the set, which may havezeros on the boundary of the set. Again we define the index to be the index of asuitable extension, but in this case the extension should not introduce new zeros ofthe vector field.

Thus there arises the question of how to bring these methods to general manifoldsin sufficient generality to cover the example above. In this section we study therelevant class of subsets of M , and develop their key properties.

A closed set Σ ⊂ M is a diffeoconvex body if each p ∈ Σ is an element of thedomain U of a Cr coordinate chart ϕ : U → R

m such that ϕ(Σ ∩U ) = P ∩ ϕ(U )

for some closed convex P ⊂ Rm with nonempty interior. Fix such a Σ .

If P ⊂ Rm is a convex set with nonempty interior and x ∈ R

m , let

CP(x) := { v ∈ Rm : 〈v, y − x〉 ≥ 0 for all y ∈ P } .

This is a closed convex cone. Recall that a cone is pointed if it does not contain a line.If CP(x) contained a line, then P would be contained in the hyperplane orthogonalto this line that contains x , contradicting the assumption that P has an interior point.Therefore CP(x) is pointed.

We now need a concept from linear algebra. If V and W are finite dimensionalinner product spaces of the same dimension and : V → W is a nonsingular lineartransformation, the adjoint of is the linear transformation ∗ : W → V such that〈v, ∗(w)〉 = 〈(v),w〉 for all v ∈ V and w ∈ W . (It is an easy exercise to show that

15.3 Diffeoconvex Bodies 297

there is a unique function ∗ determined by this condition, and that ∗ is a nonsingularlinear transformation.)

Lemma 15.3 Suppose that p ∈ Σ , U ⊂ M is an open set containing p, andϕ : U → R

m and ϕ′ : U → Rm are Cr coordinate charts such that ϕ(Σ ∩U ) =

P ∩ ϕ(U ) and ϕ′(Σ ∩U ) = P ′ ∩ ϕ(U ) for closed convex sets P, P ′ ⊂ Rm with

nonempty interiors. Let := Dϕ(p) and ′ := Dϕ′(p). Then

∗(CP(ϕ(p))) = ′∗(CP ′(ϕ′(p))) .

Proof Suppose that w ∈ Sm−1 is not in CP(ϕ(p)). Let w′ be the element of Rm suchthat ∗(w) = ′∗(w′). There is a y ∈ P such that 〈w, y − ϕ(p)〉 < 0. For sufficientlysmall ε > 0 let γ : (−ε, ε) → U be the function such that ϕ(γ (t)) = ϕ(p) + t (y −ϕ(p)). We compute that

〈w, y − ϕ(p)〉 = 〈w, (γ ′(0))〉 = 〈∗(w), γ ′(0)〉 = 〈′∗(w′), γ ′(0)〉 = 〈w′, ′(γ ′(0))〉 .

Therefore 〈w′, ′(γ ′(0))〉 < 0, so 〈w′, ϕ′(γ (t)) − ϕ′(P)〉 < 0 for sufficiently smallt > 0, and consequently w′ /∈ CP ′(ϕ′(p)). By symmetry this suffices to establish theresult. �

In view of this result we may define CΣ(p) to be Dϕ(p)∗(CP(ϕ(p))) for any Cr

coordinate chart ϕ : U → Rm on an open set U containing p such that there is a

convex set P ⊂ Rm with nonempty interior such that ϕ(Σ ∩U ) = P ∩ ϕ(U ). Since

it is the linear image of a closed pointed convex convex, CΣ(p) is a closed pointedconvex cone. For p ∈ Σ let SΣ(p) := { v ∈ CΣ(p) : ‖v‖ = 1 } be the intersectionof CΣ(p) with the unit sphere, and let

NΣ(p) := { v ∈ TpM : 〈v,w〉 ≥ 0 for all w ∈ SΣ(p) }

andN ◦

Σ(p) := { v ∈ TpM : 〈v,w〉 > 0 for all w ∈ SΣ(p) } .

Lemma2.7 implies that N ◦Σ(p) is nonempty. Since SΣ(p) is compact, N ◦

Σ(p) is anopen subset of TpM , and NΣ(p) is its closure. (It is obvious that NΣ(p) contains theclosure of N ◦

Σ(p), and any point of NΣ(p) is in this closure because it is an endpointof the line segment between itself and a point of N ◦

Σ(p).)Since SΣ(p) may be empty, SΣ is not a correspondence, so it is not quite correct

to say that it is upper hemicontinuous.

Lemma 15.4 { (p, z) : p ∈ Σ and z ∈ SΣ(p) } is a closed subset of TΣ .

Proof Suppose that {pn} is a sequence in Σ converging to p ∈ Σ , zn ∈ SΣ(p)for each n, and zn → z. There is a Cr coordinate chart ϕ : U → R

m such thatϕ(Σ ∩U ) = P ∩ ϕ(U ) for some closed convex P ⊂ R

m with nonempty inte-rior. Let x := ϕ−1(p), xn := ϕ−1(pn), := Dϕ(p)∗, n := Dϕ(pn)∗, v := −1(z),


and vn := −1n (zn). If z /∈ SΣ(p), then v /∈ CP(x), so there is a y ∈ P such that

〈v, y − x〉 < 0. By continuity, 〈vn, y − xn〉 < 0, for large n, which implies thatvn /∈ CP(xn) and zn /∈ CΣ(pn). This contradiction completes the proof. �

Corollary 15.1 The graph of N ◦Σ is an open subset of TΣ , and consequently N ◦

Σ islower hemicontinuous.

The map the takes a point near Σ to the nearest point in Σ will be an importanttechnical device. We now show that this map is defined on a neighborhood ofΣ , andwell behaved on smaller neighborhoods.

Lemma 15.5 For every p0 ∈ Σ there is a neighborhoods V ⊂ M and a constantλ > 0 such that if p, p′ ∈ V , q ∈ Σ is a nearest point for p, and q ′ ∈ Σ is a nearestpoint for p′, then

‖p − p′‖ ≥ (1 − λ(‖p − q‖ + ‖p′ − q ′‖))‖q − q ′‖ .

Proof LetU ⊂ M be a neighborhood of p0 for which there is a Cr coordinate chartϕ : U → R

m such that ϕ(Σ ∩U ) = P ∩ ϕ(U ) for some closed convex P ⊂ Rm

with nonempty interior. For any unit vector u ∈ Rm and any x ∈ ϕ(U ) the second

derivative of the function s �→ ϕ−1(x + su) at s = 0 can be expressed in terms ofthe second partials of ϕ−1, so it is bounded for x in some neighborhood of ϕ(p0).In addition, for some neighborhood of p0 there is an upper bound on the ratio of‖ϕ(q) − ϕ(q)‖ to ‖q − q ′‖ for all q, q ′ in this neighborhood. Combining these, wefind that there is a neighborhood U ′ and a λ > 0 such that the convex hull of ϕ(U ′)is contained in ϕ(U ) and for all q, q ′ ∈ U ′, if ρ : [0, 1] → Σ is the path given by

ρ(t) := ϕ−1((1 − t)ϕ(q) + tϕ(q ′))

,

then ‖ρ ′(t ′) − ρ ′(t)‖ ≤ λ‖q ′ − q‖2 × |t ′ − t | for all t, t ′ ∈ [0, 1].Let V be a neighborhood of p0 such that if p ∈ V and q is a nearest point in

Σ , then q ∈ U ′. Suppose that p, p′ ∈ U , q is a nearest point in Σ for p, and q ′is a nearest point in Σ for p′. We have 〈p − q, ρ ′(0)〉 ≤ 0 because q is nearer top than any other point in the image of ρ. Since q ′ − q is the average of the ρ ′(t),‖q ′ − q − ρ ′(0)‖ ≤ λ‖q ′ − q‖2. Therefore

〈p − q, q − q ′〉 = 〈p − q, q − q ′ + ρ ′(0)〉 − 〈p − q, ρ ′(0)〉

≥ −‖p − q‖ · ‖q − q ′ + ρ ′(0)‖ ≥ −λ‖p − q‖ × ‖q − q ′‖2 .

There is a similar inequality with p′ and q ′ in place of p and q, so we have thecomputation

‖p − p′‖ × ‖q − q ′‖ ≥ 〈p − p′, q − q ′〉 = 〈p − q, q − q ′〉 + ‖q − q ′‖2 + 〈p′ − q ′, q ′ − q〉

≥ (1 − λ(‖p − q‖ + ‖p′ − q ′‖))‖q − q ′‖2 .

15.3 Diffeoconvex Bodies 299

Dividing by ‖q − q ′‖ gives the asserted inequality. �

Let VΣ be the set of p ∈ M such that there is a unique nearest point rΣ(p) ∈ Σ .

Proposition 15.2 VΣ is a neighborhood ofΣ . For any ε > 0 there is a neighborhoodV ⊂ VΣ of Σ such that rΣ |V is Lipschitz with Lipschitz constant 1 + ε.

Proof Let p0 and V be as in the last result. If p = p′ ∈ V and q and q ′ are distinctnearest points, then ‖p − q‖ = ‖p − q ′‖ ≥ 1/2λ. If p, p′ ∈ V , q and q ′ are nearestpoints inΣ , and ‖p − q‖, ‖p′ − q ′‖ ≤ ε/2λ(1 + ε), then ‖q ′ − q‖ ≤ (1 + ε)‖p′ −p‖. Thus each p0 ∈ Σ has a neighborhood such that the restriction of rΣ to thisneighborhood is Lipschitz with Lipschitz constant 1 + ε. The second assertion isnow a consequence of the following general result. �

Lemma 15.6 If X is a metric space, A ⊂ X, r : X → A is a retraction, ε > 0, andevery a ∈ A has a neighborhoodUa such that r |Ua is Lipschitz with Lipschitz constant1 + ε, then there is a neighborhood U of A such that r |U is Lipschitz with Lipschitzconstant 1 + ε.

Proof For each a ∈ A let Va be a neighborhood of a such that d(x, r(x)) <ε2d(x, X \Ux ) for all x ∈ Va , and set U := ⋃

a Va . For any x, y ∈ U , if x /∈ Uy

and y /∈ Ux , then

d(r(x), r(y)) ≤ d(r(x), x) + d(x, y) + d(y, r(y)) ≤ (1 + ε)d(x, y) ,

and of course this inequality also holds if x ∈ Uy or y ∈ Ux . �

We will need the following fact.

Lemma 15.7 For each p ∈ VΣ , πrΣ(p)(rΣ(p) − p) ∈ CΣ(rΣ(p)).

Proof Let ϕ : U → Rm be aCr coordinate chart whose domain contains p such that

there is a closed convex P ⊂ Rm with nonempty interior such that ϕ(Σ ∩U ) =

P ∩ ϕ(U ). Setting v := πrΣ(p)(rΣ(p) − p), let v′ ∈ Rm be the vector such that

v = Dϕ(rΣ(p))∗v′. If v /∈ CΣ(rΣ(p)), then there is a y ∈ P such that 〈v′, y −ϕ(rΣ(p))〉 < 0. For a sufficiently small ε > 0 let ρ : [0, ε) → M be the path suchthat ϕ(ρ(t)) = (1 − t)ϕ(rΣ(p)) + t y. Then

ddt ‖ρ(p) − p, ‖2 = 2〈rΣ(p) − p, ρ ′(0)〉 = 2〈v, ρ ′(0)〉 = 2〈Dϕ(rΣ(p))∗v′, ρ ′(0)〉

= 2〈v′, Dϕ(rΣ(p))ρ ′(0)〉 = 2〈v′, (ϕ ◦ ρ)′(0)〉 = 2〈v′, y − ϕ(rΣ(p))〉 < 0 .

This contradicts the definition of rΣ(p). �


15.4 Flows on Diffeoconvex Bodies

Fix a diffeoconvex body Σ ⊂ M . Suppose we are given a vector field ζ on someS ⊂ Σ . We say that ζ is not outward pointing if ζ(p) ∈ NΣ(p) for all p ∈ S, andit is inward pointing if, ζ(p) ∈ N ◦

Σ(p) for all p ∈ S. We will extend the Picard–Lindelöf theorem to this setting when S is open and ζ is locally Lipschitz and notoutward pointing. In order to do this we will extend ζ to a neighborhood of S in M ,then show the trajectories of the extension that begin in S stay in S. More specifically,we will show that there is a neighborhood of S in which trajectories move closer toS.

For p ∈ M letπp : Rk → TpM and νp := IdRk − πp

be the orthogonal projections of Rk onto TpM and its orthogonal complement. Thecanonical extension of ζ is the vector field ζ on VΣ given by

ζp = πp(ζrΣ(p) + rΣ(p) − p).

If p ∈ Σ , then rΣ(p) = p, ζp ∈ TpM , and ζp = πp(ζp) = ζp, so this is indeed anextension of ζ . We first show that if ζ is locally Lipschitz, then the restriction of ζ

to some neighborhood of VΣ is Lipschitz.

Lemma 15.8 The functions (p, v) �→ πp(v) and (p, v) �→ νp(v) from M × Rk to

Rk areCr−1. For each p0 ∈ M there is a constantC > 0 andaneighborhoodW ⊂ M

such that ‖νp|Tp′ M‖ ≤ C‖p − p′‖ for all p, p′ ∈ W.

Proof Let U ⊂ Rm be open, and let ψ : U → M be a Cr parameterization. Let

w1, . . . ,wm be a basis of Rm , and suppose that vm+1, . . . , vk ∈ Rk are such that for

all x ∈ U ,Dψ(x)w1, . . . , Dψ(x)wm, vm+1, . . . , vk

are linearly independent. For p = ψ(x) letb1(p), . . . ,bk(p) be the result of applyingthe Gram–Schmidt process to these vectors. We have

πp(v) = 〈v,b1(p)〉b1(p) + · · · + 〈v,bm(p)〉bm(p) .

Since ψ−1 is Cr , Dψ(x)w is a Cr−1 function of (x,w), and the Gram–Schmidtprocess is C∞, b1(p), . . . ,bk(p) are Cr−1 functions of p. Thus (p, v) �→ πp(v) isCr−1, and of course νp(v) = v − πp(v).

We have

πp(v) = 〈v,bm+1(p)〉bm+1(p) + · · · + 〈v,bk(p)〉bk(p) ,

and if v ∈ Tp′ M , then v = ∑mi=1〈v,bi (p′)〉bi (p′), so the norm of νp|Tp′ M is bounded

by m(k − m) times

15.4 Flows on Diffeoconvex Bodies 301

maxi=1,...,m, j=m+1,...,k

|〈bi (p′),b j (p)〉| .

Each 〈bi (p′),b j (p)〉 is a differentiable function that vanishes when p = p′. Thisimplies the second claim. �

For the remainder of this sectionwe assume that ζ is locally Lipschitz. Fix an openV ⊂ VΣ that contains Σ such that rΣ |V is locally Lipschitz. Henceforth we regardζ as a vector field defined on V . Then ζ is locally Lipschitz because its definingformula displays it as a composition of locally Lipschitz functions. Let W ⊂ V × R

be the flow domain of ζ , and let Φ : W → M be the flow.Our next concern is to show that trajectories that start sufficiently close to Σ are

drawn toward it. We first discuss the technical basis for the analysis. A functionf : [a, b] → R is absolutely continuous if, for any ε > 0, there is some δ > 0 suchthat

∑ni=1 | f (bi ) − f (ai )| < ε whenever [a1, b1], . . . , [an, bn] is a finite collection

of pairwise disjoint subintervals of [a, b] that satisfy ∑ni=1 bi − ai < δ. It should go

without saying that a Lipschitz function is absolutely continuous.

Theorem 15.3 A function f : [a, b] → R is absolutely continuous if and only if it isdifferentiable almost everywhere and f (t) = f (a) + ∫ t

a f ′(s) ds for all t ∈ [a, b].Note that we are presuming that the reader knows that “almost everywhere”means

“except on a set of measure zero.” This concept was introduced in Chap. 11. Theintegral here is the Lebesgue integral; concepts frommeasure theory that are requiredto understand the statement of this result are covered in Sect. 17.4. For a proof werecommend the treatment in Royden and Fitzpatrick (2010), which culminates inSect. 6.5. (Not including a proof here is regrettable, but the argument is actuallyquite lengthy and intricate, drawing on concepts and results from several topics inanalysis.)

Proposition 15.3 If {p} × [0, T ] ⊂ W , τ : [0, T ] → M is the trajectory τ(t) :=Φ(p, t), and there is a constant K such that

〈τ(t) − rΣ(τ(t)), ζτ (t)〉 ≤ K

for all t , then‖τ(T ) − rΣ(τ(T ))‖ ≤ eKT ‖τ(0) − rΣ(τ(0))‖ .

Proof For the sake of more compact notation we set w(t) := τ(t) − rΣ(τ(t)). Letf : [0, T ] → R+ be the function f (t) := ‖w(t)‖2. Since f is a composition of Lip-schitz functions, it is Lipschitz, so Theorem15.3 implies that f is almost everywheredifferentiable and f (T ) = ∫ T

0 f ′(t) dt . Since τ is a trajectory, it is locally Lipschitz,and rΣ is Lipschitz, so rΣ(τ(t)) is Lipschitz. Applying Theorem15.3 to each of itscomponent functions, we find that for almost all t , d

dt rΣ(τ(t)) is defined, in whichcase

f ′(t) = 2⟨w(t), ζτ (t) − d

dt rΣ(τ(t))⟩.


The first order condition for the minimization problem that rΣ(τ(t)) solves gives

⟨w(t), d

dt rΣ(τ(t))⟩ = 0 .

Therefore f ′(t) ≤ 2K f (t). The function g(t) := e−2Kt f (t)/ f (0) is absolutely con-tinuous, and an elementary calculus computation gives g′(t) ≤ 0 when g′(t) isdefined, so g(T ) = g(0) + ∫ T

0 g′(t) dt ≤ g(0) = 1, fromwhichweobtain thedesired

inequality√

f (T ) ≤ eKT√f (0). �

Achieving K < 0 in the last result requires some technical inequalities. Evidentlyνp − νp′ is a Cr−1 function of (p, p′) which vanishes when p = p′. Therefore forp0 ∈ M there is a constant α > 0 and a neighborhood W such that ‖νp − νp′ ‖ ≤α‖p − p′‖ for all p, p′ in W . The vector νp(p − p′) is a Cr−1 function of (p, p′),and the derivative with respect to p′ at (p0, p0) is zero. But we also have

‖νp(p − p′)‖ ≤ ‖νp′(p − p′)‖ + ‖νp − νp′ ‖ × ‖p − p′‖ ,

so the derivative with respect to p is also zero at (p0, p0). Since r ≥ 3 we concludethat:

Lemma 15.9 For each p0 ∈ M there is a constant D > 0 and a neighborhoodW ⊂ M such that ‖νp(p − p′)‖ ≤ D‖p − p′‖2 for all p, p′ ∈ W.

Proposition 15.4 Suppose p ∈ VΣ , z ∈ NΣ(rΣ(p)), and z := πp(z + rΣ(p) − p).Let w := p − rΣ(p). If there are positive constants B, C, and D such that

‖z‖ ≤ B, ‖νp|TrΣ (p)M‖ ≤ C‖w‖, and ‖νp(w)‖ ≤ D‖w‖2 .

Then〈w, z〉 ≤ ( − 1 + BCD‖w‖ + D2‖w‖2)‖w‖2 .

Proof We havez = z − νp(z) − w + νp(w) .

Lemma15.7 implies that

⟨w, z

⟩ = ⟨πrΣ(p)(w), z

⟩ ≤ 0 .

The Cauchy–Schwartz inequality gives

⟨w, νp(z)

⟩ = ⟨νp(w), νp(z)

⟩ ≤ (D‖w‖2) × (C‖w‖ × ‖z‖) ≤ BCD‖w‖3

and ⟨w, νp(w)

⟩ = ⟨νp(w), νp(w)

⟩ ≤ ‖νp(w)‖2 ≤ D2‖w‖4 .

Combining these gives the asserted inequality. �

15.4 Flows on Diffeoconvex Bodies 303

Proposition 15.5 There is a neighborhood V of Σ such that for every p ∈ V thedistance to Σ is strictly decreasing along the trajectory of ζ starting at p.

Proof Fix p0 ∈ Σ . Since rΣ and ζ are continuous, there is a neighborhood of p0and a B > 0 such that ‖ζrΣ(p)‖ ≤ B for all p in this neighborhood. Lemmas15.8and 15.9 give constants C, D > 0 such that their asserted conclusions hold insome neighborhood of p0. Let W be a neighborhood satisfying all these condi-tions and also ‖p − rΣ(p)‖ < ε for all p ∈ W , where ε > 0 is small enough that−1 + BCDε + D2ε2 < 0. Proposition15.4 implies that 〈p − rΣ(p), ζp〉 < 0 for allp ∈ W , so Proposition15.3 implies that trajectories in W move closer to Σ . �

For us the final form of the Picard–Lindelöf theorem is:

Proposition 15.6 Let ζ be a locally Lipschitz vector field on Σ that is not outwardpointing. If C ⊂ Σ is compact, then there is an ε > 0 such that for each p ∈ C thereis a unique trajectory Φ(p, ·) : [0, ε) → Σ for ζ such that Φ0(p) = p. In additionΦ is continuous, and if ζ is Cs (1 ≤ s ≤ r − 1) then so is Φ.

Proof Proposition15.1 implies that there is an ε > 0 such that for each p ∈ C thereis a unique trajectory Φ(p, ·) : (−ε, ε) → M for ζ such that Φ0(p) = p. In additionΦ is continuous. It follows immediately that Φ is unique and continuous if it exists.Furthermore, the existence ofΦ follows if we can show that trajectories that begin inU stay in U . If a trajectory leaves Σ there must be an interval of time during whichthe distance from Σ goes from 0 to some positive quantity, but the last result impliesthat this cannot happen.

If ζ is Cs , then it has a Cs extension ζ to a neighborhood U of U in M . (Thefunction p �→ ζp has a Cs with rangeRk , and we can compose this with the function(p, v) �→ πp(v) to obtain an extended vector field.) Applying Proposition15.1 tothis extension gives a Cs flow that must agree with Φ on C × [0, ε), so Φ is Cs . �

The forward flow domain of ζ is the set W of pairs (p, t) ∈ U × R+ such thatthere is a trajectory γ : [0, t] → Σ of ζ with γ (0) = p. We say that ζ is forwardcomplete if W = U × R+. With obvious modifications the arguments from earliercan be used to prove:

Theorem 15.4 The forward flow domain of ζ is an openW ⊂ U × R+ that containsU × {0}. There is a unique function Φ : W → Σ such that for each p ∈ U, Φ(p, ·)is a trajectory such that Φ(p, 0) = p. If (p, s) ∈ W and (Φ(p, s), t) ∈ W, then(p, s + t) ∈ W and

Φ(p, s + t) = Φ(Φ(p, s), t) .

In addition Φ is continuous, and if ζ is Cs (1 ≤ s ≤ r) then so is Φ. If S ⊂ U andS × {t} ⊂ W, then Φ(·, t)|S is an embedding.


15.5 The Vector Field Index

Along with the degree and the fixed point index, the vector field index is a thirdmajor manifestation of the fixed point principle. In the simplest settings it coincideswith the fixed point index of the vector field’s flow for small negative times, butit is well defined for vector fields that are not locally Lipschitz. We extend it tonot outward pointing vector fields on diffeoconvex bodies, and then to contractiblevalued correspondences. This allows a correspondingly general form of the famousPoincaré–Hopf theorem.

Let ζ be a vector field on M . Recall that an equilibrium of ζ is a point p ∈ S suchthat ζp = 0 ∈ TpM . Let E (ζ ) be the set of equilibria of ζ . As before, for a compactC ⊂ M the topological boundary of C is ∂C := C ∩ M \ C , and intC := C \ ∂Cis its topological interior. A continuous vector field ζ on C is index admissibleif E (ζ ) ⊂ intC . Let VM be the set of index admissible vector fields ζ : C → T Mwhere C ⊂ M is compact. The vector field index will be an integer valued functionon VM .

At several points in the remainder of this chapter we will take advantage of atechnical device that was introduced in Sect. 14.5. As we did there, let λ : M →R++ be a continuous function, let T Mλ = { (p, v) ∈ T M : ‖v‖ < λ(p) }, and letκ : T Mλ → M be a Cr−1 function such that:

(a) κ(p, 0) = p and Dκ(p, ·)(0) = IdTpM for all p ∈ M ;(b) κ = π × κ : T Mλ → M × M is a Cr−1 embedding.

Theorem 15.5 There is a unique function indM : VM → Z satisfying:

(V1) (Normalization) indM(ζ ) = 1 for all ζ ∈ VM with domain C such that thereis a Cr parameterization ϕ : V → M with C ⊂ ϕ(V ), ϕ−1(C) = Dm, andDϕ(x)−1ζϕ(x) = x for all x ∈ Dm.

(V2) (Additivity) indM(ζ ) = ∑si=1 indM(ζ |Ci ) whenever ζ ∈ VM with domain C

and C1, . . . ,Cs are pairwise disjoint compact subsets of C such that E (ζ ) ⊂intC1 ∪ . . . ∪ intCs.

(V3) (Continuity) For each ζ ∈ VM with domain C there is a neighborhood U ⊂T M of ζ(C) such that indM(ζ ′) = indM(ζ ) for all vector fields ζ ′ on C withζ ′(C) ⊂ U.

If ζ ∈ VM with domain C and ζ(C) ⊂ T Mλ, then

indM(ζ ) = (−1)mΛM(κ ◦ ζ ) .

Proof For ζ ∈ VM and ε > 0 let εζ be the vector field p �→ (p, εζp). There is someε > 0 such that εζ(C) ⊂ T Mλ for all ε ∈ (0, ε), and we define the vector field indexby setting indM(ζ ) := (−1)mΛM(κ ◦ εζ ) for such ε. (Of course κ ◦ εζ is (fixedpoint) index admissible, and by Continuity this definition does not depend on ε.)

That the index so defined satisfies (V1) is easily shown by a concrete comparisonwith the fixed point index in the C∞ case. Evidently (V2) and (V3) are immediateconsequences of (I2) and (I3).

15.5 The Vector Field Index 305

To prove uniqueness suppose that a vector field index indM is given. If C ⊂ M iscompact and f : C → M is index admissible, let

Λ′M( f ) := indM(κ−1 ◦ (IdD × f |D))

where D ⊂ C is a compact neighborhood of F ( f ) that is small enough that{ (p, f (p)) : p ∈ D } ⊂ κ(T Mλ). Clearly Λ′

M satisfies (I1)–(I3). If there were twovector field indices, then there would be two fixed point indices, which is precludedby Corollary13.1. �

While (V1) seems natural at first glance, it gives rise to some unfortunate signsbecause the origin is a repeller of the dynamical system given by the vector fieldx �→ (x, x) on R

m . For the theory of dynamical systems the prototypical stableequilibrium is the dynamical system coming from x �→ (x,−x). At least from ourpoint of view, it would be preferable to have the vector field index normalized byrequiring that the index of this vector field is +1.

As with the degree and index, (V3) implies the homotopy principle. A vec-tor field homotopy on S is a continuous function η : S × [0, 1] → T M such thatπ(η(p, t)) = p for all (p, t), which is to say that each ηt = η(·, t) : S → T M is avector field on S. A vector field homotopy η on C is index admissible if each ηtis index admissible. If indM(·) is a vector field index, then indM(ηt ) is a (locallyconstant hence) constant function of t , so indM(η0) = indM(η1). Let −ζ denote thevector field p �→ (p,−ζp).

Corollary 15.2 If ζ ∈ VM, then indM(−ζ ) = (−1)m indM(ζ ).

Proof The function ζ �→ (−1)m indM(−ζ ) clearly satisfies (V1)–(V3), so the claimfollows from the uniqueness assertion of the last result. �

Combining the last two results gives:

Corollary 15.3 If ζ ∈ VM with domain C and ζ(C) ⊂ T Mλ, then

indM(−ζ ) = ΛM(κ ◦ ζ ) .

There is a natural relationship between the vector field index and the index of theforward flow for small times.

Theorem 15.6 Let ζ be a locally Lipschitz vector field on M, let W be the flowdomain of ζ , and let Φ be the flow. If C ⊂ M is compact, ζ |C is index admissible,and U ⊂ C is an open neighborhood of E (ζ ), then F (Φt |C) ⊂ U and

ΛM(Φt |C) = (−1)m indM(ζ |C)

for all sufficiently small positive t .


Proof For t ∈ R let tζ be the vector field p �→ (p, tζp). When t > 0 is sufficientlysmall we may define a vector field ζ t on C by setting

ζ t (p) = (p, ζ tp) := κ−1(p, Φt (p)) .

The partial of ζ tp with respect to t is Dκ−1(p, ·)|Φt (p)ζΦt (p). Since this is a contin-

uous function of (p, t), and equal to ζp when t = 0, for each p ∈ C \U there is aneighborhood Wp ⊂ C and ε > 0 such that ‖tζp′ − ζ t

p′ ‖ < ‖tζp′ ‖ for all p′ ∈ Wp

and nonzero t ∈ (0, ε). Since ∂C is compact, it follows that there is ε > 0 suchthat ‖tζp − ζ t

p‖ < ‖tζp‖ for all p ∈ C \U and nonzero t ∈ (−ε, ε). For such a t let

η : C × [0, 1] be the vector field homotopy η(p, s) := (p, (1 − s)tζp + sζ tp). This is

index admissible, and consequently κ ◦ η is an index admissible homotopy betweenκ ◦ tζ |C and Φt |C . Therefore F (Φt |C) ⊂ U and

ΛM(Φt |C) = ΛM(κ ◦ tζ |C) = (−1)m indM(tζ |C) = (−1)m indM(ζ |C)

where the second inequality is from Theorem15.5 and the third comes from the factthat s �→ sζ |C is an index admissible homotopy between tζ |C and ζ |C . �

Now let Σ be a diffeoconvex body. A vector field ζ on a compact C ⊂ Σ isindex admissible for Σ if it is not outward pointing and it has no equilibria in thetopological boundary ∂C = C ∩ Σ \ C of C relative to Σ . Let VΣ be the set of suchvector fields. We would like to define a vector field index for VΣ by having the indexof ζ be the index of an extension ζ of ζ to a compact C that does not have anyadditional equilibria, and which does not have any equilibria on its boundary relativeto M . In order to do this we need to show that such extensions exist.

A standard extension of ζ ∈ VΣ with domainC is the restriction of the canonicalextension ζ to some compact C ⊂ M that contains E (ζ ) in its interior, such thatC ⊂ C and ζp �= 0 for all p ∈ C \ C .

Lemma 15.10 If ζ ∈ VΣ has domain C, then for any neighborhood W ⊂ M of Cthere is a standard extension of ζ whose domain is contained in W.

Proof In view of the last result it is easy to construct a compact subset C ′ of theinterior of r−1

Σ (C) ∩ W that contains E (ζ ) in its interior, such that ζp �= 0 for allp ∈ C ′ \ Σ . We can then set C = C ∪ C ′. �

Theorem 15.7 If we set indΣ(ζ ) := indM(ζ ) whenever ζ is a standard extension ofζ , then indΣ : VΣ → Z is a well defined function that satisfies:

(V1) indΣ(ζ ) = 1 for all ζ ∈ VΣ with domain C such that there is a Cr parameter-ization ϕ : V → M with C ⊂ ϕ(V ), ϕ−1(C) = Dm, and Dϕ(x)−1ζϕ(x) = xfor all x ∈ Dm.

(V2) indΣ(ζ ) = ∑si=1 indΣ(ζ |Ci )whenever ζ ∈ VΣ with domainC andC1, . . . ,Cs

are pairwise disjoint compact subsets of C such that E (ζ ) ⊂ intC1 ∪ . . . ∪intCs.


(V3) For each ζ ∈ VΣ with domain C there is a neighborhood U ⊂ T M of ζ(C)

such that indΣ(ζ ′) = indΣ(ζ ) for all vector fields ζ ′ on C with ζ ′(C) ⊂ U.

Proof Additivity implies that indM(ζ |C) does not depend on the choice of C , soindΣ is well defined. That indΣ satisfies Normalization and Additivity follows fromthe corresponding properties of indM . To prove Continuity we observe that if ζ ′ isin a sufficiently small neighborhood of ζ , then the convex combination homotopyis index admissible. Applying the standard extension to this homotopy allows theresult to be derived from Continuity for indM . �

It seems quite likely that indΣ is uniquely determined by (V1)–(V3), but it is notobvious how one might prove this. Fortunately we do not need such a result.

For future reference we mention the following obvious consequence ofCorollary15.2.

Corollary 15.4 If ζ ∈ VΣ , then indΣ(−ζ ) = (−1)m indΣ(ζ ).

The relationship between the vector field index and the fixed point index givenby Theorem15.5 extends to Σ :

Theorem 15.8 If ζ ∈ VΣ and κ(p, tζp) ∈ Σ for all p ∈ C and all t ∈ [0, 1], then

indΣ(−ζ ) = ΛΣ(κ ◦ ζ ) .

Proof Let νM ⊂ TRk be the normal bundle of M . Recall that the tubular neighbor-hood theorem gives a neighborhood W ⊂ νM of the zero section of νM and a Cr−1

embedding ι : W → Rk such that ι(p, 0) = p for all p ∈ M . Let ρ : ι(W ) → M

be the composition of ι−1 with the projection νM → M . Let V ′Σ be the set of

p ∈ VΣ such that the line segment between p and rΣ(p) is contained in ι(W ), andlet h : V ′

Σ × [0, 1] → M be the homotopy

h(p, t) = ρ((1 − t)p + trΣ(p)

).

Let ζ be a standard extension of ζ whose domain C is contained in V ′Σ . Let i : Σ →

M be the inclusion.The definition and Corollary15.3 give indΣ(−ζ ) = indM(−ζ ) = ΛM(κ ◦ ζ ). By

Homotopy, ΛM(κ ◦ ζ ) = ΛM(rΣ ◦ κ ◦ ζ ). Of course rΣ ◦ κ ◦ ζ = i ◦ rΣ ◦ κ ◦ ζ .There are neighborhoods ofE (ζ ) in C andC that have nofixedpoints of i ◦ rΣ ◦ κ ◦ ζ

and rΣ ◦ κ ◦ ζ ◦ i |C , so γE (ζ )(i ◦ rΣ ◦ κ ◦ ζ ) and γE (ζ )(rΣ ◦ κ ◦ ζ ◦ i |C) are indexadmissible, and Commutativity gives

ΛM(i ◦ rΣ ◦ κ ◦ ζ ) = ΛM(γE (ζ )(i) ◦ γE (ζ )(rΣ ◦ κ ◦ ζ ))

= ΛΣ(γE (ζ )(rΣ ◦ κ ◦ ζ ) ◦ γE (ζ )(i)) = ΛΣ(rΣ ◦ κ ◦ ζ ◦ i |C) .

Finally the hypothesis gives rΣ ◦ κ ◦ ζ ◦ i |C = rΣ ◦ κ ◦ ζ = κ ◦ ζ . �


The relationship between the vector field index and the index of the forward flowfor small times extends to diffeoconvex bodies.

Theorem 15.9 Let ζ be a locally Lipschitz vector field on Σ that is not outwardpointing. Let W be the forward flow domain of ζ , and let Φ be the forward flow. IfC ⊂ Σ is compact and ζ |C is index admissible for Σ , then

ΛΣ(Φt |C) = (−1)m indΣ(ζ |C)

for all sufficiently small positive t .

Proof Fix an ε ∈ (0, 1/2). We fix a neighborhood V ⊂ VΣ of Σ such that ζ |V islocally Lipschitz. We require that V is small enough that trajectories of ζ start-ing at points in V move closer to Σ , as per Proposition15.5. We also requirethat (rΣ(p), p) ∈ κ(T Mλ) for all p ∈ V , and that if κ(rΣ(p), v) = p, then ‖Dκ

(rΣ(p), ·)(sv) − IdTrΣ (p)M‖ < ε for all s ∈ [0, 1]. Let Φ be the flow of ζ |V .LetC ′ be a compact neighborhood (inM) ofE (ζ ) that is contained in r−1

Σ (C) ∩ V ,and let C := C ∪ C ′. Let D be a compact neighborhood (in Σ) of E (ζ ) that iscontained in the interior of C , and let D be a compact neighborhood (in M) of E (ζ )

that is contained in the interior of C . Let i : Σ → V be the inclusion. We claim thatfor all sufficiently small positive t ,

indΣ(ζ |C) = indM(ζ |C) = (−1)mΛM(Φt |C)

and

ΛM(Φt |C) = ΛM(rΣ ◦ Φt |C) = ΛM(i ◦ rΣ ◦ Φt |D) = ΛΣ(rΣ ◦ Φt ◦ i |D)

= ΛΣ(Φt |D) = ΛΣ(Φt |C) .

Since ζ |C is a standard extension of ζ |C , the first asserted equality is simplythe definition of indΣ . Theorem15.6 implies that for all t in some open interval(0, ε), all of the fixed points of Φt |C and Φt |C are contained in the interiors of Dand D respectively, and that the second equality holds. Below we will constructan index admissible homotopy between Φt |C and rΣ ◦ Φt |C , which gives the thirdequality. If t is sufficiently small, then the fixed points of rΣ ◦ Φ|C are contained inD, so the fourth equality follows from Additivity. Since D and D are compact andcontained in the interiors of C and C (and C ⊂ r−1

Σ (C)) for sufficiently small t wehave Φt (D) ⊂ C and rΣ(Φt (D)) ⊂ C , in which case the fifth equality follows fromthe version of Commutativity given by Proposition13.6. The sixth equality is simplyrΣ ◦ Φt ◦ i |D = Φt |D . (Trajectories do not leave Σ .) For small t the last equality isAdditivity.

It remains to construct the desired homotopy. For p ∈ C let g(p) := rΣ(Φt (p))and vp := κ(g(p), ·)−1(Φt (p)). Let h : C × [0, 1] → M be the homotopy


h(p, τ ) := κ(g(p), τvp

).

Evidently h0 = rΣ ◦ Φt |C and h1 = Φt |C , so it remains to show that h is indexadmissible. For this it suffices to show that for all τ , hτ has no fixed points in C \ Σ .

Fix p ∈ C \ Σ . To simplify notation let δ := Φt (p) − g(p) and f := κ(g(p), ·) :Tg(p)M → M , so that h(p, τ ) = f (τvp). Noting that δ = ∫ 1

0 Df (svp)vp ds, we have

δ = vp +∫ 1

0(Df (svp) − IdTg(p)M)vp ds ,

so ‖δ‖ ≥ (1 − ε)‖vp‖, and

ddτ

〈 f (τvp), δ〉 = 〈Df (τvp)vp, δ〉 = ‖δ‖2 −⟨ ∫ 1

0(Df (τvp) − Df (svp))vp ds, δ

⟩

≥ ‖δ‖2 − ε‖vp‖ × ‖δ‖ ≥ 1−2ε1−ε

‖δ‖2 > 0 .

Since Φt (p) is strictly closer to Σ than p, p and g(p) are on opposite sides ofthe hyperplane containing Φt (p) that is orthogonal to δ. Since 〈 f (τvp), δ〉 is anincreasing function of τ , f (τvp) is on the same side of this hyperplane as g(p), andthus is not equal to p for all τ . �

A vector field correspondence on a set S ⊂ Σ is a correspondence Z : S → Rk

such that Z(p) ⊂ TpM for all p ∈ S. For such a Z let E (Z) := { p ∈ S : 0 ∈ Z(p) }.We say that Z is not outward pointing if Z(p) ⊂ NΣ(p) for all p ∈ S. If, for allp ∈ S, Z(p) is contained in the interior of NΣ(p), then Z is inward pointing. Avector field correspondence Z on a compact C ⊂ Σ is index admissible for Σ if itis not outward pointing and E (p) ∩ ∂C = ∅. Let VΣ be the set of index admissibleupper hemicontinuous contractible valued vector field correspondences. Followingthe methods used to extend the fixed point index to correspondences, in order toextend the vector field index to vector field correspondences we show that an upperhemicontinuous contractible valued vector field correspondence canbe approximatedby a vector field, and that sufficiently close approximations are homotopic.

Lemma 15.11 There is a locally Lipschitz inward pointing vector field on Σ .

Proof For each p ∈ Σ choose vp ∈ N ◦Σ(p) and a neighborhood Up ⊂ Σ of p such

that πp′(vp) ∈ N ◦Σ(p′) for all p′ ∈ Up. (Corollary15.1 implies that this is possible.)

Since Σ is paracompact there is an index set I , a function ρ : I → Σ , and an openVi ⊂ Uρ(i) for each i , such that { Vi : i ∈ I } is a locally finite cover ofΣ . Let {ϕi } be aC∞ partition of unity subordinate to this cover. The desired vector field ν is defined bysetting νp := ∑

i ϕi (p)πp(vρ(i)). This is locally Lipschitz because (p, y) �→ πp(y)is Cr−1, and it is inward pointing because each N ◦

Σ(p) is convex. �

Proposition 15.7 If C ⊂ Σ is a compact ANRand Z is a not outward pointing upperhemicontinuous contractible valued correspondence on C, then for any neighbor-hood U ⊂ TC of Gr(Z), there are:


(a) an inward pointing vector field ζ on C with (p, ζp) ∈ U for all p ∈ C;(b) a neighborhood U ′ ⊂ U ofGr(Z) such that for any not outward pointing vector

fields ζ, ζ ′ on C with (p, ζp), (p, ζ ′p) ∈ U ′ for all p ∈ C there is a vector field

homotopy η on C with η0 = ζ , η1 = ζ ′, ηt not outward pointing for all t , and(p, ηt (p)) ∈ U for all p and t.

Proof The last result gives an inward pointing vector field ν on C . Choose ε >

0 such that (p, z + tεν(p)) ∈ U for all (p, z) ∈ Gr(Z) and all 0 ≤ t ∈ 1. Since{ (p, z + tεν(p) : (p, z) ∈ Gr(Z) and 0 ≤ t ≤ 1 } is compact and the graph of N ◦

Σ

is open (Corollary15.1) there is a neighborhood U ⊂ U of Gr(Z) such that for all(p, z) ∈ U , (p, z + εν(p)) ∈ N ◦

Σ(p) and (p, z + tεν(p)) ∈ U for all 0 ≤ t ≤ 1. LetV := { (p, v) ∈ C × R

k : (p, πp(v)) ∈ U }.Proposition9.3 gives a continuous f : C → R

k such that Gr( f ) ⊂ V . Let ζ bethe vector field p �→ (p, πp( f (p)) + εν(p)).

Proposition9.3 gives a neighborhood V ′ ⊂ V of Gr(Z) such that for any con-tinuous f0, f1 : C → R

k with Gr( f0),Gr( f1) ⊂ V ′ there is a homotopy h : C ×[0, 1] → R

k with h0 = f0, h1 = f1, and Gr(ht ) ⊂ V for all t . Let U ′ := V ′ ∩ TC ,and let V ′′ := { (p, v) ∈ V ′ : (p, πp(v)) ∈ U ′. If ζ, ζ ′ are not outward pointing vec-tor fields whose images are contained in V ′′, Proposition9.3 gives a homotopyh : C × [0, 1] → R

k with h0(p) = ζp and h1(p) = ζ ′p for all p and Gr(ht) ⊂ V ′′

for all t . We define a vector field homotopy η be setting

(ηt )p :=

⎧⎪⎨⎪⎩

(p, ζ0p + 3tεν(p), 0 ≤ t ≤ 13 ,

(p, πp(h3t−1(p)) + εν(p)), 13 ≤ t ≤ 2

3 ,

(p, ζ1p + 3(1 − t)εν(p), 0 ≤ t ≤ 13 .

�

Theorem 15.10 There is a function indΣ : VΣ → Z that extends the previouslydefined index on VΣ and satisfies Additivity and Continuity:

(a) indΣ(Z) = ∑si=1 indΣ(Z |Ci ) whenever Z ∈ VΣ with domain C and C1, . . . ,Cs

are pairwise disjoint compact subsets of C such that E (Z) ⊂ intC1 ∪ . . . ∪intCs.

(b) For each Z ∈ VΣ with domain C there is a neighborhood U ⊂ T M of { (p, z) :p ∈ C and z ∈ Z(p) } such that indΣ(Z ′) = indΣ(Z) for all Z ′ ∈ V withdomain C such that { (p, z) : p ∈ C and z ∈ Z ′(p) } ⊂ U.

Proof We define indΣ(Z) to be indΣ(ζ ) for vector fields that are index admissiblewith images contained in a sufficiently small neighborhood of Gr(Z), where “suf-ficiently small” means that any two such vector fields are index admissible (for Σ)homotopic. The last result implies that this definition is meaningful. Of course thisdefinition agrees with the previously defined index on vector fields, and (b) is auto-matic. The proof that (a) is familiar: observe that if the image of an approximatingfor Z is required to lie in a suitably small neighborhood of Gr(Z), then the restrictionof this vector field each to Ci is an approximating vector field for Z |Ci . �


We now have a quite general version of a very famous result.

Theorem 15.11 (Poincaré–HopfTheorem) IfΣ is compact and Z is an index admis-sible upper hemicontinuous contractible valued vector field correspondence on Σ ,then

χ(Σ) = (−1)m indΣ(Z) .

Proof Condition (b) of the last result guarantees that there is a neighborhoodU ⊂ T M of Gr(Z) such that indΣ(Z ′) = indΣ(Z) for all Z ′ ∈ V |Σ with domainΣ such that Gr(Z ′) ⊂ U . Proposition15.7 guarantees that there is a not outwardpointing vector field ζ on Σ with { (p, ζp) : p ∈ Σ } ⊂ U , so indΣ(Z) = indΣ(ζ ).Lemma15.5 that there is a locally Lipschitz inward pointing vector field ν, andsince each NΣ(p) is convex, convex combination gives an index admissible homo-topy between ζ and ν, so indΣ(ζ ) = indΣ(ν). Let Φ be the forward flow for ν.Theorem15.9 implies that indΣ(ν) = (−1)mΛΣ(Φt ) for sufficiently small t > 0.Homotopy implies that ΛΣ(Φt ) = ΛΣ(IdΣ) = χ(Σ). �

In the customary statement of this result Σ is a ∂-manifold and the vector field isrequired to be outward pointing, which explains why our formulation has the factor(−1)m . Our result is more general because Σ can be a diffeoconvex body, we allowvector field correspondences, and we do not require that Z is inward pointing. Inparticular, our formulation allows equilibria on the boundary of Σ . Exercise15.2presents a special case of a recent economic application (McLennan 2018) of thePoincaré–Hopf theorem that takes advantage of this additional generality.

15.6 Dynamic Stability

In this section we study basic stability notions, focusing on the notion of uniformasymptotic stability. The first main result is that a sufficient condition for this versionof stability is the existence of a Lyapunov function. We also show that if a compactset is uniformly asymptotically stable for a dynamical system defined by a vectorfield on a diffeoconvex body Σ , then it is uniformly asymptotically stable for thedynamical system defined by the canonical extension of the vector field.

Many of the concepts below do not depend on the dynamical system being thesolution of differential equation, so we will study a dynamical system on a locallycompact metric space X . LetW be an open subset of X × R+ such that for each p ∈X , { t : (p, t) ∈ W } is an interval [0, T ) for some T ∈ (0,∞]. Let φ : W → X be acontinuous function, that we will refer to as the forward flow, such that φ(p, 0) = pfor all p ∈ X and φ(φ(p, t), s) = φ(p, s + t) for all p ∈ X and s, t ≥ 0 such that(p, s + t) ∈ W . We often write φt in place of φ(·, t).

The ω-limit set of a point p ∈ X is

ωφ(p) :=⋂t≥0

{ φs(p) : s ≥ t and (p, s) ∈ W } .


More generally, the ω-limit set of a set A ⊂ X is

ωφ(A) :=⋂t≥0

{ φs(p) : p ∈ A, s ≥ t and (p, s) ∈ W } .

Note that since it is an intersection of closed sets, ωφ(A) is closed.The domain of attraction of A is

D(A) = { p ∈ X : ∅ �= ωφ(p) ⊂ A } .

A fundamental neighborhood of A is a neighborhood U such that U × R+ ⊂ Wand for every neighborhoodU ′ ⊂ X of A there is a T > 0 such that φ[T,∞)(U ) ⊂ U ′.The set A is:

(a) forward invariant if A × R+ ⊂ W and φR+(A) ⊂ A;(b) Lyapunov stable if, for every neighborhoodU of A there is a neighborhoodU ′

such that U ′ × R+ ⊂ W and φR+(U ′) ⊂ U ;(c) attractive if D(A) is a neighborhood of A;(d) asymptotically stable if it is Lyapunov stable and attractive;(e) uniformly attractive if it has a fundamental neighborhood;(f) uniformly asymptotically stable if it is Lyapunov stable and uniformly

attractive.

We now study the relationships between these concepts. First note that if A isLyapunov stable, then it is forward invariant: if p ∈ A, then {p} × R+ ⊂ W andφR+(p) is contained in the intersection of all neighborhoods of A, which is A itself.

Lemma 15.12 If A is compact and asymptotically stable, then any compact neigh-borhood U of A that is contained in D(A) is a fundamental neighborhood.

Proof Since U ⊂ D(A), U × R+ ⊂ W . Let U ′ be any neighborhood of A. Since Ais Lyapunov stable there is an open neighborhoodU ′′ of A such thatU ′′ × R+ ⊂ Wand φR+(U ′′) ⊂ U ′. The trajectory of each p ∈ U eventually hits U ′′, and in fact(because φ is continuous) there is some t such that φt maps a neighborhood of p toU ′′. Since U is covered by finitely many such neighborhoods, there is a T such thatφ[T,∞)(U ) ⊂ U ′. �

Corollary 15.5 If A is compact and asymptotically stable, then it is uniformlyasymptotically stable.

Proof Since A is attractive, D(A) is a neighborhood of A. Since A is compactand X is locally compact, D(A) contains a compact neighborhood of A, which isfundamental, according to the last result. �

One of the earliest and most useful tools for understanding stability was intro-duced by Lyapunov toward the end of the 19th century. A function f : X → R isφ-differentiable if the φ-derivative

15.6 Dynamic Stability 313

d f

dφ(p) := d

dtf (φt (p))|t=0

is defined for every p ∈ X . A continuous function L : X → R+ is a Lyapunovfunction for a nonempty A ⊂ X if:

(a) L−1(0) = A;(b) L is φ-differentiable with dL

dφ(p) < 0 for all p ∈ X \ A;

(c) for every neighborhood U of A there is an ε > 0 such that L−1([0, ε]) ⊂ U .

Theorem 15.12 (Lyapunov 1992) If A is nonempty and compact, and L is a Lya-punov function for A, then A is uniformly asymptotically stable.

Proof Since X is locally compact, A has a compact neighborhood U . Let ε > 0 besuch that L−1([0, ε]) ⊂ U . As a closed subset ofU , L−1([0, ε]) is compact. For eachp ∈ L−1([0, ε]) (b) implies that L(φt (p)) is a strictly decreasing function of t , so

φ(W ∩ (L−1([0, ε]) × R+)) ⊂ L−1([0, ε]) .

Since W is open and contains X × {0}, and L−1([0, ε]) is compact, there is somet > 0 such that L−1([0, ε]) × [0, t] ⊂ W . It follows that L−1([0, ε]) × R+ ⊂ W , soL−1([0, ε]) is forward invariant. Since U was arbitrary, we have shown that A isLyapunov stable.

Consider δ ∈ (0, ε]. For each p ∈ L−1([0, δ]) and tp > 0, L(φtp (p)) < L(p) ≤ δ,and continuity gives a neighborhoodUp of p and a γp > 0 such that Lφtp (p

′)) ≤ δ −γp for all p′ ∈ Up. Since L−1([0, δ]) is compact, it follows that there are t > 0,γ > 0,and a neighborhoodU ⊂ L−1([0, ε]) of L−1([0, δ]) such that L(ϕt (p)) ≤ δ − γ forall p ∈ U . For some δ′ > δ we have L−1([0, δ′]) ⊂ U . If δ was the smallest numbersuch that ωφ(L−1([0, ε])) ⊂ L−1([0, δ]), then φ[T,∞)(L−1([0, ε])) ⊂ L−1([0, δ′])for sufficiently large T , but then φ[T+t,∞)(L−1([0, ε])) ⊂ L−1([0, δ − γ ]), whichis a contradiction. Therefore ωφ(L−1([0, δ])) ⊂ L−1(0) = A, so A is attractive.

We have shown that A is asymptotically stable, hence uniformly asymptoticallystable by Corollary15.5. �

We now develop a useful characterization of asymptotic stability.We say that S ⊂X is forward precompact if it is nonempty, S × R+ ⊂ W , and there is a T ≥ 0 suchthat φ[T,∞)(S) is compact. If S is forward precompact, then ωφ(S) is the intersectionof a nested family of nonempty compact sets, to it is nonempty and compact.

Lemma 15.13 If A ⊂ Σ is nonempty and compact,U ⊂ Σ is a forward precompactneighborhood of A, and ωφ(U ) ⊂ A, then U is a fundamental neighborhood, so Ais uniformly attractive.

Proof The sets Ct := φ[t,∞)(U ) are compact for large t . For any open neighborhoodW of A, ifCt \ W was nonempty for all t , then (because theCt are compact)

⋂t Ct \

W �= ∅, which is impossible because ωφ(U ) ⊂ A. Therefore the Ct are eventuallyinside any neighborhood of A, so U is a fundamental neighborhood for A. �


Lemma 15.14 If A is compact, forward invariant, and uniformly attractive, then itis asymptotically stable.

Proof LetU be a fundamental neighborhood for A, let V be any other neighborhood,and let T > 0 be such that φ[T,∞)(U ) ⊂ V . Since A is compact, continuity of φ givesa neighborhood W such that φ[0,T ](W ) ⊂ V . This continues to hold ifW is replacedby W ∩U , in which case φR+(W ) ⊂ V . Thus A is Lyapunov stable. �

We now consider a diffeoconvex bodyΣ ⊂ M and a locally Lipschitz vector fieldζ on Σ that is not outward pointing. Let W ⊂ Σ × R+ be the forward flow domainof ζ , and let Φ : W → Σ be the forward flow. Let ζ be the canonical extension of ζ ,let W be the flow domain of ζ , and let Φ : W → V be the flow. In the next sectionthe following result will be used to extend the converse Lyapunov theorem fromdynamical systems defined on all of M to systems defined on Σ .

Theorem 15.13 If A ⊂ Σ is compact, it is asymptotically stable for Φ if and onlyif it is asymptotically stable for Φ.

Proof Since Σ is forward invariant, if A is asymptotically stable for Φ, then itsasymptotic stability forΦ is a more or less automatic consequence of the definitions.Details are left to the reader.

Suppose that A is asymptotically stable for Φ. Let U ⊂ Σ be a compact funda-mental neighborhood of A for Φ, and letU ′ be a compact neighborhood of A that iscontained in the interior of U . Choose a T such that Φ[T,∞)(U ) ⊂ U ′.

Since A is compact and forward invariant for Φ, by Lemma15.14 it suffices toshow that it is uniformly attractive, and this will follow from Lemma15.13 if we canfind a forward precompact neighborhood U ⊂ M such that ωΦ(U ) ⊂ A.

Let V be a neighborhood of Σ such that the restriction of ζ to V is locallyLipschitz and, for all p ∈ V , ‖rΣ(Φt (p)) − Φt (p)‖ is a strictly decreasing functionof t . (Proposition15.5.) Since rΣ ◦ ΦT is continuous, it maps some neighborhood ofU to U , so for sufficiently small α > 0, if we set

U := { p ∈ r−1Σ (U ) : ‖rΣ(p) − p‖ ≤ α } ,

then U ⊂ V is compact, U × [0, T ] ⊂ W , and rΣ(ΦT (U )) ⊂ U . We have ΦT (U ) ⊂U , so U × R+ ⊂ W . Since the distance to Σ decreases along trajectories, anyneighborhood of U contains ΦnT (U ) for sufficiently large n, so there is some Tsuch that Φ[T ,∞)(U ) ⊂ U , because otherwise there would be a discontinuity of Φ

at some point of U . A variant of this argument shows that ωΦ(U ) ⊂ ωΦ(U ), so

ωΦ(U ) ⊂ A. Since Φ[T ,∞)(U ) is a closed subset of U , it is compact, and thus U isprecompact. �

Theorem15.13 is related to what Oyama et al. (2015) describe as the transitivitytheorem, which asserts that (under certain hypotheses) if C ⊂ B ⊂ A are compactsets, B is asymptotically stable in A, and C is asymptotically in B, then C is asymp-totically stable in A. Their Theorem3 extends the classic transitivity theorem ofConley (1978, Theorem5.3.D on p. 36).

15.7 The Converse Lyapunov Problem 315

15.7 The Converse Lyapunov Problem

A converse Lyapunov theorem is a result asserting that if a set is asymptoticallystable, then there is a Lyapunov function defined on a neighborhood of the set. Thehistory of converse Lyapunov theorems is sketched by Nadzieja (1990). Briefly, afterseveral partial results, the problem was completely solved for a dynamical systemon a manifold by Wilson (1969), who showed that one could require the Lyapunovfunction to be C∞ when the given manifold is C∞. Since we do not need such arefined result, we will follow the simpler treatment given by Nadzieja.

Before proceeding we should mention another result, due to Conley (1978), thatalso yields something that may be regarded as a Lyapunov function, insofar as itis decreasing along trajectories away from certain invariant sets. It is now knownas the fundamental theorem of dynamical systems. (Norton 1995; Robinson 1999,Chap.X.) Exercise15.3 gives the statement and outlines the proof.

Let Σ ⊂ M , ζ , W , Φ, ζ , W , and Φ be as in earlier sections. This section’s goalis:

Theorem 15.14 If A is asymptotically stable for Φ, then (after replacing M with asuitable neighborhood of A) there is a Lyapunov function for A.

In the last section we showed that if A is asymptotically stable for Φ, then itis asymptotically stable for Φ. Suppose we can establish Theorem15.14 with Φ inplace ofΦ. Then there is a neighborhood V ⊂ M of A that is forward invariant for Φand a Lyapunov function L : V → R+ for A and Φ. Let V := V ∩ Σ and L := L|V .Since Σ is forward invariant, L is a Lyapunov function for A and Φ. This meansthat it suffices to prove Theorem15.14 when Σ = M , and we shall assume that thisis the case throughout this section. It will be convenient to let W and Φ be the flowdomain and flow (as opposed to the forward flow domain and forward flow) of ζ .

Let U ⊂ D(A) be an open neighborhood of A. Since D(A) is forward invari-ant, Φt |U and Φ−t |Φt (U ) are inverse homeomorphisms when t ≥ 0, so ΦR+(U ) =⋃

t≥0 Φt (U ) is open, and if we replace U with this set, then U is forward invariant.The Lyapunov stability of A implies that any neighborhood of A contains such a U ,so we may require thatU is a bounded subset ofRk and its closure (as a subset ofRk)is contained in M . We begin by explaining how the vector field onU can be modifiedso that a Lyapunov function for the modified vector field is also a Lyapunov functionfor the given vector field, but the modified vector field is complete, and certain otherconditions hold.

For the metric d on M induced by the inclusion inRk , the infimum of the distancefrom a point p ∈ U to a point in M \U is a positive continuous function of p, soProposition10.2 implies that there is a Cr function α : U → R++ such that for eachp ∈ U , 1/α(p) is less than the distance from p to any point in M \U . Let M be thegraph of α:

M := { (p, α(p)) : p ∈ U } ⊂ U × R ⊂ Rk+1 .

Evidently M is a closed subset of Rk+1: if a sequence {(pn, hn)} in M convergesto (p, h), then p ∈ M , p must be in U because otherwise hn = α(pn) → ∞, and


continuity implies that h = α(p), so (p, h) ∈ M . Of course p �→ (p, α(p)) and(p, α(p)) �→ p are inverse Cr diffeomorphisms. Below we will usually write p inplace of (p, α(p)).

For p ∈ U let

ζ p := D(IdM × α)(p)ζp ∈ Tp M and ζ ( p) := ( p, ζ p) .

Since IdM × α is Cr , ζ is a locally Lipschitz vector field on M . Let Φ be the flow ofζ . Using the chain rule, it is easy to show that

Φt ( p) = (Φt (p), α(Φt (p))

)for all (p, t) ∈ W . Evidently A := { p : p ∈ A } is asymptotically stable for ζ .

Wenowwish to slow the dynamics, to prevent trajectories fromgoing to∞ in finitetime.Another application of Proposition10.2 gives aCr functionβ : M → R++ withβ( p) < 1/‖ζ p‖ for all p ∈ M . Define a vector field ζ ∗ on M by setting

ζ ∗p := β( p)ζ p ,

and let Φ∗ be the flow of ζ ∗. For (p, t) such that ( p, t) is in the flow domain of ζ ∗let

B(p, t) :=∫ t

0β(Φ∗

s ( p)) ds .

The chain rule computation

d

dt

[ΦB(p,t)( p)

]= β(Φ∗( p, t))ζΦB(p,t)( p)

shows that t �→ ΦB(p,t)( p) is a trajectory for ζ ∗, so Φ∗t ( p) = ΦB(p,t)( p).

This has two important consequences. The first is that the speed of a trajectoryof ζ ∗ is never greater than one, so the final component of Φ∗(p, α(p), t) cannotgo to ∞ in finite (forward or backward) time. Since M is closed in R

k+1, ζ ∗ iscomplete. The second point is that since β is bounded below on any compact set, if{ Φ(p, α(p), t) : t ≥ 0 } is bounded, then Φ∗(p, ·) traverses the entire trajectory ofζ beginning at (p, α(p)). It follows that A is asymptotically stable for ζ ∗. Note thatif L is a Lyapunov function for ζ ∗ and A, then it is also a Lyapunov function for ζ

and A, and setting L(p) := L(p, α(p)) gives a Lyapunov function for ζ |U and A.Therefore it suffices to establish the claim with M and ζ replaced by M and ζ ∗.

The upshot of the discussion to this point is as follows. We may assume that ζ iscomplete, i.e.,W = M × R, and that the domainof attractionof A is all ofM .Wemayalso assume that the metric d on M induced by its inclusion in Rk is complete—thatis, any Cauchy sequence converges—so a sequence {pn} that is eventually outside ofeach compact subset of M diverges in the sense that d(p, pn) → ∞ for any p ∈ M .


The next three results are technical preparations for the main argument.

Lemma 15.15 If U is a neighborhood of A, {pn} is a sequence in M \U, and {tn}is a sequence in R such that {Φtn (pn)} is bounded, then {tn} is bounded below. Forany B > 0 there is a T such that d(Φt (p), A) > B for all p ∈ M \U and t ≤ T .

Proof Since {Φtn (pn)} is bounded, it is contained in a compact neighborhood of A, soLemma15.12 gives a T such thatΦt (Φtn (pn)) = Φt+tn (pn) ∈ U for all t ≥ T . Sincepn = Φ0(pn) �= U , tn > −T . This establishes the first assertion, and the secondassertion follows automatically. �

Let : M → R+ be the function

(p) := inft≤0

d(Φt (p), A) .

If p ∈ A, then (p) = 0. If p /∈ A, thenΦt (p) /∈ A for all t ≤ 0 because A is forwardinvariant, so the last result implies that (p) > 0.

Lemma 15.16 is continuous.

Proof Since 0 ≤ (p) ≤ d(p, A), is continuous at points in A. Suppose that p /∈A. If (p) < β, then there is a t ≤ 0 such that Φt (p) < β, and continuity impliesthat (p′) ≤ Φt (p′) < β for all p′ in some neighborhood of p. If (p) > α > 0and C ⊂ M \ A is a closed neighborhood of p, the last result gives a T such thatd(Φt (p′), A) > 2α for all p′ ∈ C and t ≤ T , and continuity gives a neighborhoodC ′ ⊂ C of p such that d(Φt (p′), A) > α for all p′ ∈ C ′ and t ∈ [T, 0]. Thus iscontinuous at p. �

Lemma 15.17 If {(pn, tn)} is a sequence such that d(pn, A) → ∞ and there is anumber T such that tn < T for all n, then d(Φtn (pn), A) → ∞.

Proof Suppose not. After passing to a subsequence there is a B > 0 such thatd(Φtn (pn), A) < B for all n, so the sequence {Φtn (pn)} is contained in a compact setK . Since the domain of attraction of A is all ofM ,Φ is continuous, and K is compact,for any ε > 0 there is some S such that d(Φt (p), A) < εwhenever p ∈ K and t > S.The function p �→ d(Φt (p), A) is continuous, hence bounded on the compact setK × [−T, S], so it is bounded on all of K × [−T,∞). But this is impossible because−tn > −T and d(Φ−tn (Φtn (pn)), A) = d(pn, A) → ∞. �

We are now ready for the main construction. We will show that L : M → R+defined by

L(p) :=∫ ∞

0(Φs(p)) exp(−s) ds

is a Lyapunov function. If p ∈ A, then (Φt (p)) = 0 for all t ≥ 0, so L(p) = 0. Ifp /∈ A, then L(p) > 0 because (p) > 0. By construction (Φt (p)) is a decreasing


function of t , so the identity Φs(Φt (p)) = Φs+t (p) implies that L(Φt (p)) is also adecreasing function of t .

To show that L is continuous at an arbitrary p ∈ M we observe that for any ε > 0there is a T such that (ΦT (p)) < ε/4. Since Φ is continuous we have (ΦT (p′)) <

ε/4 and |(Φt (p′)) − (Φt (p))| < ε/4 for all p′ in some neighborhood of p and allt ∈ [0, T ], so that

|L(p′) − L(p)| ≤∫ T

0

∣∣(Φs(p′)) − (Φs(p))

∣∣ exp(−s) ds

+∣∣∣∣∫ ∞

T(Φs(p

′)) exp(−s) ds

∣∣∣∣+

∣∣∣∣∫ ∞

T(Φs(p)) exp(−s) ds

∣∣∣∣ < ε/4 + ε/2 + ε/4 = ε

for all p′ in this neighborhood.To show that L is ζ -differentiable, and to compute its ζ -derivative, we observe

that

L(Φt (p)) =∫ ∞

0(Φt+s(p)) exp(−s) ds = exp(t)

∫ ∞

t(Φs(p)) exp(−s) ds ,

so that

L(Φt (p)) − L(p) = (exp(t) − 1)∫ ∞

t(Φs(p)) exp(−s) ds −

∫ t

0(Φt (p)) exp(−s) ds .

Dividing by t and taking the limit as t → 0 gives

ζ L(p) = L(p) − (p) .

Note that

L(p) < (p)∫ ∞

0exp(s) ds = (p)

because (Φ(p, ·)) is weakly decreasing with limt→∞ (Φt (p)) = 0. Thereforeζ L(p) < 0 when p /∈ A.

It remains to show that if U is open and contains A, then there is an ε > 0 suchthat L−1([0, ε]) ⊂ U . The alternative is that there is some sequence {pn} in M \Uwith L(pn) → 0. Since (Φs(pn)) is a decreasing function of s,

L(pn) ≥∫ 1

0(Φs(pn)) exp(−s) ds ≥ (1 − e−1)(Φ1(pn)) ,

so (Φ1(pn)) → 0. For each n Lemma15.15 implies that d(Φt (pn), A) → ∞ as t →−∞, so there is a tn ≤ 1 such that (Φ1(pn)) = d(Φtn (pn), A). Since L is continuous


and positive away from A, {pn}must eventually be outside any compact set, but nowLemma15.17 implies that d(Φtn (pn), A) → ∞. This contradiction completes theproof that L is a Lyapunov function, and also the proof of Theorem15.14.

15.8 A Necessary Condition for Stability

This section establishes the chapter’s culminating result, which is the relationshipbetween asymptotic stability and the vector field index. As before, let Σ ⊂ M be adiffeoconvex set, let ζ be a locally Lipschitz vector field on Σ that is not outwardpointing, letW ⊂ Σ × R+ be the forward flow domain of ζ , and letΦ be the forwardflow. The argument below is one of the ones given in Demichelis and Ritzberger(2003) in a game theoretic context.

Theorem 15.15 If A is asymptotically stable for ζ , and an ANR, and C ⊂ Σ is acompact neighborhood of A such that E (ζ |C) = A, then

indΣ(−ζ |C) = χ(A) .

Proof Since A is an ANR, it is a retract of some neighborhood of itself. The assertionis unaffected if we replace C with a compact neighborhood contained in the domainof the retraction, so we may assume that there is a retraction r : C → A. From thelast section we know that (after restricting to some neighborhood of A) there isa Lyapunov function L for ζ . If ε > 0 is sufficiently small, Aε := L−1([0, ε]) iscontained in C , and is consequently compact. For p ∈ M let

τ(p) := inf{ t ≥ 0 : Φt (p) ∈ Aε } .

Since Φ is continuous, for any c ∈ R the sets τ−1((−∞, c)) and τ−1((c,∞)) areopen, so τ is continuous. Therefore ρ : p �→ Φτ(p)(p) is continuous and thus aretraction of C onto Aε. The tubular neighborhood theorem implies that M is anENR, hence an ANR, so Corollary8.4 implies that Aε is an ANR.

For small t > 0 we claim that

χ(A) = ΛA(A) = ΛΣ(r) = ΛΣ(ρ) = ΛΣ(Φt ◦ ρ) = ΛΣ(Φt |Aε)

= (−1)m indΣ(ζ |Aε) = indΣ(−ζ |Aε

) = indΣ(−ζ |C) .

The first asserted equality is just the definition of χ(A). If i : A → C is the inclu-sion, the version of Commutativity from Proposition13.6 gives χ(A) = ΛA(r ◦i) = ΛΣ(i ◦ r) = ΛΣ(r). Below we will construct an index admissible homotopybetween r and ρ, so that the third equality follows from Continuity. For any t ≥ 0 thehomotopy s �→ Φs |Aε

◦ ρ is an index admissible homotopy between ρ andΦt |Aε◦ ρ,

so the fourth equality also follows from Continuity. Since all of the fixed points


of Φt ◦ ρ are contained in A Additivity gives the fifth equality. For small t > 0Theorem15.9 gives the sixth equality, and Corollary15.4 gives the seventh. FinallyAdditivity for the vector field index gives the last equality.

It remains to construct the desired index admissible homotopy. The composi-tion r ◦ ρ is a retraction of C onto A, and the argument to this point is unaffectedif we replace r with this, so we may assume that r = r ◦ ρ. It suffices to con-struct a homotopy h : Aε × [0, 1] → Aε with h0 = IdAε

, and h1 = r |Aεbecause then

(p, t) �→ h(ρ(p), t) is an index admissible homotopy between ρ and r .Let V ⊂ M × M be a neighborhood of the diagonal for which there is a contin-

uous “convex combination” function c : V × [0, 1] → M as per Proposition10.14.That is, c(p, p′, 0) = p and c(p, p′, 1) = p′ for all (p, p′) ∈ V , and c(p, p, t) = pfor all p ∈ M and t ∈ [0, 1]. Let V be the set of (p, p′) ∈ V ∩ (Σ × Σ) suchthat c(p, p′, t) ∈ VΣ for all t ∈ [0, 1], and let c : V × [0, 1] → Σ be the func-tion c(p, p′, t) := rΣ(c(p, p′, t)). Then V is a neighborhood of the diagonal inΣ × Σ , c is continuous, c(p, p′, 0) = p and c(p, p′, 1) = p′ for all (p, p′) ∈ V ,and c(p, p, t) = p for all p ∈ M and t ∈ [0, 1]. Since A is compact and r is a retrac-tion, there is a neighborhoodU ⊂ Aε of A such that c

({ (p, r(p)) : p ∈ U } × [0, 1])is contained in the interior of Aε. Let T be large enough that ΦT (Aε) ⊂ U . Leth : Aε × [0, 1] → Aε be the homotopy

h(p, t) :=

⎧⎪⎨⎪⎩

Φ3tT (p), 0 ≤ t ≤ 13 ,

c(ΦT (p), r(ΦT (p)), 3(t − 1

3 )), 1

3 ≤ t ≤ 23 ,

r(Φ3(1−t)T (p)), 23 ≤ t ≤ 1.

As desired, h0 = IdAε, and h1 = r |Aε

. �

The special case of this result when Σ = M and A is a singleton is a prominentresult in the theory of dynamical systems. (E.g., Krasnosel’ski and Zabreiko 1984.)But that literature does not seem to have generalized the result to more general setsof equilibria, even though one can imagine physical applications.

15.9 The Correspondence and Index +1 Principles

This sectiondiscusses the consequences of this chapter’s results for our understandingof which equilibria of games, markets, and other economic models, are empiricallyplausible.We begin with a prototypical example, the Battle of the Sexes shown below(Fig. 15.1). There are two players, 1 and 2, with respective sets of pure strategies

Fig. 15.1 The battle of the sexes

15.9 The Correspondence and Index +1 Principles 321

S1 = {U, D} and S2 = {L , R}. For each pair of pure strategies the table below givesa pair of payoffs for the two agents. For example, if 1 chooses U and 2 chooses L ,then 1 receives 2 utils and 2 receives 1 util.

A mixed strategy for 1 is a probability distribution on S1, which is representedby a function σ1 : S1 → [0, 1] such that σ1(U ) + σ1(D) = 1. Mixed strategies for2 are defined similarly. For a profile σ = (σ1, σ2) of mixed strategies, 1’s expectedutility is the average utility when the two pure strategies are statistically independentevents. Concretely this expected utility is 2σ1(U )σ2(1) + σ1(D)σ2(D). Of course 2’sexpected utility is defined similarly. A Nash equilibrium is a profile σ ∗ such thateach agent is maximizing her expected utility, taking the other’s strategy as given.The pure strategy profiles (U, L) and (D, R) are Nash equilibria.

In addition there is amixedNash equilibrium ( 23U + 13D, 1

3 L + 23 R). Economists

generally agree that this equilibrium is implausible and irrelevant in almost all appli-cations, because it seems dynamically unstable, but precisely what do we mean bythis, and how can we understand it as a consequence of a general principle?

Game theory and other economicmodels can be applied inmanyways, but herewewill focus on an understanding of an equilibrium as a pattern of behavior that mightbe self-reproducing, in the sense that it is expected to occur and then does occur,repeatedly. Concretely, imagine a society in which the Battle of the Sexes is playedfrequently, and at the beginningof eachday everyone recalls the empirical distributionof behavior from the day before. If they expect that their opponents today will playin the same way, and they play a best response, we obtain a dynamical system, andindeed small disturbances of the mixed equilibrium result in rapid movement awayfrom it, followed by convergence to a pure equilibrium.

But why should anyone regard yesterday’s behavior as an accurate predictor oftoday’s behavior if the agents today are all best responding to this expectation? Aneconomic model has rational expectations if the beliefs of the agents in the modelconcerning relevant uncertainty (in this case the behavior of other agents) agreewith the model’s predictions, and the agents behave rationally given those beliefs.In each period of a model of strategic adjustment satisfying rational expectations,each agent has accurate beliefs concerning the behavior of others, and each agentresponds rationally to those beliefs. This is just another way of saying that the profileof mixed strategies embodying those beliefs is a Nash equilibrium. Thus a modelof strategic adjustment satisfying rational expectations can only predict that eachperiod’s behavior will be a Nash equilibrium. There is no nontrivial adjustmentdynamics of the sort that might explain why the mixed equilibrium of the Battle ofthe Sexes is unstable.

Thus the sort of strategic adjustment thatmight explainwhy themixed equilibriumof the Battle of the Sexes is unstable necessarily has less than fully rational behavior.In addition, in principle there are limits to how well social scientists can understandthis process, because if it was very well understood, then the agents playing the gamewould be able to take advantage of this understanding. It seems that such processesmust have some minimal murky complexity.


In spite of all this, an explanation of the instability of the mixed equilibrium ofthe battle of the sexes can be compelling if, instead of depending on the idiosyncraticdetails of some particular adjustment process, it is a consequence of a simple andintuitively justified property that is shared by a wide range of such processes. Specif-ically, we will see that the mixed equilibrium of the battle of the sexes is unstable forany process of adjustment in which the agents are adjusting their mixed strategies indirections that are utility improving, relative to the current mixed strategy profile.

Moreover, this is an instance of a quite general phenomenon. In order to explainthis we now introduce the general model. Let the natural number n be the number ofagents, let N := {1, . . . , n}, and let

G = (S1, . . . , Sn, u1, . . . , un)

be a strategic form game, as per the following description. For each i ∈ N , Si is anonempty finite set of pure strategies. Let S := ∏

i∈N Si be the set of pure strategyprofiles. Each ui is a real valued function with domain S.

For any nonempty finite set X let

Δ(X) := { μ : X → [0, 1] :∑

μ(x) = 1 }

be the set of probability measures on X . The set of mixed strategies for agent iis Σi := Δ(Si ). Let Σ := ∏

i∈N Σi be the set of mixed strategy profiles. Abusingnotation, let ui also denote the multilinear extension of ui to Σ :

ui (σ ) :=∑s∈S

( ∏h∈N

σh(sh))ui (s).

A mixed strategy profile σ ∗ is a Nash equilibrium if ui (σ ∗) ≥ ui (τi , σ ∗−i ) for all i

and τi ∈ σi . (As usual (τi , σ ∗−i ) denotes the mixed strategy profile obtained from σ ∗

by replacing σ ∗i with τi .)

For each i ∈ N agent i’s set of best responses to σ ∈ Σ is

BRi (σ ) := { τi ∈ Σi : ui (τi , σ−i ) ≥ ui (τ′i , σ−i ) for all τ

′i ∈ Σi }.

The best response correspondence is the correspondence BR : Σ → Σ given by

BR(σ ) := BR1(σ ) × · · · × BRn(σ ) .

Of course BR is an upper hemicontinuous convex valued correspondence, and itsfixed points are indeed the Nash equilibria of G.

As a matter of convention, Nash equilibria are usually thought of as fixed pointsof the best response correspondence, but although using the best response corre-spondence is in some sense standard, or at least a deeply ingrained tradition, Nashequilibrium can also be defined as the set of vector field equilibria of various vector


fields. For example, for σ ∈ Σ and i ∈ N let γi (σ ) ∈ RSi be given by

γi (σ ; si ) := max{ui (si , σ−i ) − ui (σ ), 0},

and let b(σ ) = (b1(σ ), . . . , bn(σ )) ∈ Σ be given by

bi (σ ; si ) := σi (si ) + γi (σ ; si )∑ti∈Si σi (ti ) + γi (σ ; ti ) .

Let β be the vector field σ �→ (σ, b(σ ) − σ), which is obviously not outward point-ing. (These functions were introduced byNash 1950.) Then the set of Nash equilibriais the set of fixed points of the function b, and it is the set of vector field equilibriaof β.

We now generalize this construction. For each i let

Hi := { τi ∈ RSi :

∑τi (si ) = 1 } and Vi := { τi ∈ R

Si :∑

τi (si ) = 0 } ,

and let H := ∏i∈N Hi and V := ∏

i∈N Vi . Then Tσi Σi = Vi for all i andσi ∈ Σi , andTσΣ = V for all σ ∈ Σ . Let ζ be a vector field on Σ . The vector field ζ is a payoffconsistent selection dynamics if ζ is not outward pointing and Dui (σ )ζσ i ≥ 0 forall σ ∈ Σ and i ∈ N . (Here we are abusing notation slightly by identifying thei th component ζσ i ∈ Vi with the element of V with this i-component and all othercomponents zero.) It is a Nash dynamics if, in addition, ζσ = 0 if and only if σ is aNash equilibrium.

Proposition 15.8 If ζ is a Nash dynamics and A is a set of Nash equilibria that isboth closed and open in the relative topology of the set of Nash equilibria, then

ind−ζ (A) = ΛBR(A) .

Proof There is a compact C ⊂ Σ with A = F (BR|C) that has no Nash equilibriain its topological boundary. By definition ind−ζ (A) = indΣ(ζ |C). For all σ ∈ σ andall i , Dui (σ )ζσ i > 0 unless σi is a best response to σ , so for all t ∈ [0, 1] the vectorfield (1 − t)ζ + tβ is a Nash dynamics, and therefore t �→ (1 − t)ζ + tβ is an indexadmissible homotopy of vector fields. Therefore indΣ(ζ |C) = indΣ(β|C).

Let κ : H × V → H be the function κ(σ, ν) := σ + ν. Of course κ(σ, 0) = σ

and Dκ(σ, ·)(0) = IdV for all σ ∈ H , and π × κ (where π : T H → H is theprojection) is a diffeomorphism. In addition, κ(σ, ζσ ) = b(σ ) for all σ ∈ Σ , soTheorem15.8 gives indΣ(β|C) = indΣ(κ ◦ β|C) = indΣ(b|C).

The homotopy J : C × [0, 1] → H given by

J (σ, t) := { (1 − t)b(σ ) + tτ : τ ∈ BR(σ ) }

is clearly upper hemicontinuous and convex valued. For any σ ∈ ∂C there is somei such that σi /∈ BRi (σ ), so that ui (τi , σ−i ) > ui (σ ) for every τi ∈ BRi (σ ). Since


ζ is a Nash dynamics, ui (bi (σ ), σ−i ) ≥ ui (σ ), so ui (τi , σ−i ) ≥ ui (σ ) for all t andτ ∈ Jt (σ ), with strict inequality if t > 0. Therefore Jt does not have any fixed pointin ∂C if t > 0, and J0 does not have any fixed point in ∂C because all the equilibriaof ζ are Nash equilibria. Thus J is index admissible, so Homotopy implies thatΛΣ(b|C) = ΛΣ(BR|C). By definition ΛBR(A) = ΛΣ(BR|C). �

Combining this result with Theorem15.15 gives:

Theorem 15.16 (Demichelis and Ritzberger 2003) If A is a set of Nash equilibriathat is both closed and open in the relative topology of the set of Nash equilibria, Ais an ANR,1 and A is asymptotically stable for some Nash dynamic ζ , then

ΛBR(A) = ind−ζ (A) = χ(A) .

This result provides a criterion for regarding a connected set of equilibria A asunstable and thus implausible, namely thatΛBR(A) �= χ(A). This criterion is robust,in the sense that it does not depend on which particular dynamic system we consider,within some large class that is well motivated by individual incentives. The criterionis expressed in terms of the fixed point index for the best response correspondence,which is, in a sense, canonical, and in any event independent of the consideration ofany particular dynamical system. But this result also shows that the definition of theindex of A is independent of the defining function or correspondence, again acrosssome broad class of possibilities.

We now develop an analogous result for general equilibrium theory. Fix a naturalnumber > 1 of goods. We take the strictly positive orthant P := S−1 ∩ R

++ ofthe unit sphere S−1 := { x ∈ R

: ‖x‖ = 1 } as the spaces of prices. The economy’sgiven data is summarized by an aggregate excess demand, which is a vector fieldζ on P such that:

(a) There is some b ∈ R such that ζp ≥ b for all p ∈ P .

(b) 〈p, ζp〉 = 0 for all p.(c) For each i = 1, . . . , , if pn → p ∈ P \ P and pi = 0, then ζpi → ∞.

In this context an equilibrium of ζ is called aWalrasian equilibrium.The underlying idea is that there is a set of consumers, each of whom has an initial

endowment, a set of feasible consumptions, and preferences. To keep our descriptionsimple and concrete we abstract away from production, so economic activity consistsonly of trading the endowments. In addition we assume that the set of consumers isfinite, that each consumer’s set of feasible consumptions is the interior R++ of thepositive orthant, and that each consumer’s endowment is an element of R++, so shestarts with a positive quantity of each good. Each consumer i has preferences over

1Actually, because the set of Nash equilibria is defined by finitely many equations and inequalities,it has the topology of a finite simplicial complex, so it has finitely many connected components,and any union of connected components is an ANR. This follows from results of Whitney (1957),but semi-algebraic geometry (e.g., Benedetti and Risler 1990; Blume and Zame 1994 provide asuccinct summary of foundational material) has more refined versions of this result.


consumption bundles that are represented by a utility function ui : R++ → R that iscontinuous, strictly increasing in all goods, and strictly convex. To insure that demandis well defined we assume that for each x ∈ R

++, the set { y ∈ R++ : ui (y) ≥ ui (x) }

of bundles at least as good as x is a closed subset of R, so it is not possible to haveui (xn) increasing along a sequence {xn} of consumption bundles converging to apoint in R

+ \ R++. Needless to say these assumptions are quite strong, and there isan extensive body of literature studying how they might be relaxed.

When prices are given by the price vector p ∈ P , consumer i’s wealth is thevalue 〈p, ωi 〉 of her initial endowment ωi , and her demand is the point di (p) in herbudget set Bi (p) := { x ∈ R

++ : 〈p, x〉 ≤ 〈p, ωi 〉 } thatmaximizesui . The aggregateexcess demand is ζp := ∑

i di (p) − ωi . It is bounded below by −∑i ωi . Condition

(b) above is known as Walras’ law. It is satisfied (and thus ζp is an element ofTpP) because the value of each consumer’s demand is equal to the value of herendowment. Condition (c) is of course quite natural, but we will not discuss how itmight be derived from assumptions on the utility functions.

The flow of ζ is a version of tatonnement: the price of each good adjusts at arate that is proportional to the difference between supply and demand. As a modelof adjustment to equilibrium, tatonnement has several problems. It is not invariantwith respect to changes in the units used to measure the goods. For example, if wego from measuring milk in quarts to measuring it in pints, the price is cut in half,and the excess demand is doubled, so in effect the price adjusts four times as fast.In any event there seem to be no theoretical principles governing the ratios of therates of adjustment in different markets. Furthermore, tatonnement does not satisfyrational expectations, because if it accurately described the adjustment process, andthe consumers knew this, they could make speculative profits by buying and thenreselling (selling and then buying back) goods whose prices were rising (falling).(Actually, tatonnement is often described as an entirely hypothetical process, in thesense that the prices that are adjusting are not ones at which any trade is taking place.)As with strategic adjustment in games, it seems that the process by which marketprices are achieved necessarily has some sort of opaque complexity.

Evidently these objections pertain equally to any version of tatonnement. Never-theless we will study the implications of stability with respect to some tatonnement-like process. For ε > 0 let Pε := { p ∈ P : pi ≥ ε for all i }. It is easy to see that Pε

is diffeoconvex. It is also easy to see that for sufficiently small ε, ζ |Pεis not outward

pointing: this is the case if ζpi ≥ 0 for all p ∈ Pε and all i such that pi = ε, and ifthis were false for arbitrarily small ε one could easily generate a violation of (c).

An ε-natural price dynamics is a Lipschitz vector field ζ on Pε such that:

(a) ζ is not outward pointing.(b) For all p ∈ Pε, ζ(p) = 0 if and only if ζ (p) = 0.(c) For all p ∈ Pε, 〈ζp, ζp〉 ≥ 0.

That is, ζ and ζ have the same equilibria, and ζ always adjusts prices in a directionthat is not perverse, in the sense of diminishing the value of excess demand. (Actually,from a purely mathematical point of view the required condition is that ζ and ζ areindex admissible homotopic, which is weaker still.)


Theorem 15.17 If ζ is an ε-natural price dynamics, and A is an asymptoticallystable set for ζ that is also an ANR, then ind−ζ (A) = χ(A).

Proof Theorem15.15 implies that ind−ζ (A) = χ(A). Define the vector field homo-topy (p, t) �→ (p, ηp(t)) by setting ηp(t) := (1 − t)ζp + tζp. In view of (b) and (c),this homotopy is index admissible, as is −η. Therefore ind−ζ (A) = ind−η0(A) =ind−η1(A) = ind−ζ (A).

Once again, although a wide variety of vector fields could be used to define thenotion of Walrasian equilibrium, one of these is canonical. This one is used to definethe index of a set of equilibria, but once again we have a result that relates this indexto dynamic stability, and which also shows that the index would be the same if awide range of other vector fields were used to define it.

The applications to game theory and to general equilibrium have common ele-ments. In both cases we have a definition of equilibrium that requires simultaneousmaximization by all agents.Moreover, this definition can be rephrased as equilibriumof a vector field that expresses some concept of agents adjusting in a direction thatincreases their utilities. As was the case in the game theory application, there is noreason to expect tatonnement to be an accurate model of adjustment to equilibrium,and in fact there are good reasons to think that there are limits to how accurate anysuch model could be. But asymptotic stability of a set of equilibria with respect toany dynamic in a wide range of such dynamics implies that the index of the set agreeswith the Euler characteristic. For these models, and also for other economic modelsin which an equilibrium is a topological fixed point, we describe the hypothesis thata set of equilibria will not be self-reproducing if its index and Euler characteristic aredifferent as the index+1 principle. (This terminology is accurate only in connectionwith isolated equilibria, but this is the case of greatest interest in applications, it isgeneric formanymodels, and the phrase “index equals Euler characteristic principle”is too cumbersome.)

Demichelis and Germano (2002a, b) provide a similar and complementary set ofassumptions leading to the conclusion that a regular Walrasian or Nash equilibriumwill be dynamically unstable if the condition of Theorem15.17 does not hold. Theyassume a dynamical adjustment process that is defined for all parameters and vectorsof endogenous variables, which vanishes at equilibria and nowhere else. They requireit to have the expected qualitative behavior toward the boundary of the space ofendogenous variables, so that, for example, prices that are very small should beincreasing. They show that the set of such dynamical adjustment processes is pathconnected. Since the index and the degree are homotopy invariants, they will havethe same signs for any such dynamics, and consequently an index of −1 implies asign for the degree that is inconsistent with dynamic stability.

It is interesting to contrast the index+1 principlewith the role of dynamic stabilityin disciplines such as physics and chemistry, where the given theory is a particulardynamic system. For such disciplines the stability or instability of an equilibrium is apurely mathematical issue. In contrast, although the index +1 principle is motivatedby dynamic intuitions, it does not assert that adjustment to equilibrium is governed


by one of the dynamic systems from which it is derived. It does not exclude thepossibility that adjustment to equilibrium is influenced by aspects of the world thatare not explicitly modelled. Instead of being a theorem, it is an hypothesis. While itmay have more or less compelling motivations in different applications, in the endit either does or does not agree with experience.

It should also be noted that there are consequences of dynamic stability that arenot expressed by the index +1 principle. Consider a “coordination” game with twoagents, who have the same set of three pure strategies, and each agent’s payoff is 1 ifthe two agents choose the same pure strategies and 0 otherwise. This game has threepure Nash equilibria, three Nash equilibria in which the two agents mix equally overtwo or the pure strategies, and a Nash equilibrium in which the agents mix equallyover all three pure strategies. Each pure equilibrium has index +1 and each partiallymixed equilibrium has index -1, so the totally mixed equilibrium has index +1, butit is obviously unstable with respect to any plausible dynamics.

Paul Samuelson (1941, 1942, 1947) advocated a correspondence principle,according to which dynamical stability of an equilibrium has implications for thequalitative properties of the equilibrium’s comparative statics. Samuelson’s writingsconsider many particular models, but he never formulated the correspondence prin-ciple as a precise and general theorem. The economics profession’s understandingof it has languished, being largely restricted to 1-dimensional cases; see Echenique(2008) for a succinct summary.

The idea can be illustrated in a two good exchange economy. Figure15.2a showsthe excess demand for the second good as a function of the second good’s price,when the first good is the numeraire. There are three equilibria, two of which arestable relative to price dynamics that increase (decrease) the price of the second goodwhen it is in excess demand (supply).

Figure15.2b shows the effect of changing a parameter in a way that increasesdemand for the second good. This has the expected effect of increasing the second

Fig. 15.2 Excess demandand comparative statics

(a)

(b)


good’s equilibrium price for the two stable equilibria, but it leads to a price decreasein the unstable equilibrium. In a nutshell, Samuelson’s understanding of the cor-respondence principle was that dynamic stability had implications for comparativestatics.

In this example the correspondence principle combines three elements: (a) equi-libria that are unstable with respect to natural dynamics will not be observed; (b)therefore excess demand is downward sloping at the equilibria that are empiricallyrelevant; (c) this allows us to sign certain comparative statics. The first two of theseare the 1-dimensional case of the index +1 principle. (The third quickly becomesproblematic as the dimension of the model increases.) In this sense the index +1principle can be understood as an extension of the correspondence principle tomultidimensional settings.

Exercises

15.1 Prove that a vector field ζ on a subset S of a Cr (r ≥ 1) is locally Lipschitzif and only if it is locally Lipschitz when regarded as a function from S ⊂ R

k toS × R

k .

15.2 Let I = {1, . . . , n} be a set of individuals, and let J be a set of positionswithn elements. Each individual i has a von Neumann–Morgenstern utility function ui :J → R. Let ΔJ := { p ∈ R

J+ : ∑j p j = 1 } be the set of probability distributions

on J , and let c be the barycenter of ΔJ . For q ∈ RJ let B(q) = { p ∈ ΔJ : 〈q, p〉 ≤

〈q, c〉 }, and for each i let Di (q) := argmaxpi∈B(q)ui (pi ) where ui (pi ) := ∑j pi j ui j

is the expected utility of pi . A pair (q, p) ∈ RJ × (ΔJ )I is aWalrasian equilibrium

from proportional endowments (Hylland and Zeckhauser 1979; McLennan 2018)if pi ∈ Di (q) for all i and

∑i pi = nc. (TheBirkhoff–vonNeumann theorem implies

that any such p can be realized as a lottery over deterministic assignments.)Apositionj is a favorite of i if ui j ≥ ui j ′ for all j ′, and it is i’s unique favorite if ui j > ui j ′for all j ′ �= j . Let F be the set of elements of J that are favorites of some individual.Throughout we assume that F is a proper subset of J . Let

SF := { q ∈ RJ : 〈q, c〉 = 0, ‖q‖ = 1, and q j ≥ 0 for all j ∈ F } .

(a) Prove that (as a subset of the sphere) SF is a diffeoconvex body.(b) Prove that SF is contractible.(c) For q ∈ SF , what are Tq SF , SSF (q), and NSF (q)?(d) Verify that if each element of F is some individual’s unique favorite, then the

vector field correspondence Z : SF → RJ given by Z(q) := −nc + ∑

i Di (q)

satisfies the hypotheses of the Poincaré–Hopf theorem. Conclude that a Wal-rasian equilibrium from proportional endowments exists.

Exercises 329

15.3 We prove Conway’s fundamental theorem of dynamical systems. Let (X, d)

be a compact metric space, and let f : X → X be a homeomorphism. A set A ⊂ Xis an attractor if there is an openU ⊂ X such that f (U ) ⊂ U and A = ⋂

t≥0 f t (U ).We say that U is a basin of attraction for A. A repeller is an attractor of f −1. LetU ∗ := X \U . Then U

∗ ⊂ X \U , so

f −1(U∗) ⊂ f −1(X \U ) = X \ f −1(U ) ⊂ X \U = U ∗ .

Let A∗ = ⋂t≤0 f t (U ∗).

(a) Prove that for all t , f t+1(U ∗) ∪ f t (U ) = X . Conclude that

A∗ ∪⋃t∈Z

f t (U ) = X = A ∪⋃t∈Z

f t (U ∗) .

(b) Prove that ifU ′ is another basin of attraction for A, then⋂

t≥0 f t (X \U ′) = A∗.

Thus each attractor has an associated repeller.

(c) Prove that there are at most countably many attractor-repeller pairs. (Hint: Xhas a countable basis, and if f (U ) ⊂ U , then f (U ) is covered by finitely manyelements of this basis that are contained in U .)

Let the attractor repeller pairs be {(An, A∗n)}n∈N.

For p, q ∈ X and ε > 0, an ε-chain from p to q is a sequence x0, . . . , xT such thatd(p, x0) < ε, d( f (xt−1), xt ) < ε for all t = 1, . . . , T , and d(xT , q) < ε. A point pis chain recurrent if, for any ε > 0, there is an ε-chain from p to itself. Let R( f )be the set of chain recurrent points.

(d) Prove that R( f ) is closed.

(e) Prove that if A, A∗ is an attractor-repeller pair and x /∈ A ∪ A∗, then x is notchain recurrent.

(f) For any x and ε let V be the set of points that can be reached by ε-chains fromx . Prove that f (V ) ⊂ V .

(g) Suppose that x /∈ R( f ). There is some ε > 0 such that there is no ε-chain fromx to itself. Let V be the set of points that can be reached by ε-chains from x ,and let A be the attractor with basin V . Prove that x /∈ A, but also that any limitpoint of { f t (x)}t≥0 is in V , so x /∈ A∗. Conclude that R( f ) = ⋂

n(An ∪ A∗n).

For p, q ∈ R( f ) we write p ∼ q if, for every ε > 0, there is an ε-chain from pto q and an ε-chain from q to p. The restriction to points in R( f ) implies that thisrelation is reflexive, it is evidently symmetric, and it is easy to see that it is transitive,hence an equivalence relation. The equivalence classes are called chain transitivecomponents.

(h) Prove that a chain transitive component is closed.(i) For p, q ∈ R( f ) prove that p ∼ q if and only if for all n either p, q ∈ An or

p, q ∈ A∗n .


For each n let ϕn : X → [0, 1], ϕn : X → [0, 1], and gn : X → [0, 1] be the func-tions

ϕn(x) := d(x,An)

d(x,An)+d(x,A∗) , ϕn(x) := supt≥0

ϕn( ft (x)) ,

gn(x) :=∑t≥0

2−(t+1)ϕn( ft (x)) .

(j) Prove that ϕ−1n (0) = An and ϕ−1

n (1) = A∗n .

(k) Prove that ϕn is continuous.

(l) Prove that g is continuous, g−1(0) = An , g−1(1) = A∗n , and g is strictly decreas-

ing along orbits outside of An ∪ A∗n .

A continuous function : X → R is a complete Lyapunov function for f if

(i) ( f (p)) < (p) for all p /∈ R( f );(ii) for all p, q ∈ R( f ), (p) = (q) if and only if p ∼ q;(iii) (R( f )) is compact and nowhere dense.

Let : X → [0, 1] be the function (x) = 2∑

n∈N 3−nn(x).

(m) Prove that is a complete Lyapunov function for f , and that (R( f )) is con-tained in the Cantor set.

Aflow is a continuous functionφ : X × R → X such thatφ(·, 0) = IdX andφ(·, t) ◦φ(·, s) = φ(·, s + t) for all s, t ∈ R. A set A ⊂ X is an attractor (repeller) for φ

if it is an attractor (repeller) of φ(·, δ) for some δ > 0. For δ, ε > 0 and p, q ∈X , a (δ, ε)-chain from p to q is a sequence of times t0 = 0, t1, . . . , tT and pointsx0, . . . , xT such that d(p, x0) < ε, ti+1 > ti + δ and d(φ(xi−1, ti − ti−1), xi ) < ε forall i = 1, . . . , T , and d(xT , q) < ε. We say that x is chain recurrent if there is aδ > 0 such that for all ε > 0 there is a (δ, ε)-chain from x to itself. We write p ∼ qif there is a δ > 0 such that for all ε > 0 there is a (δ, ε)-chain from p to q and a(δ, ε)-chain from q to p.

(n) Does the analysis above translate to this continuous time setting? Discuss thearguments that need to be modified.

Chapter 16Extensive Form Games

The last two chapters describe concrete economic applications of contractible val-ued correspondences. These provide tangible evidence that the book’s concepts areuseful, but they are also interesting in themselves, both economically and math-ematically, and they allow us to develop some useful related material. (Anotherexample in which this book’s concepts have found concrete application is Eraslanand McLennan (2013), which proves a uniqueness result by showing that each con-nected components of the set of equilibria has index+1, so that there must be exactlyone component. Solan (2017) is a recent paper applying the Eilenberg–Montgomerytheorem.) It seems quite likely that the literature would now have other examples ifthe techniques described here were more widely known.

Sequential equilibrium is an equilibrium concept for extensive form games thatwas introduced byKreps andWilson (1982). At the timemany new solution conceptswere being introduced, and its status was not immediately evident, but since then ithas gradually come to be regarded as the foundational solution concept for extensiveform games, in much the same way that Nash equilibrium is regarded as the centralor basic concept for normal form games, even if certain refinements may seempreferable from various points of view. The work described here (from McLennan(1989a), McLennan (1989b)) shows that from a mathematical point of view as well,sequential equilibrium is the natural analogue ofNash equilibrium. Specifically (aftera slight modification of Kreps andWilson’s definition) the set of sequential equilibriais the set of fixed points of the natural best response correspondence, which is upperhemicontinuous and contractible valued, and whose domain is homeomorphic to theunit ball in a Euclidean space. Most obviously, this allows the application of indextheory. We will also explain how this perspective is a starting point for the definitionand analysis of refinements, which can be defined as the sets of fixed points ofsubcorrespondences of the best response correspondence.

Although our treatment of extensive form game theory is logically self contained,it does not present more than a little bit of the conceptual background, which istreated in considerable detail in van Damme (1987), as well as in the game theory


331


332 16 Extensive Form Games

texts Myerson (1991), Osborne and Rubinstein (1994). Less detailed treatments aregiven in graduate microeconomics texts such that Mas-Colell et al. (1995), Jehle andReny (2011), and many other sources.

The primary references for the material related to conditional systems are Myer-son (1986), McLennan (1989b), McLennan (1989a), Vieille (1996). A discussionof conceptual problems related to their application to extensive games is given byKohlberg and Reny (1997), which also has an extensive collection of references toapplications in noncooperative game theory. The concept has been applied in coop-erative game theory by Monderer et al. (1992). It should also be mentioned thatconditional systems had been studied earlier by the statisticians deFinetti (1936,1949a, b), Rényi (1955, 1956, 1970).

16.1 A Signalling Game

In comparison with most objects one encounters in mathematics, extensive formgames have many aspects, and are formally rather cumbersome. They can be usedto tell stories that are quite intuitive, and for this reason it seems simplest to beginwith an example.

Figure16.1 shows an example of a signalling game. (It is taken from Cho andKreps (1987), with some modification of the story.) There are two agents, the Senderand the Receiver. At the beginning of the game the Sender learns a piece of privateinformation, which is called her type. She then chooses a message. The receiver hasprior beliefs concerning the type, but does not observe it directly. She does see the

Fig. 16.1 A signalling game

16.1 A Signalling Game 333

message, after which she chooses an action. This concludes the game, and the twoagents receive payoffs that are numerical functions of the type, message, and action.

In Fig. 16.1 the Sender’s possible types areW and S, which stand for “weak” and“strong.” The receiver’s prior belief is that with probability 1/10 the Sender is weak,and with probability 9/10 she is strong. The two messages are B and C , which standfor “listening to the blues” and “listening to classical music.” The Receiver’s actionsare A and W , which stand for “attack” and “withdraw.” Regardless of the choice ofmusic, the Receiver would like to attack if the Sender is weak and withdraw if theSender is strong, and she is indifferent when she thinks the two types are equallylikely. The Sender’s payoff is the sum of 0 or 2, according to whether there is a fight,and 1 or 0, according to whether she listens to her favorite music, with the Senderpreferring classical music if she is weak and the blues if she is strong.

A sequential equilibrium of this game specifies a strategy for each of the Sendertypes, a posterior belief for the Receiver after each of the twomessages, and a strategyfor the Receiver after each of the two messages. The strategy of each Sender typemust be optimal taking the strategies of the Receiver as given, the belief after eachmessage must be given by Bayesian updating (conditional probability) if that is welldefined, and the strategy of the Receiver after each message must be optimal takingthe posterior belief as given.

Eventually wewill consider the possibility that the agents may playmixed (proba-bilistic) strategies, but first let’s analyze the sequential equilibria with pure strategies.The first point is that in equilibrium the two types of Sender must listen to the sametype of music. Otherwise the choice of music would reveal the Sender’s type, and theReceiver would attack if the Sender is weak and withdraw if the Sender is strong, butin this case the weak Sender could do better by switching, thereby avoiding a fight.There are equilibria inwhich theweak and strong Sender both listen to the blues. Lessintuitively, there are equilibria in which they both listen to classical music becausethat is the way to signal strength. In all of these equilibria the Receiver must attackafter the unexpected message, and this is consistent with equilibrium if, after thismessage, she believes that the the Sender is most likely weak. Since the unexpectedmessage is never chosen, such a belief is consistent with Bayesian updating.

Allowing mixed strategies refines the picture a bit, without adding very muchthat is new. A basic property of Bayesian updating (the expectation of the posterioris the prior) implies that after one of the two signals the Receiver believes that theSender is strong with probability at least 0.9, so that the Receiver withdraws. TheSender type that prefers this type of music will certainly assign all probability tothis message. There cannot be an equilibrium in which the other Sender type assignspositive probability to the other message, because that message would then revealthe type, and the Receiver’s best response to this would induce a deviation by oneof the two Sender types. In fact the only type of mixing that is possible is that inresponse to the unexpected message, the Receiver may mix if her posterior belief isthat the two Sender types are equally likely. If she mixes, she must deter deviationsby attacking with sufficiently high probability.


16.2 Extensive Form Games

This section lays out the formalism of finite extensive form games. As we can seealready from the example of Sect. 16.1, there are many types of objects. We need todescribe the possible states of the game, and which of these precede which others.The role of chance (which is sometimes described in terms of a mythical agent calledNature or Chance) will be represented by specifying a probability distribution overthe set of initial states. There will be a payoff for each agent at each terminal state.When a player chooses an action, there can be incomplete information concerning thestate that has occurred. This is represented by gathering the states that are possibleinto an “information set.” We need to specify how the choice of an action at aninformation set leads to a new state. All of this adds up to rather heavy notationalburden.

The set of possible states of the game is a finite set T , whose elements are callednodes. There is a strict partial order ≺ of T denoting precedence. For t ∈ T let

P(t) := {x ∈ T : x ≺ t}

be the set of predecessors of t . Let the sets of initial, nonterminal, noninitial, andterminal nodes be, respectively,

W := {w ∈ T : P(w) = ∅ }, X := T \ Z , Y := T \ W,

and Z := { z ∈ T : P−1(z) = ∅ } .

(Presenting these definitions in lexicographic order is admittedly a bit illogical.) Fort ∈ T let

P(t) := P(t) ∪ {t} and P(t) := P(t) ∩ Y.

The game begins at a node in W , actions are chosen at nodes in X , such actionchoices result in nodes in Y , and the game ends when play arrives at a node in Z .

We assume that for each t ∈ T , P(t) is completely ordered by≺. Kreps andWilsonuse the term arborescence to describe a pair (T,≺) satisfying this condition, theidea being that for each w ∈ W , P−1(w) is a tree. The intuitive meaning of thisassumption is that a node is reached by only one sequence of action choices, so that,for example, two different moves orders in chess lead to distinct nodes even if theyresult in the same position on the board. For y ∈ Y let p(y) := max P(y) be theimmediate predecessor of y.

There is a partition H of X whose elements are called information sets. That is,elements of H are nonempty subsets of X , and each element of X is contained inprecisely one element of H , which we denote by η(x). For each h ∈ H there is anonempty set Ah of actions that may be chosen at h. We assume that the sets Ah arepairwise disjoint, and for an arbitrary action a, ha denotes the information set suchthat a ∈ Aha . In order for this structure to make sense, for each x and a ∈ Aη(x) there

16.2 Extensive Form Games 335

must be a unique consequence of choosing a at x , so there is a bijection

c :⋃

h∈Hh × Ah → Y

such that p(c(x, a)) = x for all (x, a) in the domain. We say that c(x, a) is theimmediate consequence of choosing a at x . Let α : Y → ⋃

h Ah be the functiondefined implicitly by requiring that c(p(y), α(y)) = y, so α(y) is the last action priorto y.

The set of agents or players is I := {1, . . . , n}. There is a function ι : H → Ithat indicates which agent chooses the action at each information set. For each playeri let

Hi := ι−1(i) and Ai :=⋃

h∈Hi

Ah

be the set of information sets at which she chooses and the set of actions she mightchoose.

For any finite set X let Δ◦(X) = { μ ∈ Δ(X) : μ(x) > 0 for all x ∈ X } be theset of interior probability measures on X . The role of chance is modelled byρ ∈ Δ◦(W ), which is called the initial assessment. (A slightly more general setuphas some decisions during the course of play controlled Nature, whose choice prob-abilities are part of the exogenously given data describing the game.) The require-ment that all initial nodes have positive probability will amount to an assumptionthat when an “impossible” event occurs, the players only consider explanations thatinvolve some players deviating from their strategies.

The assumption that the agents share a common prior belief concerning the proba-bilities of elements ofW is known as theHarsanyi doctrine. This may seem undulyrestrictive, sincewe can easily think of examples of two peoplemaintaining divergentbeliefs, even after extensive exchange of views and evidence. On the other hand, aneconomic model that purported to explain some phenomenon as a consequence ofdifferent agents living in different probabilistic worlds would be suspicious, if notdownright bizarre. From the point of view of our agenda, allowing different agentsto have different prior beliefs would result only in a slightly more complicated for-malism, without affecting the mathematical analysis in any substantive respect.

We can now say how the game is played. An initial node is selected randomly,according to the distribution ρ. Whenever the game arrives as a nonterminal nodex the agent ι(η(x)) who controls the information set η(x) containing x chooses anaction a ∈ Aη(x), which results in a new node c(x, a). This process continues until aterminal node is reached.

There is a utility or payoff function u = (u1, . . . , un) : Z → RI specifying the

payoff that each player receives at each terminal node. These payoffs are understoodto be von Neumann–Morgenstern utilities, in the sense that each agent i strives tomaximize the expectation of the utility ui (z) of the realized terminal node z.


Summarizing, an extensive form game is a tuple G = (T,≺, H, (Ah)h∈H ,

I, ι, ρ, u) specifying all the objects listed above, with the assumed properties.Throughout the remainder of the chapter we will assume that such a game is given.

We regard each agent as a single person, capable (in principle) of rememberingeverything that happened to her previously. (A game such as contract bridge, inwhich a single agent might be identified with a pair consisting of two people whosee different information, raises conceptual issues that will not be addressed here.)Our notion of strategy, which specifies a (possibly random) action choice at eachinformation set, does not make much sense if the agent can condition her choices onrecollections that provide information concerning which nodes in the informationset are possible.

For i ∈ I and t ∈ T , player i’s personal history at t is the pair (Hi (t), Ai (t))where Hi (t) := η(P(t)) ∩ Hi and Ai (t) := α(P(t)) ∩ Ai are the information sets atwhich i chose and the actions that she chose on the way to t . We will always assumethat the game satisfies perfect recall: for all h ∈ H and x, x ′ ∈ h, the personalhistories of ι(h) at x and x ′ are the same. Note in particular that if x ′ ≺ x , then theaction chosen at x ′ on the way to x would also be chosen at an infinite sequenceof predecessors of x ′, which is impossible because T is finite. Thus perfect recallimplies that the elements of each information set are unrelated by precedence.

16.3 Sequential Equilibrium

Let A := ∏h Ah be the set of pure behavior strategy profiles. We may think of

this set as a cartesian product A = ∏i Si where Si := ∏

h∈HiAh is agent i’s set of

pure strategies. That is, a pure behavior strategy for agent i is an assignment of anelement of Ah to each h ∈ Hi . A mixed strategy for i is a probability distributionon Si . These notions express a point of view in which agent i thinks about the gamebefore it begins and formulates a complete plan for how to play the game. Anyrandomization is over complete plans for how to play the game, and takes placebefore the game begins.

Insofar as a poker player makes decisions one at a time, as various situations arise,the notion of a pure strategy is psychologically unnatural. Let

Π :=∏

h∈HΔ(Ah)

be the set of behavior strategy profiles. Again, we may think of this set as a carte-sian product Π = ∏

i Πi where Πi := ∏h∈ι−1(i) Δ(Ah) is agent i’s set of behavior

strategies. Elements of Πi are behavior strategies for agent i .In order for our use of behavior strategies to be strategically valid, a conceptual

issue must be dealt with. Any behavior strategy for i generates a canonical mixedstrategy in which the probability of each pure strategy is the product of the prob-abilities of its components (i.e., the agent’s choices at the various information sets

16.3 Sequential Equilibrium 337

are statistically independent). However, there are mixed strategies that do not comefrom behavior strategies in this way, and we must consider the possibility that theyprovide useful additional strategic flexibility. A theorem of Kuhn asserts that they donot: for anymixed strategy there is a behavior strategy that is “realization equivalent”in the sense that for any strategies (behavior or mixed) of the other agents, the mixedstrategy and the behavior strategy induce the same probability distribution on Z . As amatter of algebra this is a rather bulky calculation, in part because perfect recall playsan important part, so Exercise16.1 asks you to provide the formal details. Neverth-less the main idea is simple. Given a mixed strategy for i , the realization equivalentbehavior strategy is constructed as follows: for h ∈ Hi and a ∈ Ah , the probabilityof choosing a is the probability the mixed strategy assigns to pure strategies thatallow h to occur and choose a divided by the probability the mixed strategy assignsto pure strategies that allow h to occur. (If the mixed strategy does not allow h tooccur, there is no restriction on behavior there.)

We now define the transition probabilities induced by a behavior strategy profile.For x, y ∈ T with x ≺ y, a pure behavior strategy profile a ∈ A “takes the play fromx to y” if aη(p(y′)) = α(y′) for all y′ such that x ≺ y′ y. That is, at each predecessory′ of y that has x as a predecessor, and also when y′ = y, the action specified by aat η(p(y′)) is α(y′). The probability of going from x to y when play is governed bya behavior strategy profile π is the product

Pπ (y|x) :=∏

x ≺ y′ y

πη(p(y′))(α(y′))

of the probabilities of the action choices that lead from x to y. We set Pπ (t |t) := 1and Pπ (y|x) := 0 if x is neither a predecessor of y nor y itself. For t ∈ T , if w is theinitial predecessor of t and play is governed by π , then the probability that t occursis the probability ρ(w) that w occurs times the probability of going from w to t :

Pπ (t) := ρ(w) × Pπ (t |w) . (16.1)

The space of belief profiles is

M :=∏

h∈HΔ(h) .

As above, M = ∏i∈I Mi where Mi := ∏

h∈ι−1(i) Δ(h) is the space of systems ofbeliefs for agent i . Elements of M × Π are called assessments. We are interested inassessments in which the behavior strategy profile and the belief profile are relatedby a generalized form of Bayesian updating. We first explain Bayesian beliefs whenthe behavior strategy profile is interior, and then we take the closure of the resultingrelation. The space of interior behavior strategy profiles is

Π◦ :=∏

h∈HΔ◦(Ah) .


Since ρ assigns positive probability to all initial nodes, if π ∈ Π◦, then Pπ (t) > 0for all t ∈ T . For π ∈ Π◦ let μπ ∈ M be the system of beliefs given by Bayesianupdating (conditional probability): for each h ∈ H and x ∈ h,

μπh (x) := Pπ (x)∑

x ′∈h Pπ (x ′). (16.2)

The space of interior consistent assessments is

Ψ ◦ = { (μπ, π) : π ∈ Π◦ } .

Let Ψ be the closure of Ψ ◦ in M × Π . Elements of Ψ are called consistentassessments. Consistency of (μ, π) may seem like a strong condition to impose onbeliefs at parts of the tree that have zero probability forπ , but it may also be viewed assimply requiring that when something strange happens, you continue to believe thatthe other players’ behaviors at the various action sets are statistically independent.

Given an assessment (μ, π), an agent i , and an information set h ∈ ι−1(i),

Eμ,π (ui |h) :=∑

x∈hμh(x)

∑

z∈ZPπ (z|x)ui (z)

is i’s expected payoff conditional on arriving at h. For π ∈ Π , h ∈ H , and a ∈ Ah ,let π |a denote the behavior strategy profile that agrees with π at all informationsets other than h and assigns playing a with probability one to h. For an assessment(μ, π) and an information set h let

Bh(μ, π) := argmaxa∈Ah

Eμ,π |a(uι(h)|h)

be the set of optimal actions at h. An assessment (μ, π) is myopically rational ifπh ∈ Δ(Bh(μ, π)) for all h ∈ H . An assessment is a sequential equilibrium if it isboth consistent and myopically rational.

Myopic rationality is a weak concept, insofar as it asks only whether the behaviorat each information set is optimal, taking the behavior at all other information sets asgiven. The conceptually correct notion of rationality, called sequential rationality,requires that there is no information set at which the agent in control can increaseher conditional expected payoff by changing her behavior at that information setand/or other information sets occurring later in the game. Exercise16.2 asks you toprovide an inductive calculation showing that a sequential equilibrium is rationalin this conceptually correct sense. Roughly, the combination of perfect recall andconsistency implies that the beliefs at information sets lower down in the game treeare the “correct” ones from the point of view of the computation of expected utilityat the given information set, so that the agent at later information sets has correct(from the point of view of the given information set) incentives.

16.3 Sequential Equilibrium 339

There is a direct description of the set of sequential equilibria as a set of fixedpoints. For an assessment (μ, π) let

Φ(μ, π) = { (μ′, π ′) ∈ Ψ : π ′h ∈ Δ(Bh(μ, π)) for all h } .

It is easy to show that Φ(μ, π) �= ∅, which is to say that Φ : M × Π → Ψ is in facta correspondence. A fixed point of Φ is an element of Ψ , hence consistent, and thedefinition of Φ entails that a fixed point of Φ is sequentially rational. Conversely,a sequential equilibrium is evidently a fixed point of Φ, so F (Φ) is the set ofsequential equilibria. How well behaved is Φ? If (μn, πn) → (μ, π), then, becausethe relevant expected payoffs are continuous functions of μ and π , for sufficientlylarge n we have Bh(μn, πn) ⊂ Bh(μ, π) for all h and thus Φ(μn, πn) ⊂ Φ(μ, π),so Φ is upper hemicontinuous. If Φ was contractible valued, then Ψ would becontractible (because it is a possible value of Φ) and our correspondence would bewell behaved. But whether Φ is contractible valued is unknown.

16.4 Conditional Systems

The beliefs in a sequential equilibrium are generalized conditional probabilities. AsKreps and Wilson define the sequential equilibrium concept, they consider only theminimal amount of information required to compute expected payoffs, conditionalon reaching an information set and choosing an action there. The beliefs can beunderstood as conditional probabilities on the set A of pure behavior strategy pro-files, and we modify their definition by keeping track of all conditional probabilitieson this set. The mathematical effect of this will be to “unfold” the best responsecorrespondence.

Definition 16.1 A conditional system p on a finite set A is an assignment of aprobability measure p(·|E) ∈ Δ(E) to each nonempty E ⊂ A such that

p(C |E) = p(C |D) × p(D|E) (16.3)

whenever C ⊂ D ⊂ E ⊂ A with D �= ∅. The set of conditional systems on Ais denoted by Δ∗(A), and is endowed with the relative topology inherited from∏

∅�=E⊂A Δ(E).

As we mentioned at the beginning of the chapter, this space was first studied bythe statisticians de Finnetti and Rényi, introduced in economics by Myerson (1986),and then studied by McLennan (1989a, b), Vieille (1996).

Belowwe develop several equivalent formal descriptions or “coordinate systems”forΔ∗(A), but before proceeding to these it is perhaps best to provide amore intuitive(albeit mathematically cumbersome) description of the concept. A lexicographicprobability system (LPS) is a sequence of probabilitymeasures p0, . . . , pk ∈ Δ(A).


The application of these objects in decision theory, and in game theory, were studiedextensively by Blume et al. (1991a, b), and in many subsequent papers. We mightthink of p0 as an initial theory that one adheres to as the basis of Bayesian updatingof beliefs unless one learns some fact (that presumably had prior probability zero)that contradicts it, in which case one switches to p1 until it, too, is contradicted, andso forth. For each i let Ai be the support of pi . We say that our LPS is completeif

⋃i Ai = A, and it is a lexicographic conditional probability system (LCPS) if

A0, . . . , Ak is a partition of A. In this case there is a conditional system p given bysetting p(D|E) = pi (D ∩ Ai )/pi (E ∩ Ai )whenever D ⊂ E ⊂ Awith E nonemptyand i is the least index such that E ∩ Ai �= ∅. Conversely, for a conditional system pthere is an associatedLCPSgiven by setting p0 := p(·|A), letting A0 be the support ofp0, setting p1 := p(·|A \ A0), letting A1 be the support of p1, and so forth. If ai ∈ Ai ,a j ∈ A j , and i < j , then we say that ai is infinitely more probable than a j .

For any element of Δ◦(A) there is a conditional system consisting of the inducedconditional probability distributions on the various nonempty subsets of A, so wehave a functionΔ◦(A) → Δ∗(A). If p is a conditional systemwith p(·|A) ∈ Δ◦(A),then p is the image of p(·|A). In this way we can identify Δ◦(A) with { p ∈ Δ∗(A) :p(·|A) ∈ Δ◦ }, and our discussion will treat Δ◦(A) as a subset of Δ∗(A) wheneverthat is convenient.

Lemma 16.1 Δ∗(A) is the closure of Δ◦(A), and is compact.

Proof By continuity (16.3) is satisfied at any point in the closure ofΔ∗(A), soΔ∗(A)

is closed in∏

∅�=E⊂A Δ(E), hence compact, and it contains the closure of Δ◦(A).To complete the proof we will show that an arbitrary p ∈ Δ∗(A) is in the closure ofΔ◦(A). Let A0, . . . , Ak and p(·|A0), . . . , p(·|Ak) be the lexicographic conditionalprobability system of p. For a ∈ A let ia be the index such that a ∈ Aia . For ε > 0let pε ∈ Δ◦(A) be given by

pε(D|E) :=∑

a∈D p(a|Aia )εia

∑a∈E p(a|Aia )ε

ia.

Evidently pε → p as ε → 0. �

(Note that, as above, we always write p(a|E) in place of p({a}|E).)Since a conditional system specifies a conditional probability on every subset of

A, it contains an exponential (in |A|) amount of data, much of which is redundant.Our first task is to make the concept more tractable by showing that a conditionalsystem p is completely determined by the pairwise probabilities p(a|{a, b}).Lemma 16.2 If p ∈ Δ∗(A) and a, b, and c are distinct elements of A, then

p(a|{a, b}) × p(b|{b, c}) × p(c|{c, a}) = p(b|{a, b}) × p(c|{b, c}) × p(a|{c, a}) .

(16.4)

16.4 Conditional Systems 341

Proof After multiplying both sides of this equation by

p({a, b}|{a, b, c}) × p({b, c}|{a, b, c}) × p({c, a}|{a, b, c})

Equation (16.3) can be used to reduce both sides to the same quantity. This observa-tion constitutes a proof unless the quantity above is zero, so the equation is satisfiedby all elements on Δ◦(A), and thus, by continuity, by all elements of its closure. �

Lemma 16.3 If p(·|{a, b}), p(·|{a, c}), and p(·|{b, c}) satisfy (16.4), p(a|{a, b}) >

0, and p(b|{b, c}) > 0, then p(a|{a, c}) > 0.

Proof If p(a|{a, c}) = 0, then p(c|{a, c}) = 1, so every term on the left hand sideof (16.4) would be positive, but the right hand side would vanish. �

Lemma 16.4 If p ∈ Δ∗(A) and a and b are distinct elements of E ⊂ A, then

p(a|{a, b}) × p(b|E) = p(b|{a, b}) × p(a|E) .

Proof We compute, with the second equality coming from Eq. (16.3):

p(a|{a, b}) × p(b|E) = p(b|E) − p(b|{a, b}) × p(b|E)

= p(b|{a, b}) × p({a, b}|E) − p(b|{a, b}) × p(b|E)

= p(b|{a, b})(p({a, b}|E) − p(b|E)) = p(b|{a, b}) × p(a|E).

�

Proposition 16.1 Any system of probability distributions p(·|{a, b}) for a, b,∈ Athat satisfies (16.4) has a unique extension to a conditional probability, and the mapfrom the system to its extension is continuous.

Proof We define a binary relation � on A by specifying that a � b if and only ifp(a|{a, b}) > 0. This relation is complete because probabilities sum to unity, andLemma16.3 implies that it is transitive.

Suppose that the given data extends to p ∈ Δ∗(A). Consider a nonempty E ⊂ A,and let a be an element of E that ismaximal for�. For any b ∈ E such that p(b|E)>0we have p(a|E) = p(a|{a, b}) × p({a, b}|E) > 0, so the last result implies thatp(b|E)/p(a|E) = p(b|{a, b})/p(a|{a, b}). For any D ⊂ E we have

p(D|E) =∑

b∈D p(b|E)∑b∈E p(b|E)

=∑

b∈D p(b|E)/p(a|E)∑b∈E p(b|E)/p(a|E)

=∑

b∈D p(b|{a, b})/p(a|{a, b})∑b∈E p(b|{a, b})/p(a|{a, b}) .

Any extension must satisfy this, so there is at most one extension. In addition, theright hand side is a continuous function of the given data, so if every system of givendata has an extension, then the extension function is continuous.


We now show that the last equation does indeed define an a conditional system.We first show that the right hand side does not depend on the choice of a if more thanone choice is possible. Suppose that a′ is a second element of E that is maximal for�. Equation (16.4) implies that

p(b|{a, b})p(a|{a, b}) × p(a|{a, a′})

p(a′|{a, a′}) = p(b|{a′, b})p(a′|{a′, b}) .

(All denominators are positive.) Therefore

∑b∈D

p(b|{a,b})p(a|{a,b})∑

b∈Ep(b|{a,b})p(a|{a,b})

=∑

b∈Dp(b|{a,b})p(a|{a,b}) × p(a|{a,a′})

p(a′ |{a,a′})∑b∈E

p(b|{a,b})p(a|{a,b}) × p(a|{a,a′})

p(a′ |{a,a′})=

∑b∈D

p(b|{a′,b})p(a′ |{a′,b})∑

b∈Ep(b|{a′,b})p(a′ |{a′,b})

.

Finally we show that the conditional probabilities defined by the equation abovesatisfy (16.3). Suppose that C ⊂ D ⊂ E ⊂ A with D �= ∅. If there is an a ∈ D thatis maximal in E for �, then

p(C |E) =∑

b∈Cp(b|{a,b})p(a|{a,b})

∑b∈E

p(b|{a,b})p(a|{a,b})

=∑

b∈Cp(b|{a,b})p(a|{a,b})

∑b∈D


×∑

b∈Dp(b|{a,b})p(a|{a,b})

∑b∈E


= p(C |D) × p(D|E) .

If there is no such a, then p(C |E) = p(D|E) = 0, so this equation holds in that caseas well. �

We now develop two different representations of the space of conditional systems.The following definition uses the natural extension of multiplication and multiplica-tive inverses to numbers in [0,∞], leaving 0 × ∞ and ∞ × 0 undefined.

Definition 16.2 A relativeprobability r on A is a systemofnumbers r ∈ [0,∞]A×A

(a, b ∈ A) such that

r(b, a) = r(a, b)−1 and r(a, b) × r(b, c) = r(a, c)

for all a, b, c ∈ A other than those for which the product is undefined. Let R(A) bethe set of relative probabilities on A, and let

R◦(A) = R(A) ∩ (0,∞)A×A .

Proposition 16.2 For a conditional system p define rp : A × A → [0,∞] by setting

rp(a, b) := p(a|{a, b})p(b|{a, b}) .

Then rp ∈ R(A). Conversely, if r ∈ R(A), for any distinct a, b ∈ A let

16.4 Conditional Systems 343

pr (a|{a, b}) := r(a, b)

r(a, b) + 1.

These probabilities satisfy (16.4), so they extend to a pr ∈ Δ∗(A). The functions p �→rp andr �→ pr are inverse homeomorphisms that restrict to inverse homeomorphismsbetween Δ◦(A) and R◦(A).

Proof It is easily checked that p(a|{a, b}) + p(b|{a, b}) = 1 implies rp(b, a) =rp(a, b)−1 andEq. (16.4) implies rp(a, b) × rp(b, c) = rp(a, c). Similarly, r(b, a) =r(a, b)−1 implies pr (a|{a, b}) + pr (1|{a, b}) = 1 and r(a, b) × r(b, c) = r(a, c)implies that pr satisfies (16.4). Thus there is a homeomorphismbetween R(A) and theset of systems of pairwise probabilities satisfying (16.4). The last result implies thatthe set of such systems is homeomorphic to Δ∗(A) because the extension in Δ∗(A)

is a continuous function of the pairwise system, and of course the projection fromΔ∗(A) to the set of pairwise systems is also continuous. The composition of the twohomeomorphisms above is a homeomorphism between Δ∗(A) and R(A). From theequations in the last proof we see that a conditional system p has 0 < p(D|E) < 1whenever D is a nonempty proper subset of E if and only if 0 < p(a|{a, b}) < 1whenever a and b are distinct elements of A, and in turn this is the case if and onlyif 0 < r(a, b) < ∞ for all such a and b. Thus the homeomorphism between Δ∗(A)

and R(A) restricts to a homeomorphism between Δ◦(A) and R◦(A). �

A slightly different representation of the space of conditional systemswill be evenmore useful, because it allows applications of linear algebra. In the next definitionand result we use the natural extension of addition and negation to [−∞,∞] in whichthe sum x + y is defined except when x = −∞ and y = ∞ or x = ∞ and y = −∞.We also use the natural extensions of the exponential function to [−∞,∞] and thenatural logarithm function to [0,∞], which are of course inverses.

Definition 16.3 A logarithmic relative probability λ on A is a system of numbersλ ∈ [−∞,∞]A×A (a, b ∈ A) such that

λ(b, a) = −λ(a, b) and λ(a, b) + λ(b, c) = λ(a, c)

for all a, b, c ∈ A other than those for which the left hand side of the second equationis undefined. Let (A) be the set of logarithmic relative probabilities on A, and let

◦(A) = (A) ∩ (−∞,∞)A×A .

Proposition 16.3 For r ∈ R(A) and a, b ∈ A let

λr (a, b) := ln r(a, b) .

Then λr ∈ (A). Conversely, if λ ∈ (A) and a, b ∈ A let

rλ(a, b) := exp λ(a, b) .


Then pr ∈ Δ∗(A). The functions r �→ λr and λ �→ rλ are inverse homeomorphismsthat restrict to inverse homeomorphisms between R◦(A) and ◦(A).

Proof It is obvious that if r ∈ R(A), then λr ∈ (A), and if λ ∈ (A), then rλ ∈R(A). Furthermore the maps r �→ λr and λ �→ rλ are continuous and inverses. �

For p ∈ Δ∗(A), let λp := λrp , and for λ ∈ (A) let pλ := prλ . Evidently p �→λp and λ �→ pλ are inverse homeomorphisms between Δ∗(A) and (A). Mostlythroughout the remainder of this chapter we will regard Δ∗(A), R(A), and (A)

as “the same” space, for which there are three different presentations or coordinatesystems. Of these, (A) will be the most important.

16.5 Strong Deformation Retracts

This and the following three sections are primarily mathematical, and rather abstract.The agenda of this section is topological. The following notion is a strong version ofconcepts that were studied in Chap.8.

Definition 16.4 Suppose that X is a topological space and A ⊂ X . A strong defor-mation retraction of X onto A is a continuous function ρ : X × [0, 1] → X satis-fying:

(a) ρ(x, 0) = x for all x ∈ X ;(b) ρ(a, t) = a for all (a, t) ∈ A × [0, 1];(c) ρ(x, 1) ∈ A for all x ∈ X .

If such a function exists we say that A is a strong deformation retract (SDR) of X .

Ultimately the importance of the SDR concept will be to infer that one of the twospaces is contractible once we know the other is.

Lemma 16.5 If A is an SDR of X, then A is contractible if and only if X is con-tractible.

Proof If A is contractiblewe can contract X by following a strongdeformation retrac-tion of X onto A at double speed between 0 and 1/2, then following the contractionof A at double speed. On the other hand, if X is contractible, then (Lemma8.2) A iscontractible because it is a retract of X . �

In our application the two sets of interest will be related by a sequence of strongdeformation retractions. Fortunately the “is an SDR of” relation is transitive.

Lemma 16.6 If A is an SDR of X and B is an SDR of A, then B is an SDR of X.

Proof A strong deformation retraction of X onto B is given by following the strongdeformation retraction of X onto A at double speed between 0 and 1/2, then followingthe strong deformation retraction of A onto B at double speed. �

16.5 Strong Deformation Retracts 345

Provided that appropriate topological conditions are satisfied, strong deforma-tion retractions can be embedded in a larger space. A topological fact prepares theargument.

Lemma 16.7 Suppose that {Cα}α∈A is a locally finite cover of X whose elementsare closed. Then U ⊂ X is open if and only if U ∩ Cα is open in Cα for all α.

Proof IfU is open, then automatically eachU ∩ Cα is open inCα . Suppose that eachU ∩ Cα is open in Cα . Fixing x ∈ U , it suffices to show that U is a neighborhoodof U . Let V is a neighborhood of x that intersects only finite many elements of thecover, say Cα1 , . . . ,Cαk . If x was not an element of Cαi we could replace V withV \ Cαi , so we may assume that x is an element of each Cαi . For each i let Wi be anopen subset of X such thatU ∩ Cαi = Wi ∩ Cαi . Then V ∩ ⋂

i Wi is a neighborhoodof x , and

V ∩⋂

i

Wi = V ∩⋃

α

(⋂

i

Wi

)∩ Cα ⊂

⋃

i

Wi ∩ Cαi =⋃

i

U ∩ Cαi ⊂ U .

�

Lemma 16.8 Suppose that {Cα}α∈A is a locally finite collection of closed subsets ofX. For each α let ∂Cα := Cα ∩ X \ Cα and Uα := Cα \ ∂Cα . Assume that the setsUα are pairwise disjoint. If, for each α, there is a strong deformation retraction ρα :Cα × [0, 1] → Cα of Cα onto ∂Cα , then the homotopy ρ : X × [0, 1] → X given by

ρ(x, t) :={

ρα(x, t), x ∈ Uα,

x, otherwise,

is a strong deformation retraction of X onto X \ ⋃α Uα .

Proof Clearly ρ is well defined and satisfies (a)–(c) of Definition16.4. The setsCα × [0, 1] cover X × [0, 1], and each (x, t) has a neighborhood U that intersectsonly finitely many Cα × [0, 1], so ρ−1(U ) is open because its intersection with eachof these sets and with (X \ ⋃

α Uα) × [0, 1] is open in that set. �

The retraction that will be performed repeatedly maps a simplex minus one faceto its boundary. To justify this we show that a polytope minus a facet can be retractedonto its boundary, then show that a simplex minus a face is homeomorphic to apolytope minus a facet.

Lemma 16.9 If Q is a polytope, ∂Q is the union of the facets of Q, and F is a facetof Q, then ∂Q \ F is an SDR of Q \ F.

Proof Let x be a point in the interior of F . The map h : (∂Q \ F)×(0, 1]→Q \ Fgiven by h(y, s) = (1 − s)x + sy is a homeomorphism with h(y, 1) = y. The func-tion ρ : (Q \ F) × [0, 1] → Q \ F given by ρ(h(y, s), t) = h(y, s + t − st) is asuitable strong deformation retraction. �


Lemma 16.10 If Δ is a simplex, ∂Δ is the union of the proper faces of Δ, and Δ′ isa nonempty proper face of Δ, then ∂Δ \ Δ′ is an SDR of Δ \ Δ′.

Proof We may suppose that Δ is the convex hull of the standard unit basis vectorse1, . . . , em of Rm , and that Δ′ is the convex hull of e1, . . . , ek where 1 ≤ k < m.Let Δ′′ be the convex hull of ek+1, . . . , em . Each point in Δ \ Δ′ is (1 − s)x + syfor a unique s ∈ (0, 1], an x ∈ Δ′ that is unique if s < 1, and a unique y ∈ Δ′′.Therefore there is a well defined map f : Δ \ Δ′ → R

m given by f ((1 − s)x +sy)=(1−s)x + y. Given a point in the image,we can identify x , y, and s by projectingonto the coordinate subspaces, so this map is invertible, and its inverse is obviouslycontinuous. Let Δ′′′ be the convex hull of Δ′ and the origin. Then the image of f ishomeomorphic to (Δ′′′ \ Δ′) × Δ′′, and Δ′ × Δ′′ is a facet of Δ′′′ × Δ′′ because Δ′is a facet of Δ′′′. Therefore the last result is applicable. �

Let P be a simplicial complex. We say that P is locally finite if each of itsvertices is contained in only finitely many simplices. LetQ be a subcomplex ofP .We say that Q is normal in P if Q contains every simplex of P whose verticesare all inQ.

Recall (Sect. 2.6) that the closed star of x ∈ |P|, denoted by st(x,P) (or st(x)if there is no ambiguity) is the union of all the simplices that contain x , the openstar, denoted by st(x,P), is the union of the relative interiors of all the simplicesthat contain x . For any S ⊂ |P| we let

st(S,P) :=⋃

x∈Sst(x,P) and st(S,P) :=

⋃

x∈Sst(x,P) .

Proposition 16.4 If P is locally finite and Q is normal in P , then there isa strong deformation retraction ρ : st(|Q|,P) × [0, 1] → |Q| such that ρ(P ∩st(|Q|,P), t) ⊂ P for all P ∈ P such that P ∩ |Q| �= ∅ and all t ∈ [0, 1].Proof Consider a point x ∈ st(|Q|) \ |Q|, and let P be a simplex of P containingx . Then Q := P ∩ |Q| is contained in the convex hull of the vertices of P that arein |Q|, and this simplex is in Q, so Q is this simplex. Since Q is normal in P andx /∈ |Q|, there are vertices of P that are not in |Q|. These vertices span a face R ofP , and the last result gives a strong deformation retraction of P \ R onto ∂P \ R.

For eachn = 0, 1, 2, . . . letGn := |Q| ∪ (st(|Q|) ∩ |Pn|). For eachn = 1, 2, . . .Lemma16.8 allows the strong deformation retractions described above to be com-bined to give a strong deformation retraction ρn : Gn × [0, 1] → Gn of Gn ontoGn−1.

Let ρ1 : G1 × [1/2, 1] → G1 be the function ρ1(x, t) := ρ1(x, 2t − 1). Suppos-ing that ρn−1 : Gn−1 × [2−(n−1), 1] → Gn−1 has already been defined, for x ∈ Gn

and t ∈ [2−n, 1] let

ρn(x, t) ={

ρn(x, 2 − 2nt), t ≤ 2−(n−1),

ρn−1(ρn(x, 1), t), otherwise.

16.5 Strong Deformation Retracts 347

Finally define ρ : st(|Q|) × [0, 1] → st(|Q|) by setting ρ(x, 0) := x and lettingρ(x, t) := limn→∞ ρn(x, t) if t > 0. This is continuous because its restriction toeach Gn × [0, 1] is continuous and (since P is locally finite) each point in st(|Q|)has a neighborhood that is contained in some Gn .

Since each step of the construction retracts part of a simplex onto its boundary,ρ(P ∩ st(|Q|,P), t) ⊂ P for all P ∈ P such that P ∩ |Q| �= ∅ and allt ∈ [0, 1]. �

Corollary 16.1 If P is locally finite and Q is normal in P , then |Q| is a neigh-borhood retract in |P|.

We now describe the particular setting in which Proposition16.4 will be appliedin this chapter. Let V be a finite poset, which is to say that V is a finite set endowedwith a partial order �. A chain in V is a set σ ⊂ V that is completely ordered by�. Let � be the set of chains in V . A subset of a chain is a chain, so � containsall subsets of its elements, and consequently (V, �) is a combinatoric simplicialcomplex that is called the order complex of V .

We fix the usual geometric realization of this complex. For each a ∈ V let ea bethe associated standard unit basis vector in RV . For σ ∈ � let |σ | be the convex hullof { ea : a ∈ σ }, let ∂|σ | := ⋃

τ⊂σ,τ �=σ |τ | be the boundary of |σ |, and let |σ |◦ :=|σ | \ ∂|σ | be its interior.

We say that S is closed if b ∈ S whenever a ∈ S and b � a. For such an S let

�S := { σ ∈ � : σ ⊂ S } and �S := { σ ∈ � : σ ∩ S �= ∅ } .

Evidently (S, �S) is a subcomplex of (V, �) that is normal in (V, �). Let NS :=⋃σ∈�S

|σ |◦ and JS := ⋃σ∈�S |σ |◦. Then NS = |(S, �S)|, and JS = st(NS), so:

Proposition 16.5 If S ⊂ V is closed, then NS is an closed subset of |(V, �)|, JS isa open subset of |(V, �)|, and NS is an SDR of JS.

16.6 Conical Decompositions

In this section V is a finite dimensional vector space endowed with an inner product.A conical decomposition of V is a finite collection C of polyhedral cones such that:

(a) any nonempty face of an element of C is also an element of C ;(b) the intersection of any two elements of C is a common face;(c)

⋃C∈C C = V .

(Since, by definition, a convex cone is nonempty, ∅ /∈ C .) For C ∈ C let ∂C bethe union of the proper faces of C , and let C◦ := C \ ∂C . In view of (b), for eachw ∈ V there is an element of C that contains it that is minimal, in the sense that itis contained in any other element of C that contains w. Therefore {C◦ : C ∈ C } isa partition of V . For each C ∈ C fix an arbitrary point wC ∈ C◦.


As a collection of subsets of V , C is partially ordered by inclusion. Let � be theset of chains in C , and let ∂� := { σ ∈ � : {0} /∈ σ }.Lemma 16.11 If σ ∈ ∂�, then {wC : C ∈ σ } is a linearly independent set.Proof The claim is trivial when σ = ∅, and if σ has a single elementC , thenwC �= 0because C �= {0}. If C is the maximal element of σ , then all other elements arecontained in some proper face of C , and the span of this face does not contain wC .Therefore the claim follows by induction on the number of elements of σ . �

Corollary 16.2 If σ ∈ �, then {wC : C ∈ σ } is an affinely independent set.For each σ ∈ � \ ∂� let

Dσ :={

∑

C∈σ

αCwC : αC ≥ 0 for allC

}

be the closed cone generated by {wC : C ∈ σ }.Lemma 16.12 For all σ, σ ′ ∈ � \ ∂�, Dσ ∩ Dσ ′ = Dσ∩σ ′ .

Proof That Dσ∩σ ′ ⊂ Dσ ∩ Dσ ′ is an automatic consequence of the definitions. Sup-pose thatw ∈ Dσ ∩ Dσ ′ , so thatw = ∑

C∈σ αCwC = ∑C ′∈σ ′ α′

C ′wC ′ for nonnegativeαC , αC ′ . Let C be the element of C such that w ∈ C◦. We argue by induction on thedimension of C . If this dimension is zero, then w is the origin, which is a member ofDσ∩σ ′ . Clearly C is the maximal C ∈ σ such that αC > 0 and the maximal C ′ ∈ σ ′such that α′

C ′ > 0. Furthermore, if αC < α′Cthen

w − αCwC =∑

C∈σ\{C}αCwC = (α′

C− αC)wC +

∑

C ′∈σ ′\{C}α′C ′wC ′

would be both an element of the boundary of C and an element of its interior, soαC = α′

C. Now w − αCwC ∈ Dσ\{C} ∩ Dσ ′\{C} = D(σ∩σ ′)\{C}, so w ∈ Dσ∩σ ′ . �

In Sect. 2.2 we defined a notion of pointedness for cones and showed (Proposi-tion2.2) that a closed convex coneC ⊂ V is pointed if and only if it contains no line.We say that C is pointed if each of its elements is pointed. Henceforth we assumethat this is the case. Let

D := { Dσ : σ ∈ � \ ∂� } .

The passage from C to D is a conical analogue of the passage from a polytopalcomplex to its barycentric subdivision.

Proposition 16.6 D is a pointed conical decomposition of V .

16.6 Conical Decompositions 349

Proof Consider σ ∈ � \ ∂�. Proposition2.9 implies that Dσ is a polyhedral cone. IfC is the maximal element of σ , then Dσ ⊂ C , so Dσ is pointed becauseC is pointed.

We verify (a)–(c). If H = { v ∈ V : 〈n, v〉 ≤ α } is a halfspace containing Dσ ,then the intersection of Dσ with the boundary of H is Dσ ′ where σ ′ = {C ∈ σ :〈n,wC 〉 = α }. Thus D satisfies (a).

Since {wC : C ∈ σ } is a linearly independent set, the cone generated by anysubset of {wC : C ∈ σ } is a face of Dσ , so, in view of the last result, the intersectionof any two elements of D is a common face. Thus D satisfies (b).

It remains to show that an arbitrary w ∈ V is an element of some element of D .Let C be the minimal element of C that contains w. We argue by induction on thedimension of C . If this dimension is 0, then C = {0}, so w = 0 ∈ {0} = D{{0}} ∈ D .Therefore we may assume that the dimension of C is positive, so that wC �= 0, andthat every element of ∂C is contained in some Dσ . Sincew + αwC ∈ C for all α ≥ 0,and C does not contain a line, there is some α > 0 such that w − αwC ∈ ∂C . Thereis a τ ∈ � \ ∂� such that w − αwC ∈ Dτ , so that w ∈ Dτ∪{C}. �

We can now define ∂Dσ and D◦σ as we defined ∂C and C◦ above. Evidently

D◦σ :=

{∑

C∈σ

αCwC : αC > 0 for allC

}.

Lemma 16.13 For eachC ∈ C , C◦ is the union of the D◦σ such that C is themaximal

element of σ .

Proof That the union is contained inC◦ follows directly from the definition.We argueby induction on the dimension of C . Consider a point w ∈ C◦. If the dimension ofC is zero, then C = {0}, the unique element of � \ ∂� that has C as its maximalelement is {{0}}, and D◦

{{0}} = {0}. Therefore we may suppose that wC �= 0. Sincew + αwC ∈ C for all α ≥ 0, and C does not contain a line, there is an α > 0 suchthat w′ := w − αwC ∈ ∂C . If w′ ∈ D◦

σ , then w ∈ D◦σ∪{C}. �

Corollary 16.3 For each C ∈ C , C is the union of the Dσ such that each elementof σ is a subset of C.

Proof Since C is the union of the C ′◦ for those C ′ ∈ C that are contained in C , thisfollows from the last result. �

LetT := (C , �) and ∂T := (C \ {{0}}, ∂�) .

For each nonempty σ ∈ � let |σ | be the convex hull of {wC : C ∈ σ }. Corollary16.2implies that |σ | is a simplex. Let ∂|σ | be the union of all |τ | such that τ is a propersubset of σ , and let |σ |◦ := |σ | \ ∂|σ |. Of course ∅ ∈ �, and we set |∅| := ∂|∅| :=|∅|◦ := ∅.Lemma 16.14 If τ ⊂ σ ∈ ∂�, then


|σ | ∩ Dτ = |τ | and |σ ∪ {{0}}| ∩ Dτ = |τ ∪ {{0}}| .

Proof Observe that |σ | ∩ Dτ is contained in the intersection of |σ | with the span ofDτ , which (by Lemma16.11) is |τ |. The second claim follows immediately. �

Lemma 16.15 For all σ, σ ′ ∈ �, |σ | ∩ |σ ′| = |σ ∩ σ ′|.Proof Suppose that σ, σ ′ ∈ ∂�. Then Lemma16.12 gives

|σ | ∩ |σ ′| = (|σ | ∩ Dσ ) ∩ (|σ ′| ∩ Dσ ′) = |σ | ∩ |σ ′| ∩ Dσ∩σ ′ = |σ ∩ σ ′| .

Since |σ ∪ {{0}}| = { tv : v ∈ |σ |and 0 ≤ t ≤ 1 }, the claim also holdswhen σ ∈ � \∂� and/or σ ′ ∈ � \ ∂�. �

Lemma 16.16 { |σ | : σ ∈ � } is a geometric realization of T .

Proof We have already observed that each |σ | is a simplex, and since T is a com-binatoric simplicial complex, each face of |σ | is |τ | for some τ ∈ �. The last resultstates that the intersection of two elements of { |σ | : σ ∈ � } is a common face. �

Proposition 16.7 |T | is homeomorphic to the closed unit ball in V , and |∂T | =∂|T |.Proof For each σ ∈ � \ ∂�, each v ∈ Dσ \ {0} is ∑

C∈σ αCwc for some nonnega-tive numbers αC , not all of which vanish, so there is a unique rσ (v) > 0 such thatrσ (v)v ∈ |σ |. This function is continuous because it solves a nonsingular linear alge-bra problem. The various functions rσ agree on the overlaps of their domains, whichare a finite system of relatively closed sets that cover V \ {0}, so there is a uniquecontinuous function r : V \ {0} → R++ such that r(v)v ∈ |∂T | for all v. It is nowclear that |T | = { sv : v ∈ |∂T |and 0 ≤ s ≤ 1 }, so

∂|T | = |T | ∩ V \ |T | = |∂T |.

Let D be the closed unit disk in V . The function h : ∂D → |∂T | given by h(v) :=r(v)v is a continuous bijection, so its inverse is also continuous. (IfC ⊂ ∂D is closed,then it is compact, so h(C) is compact, hence closed.) The function v �→ ‖v‖r(v)vis evidently a homeomorphism between D and |T |. �

A set S ⊂ C is closed if C ′ ∈ S whenever C,C ′ ∈ C and C ′ ⊂ C . For such an Slet:

NS :=⋃

σ∈�, σ⊂S

|σ |◦ ; JS :=⋃

σ∈�, σ∩S �=∅|σ |◦ ; KS :=

⋃

σ∈�, σ∩S �=∅D◦

σ∪{{0}} .

Evidently T is the order complex of the partially ordered (by inclusion) set C , so(Proposition16.5) NS is closed in |T |, JS is open in |T |, and NS is an SDR of JS . Forv ∈ V let pv be the point in |T | on the line segment between the origin and v that is

16.6 Conical Decompositions 351

nearest to v. The homotopy (v, t) �→ (1 − t)v + tpv is a strongdeformation retractionof V onto |T |. Since KS contains the interior of the line segment between any of itspoints and the origin, the restriction of this homotopy to KS is a strong deformationretraction of KS onto JS . Therefore either NS , JS , and KS are all contractible, ornone of them are.

Lemma16.13 implies that KS = ⋃C∈S C◦. Therefore KS does not depend on the

choice of the points wC . For this reason it will be much more susceptible to analysis.

16.7 Abstract Consistency

In this section A is simply a nonempty finite set, without any other structure. We fixa linear subspace �◦ of ◦(A) and let � be the closure of �◦ in (A).

We adopt the usual notational conventions concerning complete orders of A,which are denoted by �, �′, etc. That is, � and �′ are the asymmetric parts of � and�′, so that a � b if a � b but not b � a, and ∼ and ∼′ are the symmetric parts of� and �′, so that a ∼ b if and only if both a � b and b � a. We say that �′ refines�, or is a refinement of �, if a � b for all a, b ∈ A such that a �′ b. (Equivalently,a � b implies a �′ b.) If, in addition, �′ �= �, then �′ is a strict refinement of �.Let �◦ be the complete order with a ∼ b for all a, b ∈ A. Of course any completeorder refines �◦.

For λ ∈ (A) we define binary relations Rc(λ) and Rc(λ) on A, called thecoarse order and fine order induced by λ, by specifying that aRc(λ)b if and only ifλ(a, b) > −∞ and aR f (λ)b if and only if λ(a, b) ≥ 0. These relations are completebecause λ(b, a) = −λ(a, b), and they are transitive because λ(a, b) + λ(b, c) =λ(a, c). Of course R f (λ) is a refinement of Rc(λ).

We say that � is coarsely consistent if there is a ξ ∈ � such that Rc(ξ) = �,and it is finely consistent if there is a ξ ∈ �◦ such thatR f (ξ) = �. Since the coarseand fine orderings of the origin in �◦ are both �◦, this ordering is both coarsely andfinely consistent.

Lemma 16.17 A complete order � is coarsely consistent if and only if it is finelyconsistent.

Proof First suppose that � is finely consistent, soR f (ξ) = � for some ξ ∈ �◦. Forany ξ ′ ∈ �◦,Rc(limα→∞ αξ + ξ ′) = �, so � is coarsely consistent.

Now suppose that � is coarsely consistent. Let

L� := { ξ ∈ �◦ : ξ(a, b) = 0 for all a, b ∈ A such that a ∼ b } .

Fix a linear subspace Θ� of �◦ that is complementary to L� in the sense that L� ∩Θ� = {0} and L� + Θ� = �◦, and let �� : �◦ → L� and θ� : �◦ → Θ� be thelinear functions such that ��(ξ) + θ�(ξ) = ξ for all ξ ∈ �◦.

Let ξ1, ξ2, . . . be a sequence in �◦ converging to a point ξ ∈ ��. Since �◦ isfinely consistent, we may assume that � �= �◦. Therefore there are a, b such thata � b, so ξr (a, b) → ∞ and ‖ξr‖ → ∞. For each a, b ∈ A such that a ∼ b we have��(ξr )(a, b) = 0, so


|θ�(ξr )(a, b)| = |ξr (a, b)| → |ξ(a, b)| < ∞ .

If the sequence {θ�(ξr )} was unbounded, we could pass to a subsequence with‖θ�(ξr )‖ → ∞, and the sequence {θ�(ξr )/‖θ�(ξr )‖} would necessarily have a limitpoint ξ ∗ in the unit sphere ofΘ�. But then ξ ∗(a, b) = 0 for each a, b such that a ∼ b,which is to say that ξ ∗ ∈ L�, so this is impossible. Therefore the sequence {θ�(ξr )}is bounded. For each a, b ∈ A with a � b we now have

��(ξr )(a, b) = ξr (a, b) − θ�(ξr )(a, b) → ∞ .

Since ��(ξr ) ∈ L�,R f (��(ξr )) = � for large r . �

Since coarse consistency and fine consistency are the same thing, henceforth wewill simply use the term consistency. For a consistent � let

�(�) := { ξ ∈ � : Rc(ξ) = � } .

Evidently �(�◦) = �◦.

Definition 16.5 For a complete order � of A let G� : �◦ → [−∞,∞]A×A be themap

G�(ξ)(a, b) :=

⎧⎪⎨

⎪⎩

∞, a � b,

ξ(a, b), a ∼ b,

−∞, b � a.

Proposition 16.8 If� is consistent, then G�(�◦) = �(�), so (disregarding infinitecomponents) �(�) is a vector space.

Proof We have G�(ξ) = limα→∞ αξ ′ + ξ ∈ �(�) for any ξ ∈ �◦ and any ξ ′ ∈ �◦such that R f (ξ ′) = �, so G�(�◦) ⊂ �(�).

As the image of a linear transformation (in the obvious sense) G�(�◦) is a linearsubspace of �(�), and thus it is a closed subset of �(�). For any ξ ∈ �(�) thereis a sequence ξ1, ξ2, . . . in �◦ converging to ξ , and clearly G�(ξr ) → ξ . ThereforeG�(�◦) is dense in �(�), so it must be all of �(�). �

For the time being fix a consistent �0. We say that a complete order � is �0-consistent if there is a ξ ∈ �(�0) such thatR f (ξ) = �. Obviously a �0-consistentorder refines �0.

Lemma 16.18 If ξ ∈ �◦ and R f (ξ) refines �0, then R f (G�0(ξ)) = R f (ξ).

Proof If a ∼0 b, then G�0(ξ)(a, b) = ξ(a, b), so aR f (G�0(ξ))b if and only ifaR f (ξ)b. If a �0 b, thenG�0(ξ)(a, b) = ∞ and ξ(a, b) > 0 becauseR f (ξ) refines�0, so aR f (G�0(ξ))b and aR f (ξ)b. �

Proposition 16.9 A complete order � is �0-consistent if and only if it is consistentand a refinement of �0.

16.7 Abstract Consistency 353

Proof We have just shown that a consistent refinement of �0 is �0-consistent. Sup-pose that � is �0-consistent. Of course � refines �0, so it remains to show that � isconsistent.

Let π� : �◦ → R{ (a,b)∈A×A : a∼b } be the projection. The unit disk in �◦ maps

onto a neighborhood of the origin in π�(�◦). If 1/α is the radius of the largestdisk contained in its image, then the image of the disk of radius α contains the diskof radius one, so for any ξ ∈ �◦ there is a χ ∈ �◦ such that π�(χ) = π�(ξ) and‖χ‖ ≤ α‖π�(ξ)‖.

Fix a ξ ∈ ��0 such that Rf (ξ) = � and a sequence {ξr } in �◦ that converges to

ξ . For each r choose χr ∈ �◦ such that π�(χr ) = π�(ξr ) and ‖χr‖ ≤ α‖π�(ξr )‖.We have π�(ξr ) → 0, so ‖χr‖ → 0, and for all a and b such that a � b we have(ξr − χr )(a, b) > 0 for sufficiently large r . If a ∼ b, then (ξr − χr )(a, b) = 0, soR f (ξr − χr ) = � for sufficiently large r . �

Let O be the set of consistent orders of A. For each �0 ∈ O let O(�0) be the setof consistent orders that refine �0, and for each � ∈ O(�0) let

C�(�0) := { ξ ∈ �(�0) :� refinesR f (ξ) } .

Let C (�0) := {C�(�0) : � ∈ O(�0) }.Proposition 16.10 C (�0) is a pointed conical decomposition of �(�0).

Proof Observe that

C�(�0) = { ξ ∈ �(�0) : ξ(a, b) ≥ 0 for all a, b ∈ A such that a � b } . (16.5)

Since it is a subset of a Euclidean space defined by a finite conjunction of linearinequalities,C�(�0) is a polyhedral cone. Recall (Proposition2.6) that any nonemptyset obtained be replacing some of these inequalities with equalities is a face ofC�(�0), and any face of C�(�0) has this form. Therefore C (�0) contains the facesof its elements, and the intersection of any two elements of C (�0) is a commonface. Each ξ ∈ �(�0) is an element ofCR f (ξ)(�0), so

⋃� ∈O (�0)

C�(�0) = �(�0).To see that C�(�0) is pointed consider that if R f (ξ0 + αξ) is refined by � for allα ∈ R, then ξ(a, b) = 0 for all a and b. �

As before, let ∂C�(�0) be the union of the proper faces of C�(�0), and letC◦�(�0) := C�(�0) \ ∂C�(�0). The set O is partially ordered by increasing refine-ment. If �0,�,�′∈ O with �′ a refinement of � and � a refinement of �0, thenC�(�0) ⊂ C�′(�0). For each � ∈ O choose a point w� ∈ C◦�(�◦).

Let � be the set of chains in O , and let ∂� be the set of σ ∈ � such that �◦ /∈ σ .These are the simplices of the combinatoric simplicial complexes T := (O, �)

and ∂T := (O \ {�◦}, ∂�). When we consider a σ = {�0,�1, . . . ,�d} ∈ � therelations will always be ordered by increasing refinement. For such a σ let |σ | bethe convex hull of {w�0 ,w�1 , . . . ,w�d }. We also have ∅ ∈ �, and we let |∅| :=∅. Let |T | := ⋃

σ∈� |σ | and |∂T | := ⋃σ∈∂� |σ |. We have achieved the situation


considered in the last section: according to Proposition16.7, { |σ | : σ ∈ � } is asimplicial subdivision of |T |, |T | is homeomorphic to the closed unit ball in �◦,and |∂T | = ∂|T |.

Our main objective now is to construct a homeomorphism between |T | and �.For σ = {�0,�1, . . . ,�d} ∈ � let |σ |† := |σ | \ |{�1, . . . ,�d}|, so that

|σ |† = { α0w�0 + · · · + αdw�d : α0 > 0, α1, . . . , αd ≥ 0, and∑

i

αi = 1 } .

Any point ξ ∈ |σ |† has a unique representation of the form

ξ = (γ0 − γ1)w�0 + · · · + (γd − γd+1)w�d

where 1 = γ0 > γ1 ≥ · · · ≥ γd ≥ γd+1 = 0.If� refines�0 letw�(�0) := G�0(w�) ∈ C◦�(�0). Note thatw�0(�0) is the origin

of the vector space �(�0). For σ = {�0,�1, . . . ,�d} ∈ �(�0) let

Dσ := { α1w�1(�0) + · · · + αdw�d (�0) : α1, . . . , αd ≥ 0 } ⊂ �(�0) .

It is possible that d = 0, in which case Dσ := {w�0(�0)}. Any ξ ∈ Dσ has a uniquerepresentation of the form

ξ = (δ1 − δ2)w�1(�0) + · · · + (δd − δd+1)w�d (�0)

where ∞ = δ0 > δ1 ≥ δ2 ≥ · · · ≥ δd ≥ δd+1 = 0.Let h : [0, 1] → [0,∞] be a homeomorphism with h(0) = 0 and h(1) = ∞. Let

Hσ : |σ |† → Dσ be the map

Hσ

( d∑

j=0

(γ j − γ j+1)w� j

):=

d∑

j=1

(h(γ j ) − h(γ j+1)

)w� j (�0) . (16.6)

The inverse of Hσ is

d∑

j=1

(δ j − δ j+1)w� j (�0) �→d∑

j=0

(h−1(δ j ) − h−1(δ j+1))w� j .

Evidently Hσ and its inverse are continuous, so Hσ is a homeomorphism.Suppose that σ ′ = {�0,�′

1 . . . ,�′d ′ } is a subset of σ with the same minimally

refined element. Then Hσ ′ is the restriction of Hσ to |σ ′|† because if γ j = γ j+1 inthe representation above, then h(γ j ) = h(γ j+1), and the corresponding terms vanishfrom both sides of (16.6). Similarly, H−1

σ ′ is the restriction of H−1σ to Dσ ′ .

Let �(�0) be the set of elements of � whose least refined element is �0. Let

16.7 Abstract Consistency 355

D(�0) = { Dσ : σ ∈ �(�0) } .

Proposition16.6 and Lemma16.12 imply that D(�0) is a pointed conical decompo-sition of �(�0) and Dσ ∩ Dσ ′ = Dσ∩σ ′ for all σ, σ ′ ∈ D(�0).

For σ, σ ′ ∈ T�0 , |σ |† ∩ |σ ′|† = |σ ∩ σ ′|† is easily seen to be a consequence ofLemma16.15. Therefore the various maps Hσ agree on the overlaps of their domainssince these overlaps are common faces, and their inverses also agree on the overlapsof their domains. Let

N (�0) :=⋃

σ∈�(�0)

|σ |† ,

and let H(�0) : N (�0) → �(�0) be the function that agrees with Hσ on each |σ |†.This is well defined and continuous on each of finitely many relatively closed setsthat cover its domain, so it is continuous. The function that agrees with H−1

σ on eachDσ is similarly well defined and continuous, and of course it is the inverse of H(�0).Summing up:

Lemma 16.19 H(�0) : N (�0) → �(�0) is a homeomorphism.

The collections of sets { N (�0) : �0 ∈ O } and { �(�0) : �0 ∈ O } are partitionsof |T | and �, so the maps H(�0) combine to form a bijection

H : |T | → � .

We now show that H is continuous. Fix σ = {�0,�1, . . . ,�d} ∈ �. For eachi = 0, . . . , d let σi := {�i , . . . ,�d}, and observe that |σ | = ⋃d

i=0 |σi |†. Let {ξr } bea sequence in |σ |† converging to ξ ∈ |σi |†. We have

ξr =d∑

j=0

(γ j,r − γ j+1,r )w� j and ξ =d∑

j=i

(γ j − γ j+1)w� j

where1 = γ0,r > γ1,r ≥ · · · ≥ γd+1,r = 0 and1 = γi > γi+1 ≥ · · · ≥ γd+1 = 0. Forthese variables the meaning of ξr → ξ is that γ j,r → 1 for all j ≤ i and γ j,r → γ j

for all j ≥ i .We need to show that Hσ (ξr )(a, b) → Hσi (ξ)(a, b) for any a, b ∈ A.Without loss

of generality suppose that a �0 b. If a �0 b, then Hσ (ξr )(a, b) = ∞ for all r andHσi (ξ)(a, b) = ∞. Therefore suppose that a ∼0 b. There are two cases. If a �i b,then a � j b for all j ≥ i , so

Hσ (ξr )(a, b) =d∑

j=1

(h(γ j,r ) − h(γ j+1,r )

)w� j (a, b)

≥ (h(γi,r ) − h(γd+1,r )

)min

j=i,...,dw� j (a, b) → ∞ = Hσi (ξ)(a, b) .


If a ∼i b, then a ∼ j b for all j ≤ i , so

Hσ (ξr )(a, b) =d∑

j=i+1

(h(γ j,r ) − h(γ j+1,r )

)w� j (a, b)

→d∑

j=i+1

(h(γ j ) − (γ j+1)

)w� j (a, b) = Hσi (ξ)(a, b) .

Thus H is indeed continuous.Recall that a continuous bijection mapping a compact space onto a Hausdorff

space is a homeomorphism. (A closed subset of the domain is compact, so its imageis compact, hence closed because the range is Hausdorff.) Thus we have shown that:

Theorem 16.1 |T | and � are homeomorphic.

Wesay that S ⊂ O is closed if�′ ∈ Swhenever�, �′ ∈ O ,� ∈ S, and�′ refines�. For such an S let

�S :=⋃

� ∈S�(�) and NS :=

⋃

σ∈∂�, σ⊂S

|σ |◦ =⋃

� ∈SN (�) .

Of course H restricts to a homeomorphism between �S and NS . For a simplexσ = {�0,�1, . . . ,�d} ∈ � let

D◦σ := { α1w�1(�0) + · · · + αdw�d (�0) : α1, . . . , αd > 0 } .

LetJS :=

⋃

σ∈�, σ∩S �=∅|σ |◦ and KS :=

⋃

σ∈�, σ∩S �=∅D◦

σ∪{{0}} .

As at the end of the last section, KS is open, JS is an open subset of |T | that is anSDR of KS , and NS is a closed subset of |T | that is an SDR of JS . Either XS , NS ,JS , and KS are all contractible or none of them are. In addition, KS = ⋃

� ∈S C◦�. Inparticular:

Proposition 16.11 If S ⊂ O is closed, then �S is a closed subset of � that is con-tractible if and only if KS is contractible.

16.8 Sequential Equilibrium Reformulated

In this section we describe sequential equilibrium in terms of conditional systems.The description utilizes some of the natural constructions with conditional systemsthat are analogous to familiar constructions in probability theory.Wewill not attempt

16.8 Sequential Equilibrium Reformulated 357

a systematic or exhaustive treatment of such operations, instead defining only thosewe need.

Let A be a finite set. There is a natural map

δA : Δ∗(A) → Δ(A)

given by δA(p) := p(·|A). Since an interior probability measure determines all con-ditional probabilities, the restriction of δA to δ−1

A (Δ◦(A)) is a bijection.Let B be a second finite set. If σ ∈ Δ(A) and τ ∈ Δ(B), let σ ⊗ τ ∈ Δ(A × B) be

the product measure: (σ ⊗ τ)(a, b) = σ(a) × τ(b). This operation does not extendto all pairs of conditional systems. The best we can do in this direction is to definea product of an interior probability measure and a conditional system. Suppose thatρ ∈ Δ◦(B), p ∈ Δ∗(A), and D ⊂ E ⊂ B × A. Let EA := { a′ : (b′, a′) ∈ E }, andlet

(ρ ⊗ p)(D|E) :=∑

(b,a)∈D ρ(b) × p(a|EA)∑(b,a)∈E ρ(b) × p(a|EA)

.

Now let C : A → B be a correspondence that is injective in the sense that thevarious sets C(x) are pairwise disjoint. There is an induced map θC : Δ∗(B) →Δ∗(A) given by

θC(q)(D|E) := q( ⋃

a∈DC(a)

∣∣⋃

x∈EC(a)

).

Of course δ and θC are continuous, and ρ ⊗ p is a continuous function of (ρ, p) ∈Δ◦(B) × Δ∗(A).

We now return to the extensive game setting: let G = (T,≺, H, (Ah)h∈H ,

I, ι, ρ, u) be an extensive game of perfect recall. Let A = ∏h Ah , be the set of

pure strategy profiles, and let �◦ be the set ξ ∈ ◦(A) such that there is a vector(ξh)h∈H ∈ ∏

h ◦(Ah) such that

ξ(a, b) =∑

h∈Hξh(ah, bh)

for all a, b ∈ A. Note that �◦ is a linear subspace of RA×A. Let � is the closure of�◦ in (A). Elements of � are called consistent conditional systems.

We now show how to pass from a consistent conditional system to a consistentassessment. For all ξ ∈ �◦ we have

ξ((ah, a−h), (bh, a−h)) = ξ((ah, b−h), (bh, b−h))

for all h, ah, bh ∈ Ah , and a−h, b−h ∈ A−h . Since � is the closure of �◦, elementsof � also satisfy this condition, so for each h there is a function mh : � → (Ah)

defined by setting

mh(ξ)(ah, bh) := ξ((ah, a−h), (bh, a−h))


where a−h may be any element of A−h . Let πh := δAh ◦ mh : � → Δ(Ah).For t ∈ T let C(t) := {w} × A(t) where w is the initial predecessor of t and A(t)

is the set of a ∈ A that take the play fromw to t . Since the nodes in an information setare unrelated by precedence, the restriction of the correspondence C : T → W × Ato each information set is injective in the sense described above, so for ξ ∈ �we canlet μ(ξ) ∈ M be given by

μh(ξ) := δh(θC |h (ρ ⊗ ξ)) .

Let γ : � → M × Π be the function γ (ξ) := (μ(ξ), π(ξ)). Evidently the compo-nents of γ are compositions of continuous functions, so γ is continuous.

Lemma 16.20 γ (�) = Ψ .

Proof Since � and Ψ are the closures of �◦ and Ψ ◦, and γ is continuous, it sufficesto show that γ (�◦) = Ψ ◦. We can rewrite Eq. (16.1) as

Pπ (t) = (ρ ⊗⊗

h

πh)(C(t)) ,

so Eq. (16.2) can be rewritten as

μπh (x) = θC |h (ρ ⊗

⊗

h

πh)(x |h) = δh(θC |h (ρ ⊗⊗

h

πh))(x) .

If ξ ∈ �◦, then ξ = ⊗j π j (ξ), so for any h we have

μh(ξ) = δh(θC |h (ρ ⊗⊗

j

π j (ξ))) = μπ(ξ)

h ,

and thus γ (ξ) = (μπ(ξ), π(ξ)). Since π(�◦) = Π◦, γ (�◦) = Ψ ◦, as desired. �

We now define a correspondence Γ : � → � by setting Γ (ξ) := γ −1(Φ(γ (ξ))).Concretely

Γ (ξ) := { ξ ′ ∈ � : πh(ξ′) ∈ Δ(Bh(γ (ξ)))for all h ∈ H } .

Since γ is a surjection, it follows immediately that

F (Γ ) = γ −1(F (Φ)) and F (Φ) = γ (F (Γ )) .

As in the last section let O be the set of consistent orders of A. The next resultdescribes the common features of the correspondenceΓ and the subcorrespondencesdescribed in the next section.

16.8 Sequential Equilibrium Reformulated 359

Proposition 16.12 Suppose that Q : � → O is an upper hemicontinuous corre-spondence whose values are closed, and each KQ(ξ) is contractible. Then the cor-respondence ΓQ : � → � given by ΓQ(ξ) := �Q(ξ) is upper hemicontinuous andcontractible valued.

Proof Since the values of Q are nonempty and closed, the values ofΓQ are nonemptyand closed, hence compact. If {ξr } is a sequence in� converging to ξ , Q(ξr ) ⊂ Q(ξ)

for large r , so ΓQ(ξr ) ⊂ ΓQ(ξ) for large r . Thus ΓQ is upper hemicontinuous. Foreach ξ , ΓQ(ξ) is contractible because KQ(ξ) is contractible. �

For a complete partial order �h of Ah let

�(�h) = { ah ∈ Ah : ah �h bh for all bh ∈ Ah } .

Suppose that � ∈ O . Insofar as � is consistent, for each h there is a complete partialorder oh(�) of Ah such that for all ah, bh ∈ Ah and a−h ∈ A−h , ahoh(�)bh if andonly if (ah, a−h) � (bh, a−h). For ξ ∈ � let

Q(ξ) := { � ∈ O : �(oh(�)) ⊂ Bh(γ (ξ))for all h ∈ H }.

EvidentlyΓ = ΓQ . Of course if�′ ∈ O refines�, then oh(�′) refines oh(�), so Q(ξ)

is closed. Suppose that ξn → ξ . Since γ and the conditional expected payoffs arecontinuous functions, Bh(γ (ξn)) ⊂ Bh(γ (ξ)) and thusQ(ξn) ⊂ Q(ξ) for sufficientlylarge n, so Q is upper hemicontinuous.

Lemma 16.21 Suppose that for each h ∈ H, ∅ �= Bh ⊂ Ah. Let

S := { � ∈ O : �(oh(�)) ⊂ Bh for all h ∈ H }.

For each h let ξ ∗h be an element of ◦(Ah) such that ξ ∗

h (bh, ah) > 0 if bh ∈ Bh andah ∈ Ah \ Bh, and ξ ∗

h (ah, bh) = 0 if ah, bh ∈ Bh or ah, bh ∈ Ah \ Bh. Let ξ ∗ be theelement of �◦ be given by ξ ∗(a, b) = ∑

h ξ ∗h (ah, bh). Then ξ ∗ is a star for KS.

Proof Let λ be an element of �◦, and let λh for h ∈ H be such that λ(a, b) =∑h λh(ah, bh) for all a, b ∈ A. If the ray { ξ ∗ + αλ : α ∈ R+ } leaves KS , it first

does so at an α such that for some h there is an ah ∈ Ah \ Bh such that ξ ∗h (ah, bh) +

αλh(ah, bh) ≥ 0 for allbh ∈ Bh . Since each ξ ∗h (a, b) + αλh(a, b) is an affine function

of λ, ξ ∗ + βλ /∈ KS for all β ≥ α. �

Theorem16.1 and Proposition16.12 now imply that:

Theorem 16.2 � is homeomorphic to the closed unit ball in R

∑h∈H (|Ah |−1), and Γ

is upper hemicontinuous and contractible valued.


16.9 Refinements

From the early days of game theory it has been recognized that the Nash equilibriumconcept is too permissive. An extensive literature considers how one might “refine”the concept, ideally getting rid of the equilibria that are conceptually problematic,while leaving a nonempty set of equilibria for every game. The main method in thisliterature is to consider robustness (as studied in Chap.7) with respect to carefullychosen perturbations of the game, or of the best response correspondence. One mayalso use index theory to eliminate components of the set of equilibria whose indexis zero (or even, or negative). Of course these methods can in principle be appliedto the best response correspondence for sequential equilibria, but it turns out that asomewhat simpler and more direct method is to work with subcorrespondences ofthe best response correspondence.

A simple example illustrates many key points. Suppose that you are going to arestaurant that features valet parking. You can either park your car on a nearby street,or you can pull up to the front door and hand your keys to the valet, after which thevalet can either park your car in the usual way or go on a joy ride, cruising aroundthe city until the early morning hours. Of course that would be a ridiculous thing todo—the valet would lose his job and possibly go to jail—so in real life you wouldnever worry about it. We’ll assume that you prefer using the valet if you expectnormal behavior, but you much prefer parking yourself to having your car stolen.

In the modelling praxis of game theory, we might start out by representing this asan extensive form game in which, at the initial node, you choose between P (parkingyourself) and V (using the valet). Choosing P ends the game, and V leads to a nodewhere the valet chooses between N (normal) and J (joyride). We could then passfrom this to a strategic form game in which each of you chooses a pure strategy.Since this is quite a simple game, each of you has only two pure strategies. A Nashequilibrium for this normal form game is a pair of (possibly probabilistically mixed)strategies such that each of you prefer your strategy to any alternative if you believethat the other agent’s strategy describes what you can expect. In this game (V, N ) isa Nash equilibrium, but so is (P, J ): if you expect J you should play P , and if youchoose P the valet’s payoff is unaffected by what he intended to do in a situationthat doesn’t arise.

Themost famous and historically important paper in the refinements literaturewasdue to Selten (1975), who proposed the notion of perfect equilibrium. A perfectequilibrium for our two person normal form game is a pair of strategies that areoptimal against each other, and also against some arbitrarily nearby totally mixed (allpure strategies are assigned positive probability) strategies. For some “test sequence”of totally mixed strategy profiles that converges to the equilibrium profile, eachcomponent of the profile is a best response to each term of the sequence. For ourgame the only perfect equilibrium is (V, N ).

A tremble for a strategic form game is an assignment of positive probabilities tothe various pure strategies such that for each agent, the sum of the tremble proba-bilities assigned to her pure strategies is less than one. A tremble gives rise to a per-

16.9 Refinements 361

turbed best response correspondence in which each agent’s perturbed best responsesare the expected utility maximizing mixed strategies in the set of mixed strategiesthat assign at least the tremble probability to each pure strategy. This perturbed bestresponse correspondence satisfies the hypotheses of Kakutani’s fixed point theorem,and its fixed points are called trembling hand perfect equilibria for the tremble. Toprove the existence of a perfect equilibrium Selten took a sequence of trembles withmaximum probability converging to zero, choosing a fixed point of each perturbedbest response correspondence, and then passed to a subsequence along which thissequence of mixed strategies converges. The limiting mixed strategy assigns posi-tive probability only to those pure strategies that are best responses to all sufficientlynearby terms of the sequence, so it is a perfect equilibrium.

There are at least two conceptually distinct ways to understand how the perfectequilibrium concept eliminates (P, J ) in our simple game. First, J is a weaklydominated strategy: it is never better than N , and sometimes it is worse. By requiringstrategies that are best responses to profiles of totally mixed strategies, perfectioneliminates all Nash equilibria in which positive probability is assigned to a weaklydominated strategy.

Second, in a somewhat indirect fashion, perfection enforces rationality not just inthe a priori sense of maximizing the expected payoff prior to the start of the game,but also in the contingent sense of requiring that behavior at each information setis rational, from the point of view of the agents’ beliefs and expectations when thatevent occurs.

What this means is clear in our example, and more generally in games of perfectinformation, which are extensive form games inwhich all information sets are single-tons (there is never any hidden information). In more general extensive form gamesthe situation is less clear. Selten gave examples suggesting that perfection shouldbe applied to the agent normal form, which is the normal form game obtained byregarding each information set as a separate agent, rather than the classical normalform. Practical experience with simple examples quickly led practitioners into thehabit of thinking in terms of beliefs when analyzing perfect equilibrium, and KrepsandWilson (1982) introduced the notion of sequential equilibrium, which formalizesand slightly generalizes this mode of analysis.

Kreps and Wilson proved existence of a sequential equilibrium by observing thatifπ is a perfect equilibrium of the agent normal form, then it is the limit of a sequence{πr } of trembling hand perfect equilibria for a sequence of trembles whose maximalprobabilities go to zero. By compactness there is a subsequence such that (μπr , πr )

converges to some (μ, π) ∈ Ψ , and it is not hard to show that (μ, π) is sequen-tially rational. This argument led researchers to regard sequential equilibrium as arefinement of agent normal form Nash equilibrium that was somewhat less demand-ing than agent normal form perfection. While certain intuitive doubts concerning itsconceptual validity were raised by Kreps and Wilson (1982), these were dispelled,and over time sequential equilibrium has come to be regarded as the leading solutionconcept for extensive form games, and less a refinement than something that coulditself be refined.


By displaying the set of sequential equilibria as (the projection onto Ψ of) the setof fixed points of Γ , we have shown that sequential equilibrium is also the analogueof Nash equilibrium for extensive games in a mathematical sense. As we will see,refinements ofNash equilibriumare typically defined by considering a class of pertur-bations of the best response correspondence. However, some existing refinements ofsequential equilibrium can be understood as the fixed points of subcorrespondencesof Γ .

A number of papers have defined refinements of perfect equilibrium by placingalternative requirements on the test sequence. Historically the first refinement ofthis sort is the notion of proper equilibrium proposed by Myerson (1978), whichrequires that for some sequence of positive numbers {εr } converging to 0 there is asequence of totally mixed strategy profiles converging to the equilibrium, and the r th

term in the sequence is εr -proper. This means that for each agent and any two ofthat agent’s pure strategies, if the first strategy has a higher expected payoff than thesecond, then the probability assigned to the second strategy is not greater than εr timesthe probability assigned to the first strategy. That a given strategic form game has anε-proper equilibrium for any ε ∈ (0, 1) can be proved by considering a perturbed bestresponse correspondence in which each agent i chooses optimally from the convexhull of the set of mixed strategies that assign the probabilities 1/(1 − ε|Si |), ε/(1 −ε|Si |), . . . , ε|Si |−1/(1 − ε|Si |), in someorder, to the various pure strategies. (Apolytopeof this combinatoric type is a permutahedron; cf. Exercise2.4 and Ziegler 1995.)To prove that the game has a proper equilibrium we choose some sequence {εr }converging to 0, choose an εr -proper equilibrium for each r , and then pass to aconvergent subsequence.

By passing to a further subsequence, we can insist that each agent’s sequence ofstrategies has a limiting conditional system. In general a mixed strategy for an agentis probably best thought of not as that agent’s plan or intention—after all, the agentcan choose freely from the pure strategies that have highest expected utility—butrather as the common belief of the other agents concerning that agent’s behavior.Roughly speaking, properness imposes the following requirement on this commonbelief: more costly mistakes are infinitely less likely than less costly mistakes.

The notion of justifiability introduced by McLennan (1985) has a similar moti-vation, namely that out-of-equilibrium action choices that can be explained by con-fusion concerning which equilibrium is “in effect” should be much more likely thansuboptimal action choices that have no such explanation. Say that an action is use-less if, in every sequential equilibrium, the expected payoff for the agent who mightchoose it, conditional on it being chosen at its information set, is less than the equi-librium expected payoff conditional on reaching that information set. A system ofbeliefs μ is said to be first order justifiable if each μh assigns positive probabilityonly to the nodes in h that are reached with the minimum number of useless actions.

Theorem 16.3 There is a sequential equilibrium with first order justifiable beliefs.

The original proof used carefully chosen trembles. Let L be themaximum numberof actions on the path going from an initial node to a terminal node. Consider a


sequence {εr } of positive numbers converging to 0. For each r let πr be a tremblinghand perfect equilibrium for the tremble (of the agent normal form) that assignsprobability εLr to useless actions and probability εr to other actions. Passing to asubsequence, we may assume that (μπr , πr ) → (μ, π). Then (μ, π) is a sequentialequilibrium, by the argument that Kreps and Wilson used to prove existence. Dueto the continuity of expected payoffs, for sufficiently large r all useless actionshave inferior expected payoffs and are assigned probability εLr by πr . Consider aninformation set h and nodes x, x ′ ∈ h, and suppose that k and k ′ are the number ofuseless actions required to reach x and x ′ respectively, where k < k ′. Letw andw′ bethe initial predecessors of x and x ′. For large r we have Pπr (x) ≥ ρ(w) × εkL+L−k−1

rbecause at most L − 1 actions are chosen before x occurs, and we have Pπr (x ′) ≤ρ(w′) × εk

′Lr . Passing to the limit, we conclude that μh(x ′) = 0.

Here we present an alternative argument that displays the set of sequential equi-libria with first order justifiable beliefs as (the projection onto Ψ of) the set of fixedpoints of a subcorrespondence of Γ that is upper hemicontinuous and contractiblevalued. There are two preparatory results, the first of which is a point that is con-ceptually obvious, and whose proof (which is left to the reader) is a straightforwardmatter of working through the definitions.

Lemma 16.22 For ξ ∈ �, h ∈ H, and x ∈ h, μh(ξ)(x) = 0 if and only if for eacha ∈ A(x) there is an x ′ ∈ h and an a′ ∈ A(x ′) \ A(x) such that ξ(a′, a) = ∞.

Lemma 16.23 Suppose that for each h ∈ H, ∅ �= Bh ⊂ Ah and Uh ⊂ Ah. Let S bethe set of � ∈ O such that:

(a) �(oh(�)) ⊂ Bh for all h;(b) a � b whenever the number of h such that bh ∈ Uh \ Bh is is greater than the

number of h such that ah ∈ Uh.

For each h and ah ∈ Ah let ch(ah) be 1 if ah ∈ Bh, −|H | if ah ∈ Uh \ Bh, and0 otherwise. For each h let ξ ∗

h be the element of ◦(Ah) such that ξ ∗h (ah, bh) =

ch(ah) − ch(bh) for all ah, bh ∈ Ah. Let ξ ∗ be the element of�◦ such that ξ ∗(a, b) =∑h ξ ∗

h (ah, bh) for all a, b ∈ A. Then ξ ∗ is a star for KS.

Proof Clearly ξ ∗ ∈ KS . Let λ be an element of�◦, and let λh for h ∈ H be such thatλ(a, b) = ∑

h λh(ah, bh) for all a, b ∈ A. If the ray { ξ ∗ + αλ : α ∈ R+ } leaves KS ,it first does so at an α such that either:

(a) for some h there is some h and ah ∈ Ah \ Bh such that ξ ∗h (ah, bh) + αλh

(ah, bh) ≥ 0 for all bh ∈ Bh , or(b)

∑h ξ ∗

h (ah, bh) + αλh(ah, bh) ≤ 0 for some a, b ∈ A such that the number of hsuch that bh ∈ Uh \ Bh greater than the number of h such that ah ∈ Uh .

Each ξ ∗h (a, b) + αλh(a, b) is an affine function of α, so ξ + βλ /∈ KQ(ξ) for all

β ≥ α. �


Proof of Theorem16.3 For each h let Uh be the set of useless elements of Ah . Forξ ∈ � let Q(ξ) be the set of � ∈ O such that:

(a) �(oh(�)) ⊂ Bh(γ (ξ)) for all h;(b) a � b whenever the number of h such that bh ∈ Uh \ Bh(γ (ξ)) is greater than

the number of h such that ah ∈ Uh .

ClearlyQ(ξ) is closed. If ξr → ξ , then Bh(γ (ξr )) ⊂ Bh(γ (ξ)) for large r , and actionsthat are inferior at ξ are inferior at ξr for large r , so Q(ξr ) ⊂ Q(ξ). Thus Q is upperhemicontinuous. The last result implies that KQ(ξ) is star shaped.

Now Proposition16.12 gives a ξ ∈ � such that ξ ∈ �Q(ξ). Of course γ (ξ) is asequential equilibrium, so Bh(γ (ξ)) ∩Uh = ∅ for all h. Let � := Rc(ξ). For eachinformation set h and x, x ′ ∈ h, if the number of useless actions required to reachx is greater than the number of useless actions required to reach x ′, then there isan a′ ∈ A(x ′) such that a′ � a for all a ∈ A(x), which implies (Lemma16.22) thatμh(ξ)(x) = 0. Thus μ(ξ) is first order justifiable. �

An action is second order useless if it is not useless, but it gives a suboptimalexpected payoff in every sequential equilibrium with first order justifiable beliefs.A system of beliefs μ is second order justifiable if each μh assigns positive prob-ability only to those elements of h that are reached with a minimal number of firstorder useless actions and, within the set of such nodes, only those that are reachedwith a minimal number of second order useless actions. Obviously we can continuethis sequence of definitions, and since the tree is finite it will eventually cease toreduce the set of equilibria. A justifiable equilibrium is a sequential equilibriumwith beliefs that satisfy the entire hierarchy of conditions. The proof in McLennan(1985) that justifiable equilibria exist is an elaboration of the argument above: weconsider trembles that assign extraordinarily small probability to useless actions,extremely low probability to second order useless actions, very low probability tothird order useless actions, and so forth. Exercise16.6 asks you to define a correspon-dence Γ2 : � → � whose fixed points are sequential equilibria with second orderjustifiable beliefs, and to prove that it is an upper hemicontinuous contractible valuedcorrespondence.

The refinements proposed by Cho and Kreps (1987), Banks and Sobel (1987) forsignalling games are similar to justifiability insofar as they impose restrictions onbeliefs that are motivated by intuitions concerning which sorts of deviations froman equilibrium are more likely. As we explained in Sect. 16.1, a signalling game hasfinite sets T and M of types and messages. For each type t there is a nonemptyset M(t) ⊂ M of possible messages. For each m ∈ M there is a nonempty finite setR(m) of possible responses. For t ∈ T , m ∈ M(t), and r ∈ R(m), the payoffs of thesender and receiver are u(t,m, r) and v(t,m, r), respectively. There is a probabilitydistribution ρ ∈ Δ◦(T ) that represents the receiver’s prior beliefs concerning thetype.

For each message m let

T (m) := { t ∈ T : m ∈ M(t) }


be the set of types that can choose m. We assume that every T (m) is nonempty. Abelief is a function μ that assigns a probability measure μm ∈ Δ(T (m)) to each m.A sender strategy is a function σ that assigns a probability measure σt ∈ Δ(Mt ) toeach t . We say thatm is unused by σ if σt (m) = 0 for all t ∈ T (m). We say that μ isBayes-consistentwith ρ and σ if, for everym that is not unused and every t ∈ T (m),

μm(t) = ρ(t) × σt (m)∑t ′∈T (m) ρ(t ′) × σt ′(m)

.

We define expected utilities in the obvious way. If t ∈ T , m ∈ M(t), and β ∈Δ(R(m)), then u(t,m, β) := ∑

r∈R(m) β(r)u(t,m, r), and ifm ∈ M , r ∈ R(m), andα ∈ Δ(T (m)), then v(α,m, r) := ∑

t α(t)v(t,m, r). A receiver strategy is a func-tion τ that assigns a probability measure τm ∈ Δ(R(m)) to each m. For a type t , areceiver strategy τ , a message m, and α ∈ Δ(T (m)) the sets of pure best responsesare

At (τ ) := argmaxm∈M(t)

u(t,m, τm) and Bm(α) := argmaxr∈R(m)

v(α,m, r) .

In this context a sequential equilibrium is a triple (μ, σ, τ ) such that:

(a) the belief μ is Bayes-consistent with ρ and σ ;(b) for each t ∈ T , σt ∈ Δ(At (τ ));(c) for each m ∈ M , τm ∈ Δ(Bm(μm)).

For a receiver strategy τ and a message m let

T ∗m(τ ) := { t ∈ T (m) : m ∈ At (τ ) } .

Fix a specification of a probability measure νm ∈ Δ(T (m)) for each m.

Proposition 16.13 There is a sequential equilibrium (μ, σ, τ ) such that for eachunused m, μm is in the convex hull of {νm} ∪ Δ(T ∗

m(τ )).

This can be proved by taking a sequence of approximate equilibria for an appro-priate sequence of trembles. For each n = 1, 2, . . .we take positive numbers εn(t,m)

for all t andm such thatm ∈ M(t). For each t we require that∑

m∈M(t) εn(t,m) < 1for all n, and that

∑m∈M(t) εn(t,m) → 0. For each m and t ∈ T (m) we require that

ρ(t)εn(t,m)∑t ′∈T (m) ρ(t ′)εn(t ′,m)

→ νm(t) .

For each n let (μn, σ n, τ n) be a fixed point of the perturbed best response corre-spondence in which each t is required to assign at least probability εn(t,m) to eachm. After passing to a subsequence, we may assume that (μn, σ n, τ n) converges to asequential equilibrium (μ, σ, τ ), and it is easy to see that the condition in questionis satisfied.


Possibly consistent conditional systems offer an alternative method of proof. Weadopt the following notational convention: a consistent conditional system ξ deter-mines a belief μ and strategies σ and τ , a conditional system ξ ′ determines a beliefμ′ and strategies σ ′ and τ ′, etc. Note that the belief μ of a consistent conditional sys-tem is Bayes-consistent with ρ and the system’s sender strategy σ . For a conditionalsystem ξ let Γν(ξ) be the set of ξ ′ ∈ � such that:

(a) for each m, μ′m is in the convex hull of {νm} ∪ Δ(T ∗

m(τ ));(b) for each t , σ ′

t ∈ Δ(At (τ ));(c) for each m, τ ′

m ∈ Δ(Bm(μm)).

Since μ, σ , and τ are continuous functions of ξ , it is obvious that Γν is an upperhemicontinuous correspondence. Any fixed point gives a sequential equilibrium sat-isfying the condition in the last result. If Γν was contractible valued, the Eilenberg–Montgomery theorem would imply the existence of a fixed point. While it seemsreasonable to conjecture that Γν(ξ) is necessarily contractible, my attempts to provethis have not been successful.

We now briefly explain the set valued solution concepts first introduced byKohlberg and Mertens (1986). Working in the context of strategic form games, theydefined a stable set to be a closed subset of the set of Nash equilibria such that:

(a) for any neighborhoodU of the set there is an ε > 0 such that for any tremble thatassigns probability less than ε to every pure strategy, the perturbed best responsecorrespondence has a fixed point in U , and

(b) no closed proper subset satisfies (a).

Many of the variants of this concept that have appeared in the literature are described,in relation to each other, by Hillas et al. (2001). For our purposes it is most conve-nient to work with the version in which “closed subset” is replaced by “closed andconnected” subset, so by a stable set we will mean a closed and connected set ofNash equilibria that is robust with respect to trembles, as described in (a), and whichdoes not have a closed and connected proper subset that also satisfies (a). Section7.5and the problems for Chap. 7 outline the basic theory concerning why such sets exist.

In general it is difficult to interpret a set of equilibria as a “theory” of how the gameis played, but for extensive form games there is a result of Kreps and Wilson (1982)that dovetails nicely with the stability concept. The path of a behavior strategyprofile π is the distribution it induces on Z . Evidently the path is determined bythe components πh for the information sets h that occur with positive probabilitywhen play is governed by π . In particular, if (μ, π) is a sequential equilibrium, thenthe restriction of π to the tree obtained by eliminating all actions that are assignedprobability zero, and all nodes that follow such actions, is a totally mixed Nashequilibrium (in behavior strategies) of the truncated game. Exercise16.3 asks you toprove that for generic payoffs, all totally mixed Nash equilibria are regular, and toargue that this implies that for generic payoffs there are finitely many paths of Nashequilibria of the agent normal form, and consequently there are finitely many pathsinduced by sequential equilibria.


The map from behavior strategy profiles to paths is continuous, so the collectionof preimages of the various Nash equilibrium paths is a partition of the set of agentnormal form Nash equilibria into closed sets. For generic payoffs this partition isfinite, so Kinoshita’s theorem (Theorem7.2) implies that one of these sets, say N , isessential.An argument thatwehave seen several times (make sure you can reconstructit) shows that an essential set of Nash equilibria is robust with respect to trembles inthe sense of (a) above.

If there are finitely many sequential equilibrium paths, the preimages of thesepaths are a finite partition of the set of fixed points of Γ , and at least one of thesepreimages has nonzero index. If each Γν was contractible valued, we could concludethat there is some equilibrium path such that for every ν, there is a fixed point of Γν

with that path.We now assume that our signalling game has finitely many equilibrium paths,

and we fix one whose associated set of agent normal form Nash equilibria satisfies(a) above. This path consists of a sender strategy σ and a τm ∈ Δ(R(m)) for eachm that is not unused. For each t ∈ T let u∗

t be t’s equilibrium expected payoff. Wehave u(t,m, τm) ≤ u∗

t for all m ∈ M(t), and this inequality holds with equality forall m such that σt (m) > 0. For each m that is not unused, σ induces a Bayesianposterior belief μm ∈ Δ(T (m)), and τm ∈ Δ(Bm(μm)). For any specification of aνm ∈ Δ(T (m)) for each m, we may take series of positive numbers εn(t,m) asabove, and for any neighborhood of the set of sequential equilibria with the givenpath, for large enough n there will be a fixed point (μn, σ n, τ n) of the perturbed bestresponse correspondence in that neighborhood.A convergent subsequence convergesto a sequential equilibrium (μ, σ, τ ) with the given path such that for each unusedm, μm is in the convex hull of {νm} ∪ Δ(T ∗

m(τ )) where now T ∗m(τ ) = { t ∈ T (m) :

u(t,m, τm) = u∗i }.

It turns out that this condition exhausts the consequences of stability for genericsignalling games. The precise statement of this result (which was established bothby Cho and Kreps and by Banks and Sobel) is as follows. There is an open denseset of payoffs such that if the game’s vector of payoffs is in this set and σ togetherτm for m that are not unused is an equilibrium path, then the set of equilibria withthis path satisfy (a) above if the following holds: for each unused m and each νm ∈Δ(T (m)) there is a T ∗ ⊂ T (m), a μm in the convex hull of {νm} ∪ Δ(T ∗), and aτm ∈ Δ(Bm(μm)) such that u(t,m, τm) ≤ u∗

t for all t ∈ T (m) and u(t,m, τm) = u∗t

for all t ∈ T ∗.We now describe how these results can be used to prove the existence of sequential

equilibria satisfying strategically motivated restrictions on the receiver’s beliefs atunused messages. For m ∈ M let A1

m be the set of t ∈ T (m) such that there is someα ∈ Δ(T (m)) and r ∈ Bm(α) such that u(t,m, r) ≥ u∗(t, τ ). Cho and Kreps (1987)say that the sequential equilibrium path fails the intuitive criterion if there is amessage m and a t ∈ A1

m(τ ) such that u(t,m, r) > u∗(τ ) for all α ∈ Δ(A1m(τ )) and

r ∈ Bm(α). They argue that choosingm can, in effect, be regarded as a speech: “Anytype outside of A1

m(τ ) will certainly lose by choosing m, and since at least one typein A1

m(τ ) will certainly gain if you believe that m was chosen by a type in that set,your belief should assign all probability to that set.” In particular, the equilibrium of


the game in Fig. 16.1 in which both the weak and the strong sender listen to classicalmusic fails the intuitive criterion because the weak sender cannot possibly gain byswitching to the blues, but the strong sender can if the receiver accepts this logic.

One can go further in this direction. In addition to the type t above, suppose thereis a t ′ ∈ A1

m(τ ) that will get a worse-than-equilibrium expected payoff from choosingm so long as the receiver best responds to some belief in Δ(A1

m(τ )). The logic of thespeech suggests that the receiver’s beliefs should assign no probability to this type.Continuing this reasoning leads to the following iterative elimination procedure. Let

A0m(τ ) := T (m) and B0

m(τ ) :=⋃

α∈Δ(A0m (τ ))

Bm(α) .

For a positive integer n, if An−10 (τ ) and Bn−1

0 (τ ) have already been defined, let

Anm(τ ) := { t ∈ An−1

m : u(t,m, r) ≥ u∗(t, τ )for somer ∈ Bn−1m (τ ) } ,

let Anm(τ ) := An

m(τ ) if Anm(τ ) �= ∅, and otherwise let An

m(τ ) := An−1m (τ ), and let

Bnm(τ ) :=

⋃

α∈Δ(Anm (τ ))

Bm(α) .

Since there are finitely many types and responses, this process stabilizes in the sensethat there are A∗

m(τ ) and B∗m(τ ) such that An

m(τ ) and Bnm(τ ) = B∗

m(τ ) for all suffi-ciently large n. Consider a νm ∈ Δ(A∗

m(τ )). If the set of Nash equilibria mapping tothe path satisfies (a) above, then there exists a sequential equilibrium with the givenpath such thatμm is in the convex hull of {νm} ∪ Δ(T ∗

m(τ )).Wehaveμm ∈ Δ(B0m(τ )),

so T ∗m(τ ) ⊂ A1

m(τ ). Therefore μm ∈ Δ(A1m(τ )) and consequently τm ∈ Δ(B1

m(τ )).But now we see that T ∗

m(τ ) ⊂ A2m(τ ), soμm ∈ Δ(A2

m(τ )) and τm ∈ Δ(B2m(τ )). Con-

tinuing inductively proves that:

Theorem 16.4 If there are finitely many sequential equilibrium paths, then there isa sequential equilibrium (μ, σ, τ ) such that for all unused m, μm ∈ Δ(A∗

m(τ )).

Since the requirement that μm ∈ Δ(A∗m(τ )) is much stronger than the intuitive

criterion, it is to at least some extent less well justified by strategic intuitions. Choand Kreps (1987), Banks and Sobel (1987), as well as other sources, give carefuland detailed consideration to the question of whether various similar condition onbeliefs are intuitively compelling.

The application of consistent conditional systems to the theory of refinements wasfirst considered as this book was being completed. Already we have seen an inter-esting open problem, and the general possibilities and limitations of this approachare, at this point, almost completely unexplored.

Exercises 369

Exercises

16.1 Let G = (T,≺, H, (Ah)h∈H , I, ι, ρ, u) be an extensive form game. Fixing t ∈T and a mixed strategy profile σ , write the probability that t occurs, when play isgoverned by σ , as the probability of the initial predecessor of t times the product overagents i of the probability that i plays a pure strategy that allows t to occur. Write theprobability that i plays a pure strategy that allows t as the product, over predecessors t ′of t in information sets at which i chooses the action, of the probability, conditionalon allowing t ′, of choosing the action that leads toward t . Argue that if the gamesatisfies perfect recall, then this conditional probability is the same as the probabilityof choosing the action that leads to t conditional on allowing η(t ′) to occur. ProveKuhn’s theorem, as asserted in Sect. 16.3.

16.2 Let G = (T,≺, H, (Ah)h∈H , I, ι, ρ, u) be an extensive form game with per-fect recall, and fix i ∈ I . The personal decision tree of i is the pair (Ti ,≺i ) withthe following description. The set of nodes is Ti := {oi } ∪ Hi ∪ Ai ∪ Z where oiis an artificial initial node. The personal precedence relation ≺i has the followingcumbersome but natural description:

• oi precedes all other elements of Ti ;• h ≺i h′ if P(x ′) ∩ h �= ∅ for some (hence all, by perfect recall) x ′ ∈ h′;• h ≺i a if h = ha or h ≺i ha;• a ≺i h if a ∈ α(P(x)) for some (hence all, by perfect recall) x ∈ h;• a ≺i a′ if a ≺i ha′ ,• z ≺i h if P(z) ∩ h �= ∅;• z ≺i a if a ∈ α(P(z)).

Let (μ, π) be an interior consistent assessment.

(a) Show that the expected payoff of i , conditional on any ti ∈ Ti , is the sum, overimmediate successors of ti in (Ti ,≺i ), of the probability of transitioning to thesuccessor times that expected payoff conditional on the successor.

(b) Show that the probability of transitioning from a ∈ Ai to an immediate successorin (Ti ,≺i ) does not depend on πi .

(c) Prove that if a (not necessarily interior) consistent assessment is myopicallyrational, then it is sequentially rational.

16.3 (This continues from the last exercise.)

(a) Let (μ, π) be an interior consistent assessment. Using i’s personal decision tree,show that the set of ui ∈ R

Z such thatEμ,π |a(ui |h) = Eμ,π |a′(ui |h) for all h ∈ Hi

and a, a′ ∈ Ah is a linear subspace of dimension |Z | + |Hi | − |Ai |.(b) Let M be the set of triples (u, μ, π) such that (μ, π) is an interior consistent

assessment that is a sequential equilibrium for the payoff profile u ∈ (RZ )I .Prove that M is an (|I | × |Z |)-dimensional C∞ manifold.

(c) Applying Sard’s theorem to the projection (u, μ, π) �→ u of M onto U , provethat for generic payoff profiles u there are finitely many interior sequential equi-libria.


(d) The path of a behavior strategy is the induced distribution on Z : the path assignsprobabilityPπ (z) to each z. Prove that a sequential equilibrium (μ, π) “projects”to an interior sequential equilibrium of the truncated extensive game obtainedby eliminating all parts of the extensive form that have zero probability underthe path of π .

(e) Prove that for generic payoff profiles u there are finitely many distributions onZ that are paths of sequential equilibria.

16.4 Holding the combinatoric data (T,≺, H, (Ah)h∈H , I, ι)of an extensive gameofperfect recall fixed, prove that the correspondence that takes each initial assessment-payoff pair (ρ, u) to the set of sequential equilibria of the extensive game(T,≺, H, (Ah)h∈H , I, ι, ρ, u) is upper hemicontinuous.

16.5 (McLennan (1985)) Find the set of sequential equilibria of the game below.Which have beliefs that are first order justifiable?Which have beliefs that are secondorder justifiable?

22

33

00

00

00

00

11

00

04

40

1

2

N E S W

m m mr r r

16.6 Sketch two different proofs that our given extensive form game has a sequentialequilibrium with second order justifiable beliefs.

(a) Give a sequence of trembles εr for the agent normal form such that if, for each r ,π r is a εr -perfect equilibrium and (μπ r

, π r ) → (μ, π), then μ is second orderjustifiable.

(b) As in Lemma16.23, define an appropriate set of orderings S, and show that KS

is star-shaped.

16.7 Devise a signalling game with two messages, one of which has only oneresponsewhile the other has three responses, such that all sequential equilibria satisfythe intuitive criterion, but some equilibrium does not survive the iterative eliminationprocedure described in Sect. 16.9.

Chapter 17Monotone Equilibria

Our second economic application is the work of Reny (2011) on the existence ofmonotone pure strategy equilibria in Bayesian games, which generalizes earlierresults of Athey (2001), McAdams (2003). It was selected primarily because itapplies the Eilenberg-Montgomery fixed point theorem to correspondences that arecontractible valued, but it is appropriate in other ways as well. We will see a numberof mathematical structures that play important roles in contemporary mathematicaleconomics. In addition, Reny applies a theorem of Dugundji (1965) giving sufficientconditions for a space to be an ANR, and the development of this result will greatlydeepen our knowledge of ANR’s.

17.1 Monotone Comparative Statics

Many of the most important assertions in economics concern how choices changewhenunderlyingparameters change. In this chapterweare going tobe concernedwithconditions that imply that a bidder in an auction increases her bid if her informationimproves, or if the set of available choices increases in a certain sense. Both for theset of available choices, and for the subset of optimal choices, we will consider aparticular ordering of the set of subsets of a lattice.

Let X be a set that is partially ordered by a relation �. (Recall that this meansthat � is transitive and antisymmetric: x � y and y � x if and only if x = y.) Forany nonempty set S ⊂ X , a point x is an upper bound of S if x � s for all s ∈ S,and it is a least upper bound if it is an upper bound and x ′ � x whenever x ′ is alsoan upper bound. Similarly, a point x is an lower bound of S if x � s for all s ∈ S,and it is a greatest lower bound if it is a lower bound and for any other lower boundx ′, x ′ � x . Because � is antisymmetric there is at most one least upper bound of S,which is denoted by ∨S if it exists, and at most one greatest lower bound, denotedby ∧S if it exists.


371


372 17 Monotone Equilibria

For x, y ∈ X , the least upper bound and greatest lower bound of {x, y} are (if theyexist) called the join and meet of x and y, and are denoted by x ∨ y and x ∧ y. Ifx ∨ y and x ∧ y are defined for all x, y ∈ X , then X is a lattice. We assume that thisis the case going forward. If every nonempty subset of X has a least upper boundand a greatest lower bound, then X is a complete lattice.

Lattices occur frequently inmanyareas ofmathematics, but there are twoexamplesthat stand out as prototypes for the concept. The set of subsets of any set ordered bycontainment, and [0, 1]N with the coordinatewise partial ordering, are both completelattices. Topological concepts give rise tomany examples (e.g., the set of open subsetsof a topological space) of lattices that are not complete.

Evidently∨ and∧ are associative and commutative. They also satisfy the absorp-tion laws:

x ∨ (x ∧ y) = x and x ∧ (x ∨ y) = x .

To see the first of these note that since x ∧ y � x , x is an upper bound of {x, x ∧ y},and if z is another upper bound, then x � z, so x is the least upper bound. The proofof the second absorption law is similar. Each of the idempotent laws is proved bytwo applications of absorption:

x ∨ x = x ∨ (x ∧ (x ∨ x)) = x and x ∧ x = x ∧ (x ∨ (x ∧ x)) = x .

Associativity, commutativity, and absorption provide an axiomatic characterizationof lattices: given binary operations ∨ and ∧ satisfying these conditions, one candefine a partial order for which ∨ and ∧ are the least upper bound and greatest lowerbound operators.1

For S, S′ ⊂ X , S ≤ S′ if x ∧ x ′ ∈ S and x ∨ x ′ ∈ S′ for all x ∈ S and x ′ ∈ S′.This relation is called the strong set ordering. Insofar as we wish to compare setsof best responses, this relation will play an important role in our work, and we nowestablish its main properties.

Lemma 17.1 Let S1, S2, and S3 be nonempty subsets of X. If S1 ≤ S2 and S2 ≤ S3,then S1 ≤ S3.

Proof Choose x1 ∈ S1, x2 ∈ S2, and x3 ∈ S3. By absorption x3 = (x2 ∧ x3) ∨ x3.The hypotheses give x2 ∧ x3 ∈ S2, then x1 ∨ (x2 ∧ x3) ∈ S2, and finally x1 ∨ x3 =x1 ∨ (x2 ∧ x3) ∨ x3 ∈ S3. The proof that x1 ∧ x3 ∈ S1 is similar. �Lemma 17.2 If S1, S2 ⊂ X, S1 = ∅ = S2, S1 ≤ S2, and S2 ≤ S1, then S1 = S2.

1Suppose that ∨ and ∧ are binary relations on X that satisfy associativity, commutativity, andabsorption. We specify that x � y if x ∨ y = x . If x ∨ y = x , then y = y ∧ (y ∨ x) = y ∧ x , andif x ∧ y = y, then x = x ∨ (x ∧ y) = x ∨ y, so x � y if and only if x ∧ y = y. If x � y andy � z, then x ∨ z = (x ∨ y) ∨ z = x ∨ (y ∨ z) = x ∨ y = x , so this relation is transitive. If x � yand y � x , then x = x ∨ y = y. On the other hand the idempotent laws imply that x � x . Thus� is a partial order. Associativity and idempotence imply that x ∨ (x ∨ y) = x ∨ y, so x ∨ y � x ,and x ∨ y � y by symmetry, so x ∨ y is an upper bound of {x, y}. If z is another upper bound ofthis set, then (x ∨ y) ∨ z = x ∨ (y ∨ z) = x ∨ z = z, so z � x ∨ y. Thus x ∨ y is the least upperbound of {x, y}. The proof that x ∧ y is the greatest lower bound of {x, y} is similar.

17.1 Monotone Comparative Statics 373

Proof Choose x1 ∈ S1 and x2 ∈ S2. We have x1 ∧ x2 ∈ S2 because S2 ≤ S1 and x1 ∨(x1 ∧ x2) ∈ S2 because S1 ≤ S2, so x1 ∈ S2 by absorption. The proof that x2 ∈ S1 issimilar. �

A subset S ⊂ X is a sublattice if, for all x, y ∈ S, x ∧ y and x ∨ y are elementsof S. If S is nonempty, S ≤ S if and only if S is a sublattice. Therefore the strongset ordering is a partial order of the nonempty sublattices of X , but it is not a partialordering of the set of all nonempty subsets of X . Also, S ≤ ∅ ≤ S for any S ⊂ X .

Now let A be a lattice of actions, let T be a partially ordered set of types, and letu : A × T → R be a function. For t ∈ T and S ⊂ A let

M(t, S) := argmaxa∈S

u(a, t).

We say that M(t, S) ismonotone (nondecreasing) if M(t, S) ≤ M(t ′, S′) whenevert � t ′, S ≤ S′, and M(t, S) = ∅ = M(t ′, S′). There are two economically naturalconditions whose conjunction is equivalent to monotonicity.

We say that u satisfies the weak single crossing property (WSCP) if, for alla, a′ ∈ A with a � a′ and all t, t ′ ∈ T with t � t ′,

u(a, t ′) ≥ u(a′, t ′) implies that u(a, t) ≥ u(a′, t) .

That is, if a is already at least as good as a′ at t ′, it remains at least as good when theparameter increases to t . We say that u satisfies the single crossing property (SCP)if it satisfies theWSCP and, for all a, a′ ∈ Awith a � a′ and all t, t ′ ∈ T with t � t ′,

u(a′, t) ≥ u(a, t) implies that u(a′, t ′) ≥ u(a, t ′) .

Insofar as P ⇒ Q is equivalent to¬Q ⇒ ¬P , this is the same as u(a, t ′) > u(a′, t ′)implying u(a, t) > u(a′, t). That is, if a is already better than a′ at t ′, it remains betterwhen the parameter increases to t .

We say that u satisfies increasing differences if

u(a, t) − u(a′, t) ≥ u(a, t ′) − u(a′, t ′)

for all a, a′ ∈ A with a � a′ and all t, t ′ ∈ T with t � t ′. Evidently increasing dif-ferences implies the SCP, and this is typically the way the SCP arises in economicmodels.

If X is a lattice, a function v : X → R is quasisupermodular if, for all x, y ∈ X :

(a) v(x) ≥ v(x ∧ y) implies that v(x ∨ y) ≥ v(y), and(b) v(y) ≥ v(x ∨ y) implies that v(x ∧ y) ≥ v(x).

Note that if X is completely ordered, so that x ∨ y = y and x ∧ y = x wheneverx � y, then v is automatically quasisupermodular.

The following result subsumes a great many comparative statics results that hadappeared in earlier literature.


Theorem 17.1 (Milgrom and Shannon 1994) M(t, S) is monotone if and only if usatisfies the SCP and, for each t ∈ T , u(·, t) is quasisupermodular.Proof First suppose that u satisfies the SCP and each u(·, t) is quasisupermodu-lar. Suppose that t � t ′, S ≤ S′, a ∈ M(t, S), and a′ ∈ M(t ′, S′). Since S ≤ S′, a ∧a′ ∈ S and a ∨ a′ ∈ S′. Since u(a, t) ≥ u(a ∧ a′, t), quasisupermodularity impliesthat u(a ∨ a′, t) ≥ u(a′, t), and the SCP gives u(a ∨ a′, t ′) ≥ u(a′, t ′), so a ∨ a′ ∈M(t ′, S′) because a′ ∈ M(t ′, S′). Since u(a′, t ′) ≥ u(a ∨ a′, t ′), quasisupermodu-larity implies that u(a ∧ a′, t ′) ≥ u(a, t ′), and the SCP gives u(a ∧ a′, t) ≥ u(a, t),so a ∧ a′ ∈ M(t, S) because a ∈ M(t, S). Thus M(t, S) is monotone.

Now suppose thatM(t, S) ismonotone. Consider a, a′ ∈ Awith a � a′ and t, t ′ ∈T with t � t ′. Let S := {a, a′}. Suppose that u(a, t ′) ≥ u(a′, t ′), so thata ∈ M(t ′, S).It cannot be the case that M(t, S) = {a′} because monotonicity would imply thata = a ∨ a′ ∈ M(t, S). Therefore a ∈ M(t, S), so u(a, t) ≥ u(a′, t). Thus u satisfiesthe WSCP. Now suppose that u(a, t ′) > u(a′, t ′), so that M(t ′, S) = {a}. It cannotbe the case that a′ ∈ M(t, S) because monotonicity would imply that a′ = a ∧ a′ ∈M(t ′, S). Therefore M(t, S) = {a}, so u(a, t) > u(a′, t). Thus u satisfies the SCP.

Nowfix t ∈ T . For any a, a′ ∈ A consider S := {a, a ∧ a′} and S′ := {a ∨ a′, a′}.Note that (a ∧ a′) ∨ (a ∨ a′) = ((a ∧ a′) ∨ a) ∨ a′ = a ∨ a′ and (a ∧ a′) ∧ (a ∨a′) = a ∧ (a′ ∧ (a ∨ a′)) = a ∧ a′. We can verify that S ≤ S′ by checking eachoperation for each pair of elements, finding in each case that the desired inclusion isautomatic or follows from absorption or idempotence. Suppose that u(a, t) ≥ u(a ∧a′, t), so a ∈ M(t, S). Either a ∨ a′ ∈ M(t, S′) or a′ ∈ M(t, S′), and a ∨ (a ∨ a′) =a ∨ a′, so in either case a ∨ a′ ∈ M(t, S′) because M(t, S) ≤ M(t, S′). Thereforeu(a ∨ a′, t) ≥ u(a′, t). Now suppose that u(a′, t) ≥ u(a ∨ a′, t), so a′ ∈ M(t, S′).Either a ∈ M(t, S) or a ∧ a′ ∈ M(t, S), and a ∧ (a ∧ a′) = a ∧ a′, so in either casea ∧ a′ ∈ M(t, S) because M(t, S) ≤ M(t, S′). Therefore u(a ∧ a′, t) ≥ u(a, t). Wehave verified that u(·, t) is quasisupermodular. �

We are now going to indulge in a brief detour from this chapter’s agenda in orderto present the Tarski (1955) fixed point theorem. The underlyingmathematics is quitedifferent from the topological fixed point principle, so in one sense this is also a detourfrom the book’s larger agenda, but this fixed point theorem has important applicationsin economic theory, so from a different point of view it is a very appropriate topic.

If X is a partially ordered set, for x, x ′ ∈ X let:

(−∞, x ′] := { x ′′ ∈ X : x ′′ � x ′ }, [x,∞) := { x ′′ ∈ X : x � x ′′ },

[x, x ′] := (−∞, x ′] ∩ [x,∞) .

These sets are calledorder intervals. If X is a complete lattice, then any order intervalcontains the greatest lower bound and least upper bound of each of its nonemptysubsets, so it (with the restricted partial order) is also a complete lattice.

If Y and Z are partially ordered sets, a function f : Y → Z is monotone iff (y) � f (y′) for all y, y′ ∈ X such that y � y′.

17.1 Monotone Comparative Statics 375

Theorem 17.2 (Tarski 1955) (Fixed Point Theorem) If X is a complete lattice andf : X → X is monotone, then the set F ( f ) of fixed points of f is nonempty and is(with the restricted partial order) also a complete lattice.

Proof Let D := { x ∈ X : x � f (x) }. Since ∧X � f (∧X), D is nonempty. Letu := ∨D. For all x ∈ D we have x � u and thus x � f (x) � f (u), which is to saythat f (u) is an upper bound for D. But u is the least upper bound, so u � f (u). On theother hand monotonicity gives f (u) � f ( f (u)), so f (u) ∈ D and thus f (u) � u.Therefore u ∈ F ( f ). Since F ( f ) ⊂ D, u is the greatest fixed point of f . A sym-metric argument shows that there is also a least fixed point.

To show that F ( f ) is a complete lattice consider a nonempty S ⊂ F ( f ). Letb := ∨S. For any x ∈ S we have x = f (x) � f (b), so f (b) is an upper bound of Sand therefore b � f (b). For any x ∈ [b,∞) we have b � f (b) � f (x), so f maps[b,∞) to itself. Since [b,∞) is a complete lattice, the first part of the proof impliesthat the restriction of f to this set has a least fixed point. This is an element ofF ( f )that is an upper bound of S, and is less that any other element of F ( f ) that is anupper bound of S, which is exactly what we need. A symmetric argument shows thatS has a greatest lower bound inF ( f ). �

There is an important subtlety here. The Tarski fixed point theorem does not assertthatF ( f ) is a sublattice of X . Any nonempty subset ofF ( f ) has a least upper boundand a greatest lower bound for the restriction of the ordering toF ( f ), but these maynot be the least upper bound and greatest lower bound in X .

Zhou (1994) provides a generalization of Tarski’s theorem for monotonic (in anappropriate sense) correspondences. For a simple proof and additional details seeEchenique (2005).

17.2 Motivation: A Bit of Auction Theory

We set the stage by briefly describing the simplest possible first price auction. Asingle object is to be sold. There are N bidders i = 1, . . . , N . A random processgenerates a vector t = (t1, . . . , tN ) of types. Each bidder observes her own type, butdoes not observe the type of any other bidder. They simultaneously submit sealedbids a1, . . . , aN ∈ [0, 1]. The high bidder wins the object and pays her bid.

At first we will assume that each bidder’s type is the value of the object to thebidder, so the winning bidder i receives a net profit of ti − ai and all other biddersreceive 0. We will also assume that the types are independently and identicallydistributed, with a distribution described by a cumulative distribution function2 F :[0, 1] → [0, 1]. We assume that F(0) = 0 (there is no mass point at 0) and F is C1

with probability density function f (t) := F ′(t) > 0 for all t .A bidding strategy is a function s : [0, 1] → [0, 1] that specifies how much to bid

as a function of one’s value. At this point we’ll be content to look for an equilibriumthat is symmetric, in the sense that all bidders are following the same strategy s, which

2That is, F(t) is the probability that ti ≤ t .


we will assume is C1 with a positive first derivative everywhere. Suppose that yoursignal is t and you bid a. The probability that you win the auction is F(s−1(a))N−1,which is the probability that each other bidder has a signal that leads them to bid lessthan a. Your expected surplus is your surplus when you win times the probabilitythat you win, which is (t − a)F(s−1(a))N−1.

The first order condition for optimization is

0 = (N − 1)(t − a)F(s−1(a))N−2 f (s−1(a))

s ′(s−1(a))− F(s−1(a))N−1 ,

which simplifies to

s ′(s−1(a)) = (N − 1)(t − a) f (s−1(a))

F(s−1(a)).

In a symmetric equilibrium it is optimal for you to follow the same strategy thateveryone else is using, so this condition should hold when a = s(t), which gives

s ′(t) = (N − 1)(t − s(t)) f (t)

F(t). (17.1)

Fundamental results concerning ordinary differential equations imply that if f isLipschitz, then there is a unique solution of this equation that also satisfies the initialcondition s(0) = 0, which is clearly a property of any optimal strategy. (You neverwant to bid more than your value, but when your value is very small you should stillbid something, if only because everybody else’s value might be below your bid.)Very often this will have no closed form solution, but for the uniform distribution(F(t) = t and f (t) = 1) one can easily check that the solution is

s(t) = N − 1

Nt .

This very special case already yields many important insights concerning, forexample, efficiency (the auction is efficient because the object always goes to theagentwho values it themost) and the auctioneer’s revenue as a function of the numberof bidders. It is the beginning of a huge body of theory that is surveyed in the booksKrishna (2010), Menezes and Monteiro (2005), Milgrom (2004), among many otherplaces. Here we will focus on the extreme simplification that was involved in makingthe analysis work, and the paths that subsequent research has found that allow forgreater generality.

The assumption that the value of the object to an agent depends only on thatagent’s signal is described by the phrase “private values.” It seems reasonable forconsumer goods that will not be resold, for example wine, but there are clearly agreat many settings where it is inappropriate. The phrase “common value” describesa situation in which the value of the object is the same for all agents, and depends

17.2 Motivation: A Bit of Auction Theory 377

jointly on the profile of types. For example, the value of the right to explore a certaintract of land for oil depends only on howmuch oil there is and how easy it is to extract(all companies have pretty much the same technology) but different companies mayhave different information concerning that amount. There are also many settings thatmix private and common values. For instance, someone bidding on a Rembrandt maybe concerned both about how much it appeals to their personal taste and the amountthey can hope to get when it is resold later.

The assumption that the agents’ types are statistically independent is obviouslyunrealistic inmany settings, and it is extreme in the sense of suppressing any informa-tion the types might provide concerning other agents’ types and the likely intensityof competition. How can we relax this assumption without destroying the model’stractability?

It is almost inconceivable that there could be a tractablemodel inwhich the agents’equilibrium strategies were not injective, because the “effective competition” that abid of a faces would change discontinuously at those a that were local maximaor local minima of other agents’ bidding functions. So, we must do whatever isnecessary to allow equilibrium with monotonic strategies. When the agents’ typesare correlated, an increase in an agent’s types affects her beliefs concerning the otheragents’ types, which may influence her estimate of the object’s value to her, andher type also affects her expectations concerning what other agents are likely tobid. These considerations are echoed in her estimation of what others are likely tobe thinking about these questions, what others are thinking about what others arethinking, and so forth.

Milgrom and Roberts (1982) propose a model that is well behaved because allthese repercussions of an increase in an agent’s type act in the same direction. Wenow distinguish between an agent’s type ti and her utility of winning the object,which is a function ui (t) of the entire vector of types t = (t1, . . . , tN ). Milgrom andRoberts assume that there is a function u : [0, 1]N → R that is symmetric in its lastN − 1 arguments such that for each agent i , ui (t) = u(ti , {t j } j =i ). That is, all agentshave the same utility function, which depends on their own type and the types of theother agents, but not on which other agent has which other type.

For the sake of simplicity let’s suppose that the distribution of t is characterizedby a continuous density function3 f : [0, 1]N → R+. Then the distribution of t issymmetric if f (t) = f (tσ(1), . . . , tσ(N )) for all t ∈ [0, 1]N and all permutations σ

of {1, . . . , N }. For t, t ′ ∈ [0, 1]N , let

t ∨ t ′ := (max{t1, t ′1}, . . . ,max{tN , t ′N }) and t ∧ t ′ := (

min{t1, t ′1}, . . . ,min{tN , t ′N }) .

The distribution of t is affiliated if, for all t, t ′ ∈ [0, 1]N ,

f (t ∨ t ′) f (t ∧ t ′) ≥ f (t) f (t ′) .

3That is, for any Borel set E ⊂ [0, 1]N , the probability that t ∈ E is∫E f (t) dt . (Borel sets and

integration are defined precisely in the next section.)


To develop intuition we consider the case N = 2. Suppose that t1 < t ′1 and t2 < t ′2.We can rewrite the inequality above as

f (t ′1, t ′2)f (t ′1, t2)

≥ f (t1, t ′2)f (t1, t2)

.

Letting f (t2|t1) := f (t1, t2)/∫

f (t1, t2) dt2 be the density of the probability distri-bution of t2 conditional on t1, this becomes

f (t ′2|t ′1)f (t2|t ′1)

≥ f (t ′2|t1)f (t2|t1) .

That is, as t1 increases, agent 1’s beliefs about t2 increase in a very strong sense. Inparticular, it implies that the distribution of t2 conditional on t ′1 first order stochas-tically dominates the distribution of t2 conditional on t1: for any t2 the probabilityconditional on t ′1 that t2 ≥ t2 is at least as large as the probability conditional on t1that t2 ≥ t2.

Milgrom and Roberts gave an explicit construction of a symmetric equilibriumbidding strategy. Naturally, we are interested in whether the existence of such anequilibrium can be understood as a consequence of a fixed point theorem, and thisproved to be a critical issue for the subsequent development of the literature. A keyquestion is whether each agent has a monotone best response when all other agentsare following a monotone strategy, and this raises the more general question of whenan optimizing choice varies monotonically when a parameter of the choice problemvaries. As we have seen, Milgrom and Shannon (1994) give a unified and generaltreatment of this question.

Athey (2001) proved the existence of monotone equilibrium when each agent’ssets of types and actions are subsets ofR and a suitable version of the single crossingproperty is satisfied. (There are of course various other technical hypotheses.) Inparticular, she does not assume that the game is symmetric, and consequently inequilibrium the agents will in general be using different monotone strategies.

In many auction settings multiple units are sold. The goods might be homoge-neous, like treasury bills, or heterogeneous, such as in auctions for the right to usecertain parts of the electromagnetic spectrum, in certain locations. In such settingsthe types can easily be multidimensional, if only because a bidder may attach dif-ferent values to different quantities of a homogeneous good, or different packagesof heterogeneous goods. The bids will also typically be multidimensional. In viewof these examples (and there are many other settings of interest to economists) itwould be desirable to extend the result to a setting in which the types and actionsare multidimensional. This was accomplished by McAdams (2003). Again, a keystep in the argument is to show that there are monotone best responses to monotonestrategies.

In Athey and McAdams a great deal of effort goes into showing that the bestresponse correspondence is convex valued. Reny points out that this effort is unnec-essary, because the best response correspondence is easily shown to be contractible

17.2 Motivation: A Bit of Auction Theory 379

valued. This observation allows significant generalization, since we can work withquite general spaces of actions and types, whichmay be infinite dimensional. In addi-tion, the required assumptions on the distribution of types are less restrictive, thereis some flexibility concerning the choice of the orderings of the types and actions,and the space of actions need not have all the properties of a lattice.

Of course this additional generality creates its own technical burden. Section17.3introduces semilattices and develops the required theory. After presenting requiredmaterial concerning measure and integration in the subsequent section, Sect. 17.5describes the interaction between the partial ordering of the space of types and itsmeasure theoretic structure. The space of monotone pure strategies is studied inSect. 17.6. The game, and the existence theorem are presented in Sects. 17.7, and17.8 gives conditions on the game under which the hypotheses of the existencetheorem hold.

In order to apply the Eilenberg-Montgomery fixed point theorem one must showthat the space is an AR. It is quite easy to show that the space of monotone purestrategies satisfies the hypothesis of a result of Dugundji that imply that this is thecase. However, this result is a consequence of other results of Dugundji that are quitedeep, and quite significant because they provide concrete characterizations of ANRs.The final three sections of the chapter develop this material.

17.3 Semilattices

A pure strategy is a function from the space of types to the space of actions. Wewill need to be able to form a new strategy from two given strategies by taking thepointwise least upper bound, so this operation needs to be defined. We do not needgreatest lower bounds, so we are able to work with a more general structure than alattice.

Let A be a set that is partially ordered by �. We say that A is a semilattice ifevery pair of points a, b ∈ A has a least upper bound a ∨ b. If this is the case, thenfor any a, b, c ∈ A, (a ∨ b) ∨ c = a ∨ (b ∨ c) because both are least upper boundsof {a, b, c}. Throughout this section we assume this is the case. A semilattice iscomplete if every nonempty subset has a least upper bound. A subset S ⊂ A is asubsemilattice if a ∨ b ∈ S for all a, b ∈ S. We assume that A is also a metric space,and that the partial order is closed, which is to say that { (a, b) ∈ A × A : a � b } isclosed.

We say that A is a metric semilattice if (a, b) �→ a ∨ b is continuous. Since{ (a, b) : a � b } = { (a, b) : a ∨ b = a }, this condition implies that the partial orderis closed. But a semilattice can have a metric with respect to which the partial orderis closed and not be a metric semilattice. For example, if

A = { a ∈ R2+ : a1 + a2 = 1 } ∪ {(1, 1)}


with the coordinatewise partial order, then a ∨ b is either a or (1, 1) according towhether a = b.

We say that A is locally complete if, for every a ∈ A and every neighborhoodU of a, there is a neighborhood W of a such that every nonempty S ⊂ W has leastupper bound and∨S ∈ U . The example considered above is complete but not locallycomplete.

The remainder of the section develops a number of basic technical results. For thenext three results let {an}, {bn}, and {cn} be sequences in A.

Lemma 17.3 If A is compact, an � bn � cn for all n, and an → z and cn → z, thenbn → z.

Proof If not, then {bn} has a subsequence that stays outside some neighborhoodof z and a further subsequence that converges to some z′ = z. But then z � z′ � zbecause an � bn � cn for all n and � is closed, so that z′ = z after all. �

Lemma 17.4 If A is compact and an � an+1 for all n or an � an+1 for all n, then{an} is convergent.Proof If a and a′ are limit points, then there are subsequences {amk } and {ank } withamk → a, ank → a′, and amk � ank for all k, so a � a′ because � is closed. Sym-metrically a′ � a, so a = a′. �

Lemma 17.5 If A is compact, S ⊂ A, and {a1, a2, . . .} is a countable dense subsetof S, then a1 ∨ · · · ∨ an → ∨S.

Proof The last result implies that the sequence {a1 ∨ · · · ∨ an} converges to somepoint a. Since � is closed, a � am for every m. In addition, a � b for every b ∈ Sbecause b is the limit of a sequence in {a1, a2, . . .}. Finally, if c is any upper boundof S, then c � a1 ∨ · · · ∨ an for all n, so c � a. �

Corollary 17.1 If the partial order is closed and S is a compact subsemilattice, thenit is a complete semilattice.

Proof Any nonempty subset of S has a countable dense subset. �

Lemma 17.6 If A is a locally complete metric semilattice and {ak} is a sequence inA converging to a and an := an ∨ an+1 ∨ · · · , then an → a.

Proof For any neighborhood U of a local completeness gives a neighborhood Wsuch that every nonempty subset of W has a least upper bound in U . Since {ak} iseventually in W , {an} is eventually in U . �

Proposition 17.1 If A is a locally complete metric semilattice, C1 ⊃ C2 ⊃ . . . is adescending sequence of nonempty compact subsets of A, C = ⋂

Cn, and a := ∨C,then limn ∨Cn = a.

17.3 Semilattices 381

Proof Let {Um} be a sequence of neighborhoods of a that is eventually inside anyother neighborhood. For each m let Wm be a neighborhood of a such that for everynonempty S ⊂ Wm , ∨S ∈ Um , and let bm := ∨Wm . Let {c1, c2, . . .} be a countabledense subset of C .

We claim that for each c ∈ C and m there is a neighborhood Vm,c of c such that∨Vm,c � bm . Lemma17.5 implies that c ∨ c1 ∨ · · · ∨ ck → a. Therefore, for eachm,c ∨ c1 ∨ · · · ∨ ck ∈ Wm for large k. Continuity of∨ implies that c′ ∨ c1 ∨ · · · ∨ ck ∈Wm when c′ is sufficiently close to c.

For each m let Vm := ⋃c∈C Vm,c. This is a neighborhood of C , so Cnm ⊂ Vm for

sufficiently large nm , and a � ∨Cnm � bm . Since bm → a, Lemma17.3 implies that∨Cnm → a. Lemma17.4 implies that {∨Cn} is convergent, so ∨Cn → a. �

17.4 Measure and Integration

This section presents a brief exposition of precisely that part of the theory of measureand integration that will be needed in the rest of the chapter. It is included primarilyin order to have certain basic concepts and results at hand. For readers who havenot studied this material in other contexts it may serve some expositional purpose,providing a first taste of the main concepts in a somewhat simplified context, whichmay help prepare for a more systematic study of this large and important subject.Most readers will have already learned this topic, and can skip over this sectioninitially, referring to it later as needed.

A set S of subsets of a set S is a σ -algebra for S if it contains S itself, thecomplement of any of its elements, and the union

⋃n En of the elements of any

countable collection {E1, E2, . . .} ⊂ S . Let Z be collection of subsets of S. Thesmallestσ -algebra containingZ is theσ -algebra generatedbyZ , which is denotedby σ(Z ). (The collection of all subsets of S is a σ -algebra, and the intersection ofany collection of σ -algebras is a σ -algebra, so this notion is well defined.) TheBorel σ -algebra of a topological space is the σ -algebra generated by the open sets.Elements of the Borel σ -algebra are called Borel sets. Unless some other possibilityis mentioned explicitly, topological spaces, especially including Euclidean spacesand [0,∞], will be automatically endowed with their Borel σ -algebras.

There is a technical device that can facilitate the verification that a collection ofsubsets of S is a σ -algebra. We say that a collection D of subsets of S is a Dynkinsystem if:

(a) S ∈ D ;(b) A \ B ∈ D whenever A, B ∈ D and B ⊂ A;(c)

⋃n An ∈ D whenever A1, A2, . . . are elements of D with A1 ⊂ A2 ⊂ · · · .

The collection of all subsets of S is obviously a Dynkin system, and it is easy toverify that the intersection of any collection of Dynkin systems is a Dynkin system,so we may define the Dynkin system generated by Z to be the smallest Dynkinsystem containing Z .


We say thatZ is a π -system if it contains all finite intersections of its elements.

Proposition 17.2 (Dynkin’s Lemma) If Z is a π -system, then the Dynkin systemD generated by Z is σ(Z ).

Proof Since a σ -algebra is aDynkin system,D ⊂ σ(Z ). The claimwill follow if weshow thatD is a σ -algebra. By (a) and (b),D contains S and the complement of eachof its elements, so we need to show thatD contains countable unions of its elements.For this it suffices to show thatD is a π -system, because it then contains finite unionsof its elements (a finite union of sets is the complement of the intersection of thecomplements) and for any A1, A2, . . . ∈ D it contains each A1 ∪ · · · ∪ An and thus⋃

n An by (c).Our objective is to show that

D ′ := { A ∈ D : A ∩ D ∈ D for all D ∈ D }

is all of D . A fortiori S ∈ D ′. In view of the identities (A \ B) ∩ D = (A ∩ D) \(B ∩ D) and

(⋃An

) ∩ D = ⋃(An ∩ D), D ′ is a Dynkin system. Since D is the

minimal Dynkin system containingZ , it suffices to show thatZ ⊂ D ′, which is tosay that

D ′′ := { A ∈ D : A ∩ Z ∈ D for all Z ∈ Z }

is all ofD . Since S ∈ D andZ ⊂ D , S ∈ D ′′. In viewof the identities (A \ B) ∩ Z =(A ∩ Z) \ (B ∩ Z) and

( ⋃An

) ∩ Z = ⋃(An ∩ Z), D ′′ is a Dynkin system. Since

Z is a π -system contained in D ,Z ⊂ D ′′, so it is indeed the case that D ′′ = D . �

Ameasurable space is a pair (S,S )where S is a set andS is a σ -algebra for S.Let (S,S ) and (T,T ) be measurable spaces, and let f : S → T be a function. Thisfunction is measurable if f −1(E) ∈ S for all E ∈ T . Evidently compositions ofmeasurable spaces are measurable, and the identity function on a measurable spaceis measurable, so measurable spaces and measurable functions constitute a category.

If the defining condition is satisfied by the elements of a generating set, then f ismeasurable.

Lemma 17.7 If σ(W ) = T and f −1(W ) ∈ S for all W ∈ W , then f ismeasurable.

Proof The set of E ⊂ T such that f −1(E) ∈ S contains T and all complements andcountable unions of its elements. (For any E, E1, E2, . . . ∈ W , f −1(Ec) = f −1(E)c

and f −1(⋃

n En) = ⋃n f −1(En).) Thus it is a σ -algebra that contains all elements

of W , so it contains σ(W ) = T . �

Corollary 17.2 A continuous function is measurable with respect to the Borel σ -algebras of the domain and range.

Proposition 17.3 If (A, d) is a metric space, g1, g2, . . . : S → A is a sequence ofmeasurable functions that converges pointwise, and g is the pointwise limit of {gm},then g is measurable.

17.4 Measure and Integration 383

Proof Let A be the set of E ⊂ A such that g−1(E) ∈ S . Clearly A contains Sitself and complements and countable unions of its elements. Since the open ballsaround points of A generate the Borel σ -algebra, it suffices to show that they arein A . For any a ∈ A and ε > 0 the elements of g−1(Uε(a)) are those s such that{gm(s)} is eventually in some smaller ball centered at a, so

g−1(Uε(a)) =⋃

r∈(0,ε)∩Q

∞⋃

M=1

⋂

m≥M

g−1m (Ur (a)) ,

which is in the σ -algebra generated by the g−1m (Ur (a)), and thus inS . �

The product σ -algebraS × T is the smallest σ -algebra for S × T that containsall products E × F of sets E ∈ S and F ∈ T .

Lemma 17.8 If (R,R) is a third measurable space and f : R → S and g : R → Tare measurable, then r �→ ( f (r), g(r)) is measurable.

Proof For any E ∈ S and F ∈ T ,

{ r ∈ R : ( f (r), g(r)) ∈ E × F } = f −1(E) ∩ g−1(F) ,

so this is a consequence of Lemma17.7. �

The fact that compositions of measurable functions are measurable, the last result,and Corollary17.2 imply that sums, products, and other continuous combinations ofreal valued measurable functions are measurable.

A nonnegative countably additive measure on (S,S ) is a function μ : S →[0,∞] such that μ(

⋃k Ek) = ∑

k μ(Ek) whenever {E1, E2, . . .} ⊂ S is countableand the Ek are pairwise disjoint. (Addition is extended to [0,∞] in the obviousmanner.) Since we will never consider any other sort of measure, henceforth theterm “measure” will mean a nonnegative countably additive measure. A measurespace is a triple (S,S , μ) in which (S,S ) is a measurable space andμ is a measureon (S,S ).

One often wishes to ignore phenomenon that are restricted to a set of measurezero. If (S,S , μ) is a measure space, a property of points of S that holds at everypoint in the complement of some E ∈ S with μ(E) = 0 is said to hold almosteverywhere. This is often abbreviated to a.e. or μ-a.e. if the discussion involvesmultiple measures.

The measure space (S,S , μ) is complete if every subset of a set of measurezero is an element of S . It turns out that μ has a unique extension to the σ -algebragenerated by S and the sets of μ-measure zero. (E.g., Sect. 3.3 of Dudley1989.)For most purposes little is lost if one replaces μ with this extension. Therefore thesimplifying assumption of completeness is common.

Before moving on we should mention some types of measures that will be impor-tant in what happens later, even though for some of them there are no theoretical


points we wish to make now. The measure μ is atomless if, for every E ∈ S withμ(E) > 0, there is an F ∈ S with F ⊂ E and 0 < μ(F) < μ(E). It is finite ifμ(S) < ∞, and it is a probability measure if μ(S) = 1, in which case the measurespace (S,S , μ) is a probability space. In this case a property of points of S is said tohold almost surely (often abbreviated as a.s. orμ-a.s.) if it holds almost everywhere.

Wenow introduce integration. Fix ameasure space (S,S , μ). A function g : S →R+ is simple if it is measurable and it takes on finitely many values. The integral ofsuch a function is ∫

g dμ :=∑

v∈R+

v × μ(g−1(v)) .

The integral of a measurable f : S → [0,∞] is∫

f dμ := sup{ ∫

g dμ : g : S → [0, 1] is simple and g ≤ f}

.

The function f is integrable if∫

f dμ < ∞. For E ∈ S let 1E be the function thatis 1 on E and 0 elsewhere. We write

∫E f dμ in place of

∫1E f dμ.

There are now some very basic results.

Lemma 17.9 For any measurable f : S → [0,∞] there is a sequence f1, f2, . . .of simple functions such that for each s ∈ S, { fn(s)} is an increasing sequence thatconverges to f (s). For any such { fn},

∫fn dμ → ∫

f dμ.

Proof A satisfactory sequence is given by setting

fn(s) :={j/2n, j/2n ≤ f (s) < ( j + 1)/2n, j = 0, . . . , 22n − 1,

2n, 2n ≤ f (s).

Fix such a sequence. The inequality lim∫

fn dμ ≤ ∫f dμ is an automatic conse-

quence of the definitions. Let g be a simple function such that g ≤ f . Fix an ε > 0,and for each n let fn be the simple function

fn(s) :={

(1 − ε)g(s), fn(s) ≥ (1 − ε)g(s),

0, otherwise.

If g takes on the values v1, . . . , vk , for each i = 1, . . . , k let Ei := g−1(vi ), andfor each i and n let Ein := { s ∈ Ei : fn(s) ≥ (1 − ε)vi }. Then Ei = ⋃

n Ein , socountable additivity gives

∫fn dμ ≥

∫fn dμ =

∑

i

(1 − ε)vi × μ(Ein) → (1 − ε)∑

i

vi × μ(Ei ) = (1 − ε)

∫g dμ .

This is true for arbitrary g and ε, so lim∫

fn dμ ≥ ∫f dμ. �


Lemma 17.10 If f, g : S → [0, 1] are measurable and c ≥ 0, then

∫c f dμ = c

∫f dμ and

∫f + g dμ =

∫f dμ +

∫g dμ .

Proof If { fn} and {gn} are increasing sequences of simple functions convergingpointwise to f and g, then {c fn} and { fn + gn} are increasing sequences of simplefunctions converging pointwise to c f and f + g. Therefore the claims follow fromthe last result and the special case of simple functions, for which these equationseasily reduce to matters of simple arithmetic. �

We now come to three of the best known and most frequently cited results of realanalysis. (More precisely, what we present here are the special cases of these resultsfor nonnegative valued functions.) For these results we fix a sequence of measurablefunctions f1, f2, . . . : S → [0,∞]. First of all recall that if { fn} converges pointwiseto f , then (Proposition17.3) f is measurable.

Theorem 17.3 (Monotone Convergence) If { fn} is an increasing sequence and fis its pointwise limit, then

∫fn dμ → ∫

f dμ.

Proof First, for each n, let gn1, gn2, . . . be an increasing sequence of simple functionsconverging pointwise to fn . Now, for each n, let hn = max{g1n, . . . , gnn}. Thenhn ≤ fn ≤ f , and {hn} is an increasing sequence of simple functions that convergespointwise to f , so

∫hn dμ → ∫

f dμ, and the claim follows. �

Theorem 17.4 (Fatou’s Lemma) The function lim inf fn is measurable and

∫lim inf fn dμ ≤ lim inf

∫fn dμ .

Proof To see that lim inf fn is measurable observe that for any α ≥ 0,

(lim inf fn)−1([0, α)) =

∞⋂

N=1

⋃

n≥N

f −1n ([0, α)) ∈ S .

For each n let gn = infm≥n fm . Then gn ≤ fm and thus∫gn dμ ≤ ∫

fm dμ for allm ≥ n, and {gn} is an increasing sequence of functions converging pointwise tolim inf fn , so monotone convergence implies that

∫lim inf fn dμ = lim

∫gn dμ ≤ lim inf

∫fn dμ .

�

Corollary 17.3 (Reverse Fatou’s Lemma) If there is an integrable g : S → [0,∞]such that fn ≤ g for all n, then


∫lim sup fn dμ ≥ lim sup

∫fn dμ .

Proof Fatou’s lemma gives

∫g dμ −

∫lim sup fn dμ =

∫lim inf(g − fn) dμ

≤ lim inf∫

g − fn dμ =∫

g dμ − lim sup∫

fn dμ .

�

For this and the next result the assumption of an integrable bounding functionis indispensable. An example illustrating this is given by S = [0, 1], fn(s) = nif s ≤ 1/n, and fn(s) = 0 if s > 1/n, in which case

∫fn dμ = 1 for all n, but∫

lim sup fn dμ = ∫0 dμ = 0.

Theorem 17.5 (Lebesgue’s Dominated Convergence Theorem) If { fn} convergespointwise to f and g : S → [0,∞] is an integrable function such that fn(s) ≤ g(s)for all n and s, then

∫ | fn − f | dμ → 0 and∫

fn dμ → ∫f dμ.

Proof Since | f − fn| ≤ | f | + | fn| ≤ 2g we can apply the reverse Fatou’s lemma toobtain

lim sup∫

| f − fn| dμ ≤∫

lim sup | f − fn| dμ = 0 .

Therefore | ∫ f dμ − ∫fn dμ| ≤ ∫ | f − fn| dμ → 0. �

We conclude the section with two specific results that will be applied later.

Proposition 17.4 Let G be an element of S × T . For each s ∈ S the slice

Gs := { t ∈ T : (s, t) ∈ G }

is in T . If ν is a measure on T , then the function s �→ ν(Gs) is measurable.

Proof Let Z be the set of products E × F with E ∈ S and F ∈ T . This is aπ -system because

(E1 × F1) ∩ (E2 × F2) = (E1 ∩ E2) × (F1 ∩ F2) .

LetD be the set ofG ⊂ S × T such thatGs ∈ T for all s and s �→ ν(Gs) is mea-surable. Evidently Z ⊂ D , and in particular S × T ∈ D . For each G, H ∈ D withH ⊂ G and each s ∈ S, (G \ H)s = Gs \ Hs ∈ T and ν((G \ H)s) = ν(Gs) −ν(Hs), so G \ H ∈ D . If G1,G2, . . . is an increasing sequence of elements ofD andG = ⋃

n Gn , then Gs = ⋃n(Gn)s ∈ T and ν(Gs) = limn ν((Gn)s) for each s ∈ S,

so Proposition17.3 (with A = [0,∞], endowed with a suitable metric) implies that


G ∈ D . Thus D is a Dynkin system, and Dynkin’s lemma implies that it containsσ(Z ) = S × T . �Proposition 17.5 Suppose that A is a separable metric space, (S,S , μ) is a proba-bility space, f : A × S → R is measurable and bounded, and for each s ∈ S, f (·, s)is continuous. For a ∈ A let F(a) := ∫

S f (a, s) dμ(s). Then F is continuous.

Proof Without loss of generality assume that f takes values in [0, 1]. We will showthat F is continuous at a given a0 ∈ A. Fix ε > 0. Let D be a countable dense subsetof A. For δ > 0 let

Bδ := { s : | f (d, s) − f (a0, s)| ≥ ε/2 for some d ∈ Uδ(a0) ∩ D } .

Then Bδ ∈ S because it is a countable union of the sets Sd := {s : ∣∣ f (d, s) −

f (a0, s)∣∣ ≥ ε/2

}. Since each f (·, s) is continuous, ⋂δ>0 Bδ = ∅, so we can choose

a δ > 0 such that μ(Bδ) < ε/2.For each s, since f (·, s) is continuous and D is dense, | f (a, s) − f (a0, s)| <

ε/2 for all a ∈ Uδ(a0) if and only if this inequality holds for all d ∈ Uδ(a0) ∩ D.Therefore

{s : ∣∣ f (a, s) − f (a0, s)

∣∣ < ε/2 for all a ∈ Uδ(a0)} = S \ Bδ ,

so for all a ∈ Uδ(a0),

∣∣∣∫

Sf (a, s) dμ(s) −

∫

Sf (a0, s) dμ(s)

∣∣∣ ≤∫

S

∣∣ f (a, s) − f (a0, s)∣∣ dμ(s)

≤ μ(Bδ) +∫

S\Bδ

∣∣ f (a, s) − f (a0, s)∣∣ dμ(s) < ε .

�

17.5 Partially Ordered Probability Spaces

Let (T,T , μ) be a partially ordered probability space, by which we mean that itis a probability space endowed with a partial order � such that

E� := { (t, t ′) : t � t ′ } ∈ T × T .

Lemma 17.11 All order intervals (−∞, t] and [t,∞) are in T , μ((−∞, t]) andμ([t,∞)) are measurable functions of t , and μ([t, t ′]) is a measurable function of(t, t ′).

Proof Since (−∞, t] and [t ′,∞) are slices of E� and [t, t ′] is a slice of { (t, t ′, t ′′) :(t ′′, t) ∈ E� and (t ′, t ′′) ∈ E� }, all claims follow from Proposition17.4. �


We say that T is separable if there is a countable separating set T 0 ⊂ T such thatevery E ∈ T withμ(E) > 0 contains points t and t ′ such that there is a point t0 ∈ T 0

with t � t0 � t ′. For the remainder of the section we assume that T is separable withseparating set T 0.

Lemma 17.12 If E ∈ T andμ(E) > 0, then there are t, t ′ ∈ E and t0 ∈ T0 ∩ [t, t ′]such that μ(E ∩ (−∞, t0]) > 0 and μ(E ∩ [t0,∞)) > 0.

Proof Let

E ′ := E \( ⋃

(−∞, t0] ∪⋃

[t ′0,∞))

where the first union is over all t0 ∈ T 0 such that μ(E ∩ (−∞, t0]) = 0 and thesecond union is over all t ′0 ∈ T 0 such that μ(E ∩ [t ′0,∞)) = 0. Removing countablymany sets of measure 0 does not change the measure, so μ(E ′) = μ(E) > 0. SinceT is separable there are t, t ′ ∈ E ′ such that [t, t ′] contains some t0 ∈ T 0. We haveμ(E ∩ (−∞, t0]) > 0 because otherwise t would not be in E ′. Similarly μ(E ∩[t0,∞)) > 0 because t ′ ∈ E ′. �

Lemma 17.13 If E ∈ T , and μ(E) > 0, then there are sequences {tn} in T 0 and{t ′n} in E such that μ(E ∩ [tn, t ′n]) > 0 and μ(E ∩ [t ′n, tn+1]) > 0 for all n.

Proof Let E0 := E . The last result gives t, t ′ ∈ E0 and t0 ∈ T 0 ∩ [t, t ′] such thatμ(E0 ∩ (−∞, t0]) > 0 and μ(E0 ∩ [t0,∞)) > 0. Let E1 := E0 ∩ [t0,∞). Repeat-ing this construction gives sequences {tk} in T 0 and {Ek} in T such that μ(Ek ∩(−∞, tk]) > 0, μ(Ek ∩ [tk,∞)) > 0, and Ek+1 := Ek ∩ [tk,∞) for all k. Of coursetk � tk−1 because Ek ∩ (−∞, tk]) ⊂ [tk−1,∞) ∩ (−∞, tk]. We have

E ∩ [tk, tk+1] ⊃ (Ek ∩ [tk,∞)) ∩ (−∞, tk+1] = Ek+1 ∩ (−∞, tk+1] .

Therefore, μ(E ∩ [tk, tk+1]) > 0, so E ∩ [tk, tk+1] = ∅. For each n = 1, 2, . . . lettn := t3n , and choose t ′n ∈ E ∩ [t3n+1, t3n+2]. �

We say that T is atomless if μ is an atomless measure.

Lemma 17.14 If T is atomless, then there is a monotone and measurable functionΦ : T → [0, 1] such that μ(Φ−1(α)) = 0 for every α ∈ [0, 1].Proof Let T 0 := {t1, t2, . . .}, for each k letχk(t) be 1 if t � tk and 0 otherwise, and letΦ(t) := ∑

k 2−kχk(t). Proposition17.4 implies that each χk is measurable, so Φ is

a pointwise convergent sum of monotone measurable functions, and is consequentlymonotone and measurable.

Aiming at a contradiction, suppose that μ(Φ−1(α)) > 0. Since T is atomless,μ(T 0) = 0, so separability implies that there are t, t ′ ∈ Φ−1(α) \ T 0 such that t �tk � t ′ for some k. Since � is antisymmetric, t ≺ tk ≺ t ′ and thus Φ(t ′) ≥ Φ(t) +2−k , which is a contradiction. �

17.6 Monotone Functions 389

17.6 Monotone Functions

In this section (T,T , μ) is a complete atomless partially ordered probability spacethat is separable, with separating set T 0, and A is a compact locally complete metricsemilattice. A function f : T → A is monotone if f (t) � f (t ′) whenever t � t ′.Let M be the set of such functions.

In some applications we wish to impose additional restrictions on the functionsunder consideration. For example, in a second price auction for a single object, theagents simultaneously submit bids, the object is awarded to the high bidder, andthat bidder pays the second highest bid. There can be equilibria in which one agentalways bids more than the object could possibly be worth to any other bidder and allother agents bid zero; the high bidder always wins the object for free, and none of theother agents has a deviation that gives a positive surplus. To rule out such equilibriawe can require that each agent plays a strategy that never bids more than the objectis worth to that agent.

Let C be a set of measurable functions f : T → A. We assume that C satisfiesthree conditions:

(a) C is pointwise-limit-closed: if f1, f2, . . . is a sequence of elements of C , f :T → A is measurable, and fk(t) → f (t) for μ-almost all t , then f ∈ C .

(b) C is piecewise-closed: if f, f ′ ∈ C and g : T → A is a measurable functionsuch that g(t) ∈ { f (t), f ′(t)} for all t , then g ∈ C .

(c) C is join-closed: g ∈ C whenever f, f ′ ∈ C and g(t) = f (t) ∨ f ′(t) for all t .

Let M C := M ∩ C .We note that the set of all measurable functions from T to A satisfies these con-

ditions: (a) and (b) are obvious, and if f, f ′ : T → A are measurable, then the com-position t �→ ( f (t), f ′(t)) �→ f (t) ∨ f ′(t) is measurable because (a, b) �→ a ∨ bis continuous. Consequently results in whichM C appears pertain equally toM .

We now describe the construction that will be used to demonstrate contractibility.LetΦ : T → [0, 1]be amonotone andmeasurable function such thatμ(Φ−1(c)) = 0for all c ∈ [0, 1], as per Lemma17.14. If f, g : T → A are measurable and 0 ≤ τ ≤1, let h( f, g, τ ) : T → A be the function

h( f, g, τ )(t) :=

⎧⎪⎨

⎪⎩

f (t), Φ(t) ≤ |1 − 2τ | and τ < 1/2,

g(t), Φ(t) ≤ |1 − 2τ | and τ ≥ 1/2,

f (t) ∨ g(t), Φ(t) > |1 − 2τ |.

To better understand this construction, first observe that h( f, g, 0) = f . If τ ∈(0, 1/2), then h( f, g, τ )(t) is f (t) if Φ(t) ≤ |1 − 2τ | and otherwise it is f (t) ∨g(t). If Φ(t) > 0, then h( f, g, 1/2)(t) = f (t) ∨ g(t), and h( f, g, 1/2)(t) = g(t) ifΦ(t) = 0, so h( f, g, /1/2) agrees with f ∨ g μ-almost everywhere. If τ ∈ (1/2, 1),then h( f, g, τ )(t) is g(t) if Φ(t) ≤ |1 − 2τ | and otherwise it is f (t) ∨ g(t). Finallyh( f, g, 1) = g. In sum, as τ goes from 0 to 1/2 and then 1, h( f, g, τ ) deforms fromf to f ∨ g, and then to g.


In view of this description of h, it is easy to see that if f and g are monotone, thenso is h( f, g, τ ):

Lemma 17.15 h(M × M × [0, 1]) ⊂ M .

Evidently (b) and (c) imply that:

Lemma 17.16 h(C × C × [0, 1]) ⊂ C.

Of course it follows that h(M C × M C × [0, 1]) ⊂ M C .For measurable functions f, f ′ : T → A let

δ( f, f ′) :=∫

d( f (t), f ′(t)) dμ .

This function is a pseudometric: it is symmetric and satisfies the triangle inequality,but it may vanish if f and f ′ agree μ-almost everywhere. We endow the space ofmeasurable functions from T to A with the induced (non-Hausdorff) topology.

Let hM := h|M ×M ×[0,1]. An important objective of the following analysis is toshow that hM is continuous.

Lemma 17.17 If f1, f2, . . . is a sequence of measurable functions from T to A, f :T → A is a function, and fk(t) → f (t) for μ-almost every t , then f is measurableand fk → f .

Proof Let E be the set of t such that { fk(t)} is not convergent, and let a0 be someelement of A. If we modify each fk by having it take the value a0 everywhere in E ,then fk still agrees with fk μ-a.e., and { fk(t)} is convergent, say with limit f (t), forall t . Proposition17.3 implies that f is measurable. Since A is compact, its metricis bounded, and consequently δ are bounded, so Lebesgue’s dominated convergetheorem implies that fk → f . �

A pair of sequences {tn} and {t ′n} in T approach a function f : T → A at t ∈ Tif μ([tn, t]) > 0 and μ([t, t ′n]) > 0 for all n and f (tn), f (t ′n) → f (t). We say that fis approachable at t ∈ T if such sequences exist. For n = 1, 2 . . . and tn, t ′n ∈ T let

T ntn ,t ′n ( f ) := { t ∈ T : μ([tn, t]), μ([t, t ′n]) > 0 and f (tn), f (t ′n) ∈ U1/n( f (t)) } .

Lemma 17.18 If f : T → A is measurable, then T ntn ,t ′n

( f ) ∈ T for all n and tn, t ′n ∈T . The set of t at which f is approachable is measurable.

Proof Let d be the metric of A. Lemma17.11 implies that μ([tn, t]) and μ([t, t ′n])are measurable functions of t , and d( f (tn), f (t)) and d( f (t), f (t ′n)) are measurablefunctions of t because f is measurable and d is continuous. Thus T n

tn ,t ′n( f ) ∈ T .

If {tn} and {t ′n} approach f at t , for each n Lemma17.12 gives tn ∈ [tn, t] ∩ T0such thatμ([tn, t]) > 0 and t ′n ∈ [t, t ′n] ∩ T0 such thatμ([t, t ′n]) > 0, andmonotonic-ity and Lemma17.3 implies that f (tn) → f (t) and f (t ′n) → f (t). That is, if f is


approachable at t , then there is a pair of approaching sequences in T 0. Thereforethe set of t at which f is approachable is

⋂n

⋃tn ,t ′n∈T 0 T n

tn ,t ′n( f ), which is measurable

because T 0 is countable. �

Lemma 17.19 If f ∈ M is measurable, then f is approachable at μ-a.e. t ∈ T .

Proof The set of t at which t is not approachable is⋃

N DN where

DN :=⋂

tN ,t ′N∈T 0

T NtN ,t ′N

( f )c .

Aiming at a contradiction, suppose thatμ(DN ) > 0 for some N . Lemma17.13 givesa sequence {tn} in T 0 and a sequence {t ′n} in DN such that μ(DN ∩ [tn, t ′n]) > 0and μ(DN ∩ [t ′n, tn+1]) > 0 for all n. Since the intervals [tn, t ′n] and [t ′n, tn+1] arenonempty we have t1 � t ′1 � t2 � t ′2 � · · · , and the monotonicity of f implies thatf (t1) � f (t ′1) � f (t2) � f (t ′2) � · · · is a monotone sequence in A that must con-verge byLemma17.4, so for large nwe have d( f (tn), f (t ′n)), d( f (t ′n), f (tn+1)) < 1

Nand thus t ′n ∈ T N

tn ,tn+1( f ), contradicting t ′n ∈ DN . �

Proposition 17.6 For any sequence { fn} inM C there is a subsequence { fnk } and ameasurable f ∈ M C such that fnk (t) → f (t) for μ-a.e. t .

Proof Suppose that T 0 = {t1, t2, . . .}. Compactness allows us to choose a subse-quence { fnk } such that { fnk (t1)} converges, a subsequence { fnk } of this subsequencesuch that { fnk (t2)} also converges, and so forth. Taking the first function in the firstsubsequence, the second function in the second subsequence, and so forth, gives asubsequence such that for all t ∈ T 0, { fnk (t)} converges to some point f ′(t). Sinceeach fnk is monotonic and the partial order of A is closed, f ′ : T0 → A is monotone.

For t ∈ T , if there is no t0 ∈ T 0 such that t � t0, let f (t) be the least upper boundof A. Otherwise let f (t) be the least upper bound of

F(t) :=⋂

t0∈T 0∩[t,∞)

(−∞, f ′(t0)] .

Since the partial order of A is closed, F(t) contains all of the limit points of { fnk (t)},and is consequently nonempty, so f (t) is well defined. As an intersection of closedsubsets of X , F(t) is compact, and it is evidently a subsemilattice, so Corollary17.1implies that f (t) ∈ F(t). Obviously f is monotonic. Since f ′ is monotonic, for allt0 ∈ T 0 we have F(t0) = (−∞, f ′(t0)] and f (t0) = f ′(t0).

For each m = 1, 2, . . . define gm : T → A by letting gm(t) be the least upperbound of the intersection of the order intervals (−∞, f (ti )] for those i = 1, . . . ,msuch that t � ti . Lemma17.11 implies that each (−∞, ti ] ismeasurable, so each gm isameasurable simple function. For each t Proposition17.1 implies that gm(t) → f (t),so (Proposition17.3) f is measurable.

Fix a t at which f is approachable. Passing to a further subsequence, we mayassume that { fnk (t)} converges to some a ∈ A. As we explained at the beginning of


the proof of the last result, there are sequences {t j } and {t ′j } in T 0 such that lim f (t j ) =lim f (t ′j ) = f (t) andμ([t j , t]) > 0 andμ([t, t ′j ]) > 0 for all j . In particular [t j , t] =∅ = [t, t ′j ], so t j � t � t ′j , and the monotonicity of each fnk implies that fnk (t j ) �fnk (t) � fnk (t

′j ). Taking the limit with respect to k yields f (t j ) � a � f (t ′j ), and

taking the limit with respect to j gives f (t) � a � f (t), so f (t) = a.We have shown that fnk (t) → f (t) at every t at which f is approachable, so

Lemma17.19 implies that fnk (t) → f (t) forμ-almost every t . SinceC is pointwise-limit-closed, f ∈ C . �

Since (T,T , μ) is complete, applying this to a constant sequence f, f, . . . gives:

Corollary 17.4 Each f ∈ M is measurable.

Lemma 17.20 If a sequence f1, f2, . . . in M converges to f , then fk(t) → f (t)for μ-a.e. t .

Proof Corollary17.4 allows us to assume that f is measurable, so, in view ofLemma17.19, it suffices to show that fk(t) → f (t) if t is a point at which f isapproachable. Since A is compact, it suffices to show that an arbitrary convergentsubsequence of { fk(t)} has f (t) as its limit, so (replacing the sequence with thissubsequence) suppose that fk(t) → a.

Proposition17.6 implies that after passing to a further subsequence there is ameasurable g ∈ M such that fk(t ′) → g(t ′) for μ-a.e. t ′. Since the metric of A isbounded, dominated convergence implies that δ( fk, g) → 0, so δ( f, g) = 0 and fand g agree almost everywhere, so fk(t ′) → f (t ′) for μ-a.e. t ′.

Since t is approachable there are sequences {tn} and {t ′n} such that μ([tn, t]) > 0andμ([t, t ′n]) > 0 for all n and f (tn), f (t ′n) → f (t). Since fk(t) → f (t) forμ-a.e. t ,for each n there are tn ∈ [tn, t] and t ′n ∈ [t, t ′n] such that fk(tn) → f (tn) and fk(t ′n) →f (t ′n). Since fk(tn) � fk(t) � fk(t ′n), taking limits gives f (tn) � a � f (t ′n). Nowf (tn) � f (tn) � a � f (t ′n) � f (t ′n). Since A is a metric semilattice, its partial orderis closed, so f (t) � a � f (t) and thus f (t) = a as desired. �

Corollary 17.5 If a sequence f1, f2, . . . in M converges to f and, for each n,ϕn := fn ∨ fn+1 ∨ fn+2 ∨ · · · , then ϕn → f .

Proof The last result implies that fk(t) → f (t) forμ-a.e. t . For each such t ,ϕn(t) →f (t) (Lemma17.6). Therefore Lemma17.17 implies that ϕn → f . �

Proposition 17.7 hM is continuous.

Proof Suppose that ( fk, gk, τk) → ( f, g, τ ). Lemma17.20 gives a D ∈ T such thatμ(D) = 1 and fk(t) → f (t) and gk(t) → g(t) for all t ∈ D. There are three cases,according to τ .

If τ < 1/2, then τk < 1/2 for large k, so that

h( fk, gk, τk)(t) ={fk(t), Φ(t) ≤ |1 − 2τk |,fk(t) ∨ gk(t), Φ(t) > |1 − 2τk |.


Since Φ and pairwise ∨ are continuous, we have h( fk, gk, τk)(t) → h( f, g, τ )(t)if Φ(t) < |1 − 2τ | or if Φ(t) > |1 − 2τ |. Since μ(Φ−1(|1 − 2τ |)) = 0 Lebesgue’sdominated convergence theorem implies that δ(h( fk, gk, τk), h( f, g, τ )) → 0. Theproof when τ > 1/2 is similar, and adjusting the details is left to the reader.

If τ = 1/2, then h( fk, gk, τk)(t) → f (t) ∨ g(t) for all t such that Φ(t) > 0.Since μ(Φ−1(0)) = 0, Lebesgue’s dominated convergence theorem implies thatδ(h( fk, gk, τk), h( f, g, τ )) → 0. �

The triangle inequality implies that being separated by zero distance is an equiv-alence relation. Let [ f ] denote the equivalence class of f , and for f and f ′ letδ([ f ], [ f ′]) := δ( f, f ′). Of course δ is a metric, and the map f �→ [ f ] is easilyseen to be continuous and open. Let M := { [ f ] : f ∈ M }.

If f, f ′, g, g′ ∈ M , f and f ′ agree almost everywhere, and g and g′ agree almosteverywhere, then for any τ , h( f, g, τ ) and h( f ′, g′, τ ) agree almost everywhere, sowe can define a function h : M × M × [0, 1] → M by setting h([ f ], [g], τ ) :=[h( f, g, τ )]. IfU ⊂ M is open, h−1(U ) = π(h−1(π−1U ))) where π : M → M isthe map f �→ [ f ], and π is an open map, so h is continuous.

Lemma 17.21 M C is contractible.

Proof For any [ f ∗] ∈ M C the function c([ f ], t) := h([ f ], [ f ∗], t) is a contraction.�

Proposition 17.8 M C is compact.

Proof If {[ fn]} is a sequence in M C , Proposition17.6 gives a subsequence { fnk } of{ fn} and an f ∈ M C such that fnk (t) → f (t) for all μ-a.e. t . Lemma17.17 impliesthat fkn → f and [ fkn ] → [ f ]. Thus M C is sequentially compact, hence compact(Proposition3.1). �

We will apply the following result, whose proof will be an adventure unto itself,undertaken in the last three sections of this chapter. Let X be a metric space. LetW ⊂ X × X be a neighborhood of the diagonal, and let λ : W × [0, 1] → X be anequiconnecting function. Suppose that U ⊂ X is open, V ⊂ U , and V × V ⊂ W .Let V 1 := λ(V × V × [0, 1]). Proceeding inductively, if V n has been defined andV × V n ⊂ W , let V n+1 := λ(V × V n × [0, 1]). If this process does not come to anend and V n ⊂ U for all n, then we say that V is λ-stable in U .

Theorem 17.6 (Dugundji 1965) If X is locally equiconnected and there is anequiconnecting function λ such that for each x ∈ X and each neighborhood U of xthere is a neighborhood V ⊂ U that is λ-stable in U, then X is an ANR.

Proposition 17.9 M C is an AR.

Proof Fix [ f ] ∈ M C and a neighborhood U . We will show that there is a neigh-borhood V of [ f ] that is h-stable in U . Since [ f ] is arbitrary, the last result willthen imply that M C is an ANR, hence (Theorem8.2) an AR because it is con-tractible. Let U := { f ′ ∈ M C : [ f ′] ∈ U }. Since f ′ �→ [ f ′] is continuous, U is


open. Let V be a neighborhood of f , and define V 1, V 2, . . . inductively be settingV 1 := h(V, V, [0, 1]) and V n+1 := h(V, V n, [0, 1]).

Suppose that g ∈ V1. Then g = h( f0, f1, τ ) for some f0, f1 ∈ V and 0 ≤ t ≤ 1.The definition of h gives f0 � g if τ < 1/2, f1 � g is τ ≥ 1/2, and g � f0 ∨ f1.Now suppose that g ∈ Vn . Then g = h( fn, g′, τ ) for some fn ∈ V , g′ ∈ Vn−1 and0 ≤ τ ≤ 1. Proceeding inductively, suppose that there are f0, . . . , fn−1 ∈ V suchthat f0 � g′ � f0 ∨ · · · ∨ fn−1. As above, either fn � g or f0 � g′ � g, and g �fn ∨ g′ � f0 ∨ · · · ∨ fn . Possibly after putting fn in place of f0, we have shown thatthere are f0, . . . , fn ∈ V such that f0 � g � f0 ∨ · · · ∨ fn .

We wish to show that for some natural number k, if we set V := U1/k( f ), thenV n ⊂ U for all n. If not, then for each k = 1, 2, . . . there is an nk and a gk ∈ V nk \U .Choose f k0 , . . . , f knk ∈ U1/k( f ) such that f k0 � gk � f k0 ∨ · · · ∨ f knk . Let ϕ1, ϕ2, . . .

be the sequence f 10 , . . . , f 1n1 , f 20 , . . . , f 2n2 , . . .. For every k there is an mk such that

f k0 � gk � f k0 ∨ · · · ∨ f knk � ϕmk ∨ ϕmk+1 ∨ · · · ,

and themk can be chosen so thatmk → ∞ as k → ∞. Evidently f k0 → f and ϕm →f , so ϕm ∨ ϕm+1 ∨ · · · → f (Corollary17.5), whence f k0 (t) → f (t) and ϕm(t) ∨ϕm+1(t) ∨ · · · → f (t) for μ-a.e. t (Lemma17.20). For such a t we have

f k0 (t) � gk(t) � ϕmk (t) ∨ ϕmk+1(t) ∨ · · · ,

so gk(t) → f (t) (Lemma17.3). Now Lemma17.17 implies that gk → f , contra-dicting the hypothesis that no k is satisfactory.

If we now define V := { [g] : g ∈ V } and V n := { [g] : g ∈ V n } we find thatV ⊂ U is open (because passage to equivalence classes is an open map) V 1 =h(V , V , [0, 1]) and V n+1 = h(V , V n, [0, 1]) for all n, and V n ⊂ U for all n. �

There is one more result that will be needed in the next section.

Proposition 17.10 If f ∈ M , Z is the set of g ∈ M such that g(t) � f (t) fora.e. t , and f : T → A is defined by setting f (t) := ∨g∈Z g(t), then f is monotoneand f (t) = f (t) for a.e. t .

Proof Since Z is nonempty ( f is an element) and A is a complete semilattice (Corol-lary17.1) f is well defined. It is monotone because it is the pointwise join of mono-tone functions.

Corollary17.4 gives a measurable f ∈ M that agrees with f almost everywhere.Note that Z and f are unchanged if we replace f with f , and f (t) = f (t) for a.e.t if and only if f (t) = f (t) for a.e. t . Thus it suffices to prove the result with f inplace of f , so we may assume that f is measurable. This implies that the set E of tat which f is approachable is measurable and μ(E) = 1 (Lemma17.19).

It now suffices to show, for a given t ∈ E , that f (t) � g(t) for all g ∈ Z , since thenf (t) � f (t) � f (t) (because f ∈ Z ) and thus f (t) = f (t). Fix g ∈ Z . There is aD ∈ T withμ(D) = 1 such that g(t) � f (t) for all t ∈ D. Since f is approachable


at t there are sequences {tn} and {t ′n} such that { f (tn)} and { f (t ′n)} converge tof (t) and μ([tn, t]), μ([t, t ′n]) > 0 for all n. For each n choose tn ∈ [t, t ′n] ∩ D. Thenf (t ′n) � f (tn) � g(tn) � g(t) for all n, so (since the order is closed) f (t) � g(t).The proof is complete. �

17.7 The Game

Leaving the proof of Theorem17.6 aside temporarily, we turn to Reny’s equilibriumexistence result. We begin by describing the model, which is a Bayesian game withfinitely many agents i = 1, . . . , N . For each i there are:

(a) a compact locally complete metric semilattice4 Ai of actions;(b) a partially ordered probability space (Ti ,Ti , μi ) of types that is separable, with

separating set T 0i ;

(c) a utility function is ui : A × T → R.

Here A := ∏i Ai and T := ∏

i Ti .We endow T with the product σ -algebra T := ∏

i Ti . We assume that there is acommon prior μ, which is a probability measure on (T,T ), such that for each i themarginal of μ on Ti is μi .

A pure strategy for i is a μi -almost measurable function si : Ti → Ai . Let Sidenote the set of pure strategies for i , endowed with the topology derived from thepseudometric of the last section. Let S := ∏

i Si be the space of pure strategy profiles,and for each i let S−i := ∏

j =i S j . We assume that each ui is bounded and jointlymeasurable. For s ∈ S let

Ui (s) :=∫

Tui (s(t), t) dμ(t) .

Throughout we assume that each Ui is continuous. This can be derived from a moreprimitive assumption.

Lemma 17.22 If, for each t ∈ T , ui (·, t) is continuous, then Ui is continuous.

Proof Let sn be a sequence in S converging to s. Lemma17.20 implies that for eachi , sni (t) → si (t) for μi -a.e. t . It follows that sn(t) → s(t) for μ-a.e. t . (If the set of twhere this was not the case had positive measure, the projection of this set on eachTi would have positive μi -measure.) Since ui is bounded, Lebesgue’s dominatedconvergence theorem implies that Ui (sn) → Ui (s). �

Since auctions typically have discontinuous payoffs at the boundaries betweenwinning and losing, we should consider ways that Ui (or its restriction to some

4Reny also proves a variant of the result each Ai is a convex subset of a locally convex topologicalvector space and the partial order on Ai is convex in the sense that { (ai , bi ) : ai ≺ bi , } is convex.


relevant set of pure strategies) may be continuous even though ui (·, t) typically isnot. In some settings it may make sense to only consider pure strategies that are“minimally responsive” to the type. If other players are playing such strategies,then your expected payoff is continuous because tying with another agent’s bidis a measure zero event. Reny (2011) considers examples in which the spaces ofactions are finite, so that continuity is automatic. There is also an extensive literatureon existence-of-pure-equilibrium results for games with discontinuities, for whicha different paper by Reny (1999) is seminal. (McLennan et al. 2011, Barelli andMeneghel2013 are more recent contributions.)

The pure strategy si is a best reply to s−i ∈ S−i if Ui (si , s−i ) ≥ Ui (s ′i , s−i ) for

all s ′i ∈ Si . The profile s is a Nash equilibrium if each si is a best reply to s−i . For

each i let Ci be a pointwise-limit-closed, piecewise-closed, and join-closed set ofpure strategies that contains at least one monotone element. Let C := ∏

i Ci .

Theorem 17.7 Suppose that for all profiles s of monotone pure strategies in C andall i , the intersection of Ci with the set of i’s monotone pure best replies is nonemptyand join-closed. Then C contains a monotone pure-strategy equilibrium.

Proof For each i letM Cii be player i’s set of monotone pure strategies in Ci , and let

M Cii be the space of equivalence classes of elements ofM Ci

i . Let

M C :=∏

i

M Cii , M C

−i :=∏

j =i

MC j

j , M C :=∏

i

M Cii , M C

−i :=∏

j =i

MC j

j .

For s−i ∈ M C−i let Bi (s−i ) be the set of best replies to s−i that are in M Ci

i . By

assumption Bi (s−i ) is nonempty. Define Bi : M C−i → M C by setting

Bi ([s−i ]) := { [si ] : s−i ∈ [s−i ], si ∈ Bi (s−i ) } .

Define B : M C → M C by setting B([s]) := ∏i Bi ([s−i ]).

It now suffices to show that B satisfies the hypotheses of the Eilenberg-Montgomery fixed point theorem. Each M Ci

i is a nonempty compact AR, so (byCorollary8.3) M C is an AR. For each i and [s] ∈ M C let Ui ([s]) := Ui (s). (Thisdefinition clearly does not depend on the choice of representatives.) SinceUi is con-tinuous and the map s �→ [s] is open, Ui is continuous. Since M

Cii is compact and

Bi ([s−i ]) is the set of maximizers of Ui (·, [s−i ]), Bi is upper hemicontinuous. It onlyremains to show that B is contractible valued, and for this it suffices to show that fora given i and s−i ∈ S−i , Bi ([s−i ]) is contractible.

As the set of maximizers of a continuous function on a compact set, Bi ([s−i ]) iscompact. There is a partial ordering of Bi ([s−i ]) defined by [si ] � [s ′

i ] if and only ifsi (ti ) � s ′

i (ti ) for μi -a.e. ti . Since Bi (s−i ) is join-closed, this partial ordering makesBi ([s−i ]) into a semilattice. To see that this partial order is closed consider thatif [sir ] → [si ], [s ′

ir ] → [s ′i ], and [sir ] � [s ′

ir ] for all r , then Lemma17.20 impliesthat sir (ti ) → si (ti ) and s ′

ir (ti ) → s ′i (ti ) for μi -a.e. ti , so (because Ai is a metric

17.7 The Game 397

semilattice) si (ti ) � s ′i (ti ) for μi -a.e. ti . Now Corollary17.1 implies that Bi ([s−i ])

is a complete semilattice, so that [si ] = ∨Bi ([s−i ]) is a well defined member ofBi ([s−i ]). Proposition17.10 implies that there is an si ∈ M Ci

i such that si (ti ) = si (ti )for μi -a.e. ti and si (ti ) � si (ti ) for every ti and every si that is μi -a.e. less than orequal to si .

Let Φi : Ti → [0, 1] be an increasing function such that μi (Φ−1i (r)) = 0 for all

r . Define j : Bi (s−i ) × [0, 1] → Bi (s−i ) by letting

j (si , τ )(ti ) :={si (ti ), Φi (ti ) ≤ 1 − τ and τ < 1,

si (ti ), otherwise.

If j (si , τ ) was not a best reply, the integral giving the expected payoff could beimproved on the set of ti where j (si , τ ) agrees with si or on the set of ti where j (si , τ )

agrees with si , leading to a contradiction of the optimality of si or si respectively.Consequently j (si , τ ) is a best reply to s−i . Since Φi is monotone, si and si aremonotone, and si (ti ) � si (ti ) for all ti , j (si , τ ) is monotone, so it is an element ofBi (s−i ).

Suppose that (sni , τn) is a sequence in Bi (s−i ) × [0, 1] that converges to (si , τ ).Lemma17.20 implies that there is a set D ⊂ Ti with μi (D) = 1 such that sni (ti ) →si (ti ) for all ti ∈ D. Consider ti ∈ D. If Φi (ti ) < 1 − τ , then τ < 1, so Φi (ti ) < 1 −τn and τn < 1 for large n, and j (sni , τn)(ti ) = sni (ti ) → si (ti ) = j (si , τ ). If Φi (ti ) >

1 − τ , then Φi (ti ) > 1 − τn for large n, so j (sni , τn)(ti ) = si (ti ) = j (si , τ )(ti ).The set of μi ({ ti : Φi (ti ) = 1 − τ }) = 0, so we have shown that j (sni , τn)(ti ) →j (si , τ )(ti ) for μi -a.e. ti . Therefore Lemma17.17 implies that j (sni , τn) → j (si , τ ).We have shown that j is continuous. Since j (si , 1) = si , j is a contraction of Bi (s−i ).

If si and s ′i agree almost everywhere, then for any τ ∈ [0, 1], j (si , τ ) and j (s ′

i , τ )

agree almost everywhere. Therefore we can define a map j : Bi ([s−i ]) × [0, 1] →Bi ([s−i ]) by setting j([si ], τ ) := [ j (si , τ )]. Since si �→ [si ] is both continuous andopen, j is continuous, hence a contraction, so the proof is complete. �

17.8 Best Response Sets

The key assumption of Theorem17.7—that a player’s set ofmonotone best responsesto a profile of monotone pure strategies is nonempty and join-closed—may hold for avariety of reasons. In this section we give one set of assumptions on the “primitives”(that is, the information structure and utilities) that has this consequence.

We first introduce some general concepts related to probability. Let (Z ,Z , λ)

be a probability space. Let Z ′ be a sub-σ -algebra of Z ; that is, Z ′ ⊂ Z and Z ′is itself a σ -algebra. Let λZ ′ denote the restriction of λ to Z ′. If f : Z → R+ ismeasurable, a conditional expectation of f givenZ ′ is a function g : Z → R+ thatis measurable with respect to Z ′, such that


∫

Eg dλZ ′ =

∫

Ef dλ

for all E ∈ Z ′. Roughly Z ′ can be understood as describing information, and thisequation requires that g specifies expectations conditional on this information thatare correct when averaged over events in Z ′ that have positive probability.

A function π(·|Z ′)(·) : Z × Z → [0, 1] is a regular conditional probabilityif:

(a) For each E ∈ Z , π(E |Z ′)(·) is a conditional expectation of 1E given Z ′.(b) For almost every s ∈ Z , π(·|Z ′)(s) is a probability measure on (Z ,Z ).

Lemma 17.23 If π(·|Z ′)(·) is a regular conditional probability, then for any inte-grable function f : Z → R+,

∫f dπ(·|Z ′)(s) is an Z ′-measurable function of s,

and ∫f dλ =

∫ ( ∫f dπ(·|Z ′)(s)

)dλZ ′(s) .

Proof In view of the definitions above, for any E ∈ Z , π(E |Z ′)(s) = ∫1E d

π(·|Z ′)(s) is an Z ′-measurable function of s, and

∫1E dλ =

∫ ( ∫1E dπ(·|Z ′)(s)

)dλZ ′(s) .

These assertions extend easily to step functions, and then by monotone convergenceto arbitrary integrable functions. �

It can be shown that for any f ∈ L 1(Z ,Z , λ) a conditional expectation exists.In particular, for each E ∈ Z , a conditional expectation for the function 1E exists, sofunctions π(E |Z ′)(·) satisfying (a) exist. Whether all of these can be chosen so that(b) holds is a subtle issue. There are positive results for metric spaces. However, theproof of existence of a conditional expectation would already take several pages, andthe argument for the existence of a regular conditional probability in the metric caseis even more involved, so instead of treating this material here it is more appropriateto refer the reader to a real analysis text such as Dudley (1989).

We shall therefore simply assume that for each agent i = 1, . . . , N there is aregular conditional probability μi (·|Ti )(·) for μ conditional on Ti , which will bewritten μi (·|ti ) for ti ∈ Ti . Whether there is always a regular conditional probabilityin the general case—each Ti is a separable partially ordered probability space, andwe are conditioning on one of the factors—is not clear, and may never have beeninvestigated. In practice the models that economists might study almost always havemetric structures, and in fact there is often enough structure to impose additionalconditions such as continuity that identify a canonical regular conditional probability.

Given a profile of pure strategies s−i , for each ti ∈ Ti and ai ∈ Ai there is aninterim expected payoff

17.8 Best Response Sets 399

Vi (ai , ti , s−i ) :=∫

T−i

ui (ai , s−i (t−i ), t) dμi (t−i |ti ) .

(Here and below we write t in place of (ti , t−i ) even though ti is given and t−i is thevariable of integration.) For any profile s of pure strategies, Lemma17.23 impliesthat ∫

Tui (s(t), t) dμ =

∫

Ti

Vi (si (ti ), ti , s−i ) dμi (ti ) .

Therefore s is an equilibrium if and only if for each i and μi -a.e. ti ∈ Ti ,

Vi (si (ti ), ti , s−i ) ≥ Vi (ai , ti , s−i )

for all ai ∈ Ai .For a profile s−i of pure strategies and ti ∈ Ti let Bs−i (ti ) be the set of maximizers

of Vi (·, ti , s−i ). In view of Proposition17.5 Bs−i (ti ) is nonempty. The correspondenceBs−i : Ti → Ai is the interimbest response correspondence for s−i .We say that it issemimonotone if, for all ti , t ′i ∈ Ti such that t ′i � ti , if ai ∈ Bs−i (ti ) and a

′i ∈ Bs−i (t

′i ),

then ai ∨ a′i ∈ Bs−i (t

′i ).

We now assume that each Ai is a lattice and not just a semilattice. Player i’sinterim payoff Vi is weakly quasisupermodular if

Vi (ai , ti , s−i ) ≥ Vi (ai ∧ a′i , ti , s−i ) implies Vi (ai ∨ a′

i , ti , s−i ) ≥ Vi (a′i , ti , s−i )

for all profiles s−i of monotone pure strategies, all ai , a′i ∈ Ai , and all ti ∈ Ti . (This is

part (a) of the definition of quasisupermodularity.) It satisfies weak single crossingif

Vi (a′i , ti , s−i ) ≥ Vi (ai , ti , s−i ) implies Vi (a

′i , t

′i , s−i ) ≥ Vi (ai , t

′i , s−i )

for all profiles s−i of monotone pure strategies, all ai , a′i ∈ Ai such that a′

i � ai , andall ti , t ′i ∈ Ti such that t ′i � ti .

Obviously Vi is weakly quasisupermodular if ui is weakly quasisupermodular inthe sense that

ui (ai , a−i , t) ≥ ui (ai ∧ a′i , a−i , t) implies ui (ai ∨ a′

i , a−i , t) ≥ ui (a′i , a−i , t)

for all ai , a′i ∈ Ai , a−i ∈ A−i , and t ∈ T . Similarly, Vi satisfies weak single crossing

if ui satisfies weak single crossing in the sense that

ui (a′i , a−i , ti , t−i ) ≥ ui (ai , a−i , ti , t−i ) implies ui (a

′i , a−i , t

′i , t−i ) ≥ ui (ai , a−i , t

′i , t−i )

for all a−i ∈ A−i , t−i ∈ T−i , ai , a′i ∈ Ai such that a′

i � ai , and ti , t ′i ∈ Ti such thatt ′i � ti .

Theorem 17.8 If Ai is a lattice and Vi is weakly quasisupermodular and satisfiesweak single crossing, then player i’s set of monotone best responses to any profiles−i of monotone pure strategies is nonempty and join-closed.


Proof As we mentioned above, Proposition17.5 implies that Bs−i (ti ) is alwaysnonempty. We now show that Bs−i is semimonotone. Suppose that ti , t ′i ∈ Ti , t ′i � ti ,ai ∈ Bs−i (ti ), and a′

i ∈ Bs−i (t′i ). Then Vi (ai ∧ a′

i , ti , s−i ) ≤ Vi (ai , ti , s−i ), so weaksupermodularity implies that Vi (a′

i , ti , s−i ) ≤ Vi (ai ∨ a′i , ti , s−i ), and then weak sin-

gle crossing implies that Vi (a′i , t

′i , s−i ) ≤ Vi (ai ∨ a′

i , t′i , s−i ). Since Bs−i (t

′i ) contains

a′i , it must also contain ai ∨ a′

i .Semimonotonicity (with ti = t ′i ) implies that for all ai , a′

i ∈ Bs−i (ti ), ai ∨ a′i ∈

Bs−i (ti ). Therefore Bs−i (ti ) is a subsemilattice of Ai . The pointwise join of twomono-tone pure best responses is evidently monotone, and it assigns an interim optimalaction to μi -a.e. ti ∈ Ti , so it is a pure best response.

It remains to show that the set of monotone best replies to s−i is nonempty.For each ti let ai (ti ) := ∨Bs−i (ti ). Since Bs−i (ti ) is compact and a semilattice,Corollary17.1 implies that ai (ti ) ∈ Bs−i (ti ). To see that ai is monotone, suppose thatt ′i � ti . Since ai (ti ) ∈ Bs−i (ti ) and ai (t ′i ) ∈ Bs−i (t

′i ), the semimonotonicity of Bs−i

implies that ai (ti ) ∨ ai (t ′i ) ∈ Bs−i (t′i ). Since ai (t

′i ) is the largest element of Bs−i (t

′i )

we have ai (t ′i ) � ai (ti ) ∨ ai (t ′i ) � ai (ti ), as desired. �

17.9 A Simplicial Characterization of ANR’s

Our remaining task is the proof of Theorem17.6, which turns out to be a long andwinding road. It was developed by Dugundji, not all at once, but in phases, over thecourse of about fifteen years, in Dugundji (1952), Dugundji (1957), and Dugundji(1965). Except for a result of Whitehead, all the material below comes from thesethree papers. There will be several sets of conditions that imply that a metric spaceis an ANR, of which those given by Theorem17.6 are perhaps the simplest. Theother sets of conditions are both necessary and sufficient. This section gives the onewhose proof is most intricate, because its proof uses an elaborate and ingeniousconstruction.

Throughout this section we work with a fixed metric space (X, d). Theorem6.3allows us to regard X as a relatively closed subset of a convex subset C of a Banachspace. If X is a retract of a neighborhood W ⊂ C , then (Proposition8.3) X is anANR. Conversely, if X is not a retract of some neighborhood W ⊂ C . Then X doesnot satisfy the definition of an ANR.

We recall the constructions associated with a combinatorial simplicial complex(V,Σ). Let { ev : v ∈ V } be the standard unit basis vectors, for each nonemptyσ ∈ Σ let |σ | be the convex hull of { ev : v ∈ σ }, let P := { |σ | : σ ∈ Σ }, and let|P| := ⋃

∅=σ∈Σ |σ |. We endow |P| with the CW topology: a set U ⊂ |P| is openif and only if its intersection with each |σ | is open in the usual sense. In particulara function with domain |P| is continuous if and only if its restriction to each |σ | iscontinuous.

Let S be a collection of subsets of X that cover X . A map f : |P| → X is arealization of P relative to S if, for each P ∈ P , there is some S ∈ S such thatf (P) ⊂ S. A partial realization ofP relative toS is a map f ′ : |P ′| → X , where

17.9 A Simplicial Characterization of ANR’s 401

P ′ is a subcomplex ofP that contains every vertex ofP , such that for all P ∈ Pthere is some S ∈ S such that f ′(P ∩ |P ′|) ⊂ S.

Proposition 17.11 If X is an ANR, then any open cover V of X has a refinementV ′ such that every partial realization of a simplicial complex relative to V ′ extendsto a realization relative to V .

Proof Let r : W → X be a retraction of a neighborhoodW ⊂ C of X , and let an opencover V of X be given. For each x ∈ X choose a Vx ∈ V that contains x , and choosea convex neighborhood V ′

x ⊂ W of x such that r(V ′x ) ⊂ Vx . Let V ′ := {V ′

x ∩ X}x∈X .Let P be a simplicial complex, let P ′ be a subcomplex that contains all the

vertices of P , and let f ′ : |P ′| → X be a partial realization of P relative to V ′.For each P ∈ P let ZP be the convex hull of f ′(P ∩ |P ′|). We first constructan extension f : |P| → W of f ′ such that f (P) ⊂ ZP for all P ∈ P \ P ′. At theoutset f is already defined on the set |P0| of all vertices. Proceeding by induction ondimension, suppose thatwehave alreadydefined f on the (n − 1)-skeleton |Pn−1|. IfP ∈ P \ P ′ is n-dimensional, we already have f (∂P) ⊂ ZP . IfβP is the barycenterof P , every point in P is (1 − t)βP + t y for some y ∈ ∂P and some t ∈ [0, 1]. Afterchoosing an arbitrary point f (βP) ∈ ZP we extend f to P by setting

f ((1 − t)βP + t y) := (1 − t) f (βP) + t f (y) .

Let f := r ◦ f : |P| → X . For each P ∈ P there is an x ∈ X such that P ∩|P ′| ⊂ V ′

x ∩ X . Since V ′x is convex and contains ZP , it contains f (P), so f (P) ⊂

Vx . �

This section’s main result asserts that a weakening of the condition in the resultabove is sufficient as well as necessary for X to be an ANR. In preparation weintroduce some related concepts. Let ∂X := X ∩ C \ X . For a neighborhoodW ⊂ Cof X , an open cover U of W \ X is canonical if:

(a) Each neighborhood V of a point in ∂X contains a neighborhood V ′ such that Vcontains every U ∈ U that has a nonempty intersection with V ′.

(b) For each U ∈ U , d(X,U ) > 0.

Lemma 17.24 Any refinement of a canonical cover is canonical. There is a locallyfinite canonical cover of C \ X.

Proof Suppose that U is a canonical cover and V is a refinement, then V obvi-ously satisfies (a) and (b). The open cover {Ud(y,X)/2(y) : y ∈ C \ X } is easily seento be canonical. Since X is paracompact, any canonical cover has a locally finiterefinement. �

LetU be an open cover of X . An open cover V is a star refinement ofU if, foreach V ∈ V , someU ∈ U contains the union of all the V ′ ∈ V that have a nonemptyintersection with V .

Lemma 17.25 Every open cover U of X has a star refinement.


Proof Since metric spaces are paracompact, we can first of all replace U with alocally finite refinement. For each x ∈ X let δx be the supremum of the set of δ

such that Uδ(x) ⊂ U for all U ∈ U such that x ∈ U , and let Vx := Uδx/2(x). Then{Vx }x∈X is satisfactory because for each x and x ′ ∈ Vx , Vx ′ is contained in eachU ∈ U that contains x because U contains x ′. l �

For an open set U ⊂ X the extension of U is

E(U ) = { y ∈ C : for some r > 0,∅ = Ur (y) ∩ X ⊂ U } .

Clearly E(U ) is open in C and contains U .

Lemma 17.26 For all open U, V ⊂ X, E(U ∩ V ) = E(U ) ∩ E(V ).

Proof If y ∈ E(U ) ∩ E(V ) because ∅ = UrU (y) ∩ X ⊂ U and ∅ = UrV (y) ∩ X ⊂V , and r := min{rU , rV }, then ∅ = Ur (y) ∩ X ⊂ U ∩ V . If y ∈ E(U ∩ V ) because∅ = UrU (y) ∩ X ⊂ U ∩ V , then ∅ = UrU (y) ∩ X ⊂ U and ∅ = UrU (y) ∩ X ⊂ V .

�

Theorem 17.9 If every open cover V of X has a refinement V ′ such that everypartial realization of a locally finite simplicial complex relative to V ′ extends to arealization relative to V , then X is an ANR.

Proof We first construct a sequence V0,V1,V2, . . . of open covers of X inductively.Let V0 := {X}. If Vn−1 has already been constructed, the following steps are used toconstruct Vn:

(a) Let V an be a refinement of Vn−1 of mesh < 1/n.

(b) Let V bn be a refinement of V a

n such that every partial realization of a simplicialcomplex relative to V b

n extends to a realization relative to V an .

(c) Let V cn be a refinement of V b

n such that every partial realization of a simplicialcomplex relative to V c

n extends to a realization relative to V bn .

(d) Let Vn be a star refinement of V cn .

LetW := ⋃V∈V 1

E(V ).Wewill construct a retraction r : W → X . Lemma17.24gives a locally finite canonical coverU ofW \ X that refines { E(V ) : V ∈ V1 }. LetNU := (U ,ΣU ) be the nerve of this cover.

Let W1 := W . Proceeding inductively, if Wn−1 has already been defined, foreach x ∈ X choose an open neighborhood Wn(x) ⊂ U1/n(x) such that there issome Vn ∈ Vn such that E(Vn) ∩ Wn−1 contains Wn(x) and every U ∈ U suchthat U ∩ Wn(x) = ∅. Let Wn := ⋃

x Wn(x). Since X ⊂ Wn ⊂ U1/n(X) we have⋂∞n=1 Wn = X .For U ∈ U , let nU be the largest n such that U ∩ Wn = ∅. For each n let

Un := {U ∈ U : nU = n }. For each U ∈ U , if U ∈ Un choose a VU ∈ Vn suchthat U ⊂ E(VU ). This inclusion implies that we can choose a point η0(U ) ∈ VU

with d(z, η0(U )) < 2d(z, X) for some z ∈ U . Then η0 maps the zero skeleton |N 0U |

of NU into X .

17.9 A Simplicial Characterization of ANR’s 403

Let U ′n := Un ∪ Un+1, and define the simplicial complexes (Un,Σn) and

(U ′n ,Σ

′n) by setting:

Σn := { σ ∈ ΣU : σ ⊂ Un } and Σ ′n := { σ ∈ ΣU : σ ⊂ U ′

n } .

If U ∈ Um , U ′ ∈ Un , and n > m + 1, then U ∩ Wm+1 = ∅, U ′ ∩ Wn+1 = ∅, andU ′ ⊂ Wn , soU ∩U ′ = ∅. Therefore⋃

n Σ ′n = ΣU ,Σ ′

n ∩ Σ ′n+1 = Σn+1, andΣ ′

m ∩Σ ′

n = ∅ if n > m + 1.Consider σ = {U1, . . . ,Uk} ∈ Σ ′

n . Since ∅ = U1 ∩ · · · ∩Uk ⊂ E(VU1) ∩ · · · ∩E(VUk ), Lemma17.26 implies that VU1 ∩ · · · ∩ VUk = ∅. Each VUi is an elementof Vn or of Vn+1, which is a refinement of Vn . Since Vn is a star refinement ofV cn , there is some V c

n ∈ V cn that contains η0(U1), . . . , η0(Uk) because it contains

VU1 ∪ · · · ∪ VUk . Therefore the restriction of η0 to the 0-skeleton of (U ′n ,Σ

′n) is a

partial realization of this simplicial complex relative to V cn .

Consequently the restriction of η0 to the 0-skeleton of (Un,Σn) is a partial realiza-tion of (Un,Σn) relative to V c

n , which extends to a realization ηn : |(Un,Σn)| → Xrelative to V b

n . Now ηn and ηn+1 together constitute a partial realization of (U ′n ,Σ

′n)

relative to V bn , which extends to a realization η′

n : |(U ′n ,Σ

′n)| → X relative to V a

n .The sets Σn are disjoint, and if σ ∈ ΣU is not in one of these sets, then it is inprecisely one Σ ′

n , so these realizations combine to form an unambiguously definedcontinuous function η : |NU | → X .

Let KU : W \ X → |NU | be the function defined in Sect. 8.6, for some partitionof unity subordinate to U . There is a function r : W → X given by

r(w) :={w, w ∈ X,

η(KU (w)), w ∈ W \ X,

Consider a sequence {wk} in W \ X that converges to x ∈ X . Fixing a wk , let σ :={U ∈ U : wk ∈ U }, let n be an integer such that σ ∈ Σ ′

n , and let V be an elementof V a

n that contains η(|σ |). The distance from x to r(wk) is bounded by the sum of:a) the distance from x to wk ; b) the maximum diameter of any U ∈ σ ; c) twice themaximumdistance of anyU ∈ σ to X ; d) the diameter of V . SinceU is canonical, thefirst three quantities can bemade arbitrarily small by takingwk sufficiently close to x ,and the last one can be made arbitrarily small by making n sufficiently large, whichalso results from wk being sufficiently close to x . Thus r(wk) → x . The restrictionsof r to X and W are continuous, and X is closed in W , so this shows that r iscontinuous. �

17.10 More Simplicial Topology

Let P be a simplicial complex. This section’s goal is:


Theorem 17.10 Each open cover U of |P| has a refinement U ′ such that everypartial realization of a simplicial complex relative to U ′ extends to a realizationrelative to U .

In view of Theorem17.9, it follows that:

Corollary 17.6 A metrizable simplicial complex is an ANR.

We will prove Theorem17.10 in stages, first giving additional hypotheses underwhich the conclusion would follow, then showing how these hypotheses can berealized. Let P = P(V,Σ). For each v ∈ V let ev be the corresponding unit basisvector ofRV , so that for each nonemptyσ ∈ Σ , |σ | is the convex hull of { ev : v ∈ σ }.Proposition 17.12 Suppose that W ⊂ R

V is a superset of |P| for which thereis a retraction r : W → |P| such that each point x ∈ |P| has a convex neigh-borhood Tx ⊂ W such that r(Tx ) ⊂ Tx ∩ |P|. Then every partial realization of asimplicial complex relative to {Tx ∩ |P|}x∈|P | extends to a realization relative to{Tx ∩ |P|}x∈|P |.

Proof The argument is essentially the same as the proof of Proposition17.11. LetQbe a simplicial complex, let Q′ be a subcomplex that contains all the vertices of Q,and let f ′ : |Q′| → |P| be a partial realization of Q relative to {Tx ∩ |P|}x∈|P |.For each Q ∈ Q let ZQ be the convex hull of f ′(Q ∩ |Q′|).

Let f0 := f ′. Obviously f0(Q) ⊂ ZQ for all Q ∈ Q0. Proceeding inductively,supposewe have already constructed fn−1 : |Qn−1 ∪ Q′| → W such that fn−1(Q) ⊂ZQ for all Q. To define fn , consider a Q ∈ Qn \ Q′. If βQ is the barycenter of Q,every point in Q is (1 − t)βQ + t y for some y ∈ ∂Q and some t ∈ [0, 1]. Afterchoosing an arbitrary point fn(βQ) ∈ ZQ we extend fn−1 to Q by setting

fn((1 − t)βQ + t y) := (1 − t) fn(βQ) + t fn−1(y) .

In thiswaywe define a sequence of functions f0, f1, f2, . . ., each ofwhich extendsits predecessor. These combine to give a function f : |Q| → W that is continuousbecause its restriction to each simplex is continuous. Let f := r ◦ f . For each Q ∈ Qthere is an x such that f ′(Q ∩ |Q′|) ⊂ Tx . Evidently ZQ ⊂ Tx , so f (Q) ⊂ Tx andf (Q) ⊂ r(Tx) ⊂ Tx ∩ |P|. �

The next part of the argument depends on explicit geometric considerations, sowe introduce concrete specifics. LetK be the set of all convex hulls of finitely manyof the ev, and let Q be a subcomplex of P . Let P ′, Q′, and K ′ be the barycentricsubdivisions of P ,Q, and K respectively.

Lemma 17.27 Q′ is normal inP ′.

Proof Each element of P ′ is the convex hull P ′ = conv({βP1 , . . . , βPk }) of thebarycenters of simplices P1, . . . , Pk ∈ P with P1 ⊂ · · · ⊂ Pk . If the verticesβP1 , . . . , βPk are all inQ

′, then P1, . . . , Pk ∈ Q, so P ′ ∈ Q′. �

17.10 More Simplicial Topology 405

We assume that the barycenter βP of P = |σ | is ∑v∈σ ev times the inverse of the

number of elements of σ . Of course a generic point in |K | is ∑v∈σ αvev for some

nonempty finite σ ⊂ V and positive numbers αv such that∑

v∈σ αv = 1.

Lemma 17.28 For each P = |σ | ∈ K , st(βP ,K ) is the set of∑

v∈σ ′ αvev suchthat σ ∩ σ ′ = ∅ and αv > αv′ for all v ∈ σ and v′ ∈ σ ′. Consequently st(βP ,K ) isconvex.

Proof Insofar as a point in st(βP ,K ) is∑

σ ′ γσ ′β|σ ′| for some finite collection ofσ ′, each of which is either a subset or a superset of σ , it is clear that every point inst(βP ,K ) has the asserted form. Conversely, suppose that a point has the assertedform. We can view it as a convex combination of β|σ ′| and a sum of the asserted formwith fewer terms. Repeating this reduction eventually achieves a representation ofthe point as a strict convex combination of the barycenters of some P1, . . . , Pk ∈ Pwith P1 ⊂ · · · ⊂ Pk and P = Pi for some i . �

Proposition 17.13 Let W := st(|P|,K ′). The sets st(βP ,K ′) for P ∈ P are asystem of convex neighborhoods (in W) of the points of |P|. There is a retractionr : W → |P| such that r(st(βP ,K ′)) ⊂ st(βP ,P ′) for all P ∈ P .

Proof The first assertion follows from the last result and the fact that for any sim-plicial complex the stars of the vertices are an open cover. Since P ′ is normal inK ′, Proposition16.4 gives a retraction r : W → |P| such that r(P ∩ W ) ⊂ P forall P ∈ K ′. �

Recall that a subdivision ofP is a simplicial complexP ′ such that |P ′| = |P|and each P ∈ P is the union of finitely many elements of P ′. The assertion ofTheorem17.10 depends only on |P|, so we are free to replaceP with a subdivision.Proposition17.12 and the last result imply that if each st(βP ,P ′) is contained insome U ∈ U , then we can set U ′ := { st(βP ,P ′) : P ∈ P }. If P is finite, thenrepeated barycentric subdivision suffices to bring this situation about, but in generalthe number of barycentric subdivisions required to do the job need not be bounded.Instead we inductively subdivide each skeleton Pn in a way that does not modifythe subdivision of Pn−1.

LetΔ be the standard n-dimensional unit simplex inRn+1, regarded as a simplicialcomplex whose cells are the various faces, and let ∂Δ be the subcomplex consistingof all proper faces.

Lemma 17.29 Suppose that U is an open cover of |Δ|, and Q is a simplicialsubdivision of |∂Δ| such that for each vertex v ofQ there is some Uv ∈ U such thatst(v,Q) ⊂ Uv. Then there is a simplical subdivisionP of |Δ| that restricts toQ on|∂Δ| such that for each vertex v of Q, st(v,P) ⊂ Uv, and for each vertex v′ of Pthere is some Uv′ ∈ U such that st(v′,P) ⊂ Uv′ .

Proof We construct a sequence P0,P1,P2, . . . of triangulations of Δ as follows.Let βΔ be the barycenter of Δ. Let P0 be the simplicial complex consisting of Qtogetherwith all simplices of the formconv({βΔ} ∪ Q)whereQ ∈ Q. Supposing that


Fig. 17.1 F0 (solid) and F1(dotted)

Pk−1 has already been constructed, we constructPk as follows. For each P ∈ Pk−1

that is not contained in |∂Δ| let βP be the barycenter of P . The 0-skeletonP0k ofPk

is union ofP0k−1 and the set of such barycenters. IfP

r−1k has already been defined,

Prk is formed by appending all r -dimensional simplices in Q and all simplices

of the form conv({βP} ∪ R) where P ∈ Pk−1 \ Q and R is an (r − 1)-dimensionalelement ofPr−1

k that is contained in the boundary of P . Figure17.1 showsP0 (solidlines) and P1 (dotted lines). Note that (by induction on r ) if Q ∈ Q, P ∈ Pk \ Q,and P ∩ ∂Δ = Q, then P is a convex combination of Q and barycenters of simplicesP ′ ∈ Pk−1 \ Q such that P ′ ∩ ∂Δ = Q.

Let dk be the maximum distance from βQ for some Q ∈ Q to a vertex (other thanthe vertices of Q) of a P ∈ Pk \ Q such that P ∩ ∂Δ = Q. If P is a simplex inPk−1, ∅ = Q := P ∩ ∂Δ = P , the vertices of P are v1, . . . , vk , and the vertices ofQ are v1, . . . , v�, then

‖βP − βq‖ = ∥∥1k

k∑

i=1

vi − 1

�

�∑

j=1

v j∥∥ = ∥

∥1k

k∑

i=�+1

vi − (1

�− 1

k)

�∑

j=1

v j∥∥

= k − �

k

∥∥ 1

k − �

k∑

i=�+1

vi − 1

�

�∑

j=1

v j∥∥ ≤ k − �

kmax

i=�+1,...,k‖vi − βQ‖ .

Therefore dk ≤ nn+1dk−1. In particular, if v is a vertex ofQ and st(v,Q) ⊂ U ∈ U ,

then st(v,Pk) ⊂ U for large k.Suppose that k is large enough that for each vertex v of Q there is some U ∈ U

such that st(v,Pk) ⊂ U . If Q ∈ Q, P ′ ∈ Pk+1, and P ′ ∩ ∂Δ = Q, then every pointof P ′ is contained in the interior of some P ∈ Pk that has Q as a face. Therefore

17.10 More Simplicial Topology 407

st(∂Δ,Pk+1) ⊂ st(∂Δ,Pk). The Lebesgue number lemma gives an ε > 0 such thatfor every x ∈ Δ \ st(∂Δ,Pk) there is someU ∈ U that contains the ball of radius ε

centered at x . For each k ′ ≥ k + 1 the restriction toΔ \ st(∂Δ,Pk+1) of the passagefromPk ′ toPk ′+1 is barycentric subdivision, so the fact (Sect. 2.5) that each roundof barycentric subdivision reduces the maximum diameter of the simplices by afixed multiplicative factor implies that if k ′ is sufficiently large, then the maximumdiameter is less than ε. For such a k ′ letw be a vertex ofPk ′ . Ifw ∈ Δ \ st(∂Δ,Pk),then the diameter of st(w,Pk ′) is less than ε, so st(w,Pk ′) is contained in someU ∈ U . Ifw ∈ st(∂Δ,Pk), then there is some vertex v ofQ such that v ∈ st(v,Pk),so st(w,Pk ′) ⊂ st(t,Pk), which is contained in some U ∈ U . �

Theorem 17.11 (Whitehead1939) If U is an open cover of |P|, there is a sub-division P ′ such that for each vertex v′ of P ′ there is some Uv′ ∈ U such thatst(v′,P ′) ⊂ Uv′ .

Proof For each vertex v of P choose a Uv ∈ U containing v. Proceeding induc-tively, suppose that we have already constructed a subdivisionPk−1 of the (k − 1)-skeleton Pk−1 and for each vertex v of Pk−1 we have chosen a Uv ∈ U such thatst(v,Pk−1) ⊂ Uv. We construct Pk by applying the last result to each k-simplex,obtaining an extension such that for all vertices ofPk−1, st(v,Pk−1) ⊂ Uv, and foreach vertex v ofPk that is not a vertex ofPk−1 there is someUv ∈ U such that thisinclusion holds. Evidently

⋃k Pk is satisfactory. �

Clearly Theorem17.10 follows from Propositions17.12 and 17.13 and White-head’s theorem.

17.11 Additional Characterizations of ANR’s

Fix a metric space (X, d). Recall that X is locally equiconnected if there is a neigh-borhood W ⊂ X × X of the diagonal and a map λ : W × [0, 1] → X such that

λ(x, x ′, 0) = x ′, λ(x, x ′, 1) = x, and λ(x, x, t) = x

for all (x, x ′) ∈ W and t ∈ [0, 1]. Such a λ is called an equiconnecting function.Proposition8.9 asserts that an ANR is locally equiconnected. If it were true, theconverse of this would be strengthening and simplification of Theorem17.6. In factthe converse holds for spaces that are (in an appropriate sense) finite dimensional,and whether it is true in general was unknown for many years. Eventually, however,the question was resolved by a counterexample of Cauty (1994). Thus we are ledto consider strengthenings of local equiconnectedness. We will arrive at the relevantstrengthening via a sequence of propositions, each of which implies the hypothesesof its predecessor.


Proposition 17.14 If, for each open cover U of X, there is a simplicial complexQ, and maps ϕ : X → |Q| and ψ : |Q| → X such that ψ ◦ ϕ and IdX are U -homotopic, then X is an ANR, then every open cover V of X has a refinement V ′such that every partial realization of a simplicial complex relative to V ′ extends toa realization relative to V .

Proof LetQ be a simplicial complex for which there are maps ϕ : X → |Q| andψ :|Q| → X and a V -homotopy h : X × [0, 1] → X with h0 = ψ ◦ ϕ and h1 = IdX .For each x there is a V ∈ V such that h(x, t) ∈ V for all t , and for each t continuitygives a neighborhood Wt of x and εt > 0 such that h(Wt × (t − εt , t + εt )) ⊂ V . IfWt1 × (t1 − εt1 , t1 + εt1), . . . ,Wtn × (tn − εtn , tn + εtn ) cover {x} × [0, 1] andWx =⋂

i Wti , then h(W × [0, 1]) ⊂ V . Therefore there is an open coverW of X such thatfor each W ∈ W there is some V ∈ V such that h(W × [0, 1]) ⊂ V .

Now W := {ψ−1(W )}W∈W is an open cover of |Q|, and Theorem17.10 impliesthat there is a refinement Z such that every partial realization of a simplicial complexrelative to Z extends to a realization relative to W . Let V ′ be a common refinementof W and { ϕ−1(Z)}Z∈ ˜Z .

Let P be a simplicial complex, and let P ′ be a subcomplex that contains everyvertex ofP . Let A := (|P| × {0}) ∪ (|P ′| × [0, 1]). The proof of Proposition16.4(adjustment of details is left to the reader) gives a retraction r : |P| × [0, 1] → Asuch that r(P × [0, 1]) ⊂ P × [0, 1] for all P ∈ P .

Let f ′ : |P ′| → X be a partial realization of P ′ relative to V ′. Of course g′ :=ϕ ◦ f ′ is a partial realization relative to Z , which extends to a realization g : |P| →|Q| relative to W . Let δ : A → X be the map

δ(y, t) :={

ψ(g(y)), t = 0,

h(ψ(g′(y)), t), (y, t) ∈ |P ′| × [0, 1] .

Let f : |P| → X be the map f (y) := δ(r(y, 1)). If y ∈ |P ′|, then

f (y) = δ(r(y, 1)) = δ(y, 1) = h(ψ(ϕ( f ′(y))), 1

) = f ′(y) .

For each P ∈ P there is some W ∈ W such that g(P) ⊂ ψ−1(W ), so

f (P) ⊂ δ((P × [0, 1]) ∩ A) ⊂ h(ψ(g(P)) × [0, 1]) ⊂ h(W × [0, 1]) ,

and consequently there is some V ∈ V such that f (P) ⊂ V . Thus f ′ extends to arealization relative to V . �

Proposition 17.15 Suppose X is locally equiconnected and each open cover V ofX has a refinement V ′ such that every partial realization of the 0-skeleton of asimplicial complex relative to V ′ extends to a realization relative to V . Then foreach open cover U of X there is a simplicial complex Q and maps ϕ : X → |Q|and ψ : |Q| → X, such that ψ ◦ ϕ and IdX are U -homotopic.

17.11 Additional Characterizations of ANR’s 409

Proof Let U be an open cover of X . Proposition8.8 gives a refinement V suchthat any two V -close maps of a topological space Y into X are (stationarily) U -homotopic. Let V ∗ be a star refinement of V . Let W be a refinement of V ∗ suchthat every partial realization of the 0-skeleton of a simplicial complex relative toWextends to a realization relative to V ∗. Finally (per Lemmas17.24 and 17.25) letW ∗be a locally finite star refinement of W , let Q := NW ∗ = (W ∗,ΣW ∗) be the nerveof W ∗, and let ϕ := KW ∗ : X → |Q| be the function defined in Sect. 8.6, for somepartition of unity subordinate to W ∗.

Let ψ0 : W ∗ → X be a function assigning a point inW ∗ to eachW ∗ ∈ W ∗. Thisis a partial realization of Q relative to W because for any {W ∗

1 , . . . ,W ∗n } ∈ ΣW ∗ ,

W ∗1 ∩ · · · ∩ W ∗

n = ∅ and consequentlyW ∗1 ∪ · · · ∪ W ∗

n is contained in someW ∈ W .Therefore ψ0 extends to a full realization ψ : |Q| → X relative to V ∗.

For given x ∈ X let W ∗1 , . . . ,W ∗

n be the elements of W ∗ that contain x . Sincex ∈ W ∗

1 ∩ · · · ∩ W ∗n , there is a V

∗0 ∈ V ∗ that contains W ∗

1 ∪ · · · ∪ W ∗n . Since ψ is a

realization relative to V ∗ there is V ∗1 ∈ V ∗ that contains ψ(|{W ∗

1 , . . . ,W ∗n }|). Each

ψ0(W ∗i ) is in W ∗

i ⊂ V ∗0 , and it is in V ∗

1 , so V ∗0 ∩ V ∗

1 = ∅. Consequently there is aV ∈ V that contains V ∗

0 ∪ V ∗1 . Now x ∈ W ∗

1 ∩ · · · ∩ W ∗n ⊂ V ∗

0 ⊂ V and ψ(ϕ(x)) ∈ψ(|{W ∗

1 , . . . ,W ∗n }|) ⊂ V ∗

1 ⊂ V . Since x was arbitrary, ψ ◦ ϕ and IdX are V -close,hence U -homotopic. �

Proposition 17.16 Let W ⊂ X × X be a neighborhood of the diagonal, and letλ : W × [0, 1] → X be an equiconnecting function. If, for each x ∈ X and eachneighborhood U of x, there is a neighborhood V ⊂ U that is λ-stable in U, theneach open cover V of X has a refinement V ′ such that every partial realization ofthe 0-skeleton of a simplicial complex relative to V ′ extends to a realization relativeto V .

Proof Let an open cover V of X be given. Let V ′ be an open cover such thatfor each V ′ ∈ V ′ there is some V ∈ V such that V ′ is λ-stable in V . Let P be asimplicial complex, for each n = 0, 1, 2, . . . letPn be the n-skeleton ofP , and letf 0 : |P0| → X be a partial realization of the 0-skeleton of P relative to V ′.The well ordering theorem gives a complete strict ordering of P0. We construct

a sequence of extensions f n : |Pn| → X as follows. Suppose that f n−1 has alreadybeen constructed. Let P ∈ Pn be an n-simplex with vertices v0, . . . , vn , where theordering of the indices agrees with the ordering of the vertices. If P ′ is the convexhull of v0, . . . , vn−1, then each y ∈ P is t y′ + (1 − t)vn for some y′ ∈ P ′. Here t isuniquely determined by y, and y′ is uniquely determined unless t = 0, so we can set

f n(y) := λ( f 0(vn), f n−1(y′), t) .

Note that this definition does not disagree by the one already given by f n−1 if y ∈ P ′,and it also does not disagree with the one given by f n−1 if y is in any other facet of P ,so f n is well defined and continuous. Let f : |P| → X be the function whose graphis the union of the graphs of the f n . Since its restriction to each P ∈ P is continuous,f is continuous. If f 0(v0), . . . , f 0(vn) ∈ V ′ ∈ V ′, then induction evidently gives


f n(P) ⊂ λ(V ′ × V ′n−1 × [0, 1]) = V ′n . Therefore the function f is a realization ofP relative to V . �

This completes the proof of Theorem17.6: by assumption X is locally equicon-nected, and Proposition17.16 implies that each open cover V of X has a refinementV ′ such that every partial realization of the 0-skeleton of a simplicial complex rel-ative to V ′ extends to a realization relative to V , after which we follow the chainof implications (d) ⇒ (e) ⇒ (f) ⇒ (g) ⇒ (a) in the next result. It seems unlikelythat an ANR necessarily has an equiconnecting function λ such that for each x ∈ Xand each neighborhoodU of x there is a neighborhood V ⊂ U that is λ-stable inU .However, if X is an ANR, then it is locally equiconnected and satisfies a strengthen-ing of (d) below. Combining this fact, the various results above, and relevant resultsfrom Chap.8, yields the following omnibus result.

Theorem 17.12 For a metric space X the following are equivalent:

(a) X is an ANR.(b) There is a convex subset C of Banach space such that (a homeomorphic image

of) X is both a closed subset of C and a retract of a neighborhood U ⊂ C.(c) X (or its homeomorphic image) is a retract of an open subset of a convex subset

of a locally convex linear space.(d) X is locally equiconnected and each open coverV of X has a refinementV ′ such

that every partial realization of the 0-skeleton of a simplicial complex relativeto V ′ extends to a realization relative to V .

(e) For each open cover U of X there is a simplicial complex Q, and maps ϕ :X → |Q| and ψ : |Q| → X, such that ψ ◦ ϕ and IdX are U -homotopic.

(f) Every open cover V of X has a refinement V ′ such that every partial realizationof a simplicial complex relative to V ′ extends to a realization relative to V .

(g) Every open cover V of X has a refinement V ′ such that every partial realizationof a locally finite simplicial complex relative to V ′ extends to a realizationrelative to V .

Proof (a) ⇒ (b) Proposition8.4.(b) ⇒ (c) Automatic.(c) ⇒ (a) Proposition8.3.(a) ⇒ (d) Proposition8.9 and Theorem17.9.(d) ⇒ (e) Proposition 17.15.(e) ⇒ (f) Proposition17.14.(f) ⇒ (g) Automatic.(g) ⇒ (a) Theorem17.9. �

Exercises 411

Exercises

17.1 Recall the setting of Exercise3.7. There are finite sets M and W of men andwomen. Each m ∈ M has a strict preference ordering �m of W ∪ {∅}, and eachw ∈ W has a strict preference ordering �w of M ∪ {∅}, where ∅ represents beingunmatched. A match is a function

μ : M ∪ W → M ∪ W ∪ {∅}

such that μ(M) ⊂ W ∪ {∅}, μ(W ) ⊂ M ∪ {∅}, μ(μ(m)) = m for all m ∈ M suchthat μ(m) ∈ W , and μ(μ(w)) = w for all w ∈ W such that μ(w) ∈ M . The matchμ is stable if no one would prefer being unmatched to their assigned partner andthere do not exist m ∈ M and w ∈ W such that w �m μ(m) and m �w μ(w). If μ

and μ′ are matches, define μ ∨ μ′ : M ∪ W → M ∪ W ∪ {∅} by letting μ ∨ μ′(m)

be the �m-best element of {μ(m), μ′(m)} and letting μ ∨ μ′(w) be the �w-worstelement of {μ(w), μ′(w)}, and define μ ∧ μ′ : M ∪ W → M ∪ W ∪ {∅} by lettingμ ∧ μ′(m) be the �m-worst element of {μ(m), μ′(m)} and letting μ ∧ μ′(w) be the�w-best element of {μ(w), μ′(w)}.(a) Prove that if μ and μ′ are stable matchings, then μ ∨ μ′ and μ ∧ μ′ are stable

matchings.(b) Prove that the set of stable matchings (with these operations) is a lattice.

17.2 We study some different auction forms for the setting of Sect. 17.2: there are Nbidders, their types (which are their values for the object being auctioned) are inde-pendent identically distributed random variables t1, . . . , tN whose common cumu-lative distribution function F : [0, 1] → [0, 1] (with F(0) = 0 and F(1) = 1) is C1

with probability density function f (t) := F ′(t) > 0.

(a) In the second price auction (also known as the Vickrey auction) each bidder isubmits a bid ai ∈ [0, 1], the object is awarded to the agent whose bid is highest,that agent pays the second highest bid, and everyone else pays zero. Prove thatthe unique equilibrium of the second price auction is for each agent to bid hervalue, i.e., ai = ti .

(b) In the all pay auction each bidder i submits a bid ai ∈ [0, 1], the object isawarded to the agent whose bid is highest, and each agent pays her bid. Derivethe differential equation analogous to Eq. (17.1) that is satisfied by the biddingstrategy s : [0, 1] → [0, 1] of a symmetric equilibrium.

(c) For the case of uniformly distributed values (F(t) = t and f (t) = 1) computethe auctioneer’s expected revenues for the first price, second price, and all payauctions.

17.3 (This and the following three problems are based on Myerson (1981).) Wegeneralize the setting of the last problem, letting bidder i’s space of types (valua-tions) be the interval Ti = [αi , ωi ]. We continue to assume that types are indepen-dently distributed, but we no longer assume that they are identically distributed:


for each i , the cumulative distribution function Fi : Ti → [0, 1] is C1, Fi (αi ) = 0,Fi (ωi ) = 1, and fi (ti ) := F ′

i (ti ) > 0 for all ti . A mechanism for selling the objectconsists of nonempty measurable spaces B1, . . . ,BN of messages for the buy-ers, a measurable allocation rule q = (q1, . . . , qN ) : B → [0, 1]N (where B :=B1 × · · · × BN ) such that

∑i qi (b) ≤ 1 for all b, and a measurable payment rule

p = (p1, . . . , pN ) : B → RN . Here qi (b) is the probability that agent i receives the

object and pi (b) is agent i’s payment to the auctioneer. Note that we allow the auc-tioneer to sometimes not sell the object. An equilibrium for the mechanism is anN -tuple β = (β1, . . . , βN ) of measurable functions βi : [0, 1] → Bi such that foreach i and ti , the expected surplus

∫

T−i

(qi (βi (ti ), β−i (t−i ))ti − pi (βi (ti ), β−i (t−i ))

)f−i (t−i ) dt−i

(the “−i” notation has the usual and obvious interpretation) resulting from playingβi (ti ) is at least as large as the expected surplus for ti resulting from playing anyother b′

i ∈ Bi . A mechanism is direct ifBi = Ti for all i . Let T := T1 × · · · × TN .Prove the revelation principle: if β is an equilibrium, then there is a direct mecha-nism p′ = (q ′

1, . . . , q′N ) : T → [0, 1]N and p′ = (p′

1, . . . , p′N ) : T → R

N such thattruth telling (that is, the profile of strategies (IdT1 , . . . , IdTN )) is an equilibrium andp(β1(t1), . . . , βN (tN )) = p′(t) and p(β1(t1), . . . , βN (tN )) = p′(t) for all t ∈ T .

17.4 Continuing with the framework of the last problem, let q : T → [0, 1]N andp : T → R

N be a direct mechanism. For each i let Qi : Ti → [0, 1] and Pi : Ti → R

be the functions

Qi (ti ) :=∫

T−i

qi (ti , t−i ) f−i (t−i ) dt−i and Pi (ti ) :=∫

T−i

pi (ti , t−i ) f−i (t−i ) dt−i .

Let Ui : Ti → R and Vi : Ti → R be the functions Ui (ti ) := Qi (ti )ti − Pi (ti ) andVi (ti ) := maxzi∈Ti Qi (zi )ti − Pi (zi ). That is, Qi (ti ) is the probability of winning theobject and Pi (ti ) is the expected payment when agent i reports ti and the other agentsare following their truth telling strategies, Ui (ti ) is the expected surplus when i hastype ti and reports it truthfully, and Vi (ti ) is the maximal expected surplus.

(a) Observing that for each zi , ti �→ Qi (zi )ti − Pi (ti ) is an affine function, provethat Vi is convex.

(b) Prove that a convex function is absolutely continuous. Use Theorem15.3 to showthat Vi is differentiable almost everywhere and Vi (ti ) = Vi (αi ) + ∫ ti

αiV ′i (si ) dsi

for all ti ∈ Ti .

17.5 Continuing the last exercise, we say that truth telling is incentive compat-ible for i if Ui (ti ) = Vi (ti ) for all ti ∈ Ti . We say that truth telling is incentivecompatible if this is the case for all i .

(a) Observing that Vi (zi ) ≥ Qi (ti )zi − Pi (ti ) = Ui (ti ) + Qi (ti )(zi − ti ), show thatif truth telling is incentive compatible for i , then U ′

i = Qi is nondecreasing.

Exercises 413

(b) Prove the converse: if Qi is nondecreasing, then truth telling is incentive com-patible for i .

(c) Observe that Ui (ti ) = Ui (αi ) + ∫ tiαiQi (si ) dsi when truth telling is incentive

compatible for i , so that

Pi (ti ) = −Ui (αi ) + Qi (ti )ti −∫ ti

αi

Qi (si ) dsi . (17.2)

The last result is the (generalized) revenue equivalence theorem. It implies thattwo direct revelation mechanisms for which truth telling is incentive compatiblehave the same expected revenue for the auctioneer if they have the same allocationrule and the same expected utilities for the lowest types of each bidder. We say thatthe mechanism is individually rational if Ui (ti ) ≥ 0 for all i and ti ∈ Ti .

(d) Prove that if truth telling is incentive compatible, then the mechanism is indi-vidually rational if and only if Ui (αi ) ≥ 0 for all i .

17.6 Again we continue from the last problem.

(a) Compute bidder i’s expected payment by integrating (17.2), reversing the orderof integration in the double integral, arriving at the following expression

−Ui (αi ) +∫ ωi

αi

(ti − 1 − Fi (ti )

fi (ti )

)Qi (t) fi (ti ) dti .

The expression ti − (1 − Fi (ti ))/ fi (ti ) is called the virtual valuation of buyer iwith type ti . Suppose that Eq. (17.2) holds, for each i the virtual valuation of ti isa nondecreasing function of ti , and for each t ∈ T , q(t) assigns all probability tothose i for which the virtual valuation is maximal if the maximal virtual valuation ispositive and no probability to any i with a negative virtual valuation.

(b) Prove that the mechanism is incentive compatible.(c) Prove that if Ui (αi ) = 0 for all i , then there is no other individually rational

incentive compatible direct revelation mechanism that gives the auctioneer ahigher expected revenue.

(d) What is the revenue maximizing auction when each agent’s valuation is uni-formly distributed on [0, 1]?

It can easily happen that the virtual valuation is not nondecreasing, in which casefinding the optimal mechanism requires additional analysis. However, there is anatural condition that implies the the virtual valuation is nondecreasing. The functionHi (ti ) := fi (ti )/(1 − Fi (ti )) is called the hazard rate of the cumulative distributionfunction Fi . (In applications in which ti represents time, Hi (ti )δ approximates theprobability of the event falling in the interval [ti , ti + δ) conditional on it not havingalready happened.) If Hi (ti ) is an increasing function of ti , then the virtual valuationis an increasing function.


17.7 Mass transportation is a topic in mathematics originated by Gaspard Monge(1746–1818). The leader of contemporary research on this topic is Cedric Villani,who was awarded the Fields Medal in 2010. We sketch a connection between masstransportation and mechanism design (e.g., Ekeland2010) that has emerged recently.Let T and A be finite sets of types and actions, and let u : T × A → R be a function.Initially we assume that T and A have the same cardinality. A bijection ξ : T → Ais optimal if it maximizes

∑t u(t, ξ(t)) among all bijections ξ ′ : T → A. (As mass

transportation is usually described, T and A are locations, and the problem is tominimize the cost of moving unit masses initially located at the elements of T to theelements of A, where−u(t, a) is the cost of moving a unit mass from t to a.) A chainis a sequence t0, . . . , tN of elements of T , and such a chain is a cycle if tN = t0.

(a) Prove that ξ is optimal if and only if

N−1∑

n=0

[u(tn, ξ(tn)) − u(tn, ξ(tn+1))] ≥ 0 (17.3)

for all cycles t0, . . . , tN = t0.

We now drop the assumption that T and A have the same cardinality, and let ξ : T →A be an arbitrary function. A function f : T → R is a potential for ξ and t ∈ T iff (t) = 0 and

f (t ′) ≥ f (t) + u(t ′, ξ(t)) − u(t, ξ(t))

for all t, t ′ ∈ T . For a given ξ and t define fξ,t : T → [−∞,∞) by setting

fξ,t (t) := infN−1∑

n=0

[u(tn, ξ(tn)) − u(tn, ξ(tn+1))]

where the infimum is over all chains t0, . . . , tN with t0 = t and tN = t .

(b) Prove that if ξ satisfies (17.3), then fξ,t is a potential for ξ and t .

The function ξ is incentive compatible if there is a salary function s : T → R suchthat

u( f, ξ(t)) + s(t) ≥ u(t, ξ(t ′)) + s(t ′) (17.4)

for all t, t ′ ∈ T .

(c) Prove that if ξ is incentive compatible, then it satisfies (17.3).(d) Prove that if f is a potential for ξ and some t , and s : T → R is the function

s(t) := f (t) − u(t, ξ(t)), then (17.4) holds. Conclude that ξ is incentive com-patible if and only if (17.3) holds.

(e) Observe that if (17.4) holds, then there is a function s : ξ(T ) → R such thats = s ◦ ξ , so that u(t, ξ(t)) + s(ξ(t)) ≥ u(t, a) + s(a) for all t and a ∈ ξ(T ).

Exercises 415

17.8 (Reny2011) We consider an auction with n bidders and m homogeneous unitsof a single good for sale. The agents simultaneously submit bids, where a bid foragent i is a vector ai = (ai1, . . . , aim) such that ai1 ≥ · · · ≥ aim . The resulting priceis the largest number p such there are at least m + 1 distinct pairs (i, j) such thatai j ≥ p. For each ai j that is greater than p, agent i receives one unit. The tie breakingrule is that the agents are ordered randomly, with each ordering having probability1/n!, the first agent is awarded a unit for each bid equal to p, then the second agentis awarded a unit for each bid equal to p, and so forth until the supply is exhausted.Each agent pays p for each unit received. (This is called a uniform-price multiunitauction.)

For each i let Ai := { ai ∈ [0, 1]m : ai1 ≥ · · · ≥ aim }, and let A := ∏i Ai . We

endow Ai with the usual metric and the componentwise partial order: a′i � ai if and

only if a′i j ≥ ai j for all j .

(a) Verify that Ai is a compact locally complete metric semilattice.(b) Fixing an order of the agents for tie breaking purposes, show that for any i ,

ai , a′i ∈ A, and a−i ∈ A−i , the pair of outcomes resulting from (ai , a−1) and

(a′i , a−1) is the same as the pair of outcomes resulting from (ai ∨ a′

i , a−1) and(ai ∧ a′

i , a−1). By averaging over all orders, show that for the given tie breakingrule, the pair of random outcomes resulting from (ai , a−1) and (a′

i , a−1) is thesame as the pair of random outcomes resulting from (ai ∨ a′

i , a−1) and (ai ∧a′i , a−1).

For each i let Ti := { ti ∈ [0, 1]m : ti1 ≥ · · · ≥ tim }, and let T := ∏i Ti . For each

i there is a C1 increasing concave utility-of-money function vi : R → R. If the bidprofile a results in i winning k units at price p, and the type profile is t , then i’sutility is

ui (a, t) := vi( k∑

j=1

ti j − kp)

.

(Reny allows the monetary values of the units to depend on the entire profile t .) Moregenerally, ui (a, t) is the average of this quantity over the distribution of pairs (k, p)induced by a.

(c) Show that ui is weakly quasisupermodular.

Let αi = v′i (−m)/v′

i (m) − 1. We endow Ti with the Borel σ -algebra, which wedenote by Ti , and the partial order ≥i given by t ′i ≥i ti if and only if

t ′ik − αi

k−1∑

j=0

t ′i j ≥ tik − αi

k−1∑

j=0

ti j

for all k = 1, . . . ,m. Let μi be a probability measure on Ti that is given by a con-tinuous density fi , which may vanish on some parts of Ti .


(d) Show that (Ti ,Ti , μi ) is a partially ordered probability space (hint:≥i is closed)that is atomless and separable.

(e) Suppose that t ′i ≥i ti , that k ′ > k, and that p′ ≥ p. Justify each inequality in thefollowing calculation:

vi( k′∑

j=1

t ′i j − k′ p′) − vi( k′∑

j=1

ti j − k′ p′) ≥ v′i (m) ×

k′∑

j=1

t ′i j − ti j ≥ v′i (m) ×

k+1∑

j=1

t ′i j − ti j

≥ v′i (−m) ×

k∑

j=1

t ′i j − ti j ≥ vi( k∑

j=1

t ′i j − kp) − vi

( k∑

j=1

ti j − kp)

.

(f) Prove that ui satisfies weak single crossing.

For each i let Ci be the set of measurable si : Ti → Ai such that si j (ti ) ≤ ti j forall ti and all j = 1, . . . ,m. Let C := ∏

i Ci .

(g) Prove that Ci is pointwise-limit-closed, piecewise-limit closed, and join-closed.(h) Explain howTheorems17.7 and17.8 can be used to show that there is amonotone

pure strategy equilibrium in C .

References

Alexander, J. W. (1924). An example of a simply-connected surface bounding a region which isnot simply-connected. Proceedings of the National Academy of Sciences of the United States ofAmerica, 10, 8–10.

Amann, H., & Weiss, S. A. (1973). On the uniqueness of the topological degree. Math Z, 130,39–54.

Arora, S., & Boaz, B. (2007). Computational complexity: A modern approach. Cambridge: Cam-bridge University Press.

Arrow, K. J., & Debreu, G. (1954). Existence of an equilibrium for a competitive economy. Econo-metrica, 22, 265–290.

Arrow,K. J., &Hurwicz, L. (1958). On the stability of the competitive equilibrium, I.Econometrica,26, 522–552.

Arrow, K. J., Block, H. D., & Hurwicz, L. (1959). On the stability of the competitive equilibrium,II. Econometrica, 27, 82–109.

Athey, S. (2001). Single crossing properties and the existence of pure strategy equilibria in gamesof incomplete information. Econometrica, 69, 861–889.

Banks, J. S., & Sobel, J. (1987). Equilibrium selection in signalling games. Econometrica, 55,647–661.

Barelli, P., & Meneghel, I. (2013). A note on the equilibrium existence problem in discontinuousgames. Econometrica, 81, 813–824.

Bauer, A. (2017). Five stages of accepting constructive mathematics. The Bulletin of the AmericanMathematical Society, 51, 481–498.

Benedetti, R., & Risler, J. J. (1990). Real algebraic and semi-algebraic sets. Paris: Hermann.Blume, L., & Zame, W. (1994). The algebraic geometry of perfect and sequential equilibrium.

Econometrica, 62, 783–794.Blume, L., Brandenberger, A., & Dekel, E. (1991a). Lexicographical probabilities and choice underuncertainty. Econometrica, 59, 61–79.

Blume, L., Brandenberger, A., & Dekel, E. (1991b). Lexicographical probabilities and equilibriumrefinements. Econometrica, 59, 81–98.

Bohl, P. (1904). Über die Bewegung eines mechanischen Systems in der Nähe einer Gleichgewicht-slage. J Reine Agnew Math, 127, 179–276.

Bollobás, B. (1979). Graph theory: An introductory course. New York: Springer.Border, K. C. (1985). Fixed point theorems with applications to economics and game theory.Cambridge: Cambridge University Press.

© Springer Nature Singapore Pte Ltd. 2018A. McLennan, Advanced Fixed Point Theory for Economics,https://doi.org/10.1007/978-981-13-0710-2

417

418 References

Borsuk, K. (1935). Sur un continu acyclique qui se laisse transformer topologiquement en lui mêmesans points invariant. Fundamenta Mathematicae, 24, 51–58.

Borsuk, K. (1937). Sur les prolongements des transformations continues. Fundamenta Mathemat-icae, 28, 99–110.

Borsuk, K. (1967). Theory of retracts. Warsaw: Polish Scientific Publishers.Bourgin, D. G. (1955a). Un indice dei punit uniti i. Atti Accad Naz Lincei, 19, 435–440.Bourgin, D. G. (1955b). Un indice dei punit uniti ii. Atti Accad Naz Lincei, 20, 43–48.Bourgin, D. G. (1956). Un indice dei punit uniti iii. Atti Accad Naz Lincei, 21, 395–400.Brouwer, L. E. J. (1912). Uber Abbildung von Mannigfaltikeiten. Mathematische Annalen, 71,97–115.

Browder, F. (1948). The topological fixed point theory and its applications to functional analysis.PhD thesis, Princeton University.

Brown, A. (1935). Functional dependence. The Transactions of the American Mathematical Society,38, 379–394.

Brown, R. (1971). The Lefschetz Fixed Point Theorem. Glenview, IL: Scott Foresman and Co.do Carmo, MP. (1976). Differential geometry of curves and surfaces. Englewood Cliffs: Prentice-Hall.

Cauty, R. (1994). Un éspace métrique lineaire qui n’est un rétract absolu. Fund Math, 146, 85–99.Chen,X.,&Deng,X. (2006a). On the complexity of 2Ddiscrete fixed point problem. InProceedings

of the 33th International Colloquium on Automata, Languages, and Programming (pp. 489–500).Chen, X., &Deng, X. (2006b). Settling the complexity of two-player Nash equilibrium. In Proceed-

ings of the 47th Annual IEEE Symposium on Foundations of Computer Science (pp. 261–272).Cho, I. K., & Kreps, D. M. (1987). Signalling games and stable equilibria. The Quarterly Journal

of Economics, 102, 179–221.Cohen, D. I. A. (1967). On the Sperner lemma. Journal of Combinatorial Theory, 2, 765–771.Condon, A. (1992). The complexity of stochastic games. Information and Computation, 96, 203–224.

Conley, C. (1978). Isolated invariant sets and the morse index. Providence: AmericanMathematicalSociety.

van Damme, E. (1987). Stability and perfection of nash equilibria. Berlin: Springer.Daskalakis, C., Goldberg, P., & Papadimitriou, C. (2006). The complexity of computing a Nashequilibrium. In Proceedings of the 38th ACM Symposium on the Theory of Computing.

Debreu, G. (1970). Economies with a finite set of equilibria. Econometrica, 38, 387–392.Debreu, G. (1974). Excess demand functions. Journal of Mathematical Economics, 1, 15–21.deFinetti, B. (1936). Les probabilité nulles. Bulllletin des Sciences Mathématiques, 60, 275–288.deFinetti, B. (1949a). On the axiomatization of probability. Probability, induction, statistics: the

art of guessing, chap. 5 (pp. 67–113). New York: Wiley.deFinetti, B. (1949b). Sull’impostazione assiomatica del calcolo della probabilitá. Annal Triestini

Univ Trieste, 19, 29–81.Demichelis, S., &Germano, F. (2002a). On (un)knots and dynamics in games.Games and Economic

Behavior, 41, 46–60.Demichelis, S., & Germano, F. (2002b). Some consequences of the unknottedness of the Walrascorrespondence. Journal of Mathematical Economics, 34, 537–545.

Demichelis, S., & Ritzberger, K. (2003). From evolutionary to strategic stability. Journal of Eco-nomic Theory, 113, 51–75.

Dudley, R. M. (1989). Real analysis and probability. Cambridge: Cambridge University Press.Dugundji, J. (1951). An extension of Tietze’s theorem. Pacific Journal of Mathematics, 1, 353–367.Dugundji, J. (1952). Note on CW polytopes. Port Math, 11, 7–10.Dugundji, J. (1957). Absolute neighborhood retracts and local connectedness in arbitrary matricspaces. Compositio Mathematica, 13, 229–246.

Dugundji, J. (1965). Locally equiconnected spaces and absolute neighborhood retracts.FundamentaMathematicae, 52, 187–193.

Dugundji, J., & Granas, A. (2003). Fixed Point Theory. New York: Springer.

References 419

Eaves, B. C. (1972). Homotopies for computation of fixed points. Mathematical Programming, 3,1–22.

Eaves, B. C., & Saigal, R. (1972). Homotopies for computation of fixed points on unboundedregions. Mathematical Programming, 3, 225–237.

Echenique, F. (2005). A short and constructive proof of Tarski’s fixed point theorem. InternationalJournal of Game Theory, 33, 215–218.

Echenique, F. (2008). The correspondence principle. In S. Durlauf & L. Blume (Eds.), The newpalgrave dictionary of economics (2nd ed.). New York: Palgrave Macmillan.

Eilenberg, S., & Montgomery, D. (1946). Fixed-point theorems for multivalued transformations.American Journal of Mathematical Analysis, 68, 214–222.

Ekeland, I. (2010). Notes on optimal transportation. Economic Theory, 42, 437–459.Eraslan, H., & McLennan, A. (2013). Uniqueness of stationary equilibriun payoffs in coalitionalbargaining. Journal of Economic Theory, 148, 2195–2222.

Fan, K. (1952). Fixed point and minimax theorems in locally convex linear spaces. Proceedings ofthe National Academy of Sciences of the United States of America, 38, 121–126.

Federer, H. (1969). Geometric measure theory. New York: Springer.Florenzano, M. (2003). General equilibrium analysis: Existence and optimality properties of equi-

libria. Boston: Kluwer Academic.Fort, M. (1950). Essential and nonessential fixed points. American Journal of Mathematics, 72,315–322.

Furi, M., Pera, M. P., & Spadini, M. (2004). On the uniqueness of the fixed point index on differ-entiable manifolds. Fixed Point Theory and Applications, 4, 251–259.

Galántai, A. (2000). The theory of Newton’s method. Journal of Computational and Applied Math-ematics, 124, 25–44.

Gale, D., & Mas-Colell, A. (1975). An equilibrium existence theorem for a general model withoutordered preferences. Journal of Mathematical Economics, 2, 9–15.

Gale, D., & Mas-Colell, A. (1979). Correction to an equilibrium existence theorem for a generalmodel without ordered preferences. Journal of Mathematical Economics, 6, 297–298.

Gale, D., & Shapley, L. S. (1962). College admissions and the stability of marriage. AmericanMathematical Monthly, 69, 9–14.

García, C. B., & Zangwill, W. I. (1981). Pathways to solutions, fixed points, and equilibria. Engle-woodCliffs: Prentice-Hall.

Glicksberg, I. (1952). A further generalization of the Kakutani fixed point theoremwith applicationsto Nash equilibrium. Proceedings of the American Mathematical Society, 3, 170–174.

Goldberg, P., Papadimitriou, C., & Savani, R. (2011). The complexity of the homotopy method,equilibrium selection, and Lemke-Howson solutions. In Proceedings of the 52nd Annual IEEESymposium on the Foundations of Computer Science.

Górniewicz, L. (2006). Topological fixed point theory of multivalued mappings (2nd ed.). TheNetherlands: Springer.

Govindan, S., & Wilson, R. (2008). Nash equilibrium, refinements of. In S. Durlauf & L. Blume(Eds.), The new palgrave dictionary of economics (2nd ed.). NewYork: PalgraveMacmillan. Thechapter title seems to be incomplete. Please check for missing words/phrases and complete thechapter title.

Guillemin, V., & Pollack, A. (1974). Differential topology. New York: Springer.Hanner, O. (1951). Some theorems on absolute neighborhood retracts. Arkiv för Matematik, 1,315–360.

Harsanyi, J. C. (1973). Oddness of the number of equilibrium points: A new proof. InternationalJournal of Game Theory, 2, 235–250.

Hart, O., & Kuhn, H. (1975). A proof of the existence of equilibrium without the free disposalassumption. Journal of Mathematical Economics, 2, 335–343.

Hauk, E., & Hurkens, S. (2002). On forward induction and evolutionary and strategic stability.Journal of Economic Theory, 106, 66–90.

420 References

Hillas, J., Jansen, M., Potters, J., & Vermeulen, D. (2001). On the relations among some definitionsof strategic stability. Mathematics of Operations Research, 26, 611–635.

Hirsch, M. (1976a). Differential topology. New York: Springer.Hirsch, M., & Smale, S. (1974). Differential equations, dynamical systems, and linear algebra.Orlando: Academic Press.

Hirsch, M., Papadimitriou, C., & Vavasis, S. (1989). Exponential lower bounds for finding Brouwerfixed points. Journal of Complexity, 5, 379–416.

Hirsch, M. W. (1976b). Differential topology. Graduate Texts in Mathematics (p. 33). New York:Springer.

Hopf, H. (1928). A new proof of the Lefschetz formula on invariant points. Proceedings of theNational Academy of Sciences of the United States of America, 14, 149–153.

Hopf, H. (1931). Über die Abbildungen der dreidimensionalen Sphäre auf die Kugelfläche. Math-ematische Annalen, 104, 637–665.

Hu, S. T. (1965). Theory of retracts. Detroit: Wayne State University Press.Hylland, A., & Zeckhauser, R. (1979). The efficient allocation of individuals to positions. Journal

of Political Economy, 87, 293–314.Jacobson, N. (1953). Lectures in abstract algebra. Princeton: D. van Norstrand Inc.Jehle, G. A., & Reny, P. J. (2011). Advanced microeconomic theory. New York: Prentice Hall.Jiang, J. H. (1963). Essential component of the set of fixed points of the multivalued mappings andits application to the theory of games. Scientia Sinica, 12, 951–964.

Kakutani, S. (1941). A generalization of Brouwer’s fixed point theorem. Duke Mathematical Jour-nal, 8, 457–459.

Kantorovich, L., & Akilov, G. (1982). Functional analysis (2nd ed.). New York: Pergamon Press.Karmarkar, N. (1984). A new polynomial-time algorithm for linear programming. In Proceedings

of the 16th ACM Symposium on Theory of Computing, ACM, NewYork, NY, USA, STOC ’84 (pp.302–311).

Karp, R. M. (1972). Reducibility among combinatorial problems. In R. E. Miller & J. W. Thatcher(Eds.), Complexity of computer computations. New York: Plenum.

Kelley, J. (1955). General topology. New York: Springer.Khachian, L. (1979). A polynomial algorithm in linear programming. Soviet Mathematics Doklady,

20, 191–194.Kinoshita, S. (1952).On essential components of the set of fixedpoints.Osaka Mathematics Journal,

4, 19–22.Kinoshita, S. (1953). On some contractible continua without the fixed point property. Fundamenta

Mathematicae, 40, 96–98.Klee, V., & Minty, G. (1972). How good is the simplex algorithm? In O. Sisha (Ed.), Inequalities

III. New York: Academic Press.Kodama, Y. (1956). Note on an absolute neighborhood extensor for metric spaces. Journal of the

Mathematical Society of Japan, 8, 206–215.Kohlberg, E., & Mertens, J. F. (1986). On the strategic stability of equilibria. Econometrica, 54,1003–1038.

Kohlberg, E., & Reny, P. (1997). Independence on relative probability spaces and consistent assess-ments in game trees. Journal of Economic Theory, 75, 280–313.

Krasnosel’ski, M. A., & Zabreiko, P. P. (1984). Geometric methods of nonlinear analysis. Berlin:Springer.

Kreps, D., & Wilson, R. (1982). Sequential equilibrium. Econometrica, 50, 863–894.Krishna, V. (2010). Auction theory (2nd ed.). London: Academic Press.Kuhn, H., & MacKinnon, J. (1975). Sandwich method for finding fixed points. Journal of Opti-

mization Theory and Applications, 17, 189–204.Kuhn, H.W. (1960). Some combinatorial lemmas in topology. IBM Journal of Research and Devel-

opment, 4, 508–524.Kuhn,H.W. (1968). Simplicial approximation of fixed points.Proceedings of the National Academy

of Sciences of the United States of America, 61, 1238–1242.

References 421

Kuratowski, K. (1935). Quelques problèms concernant les espaces métriques non-séparables. Fun-damenta Mathematicae, 25, 534–545.

Lee, J. M. (2013). Introduction to smooth manifolds (2nd ed.). New York: Springer.Lefschetz, S. (1923). Continuous transformations of manifolds. Proceedings of the National

Academy of Sciences of the United States of America, 9, 90–93.Lefschetz, S. (1926). Intersections and transformations of complexes and manifolds. Transactions

of the American Mathematical Society, 28, 1–49.Lefschetz, S. (1927). Manifolds with a boundary and their transformations. Transactions of the

American Mathematical Society, 29, 429–462.Leray, J., & Schauder, J. (1934). Topologie et équations funtionnelles. Annales scientifiques de l

École normale supérieure, 51, 45–78.Lyapunov, A. (1992). The general problem of the stability of motion. London: Taylor and Francis.Mantel, R. (1974). On the characterization of aggregate excess demand. Journal of Economic

Theory, 7, 348–353.Mas-Colell, A. (1974). A note on a theorem of F. Browder. Programs in Mathematics, 6, 229–233.Mas-Colell, A., Whinston, M. D., & Green, J. R. (1995). Microeconomic theory. Oxford: OxfordUniversity Press.

Mawhin, J. (1999). Leray-Schauder degree: A half century of extensions and applications. Topo-logical Methods in Nonlinear Analysis, 14, 195–228.

McAdams, D. (2003). Isotone equilibrium in games of incomplete information. Econometrica, 71,1191–1214.

McLennan, A. (1985). Justifiable beliefs in sequential equilibrium. Econometrica, 53, 889–904.McLennan, A. (1989a). Consistent conditional systems in noncooperative game theory. Interna-

tional Journal of Game Theory, 18, 141–174.McLennan, A. (1989b). The space of conditional systems is a ball. International Journal of Game

Theory, 18, 125–139.McLennan, A. (1991). Approxiation of contractible valued correspondences by functions. Journal

of Mathematical Economics, 20, 591–598.McLennan, A. (2018). Efficient disposal equilibria of pseudomarkets, working Paper, University ofQueensland.

McLennan, A., & Tourky, R. (2005). From imitation games to Kakutani, unpublished.McLennan, A., & Tourky, R. (2008). Using volume to prove sperner’s lemma. Economic Theory,

35, 593–597.McLennan, A., & Tourky, R. (2010). Imitation games and computation. Games and Economic

Behavior, 70, 4–11.McLennan,A.,Monteiro, P.K.,&Tourky, R. (2011).Gameswith discontinuous payoffs:A strength-ening of Reny’s theorem. Econometrica, 79, 1643–1664.

Menezes, F. M., & Monteiro, P. K. (2005). An introduction to auction theory. Oxford: OxfordUniversity Press.

Merrill, O. H. (1972). A summary of techniques for computing fixed points of continuousmappings.In R. H. Day& S.M. Robinson (Eds.),Mathematical topics in economic theory and computation.Philadelphia: Society for Industrial and Applied Mathematics.

Mertens, J. F. (1989). Stable equilibria-a reformulation, part I: Definition and basic properties.Mathematics of Operations Research, 14, 575–625.

Mertens, J. F. (1991). Stable equilibria-a reformulation, part II: Discussion of the definition andfurther results. Mathematics of Operations Research, 16, 694–753.

Michael, E. (1951). Topologies on spaces of subsets. Transactions of the American MathematicalSociety, 71, 152–182.

Michael, E. (1956). Continuous selections. I. Annals of Mathematics, 2(63), 361–382.Milgrom, P. (2004). Putting auction theory to work. Cambridge: Cambridge University Press.Milgrom, P., & Roberts, J. (1982). A theory of auctions and competitive bidding. Econometrica,

50, 1089–1122.Milgrom, P., & Shannon, C. (1994). Monotone comparative statics. Econometrica, 62, 157–180.

422 References

Milnor, J. (1965a). Topology from the differentiable viewpoint. Charlottesville: University Press ofVirginia.

Milnor, J. (1965b). Topology from the differentiable viewpoint. Charlottesville: University Press ofVirginia.

Monderer, D., Samet, D., & Shapley, L. S. (1992). Weighted values and the core. InternationalJournal of Game Theory, 21, 27–39.

Morgan, F. (1988). Geometric measure theory: A beginner’s guide. New York: Academic Press.Morse, A. (1939). The behavior of a function on its critical set. Annals of Mathematics, 40, 62–70.Myerson, R. (1978). Refinements of the Nash equilibrium concept. International Journal of Game

Theory, 7, 73–80.Myerson, R. (1981). Optimal auction design. Mathematics of Operations Research, 6, 58–73.Myerson, R. (1986). Multistage games with communication. Econometrica, 54, 323–358.Myerson, R. (1991). Game theory: Analysis of conflict. Cambridge: Harvard University Press.Nadzieja, T. (1990). Construction of a smooth Lyapunov function for an asymptotically stable set.

Czeckoslovak Mathematical Journal, 40, 195–199.Nash, J. (1950). Non-cooperative games. PhD thesis, Mathematics Department, Princeton Univer-sity.

Nash, J. (1951). Non-cooperative games. Annals of Mathematics, 54, 286–295.Norton, D. E. (1995). The fundamental theorem of dynamical systems. Commentationes Mathe-

maticae Universitatis Carolinas, 36, 585–597.Nussbaum, R. D. (1974). On the uniqueness of the topological degree for k-set-contractions. Math-

ematische Zeitschrift, 137, 1–6.O’Neill, B. (1953). Essential sets and fixed points. American Journal of Mathematics, 75, 497–509.Osborne, M. J., & Rubinstein, A. (1994). A course in game theory. Cambridge: Cambridge Univer-sity Press.

Oyama, D., Sandholm, W. H., & Tercieux, O. (2015). Sampling best response dynamics and deter-ministic equilibrium selection. Theoretical Economics, 10, 243–281.

Papadimitriou, C. H. (1994a). Computational complexity. New York: Addison Wesley Longman.Papadimitriou, C. H. (1994b). On the complexity of the parity argument and other inefficient proofsof existence. Journal of Computer and System Sciences, 48, 498–532.

Petri, H., & Voorneveld, M. (2017). No bullying! a playful proof of Brouwer’s fixed-point theorem,working paper, Sweden School of Economics.

Reny, P. (1999). On the existence of pure and mixed strategy Nash equilibria in discontinuousgames. Econometrica, 67(5), 1029–1056.

Reny, P. (2011). On the existence of monotone pure-strategy equilibria in Bayesian games. Econo-metrica, 79(2), 499–553.

Rényi, A. (1955). On a new axiomatic theory of probability. Acta Mathematica Hungarica, 6,285–335.

Rényi, A. (1956). On conditional probability spaces generated by a dimensionally ordered set ofmeasures. Theory of Probability and Its Applications, 1, 61–71.

Rényi, A. (1970). Foundations of probability. San Francisco: Holden-Day.Repovš, D., & Semenov, P. V. (2014). Continuous selections of multivalued mappings. In K. Hart,J. van Mill, & P. Simon (Eds.), Recent progress in general topology III (pp. 711–749). Berlin:Springer.

Robinson, C. (1999). Dynamical systems: Stability, symbolic dynamics, and chaos (2nd ed.). BocaRatan: CRC Press.

Royden, H., & Fitzpatrick, P. (2010). Real analysis, 4th edn. Upper Saddle River: Prentice Hall.Rubinstein, A. (1989). The electronic mail game: Strategic behavior under “almost common knowl-edge”. American Economic Review, 79, 385–391.

Rudin, M. E. (1969). A new proof that metric spaces are paracompact. Proceedings of the AmericanMathematical Society, 20, 603.

Samuelson, P. (1947). Foundations of economic analysis. Harvard University Press.

References 423

Samuelson, P. A. (1941). The stability of equilibrium: Comparative statics and dynamics. Econo-metrica, 9, 97–120.

Samuelson, P. A. (1942). The stability of equilibrium: Linear and nonlinear systems. Econometrica,10, 1–25.

Sard, A. (1942). The measure of the critical values of differentiable maps. Bulletin of the AmericanMathematical Society, 48, 883–897.

Savani, R., & von Stengel, B. (2006). Hard-to-solve bimatrix games. Econometrica, 74, 397–429.Scarf, H. E. (1967). The approximation of fixed points of a continuous mapping. SIAM Journal of

Applied Mathematics, 15, 1328–1343.Scarf, H. E., & Shapley, L. S. (1974). On cores and indivisibility. Journal of Mathematical Eco-

nomics, 1, 23–37.Schauder, J. (1930). Der fixpunktsatz in funktionalräumen. Studia Mathematica, 2, 171–180.Selten, R. (1975). Re-examination of the perfectness concept for equilibrium points of extensivegames. International Journal of Game Theory, 4, 25–55.

Shapley, L. S. (1974). A note on the Lemke-Howson algorithm. Mathematical Programming Study,1, 175–189.

Shishikura, M. (1994). The boundary of the Mandelbroit set has Hausdorff dimension two.Astérisque, 7(222), 389–405.

Smale, S. (1965). An infinite dimensional version of Sard’s theorem. American Journal of Mathe-matics, 87, 861–866.

Solan, E. (2017). The modified stochastic game, working paper, School of Mathematical sciences,Tel Aviv University.

Sonnenschein, H. (1973). Do Walras’ identity and continuity characterize the class of communityexcess demand functions? Journal of Economic Theory, 6, 345–354.

Spielman, D., & Teng, S. H. (2004). Smoothed analysis of algorithms: Why the simplex algorithmusually takes polynomial time. Journal of Association of Computing Machinery, 51, 385–463.

Spivak, M. (1965). Calculus on manifolds : A modern approach to classical theorems of advancedcalculus. New York: Benjamin.

Spivak, M. (1979). A comprehensive introduction to differential geometry (Vol. 1, 2nd edn.) Publishor Perish.

Sternberg, S. (1983). Lectures on differential geometry (2nd ed.). New York: Chelsea PublishingCompany.

Stone, A. H. (1948). Paracompactness and product spaces. Bulletin of the American MathematicalSociety, 54, 977–982.

Tapp, K. (2016). Differential geometry of curves and surfaces. New York: Springer.Tarski, A. (1955). A lattice theoretical fixed point theorem and its allications. Pacific Journal of

Mathematics, 40, 285–309.Tuy, H. (1979). Pivotal methods for computing equilibrium points: Unified approach and new restartalgorithm. Programs in Mathematics, 16, 210–227.

Tuy,H., vanThoai,N.,&Muu, L.D. (1978).Amodification of Scarf’s algorithmallowing restarting.Math Operationforsch Statist Ser Optimization, 9, 357–372.

Vieille, N. (1996). Conditional systems revisited. International Journal of Game Theory, 25, 207–217.

Vietoris, L. (1923). Bereiche Zweiter Ordnung. Monatshefte für Mathematik, 32, 258–280.Whitehead, J. (1939). Simplicial spaces, nuclei, and m-groups. Proceedings London Mathematical

Society, 45, 243–327.Whitney, H. (1957). Elementary structure of real algebraic varieties. Annals of Mathematics, 66,545–556.

Wilson, F. W. (1969). Smoothing derivatives of functions and applications. Transactions of theAmerican Mathematical Society, 139, 413–428.

Wojdyslawski, M. (1939). Rétractes absolus et hyperspaces des continus. Fundamenta Mathemat-icae, 32, 184–192.

424 References

Wu, W. T., & Jiang, J. H. (1962). Essential equilibrium points of n-person non-cooperative games.Scientia Sinica, 5, 1307–1322.

Zhou, L. (1994). The set of Nash equilibria of a supermodular game is a complete lattice. Gamesand Economic Behavior, 7, 295–300.

Ziegler, G. M. (1995). Lectures on polytopes. New York: Springer.

Index

AAbsolute continuity, 301Absolute extensor, 167Absolute neighborhood extensor, 163Absolute neighborhood retract, 19, 162Absolute retract, 19, 167Absorption laws, 372Actions, 334, 395Acyclic, 51Additivity, 4Adjoint, 296Affiliated, 377Affine

combination, 33dependence, 33hull, 34independence, 34subspace, 34

Agent normal form, 361Aggregate endowment, 51Aggregate excess demand, 324Alexander horned sphere, 191Algorithm, 60, 90Allocation, 51, 99All pay auction, 411Almost completely labelled, 64Almost everywhere, 383Almost surely, 384Ambient space, 192Annulus, 203Antipodal function, 278Antipodal map, 243Antipodal points, 275Antisymmetric, 371Approach, 390Approachable, 390Approximate fixed point, 60

Arborescence, 334Arrow, Kenneth, 10Assessment, 337

consistent, 338interior, 338

Asymptotically stable set, 312Atlas, 190Atomless, 384, 388Attractive set, 312Attractor, 329, 330Axiom of choice, 56

BBalanced set, 133Banach space, 135Barycenter, 46Barycentric subdivision, 46Base of a topology, 106Basin of attraction, 329Bayes-consistent, 365Behavior strategies, 336Behavior strategy profiles, 336Belief profiles, 337Best reply, 396Best response, 322Best response correspondence, 151, 322Bilinear function, 73Bing, R. H., 271Bipartite graph, 53Birkhoff polytope, 52Birkhoff-von Neumann theorem, 52Bistochastic matrix, 52Boolean circuit, 94Boolean formula, 101Border, Kim, viBorel σ -algebra, 381

© Springer Nature Singapore Pte Ltd. 2018A. McLennan, Advanced Fixed Point Theory for Economics,https://doi.org/10.1007/978-981-13-0710-2

425

426 Index

Borel sets, 381Borsuk, Karol, 155Borsuk–Ulam theorem, 280Bounded operator, 141Bounding hyperplane, 35Brouwer, Luitzen, 10Brown, Robert, viiBubble sort, 96

CCanonical cover, 401Canonical extension, 300Category, 20, 195Cauchy-Schwartz inequality, 35, 136Cauchy sequence, 135Certificate, 91Chain, 57, 329, 347Chain recurrent, 329, 330Chain transitive component, 329Characteristic polynomial, 249Church–Turing thesis, 90Circled set, 133Classification, 209Closed convex hull, 140Closed function, 114Closed partial order, 379Closed star, 49, 53, 346Coarsely consistent, 351Coarse order, 351Codimension, 34, 195Common prior, 335Commutativity, 7Compact, 4Compact-open topology, 125Complete, 379Complete invariant, 269Complete lattice, 372Complete Lyapunov function, 330Completely labelled simplex, 64Completely metrizable, 162Complete metric space, 135Complete ordering, 57Complete subdivision, 46Component of a graph, 51Computational problem, 90

complete for a class, 92computable, 90decision, 90search, 90

Conditional expectation, 397Conditional system, 339Cone

polyhedral, 347Conical decomposition, 347Connected

graph, 51space, 18, 149

Consistency, 352Consistent conditional systems, 357Constant rank theorem, 186Constructivism, 59Continuity, 8Continuous, 118Continuous convergence, 128Contractible, 156Contraction, 156Converse Lyapunov theorem, 315Convex, 35, 395

combination, 35cone, 38hull, 36

Coordinate chart, 190Coordinatewise partial ordering, 57Coordination game, 264Core, 99Correspondence, 118

closed valued, 118convex valued, 118graph of, 118

Correspondence principle, 290Covariant functor, 20Critical point, 197, 217Critical value, 197, 217Curvature, 295CW topology, 48Cycle, 51, 295

DDebreu-Gale-Kuhn-Nikaido lemma, 212Debreu, Gerard, 10Deferred acceptance algorithm, 99Degree, 51, 241Degree admissible

function, 237homotopy, 238

Dehn, Max, 209Derivative, 184, 193, 194Derived, 46, 47Descartes, René, 44Diameter, 46Diffeoconvex body, 289, 296Diffeomorphism, 192Diffeomorphism point, 195Differentiation along a vector field, 312

Index 427

Dimension, 48of a polyhedral complex, 44of a polyhedron, 39of an affine subspace, 34

Directed graph, 94Discrete set, 191Domain of attraction, 312Domination, 169Dual, 37Dual of a polytope, 52Dual space, 140Dugundji, James, vi, 138Dynkin system, 381

EEdge, 40, 51Eilenberg–Montgomery theorem, 271Electronic mail game, 264Embedding, 190, 204Endpoint, 51Equiconnecting function, 164, 407

stability for, 393Equilibrium, 295Equilibrium of a vector field, 304Essential

fixed point, 16Nash equilibrium, 18set of fixed points, 16, 148set of Nash equilibria, 18

Essential Nash equilibrium, 151Euclidean neighborhood retract, 160Euler characteristic, 270Exact sequence, 288Expected payoff, 73, 338Expected utility, 321Extraneous solution, 80Extremal set, 140Extreme point, 42, 140

FFace, 39

proper, 40Facet, 40Family of sets

locally finite, 129refinement of, 129

Federer, Herbert, 213Fermat’s last theorem, 209Fiber, 199Fiber bundle, 210Finely consistent, 351

Fine order, 351Finite, 384Finite trajectory, 291, 294First order stochastically dominates, 378Fixed point, 9Flow, 330Flow domain, 294Fort, M. K., 143Forward flow, 311Forward flow domain, 303Forward invariant, 312Forward precompact, 313Four color theorem, 209Frame, 228Freedman, Michael, 209Fubini’s theorem, 214Fully stable set, 152Functor, 195Fundamental group, 287Fundamental neighborhood, 312

GGeneral linear group, 232Generic, 225Germ, 250, 259

composition, 251domain, 250, 259index admissible, 251, 259

Gram–Schmidt process, 228Granas, Andrzej, viGraph, 50Grassman manifold, 228Greatest lower bound, 371Group, 71

HHalf-space, 35Hall’s marriage theorem, 54Harsanyi doctrine, 335Hauptvermutung, 271Hausdorff distance, 109Hausdorff space, 106Hawaiian earring, 49, 161Hazard rate, 413Heegaard, Poul, 209Hilbert cube, 138Hilbert space, 136Homology, 247, 271Homotopy, 87–88

extension property, 168, 273invariant, 272

428 Index

Homotopy groups, 287Homotopy invariant, 156

complete, 171Hopf fibration, 210Hopf, Heinz, 271Hopf’s theorem, 272–273Hyperplane, 34

IIdempotent laws, 372Identity component, 232Immediate consequence, 335Immediate predecessor, 334Immersion, 197, 204Immersion point, 195Implicit function theorem, 185Incentive compatible, 414Increasing differences, 373Indegree, 94Index, 4, 326Index admissible, 306, 309

correspondence, 258function, 247

index +1 principle, 326Individual rationality, 413Inessential fixed point, 16Infinitely more probable, 340Information sets, 334Initial assessment, 335Initial point, 40Inner product, 34, 135Inner product space, 136Integer linear program, 101Integrable function, 384Integral, 384Interim best response correspondence, 399Interim expected payoff, 398Interior probability measures, 335Intuitionism, 59Intuitive criterion, 367Invariance of domain, 282Inverse function theorem, 185Inward pointing, 235, 300

JJoin, 372Join-closed, 389Justifiable equilibrium, 364

KKinoshita, Shin’ichi, 143, 155

LLattice, 372Least upper bound, 371Lebesgue measure, 213Lefschetz fixed point theorem, 271Lefschetz number, 271Lefschetz, Solomon, 271Lemke–Howson algorithm, 73–85, 92Lexicographic probability system, 339Lineality space, 37Linear complementarity problem, 81Linear interpolation, 49Linear programming, 51Linear transformation

orientation preserving, 232orientation reversing, 232

Link, 53Lipschitz, 292Local diffeomorphism, 197Local property, 165Locally closed set, 160Locally compact, 116Locally complete, 380Locally contractible space, 164Locally equiconnected metric space, 164,

407Locally finite

simplicial complex, 346Locally Lipschitz, 292Locally nonsatiated, 51Locally path connected space, 164, 203Logarithmic relative probability, 343Lower bound, 371Lower hemicontinuous, 118Lyapunov function, 313Lyapunov stability, 312Lyapunov theorem, 313

MManifold

orientable, 234smooth, 190unorientable, 234

Manifold with boundary, 204Mas-Colell, Andreu, 173Mass transportation, 414Match, 53Matching, 411Maximal, 51Measurable, 382Measurable space, 382Measure, 383

Index 429

Measure space, 383complete, 383

Measure theory, 213Measure zero, 22, 214, 220Mechanism, 412Meet, 372Merge algorithm, 96Mesh, 46Metric semilattice, 379Milnor, John, 213, 271Minkowski sum, 37, 147Mixed strategy, 151, 321, 336Mixed strategy profile, 151Möbius, August Ferdinand, 209Moise, Edwin E., 271Monotone, 374Monotone function, 389Morse-Sard theorem, 221Multiplication, 7

NNash dynamics, 323Nash equilibrium, 151, 321, 322, 360, 396

accessible, 77mixed, 74pure, 73refinements of, 18regular, 211strict, 211totally mixed, 211

Nash, John, 10Negatively oriented relative to P , 236Neighborhood retract, 159Neighbors, 51Nerve of an open cover, 169Node, 334

initial, 334noninitial, 334nonterminal, 334terminal, 334

Nondegenerate game, 74Norm, 34, 134Normal bundle, 199Normalization, 4Normal space, 106Normal subcomplex, 346Normal vector, 35Normed space, 135Not outward pointing, 300

O1 principle, 326

Open star, 49, 53, 346Operator norm, 140, 215, 293Oracle, 91Order complex, 347Order intervals, 374Order of differentiability, 184Orientable, 234Orientation, 6, 227, 231–237

induced, 234Orientation preserving, 63, 236Orientation reversing, 63, 236Orientation reversing loop, 234Oriented intersection number, 236Oriented manifold, 235Oriented vector space, 232Orthogonal complement, 230Orthonormal, 228Outdegree, 94Outward pointing, 235

PParacompact space, 129Parameterization, 190, 204Pareto efficiency, 51Parity game, 100Partially ordered probability space, 387Partial order, 56Partial realization, 400Partition of unity, 130, 187Path, 51, 366, 370Payoff consistent selection dynamics, 323Payoff function, 73, 335Perelman, Grigori, 209Perfect equilibrium, 152, 360Perfect recall, 336Permutahedron, 52, 362Permutation, 71

even, 71odd, 71

Permutation matrix, 52, 71Personal decision tree, 369Personal history, 336Picard–Lindelöf theorem, 292, 294, 303Piecewise-closed, 389Pivoting, 83Play, 100Poincaré conjecture, 209Poincaré, Henri, 209Pointed cone, 39, 348Pointed map, 149Pointed space, 149Pointwise-limit-closed, 389

430 Index

Polar of a polytope, 52Polyhedral complex, 44Polyhedral complex:finite, 44Polyhedral complex:locally finite, 44Polyhedral cone, 43Polyhedral subdivision, 44Polyhedron, 39

minimal representation of, 42standard representation of, 40

Polytopal complex, 45Polytope, 42

simple, 53, 80simplicial, 53

Poset, 347Positively oriented relative to P , 236Potential, 414Precedence, 334Predecessors, 334Predictor-corrector method, 87Prime factorization, 93Primitive set, 68

completely labelled, 69Primitive simplex, 68Probability density function, 375Probability measure, 384Probability space, 384Product σ -algebra, 383Product topology, 115Projection, 229Projective space, 243Proper equilibrium, 153, 362Pseudometric, 390Pure behavior strategy profiles, 336Pure strategy, 151, 336, 395Pure strategy profile, 151

QQuasisupermodular, 373Quotient topology, 121

RRado, Tibor, 209, 271Rational expectations, 321Rational polynomial, 244Realization, 400Recession cone, 37Reduction, 92Refinement, 351Refines, 351Regular conditional probability, 398Regular economy, 212

Regular point, 197Regular space, 106Regular subdivision, 98Regular value, 5, 197Relative probability, 342Repeller, 329, 330Restart, 67Retract, 159Retraction, 159Revelation principle, 412Revenue equivalence theorem, 413Riemann sphere, 243Ritzberger, Klaus, viiRobust set, 149

minimal, 150minimal connected, 150

Rubinstein, A., 264Rural hospital theorem, 100

SSandholm, Bill, viiSandwich method, 67Sard’s theorem, 248, 250Second price auction, 411Section, 199Selection, 141, 230Semilattice, 379Semimonotone, 399Separable, 388Separable metric space, 136Separation axioms, 106Sequential equilibrium, 333, 338Sequentially compact, 58Sequential rationality, 338Shmaya, Eran, viiSignalling game, 332, 364Simple polytope, 53, 80Simplex, 45, 47Simplex algorithm, 97Simplicial, 49Simplicial complex, 45

canonical realization, 48combinatoric, 47

Simply connected, 209Single crossing property, 373Single index space, 261Sink, 94Skeleton, 44, 48Slack variables, 79Slice, 386Slice of a set, 216Smale, Stephen, 209

Index 431

Smooth, 220Solan, Eilon, viiSource, 94Sperner labelling, 62Stable, 411Stable matching, 99Stable set, 153, 366Standard extension, 306Star refinement, 401Starshaped, 156Stationary homotopy, 164Steinitz, Ernst, 271Step size, 88Sternberg, Shlomo, 213Strategic form game, 151, 322Strategy

mixed, 73, 322pure, 73, 322

Strategy profile, 322mixed, 73pure, 73

Strict refinement, 351Strong deformation retract, 344Strong deformation retraction, 344Strong set order, 128Strong set ordering, 372Strong topology, 125Strong upper topology, 119Subbase of a topology, 106Subbundle, 199Subcomplex, 44, 47, 48Subdivision, 45Sublattice, 373Submanifold, 195

neat, 206Submersion, 197Submersion point, 195Subsemilattice, 379Support, 73, 211Symmetric group, 71Systems of beliefs, 337

TTableau, 83Tangent bundle, 20, 193Tangent space, 193Tarski fixed point theorem, 375Tatonnement, 325Tietze, Heinrich, 271Topological space, 106Topological vector space, 132

locally convex, 134

Topologyjointly continuous, 128pointwise convergence, 127uniform convergence, 127uniform convergence on compacta, 127

Top trading cycle, 99Totally bounded, 59Totally mixed strategy, 151Total subspace, 140Trajectory, 291Translation invariant topology, 133Transposition, 71Transversal, 198, 206, 221Tree, 51Tremble, 360Triangulation, 45Tubular neighborhood theorem, 199Turing machine, 89Two person game, 73Types, 395

UUniformly asymptotically stable set, 312Uniformly attractive set, 312Uniformly Cauchy, 141Uniformly locally contractible space, 164Unorientable, 234Unused message, 365Upper bound, 371Upper contour set, 51Urysohn’s lemma, 131Useless action, 362

VVan Dyke, Walther, 209Vector bundle, 199Vector field, 222

index admissible, 304Vector field along γ , 234Vector field: complete, 295Vector field correspondence, 309Vector field:forward complete, 303Vector field homotopy, 305

index admissible, 305Vertex, 40, 47, 50

connected, 51Vickrey auction, 411Vietoris, Leopold, 105Vietoris topology, 107Virtual valuation, 413Voronoi diagram, 44

432 Index

WWalk, 51Wallace, Neil, viiWalrasian equilibrium, 211, 324, 328

regular, 212Walras’ law, 325Weakly dominated strategy, 153Weakly quasisupermodular, 399Weak single crossing, 399Weak single crossing property, 373Weak topology, 125Weak∗ topology, 140

Weak upper topology, 121Well ordering, 58Well ordering theorem, 58Whitney embedding theorems, 192Wild embedding, 191Winding number, 156Witness, 91

ZZero section, 199, 222Zero sum game, 101

Date post:	26-Aug-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Advanced Fixed Point Theory for Economics · of ﬁxed points entitled Selected Topics in the...

Documents