SET THEORY (MTH-3E22) (SPRING 2015)bfe12ncu/MTH-3E22-set-theory.pdf · 2015-05-11 · 12.2....

SET THEORY (MTH-3E22)(SPRING 2015)

DAVID ASPERO

Contents

1. Introduction 21.1. Some elementary facts about sets 61.2. You may want to read this subsection again later 92. The axiomatic method: A crash course in first order logic 123. Axiomatic set theory: ZFC 183.1. The axioms 193.2. ZFC vs. PA 253.3. The consistency question 274. Ordinals 315. Cardinals 385.1. The Cantor–Bernstein–Schroder Theorem 405.2. More on cardinals 425.3. Countable and uncountable sets 435.4. Almost disjoint families 456. Foundation, recursion and induction. The cumulative

hierarchy 467. Inner models and relativization 527.1. A relative consistency proof: Con(ZF \{Foundation})

implies Con(ZF) 547.2. Basics of ordinal arithmetic 568. The Axiom of Choice 579. Basics of cardinal arithmetic 6010. Filters and ultrafilters 6410.1. Clubs and stationary sets 6711. Infinite Ramsey theory 7012. Some countable combinatorics 7312.1. Countable linear orders 74

1

2 D. ASPERO

12.2. Countable graphs 76

1. Introduction

Set theory plays a dual role.

(1) It provides a foundation for mathematics.(2) It is itself a branch of mathematics, with non-trivial – and some-

times surprising – consequences at the level of the propertiesof (infinite) sets, and with deep applications to other areas ofmathematics (we will see examples of both).

Reducing everything to sets: Set theory was developed / dis-covered / instigated by Georg Cantor in the second half of the 19thcentury, as a result of his investigations of trigonometric series ratherthan out of foundational considerations.1 However, set theory wouldsoon become the prevalent foundation of mathematics. In fact, it wasborn at a time when mathematicians saw the need to define things care-fully (i.e., define the object of their study in a mathematical languagereferring to reasonably ‘simple’ and well–understood entities) and settheory provided the means to do exactly that.2

Example: What is a di↵erentiable function? What is a continuousfunction? What is a function? What is a real number? What is anatural number?

Well, we can start approaching the above questions in the followingway: We can define a relation to be simply a set of ordered pairs (a, b).And we can define a function f to be a functional relation (i.e., arelation f such that (a, b), (a, b0) 2 f implies b = b0).

What is an ordered pair (a, b)? Well, note that

1Cantor was certainly not the first to come across the notion of set, it beingindeed an extremely natural notion both within and outside of mathematics, buthe was probably the first to carry out a systematic study of the notion of abstractset (in very much the same way that group-theorists study the notion of abstractgroup) and the first to prove non-trivial things about infinite sets in general.

2Most comments of historical flavour in these notes, like this one, are meant tobe read with caution. History is of course often a messy a↵air, and the historyof mathematics is no exception. Fortunately this is not a course on the history ofmathematics.

Set Theory (MTH-3E22) (Spring 2015) 3

(i) it should have a ‘left component’ and a ‘right component’, andthat

(ii) it should be completely determined by its left component andits right component together.

(in particular, whatever an ordered pair is, it has to be such that ifa 6= b, then (a, b) 6= (b, a)). Now, given a, b, we can define

(a, b) = {{a}, {a, b}}(this definition is due to Kuratowski).

Fact 1.1. Given ordered pairs (a, b), (a0, b0), the following are equiva-lent.

(1) (a, b) = (a0, b0)(2) a = a0 and b = b0.

Proof. We just need to prove that (2) implies (1).Case 1: a = b. In that case, (a, b) = {{a}, {a, b}} = {{a}} =

(a0, b0) = {{a0}, {a0, b0}}, which implies that {{a0}, {a0, b0}} has exactlyone element, i.e., {a0} = {a0, b0}, and that {a0} = {a}. But thenb0 = a0 = a = b.

Case 2: a 6= b. In that case, {a} 6= {a, b}, and (a, b) = {{a}, {a, b}}has two elements, one of them with exactly one element (namely a)and the other one with exactly two elements (namely a and b). Hencethe same holds for {{a0}, {a0, b0}}. {a0} has exactly one element, andtherefore this set is equal to the unique set of {{a}, {a, b}} with exactlyone element, which means that {a0} = {a}, and therefore that a0 = a.Similarly, the two–membered set of {{a0}, {a0, b0}}, namely {a0, b0}, hasto be equal to the unique two–membered set of {{a}, {a, b}}, namely{a, b}, and since a0 = a and b 6= a, this implies that b0 = b. ⇤

So, Fact 1.1 precisely shows that the above definition does the in-tended job; it satisfies both of (i) and (ii), and so it is a satisfactorydefinition of the (informal) notion of ordered notion in terms only ofsets and the membership (2) relation.

Similarly, for given n, we can define the n–tuple (a0, . . . , an, an+1) =((a0, . . . , an), an+1).

So we can successfully define the notion of function from the notionof set (and the membership relation 2, of course). And the notion ofset is presumably easier to grasp than the notion of function.

What about natural numbers, integers, rational, reals and so on?We can define 0 = ; (the empty set, the unique set with no elements).

The set ; has 0 members.We can define 1 = {0} = {;}. The set {;} has 1 member.

4 D. ASPERO

We can define 2 = {0, 1} = {;, {;}}. The set {;, {;}} has 2 mem-bers.

In general, we can define n+1 = n[{n}. With this definition n is aset with exactly n many members, namely all natural numbers m suchthat m < n.

With this definition, each natural number n is an ordinal � whichis either ; or of the form ↵ [ {↵} for some ordinal ↵ and all of whosemembers are either the empty set or of the form ↵ [ {↵} for some or-dinal ↵, and every ordinal which is either ; or of the form ↵[ {↵} andall of whose members are either the empty set or of the form ↵ [ {↵}for some ordinal ↵ is a natural number (the notion of ordinal, which wewill see later on, is defined only in terms of sets and the membershiprelation).

What is nice about this is that it gives a definition of the set N ofnatural numbers involving only the notion of set (and the membershiprelation):

N is the set of all those ordinals ↵ such that ↵ is either the emptyset or of the form � [ {�} for some ordinal � and such that each ofits members is either the empty set or of the form � [ {�} for someordinal �.

In particular, we may want to say that a set x is finite i↵ there is abijection between x and some member of N. By the above definitionof N we thus have a definition of finiteness purely in terms of sets andthe membership relation.

+ and · on N can be defined also in a satisfactory way using thenotion of set. Then we can define Z in the usual way as the set ofequivalence classes of the equivalence relation ⇠ on N ⇥ N defined by(a, b) ⇠ (a0, b0) if and only if a+ b0 = a0 + b, and can define operationsof addition and multiplication on Z also in a natural way. Then wecan define also Q, together with the natural operations on Q, fromZ and the operations on Z, in the usual way. Specifically, Q can bedefined as {(a, b) 2 Z ⇥ Z : b 6= 0}/ ⇠, where now (a, b) ⇠ (a0, b0) i↵a · b0 = a0 · b, where · denotes multiplication in Z, and the operations ofaddition and multiplication on Q can also be defined in a natural way.We can define R as the set of equivalence classes of the equivalencerelation ⇠ on the set of Cauchy sequences f : N �! Q where f ⇠ gif and only if limn!1 h = 0, where h(n) = f(n) � g(n) (Weierstrass’


construction) or, equivalently,3 as the set of Dedekind cuts of rationals(Dedekind’s construction).4 And so on.5

All these constructions involve only notions previously defined to-gether with the notion of set and the membership relation. So theyultimately involve only the notion of set and the membership relation.

If there is nothing fishy with the notion of set and the operationswe have used to build more complicated sets out of simpler ones, thenthere cannot be anything fishy with these higher level objects.

Similarly: We feel confident with the existence of C (which, by theway, contains “imaginary numbers” like i) once we become confidentwith the existence of R and know how to build C from R in a verysimple set–theoretic way.

Also: We can derive everything we know about the higher level ob-jects (like, say, the fact that ⇡ is transcendental, or the fact the de-terminant of the product of two square matrices with real entries isthe product of the determinants of the two matrices) from facts aboutsets. This is true since the higher level objects (like ⇡ and all otherrelevant objects, and so on) are ultimately definable only in terms ofsets and the membership relation. For the same reason, every questionwe might be interested in as mathematicians (is e+⇡ transcendental?,Goldbach’s conjecture, ...) is ultimately reducible to a question aboutproperties of sets.

Hopefully we can get to know the elementary facts about sets fromwhich – again, hopefully – we will be able to derive (using logic) all otherrelevant facts about sets that will enable us to answer all interestingquestions. If we can do this, we will have managed to completely re-duce mathematics to considerations about sets and their (elementary)properties (plus logic).

So let’s get started.

3In the sense that there is an isomorphism between the structures resulting fromthese two constructions.

4A Dedekind cut of rationals is a bounded nonempty initial segment of rationals,with the usual ordering < on the rationals, without a maximum (where < can alsobe defined in a natural from previously defined objects). For example,

p2 = {q 2

Q : q < 0} [ {q 2 Q : 0 q, q

2< 2}.

5All these constructions (and the corresponding proofs) are standard and can beeasily found in the literature.

6 D. ASPERO

1.1. Some elementary facts about sets. Given sets A, B, we saythat A is of cardinality at most that of B, and write

|A| |B|,if there is an injective (or one–to–one) function f : A �! B (remember,a function is a special kind of set!).

We say that that A and B have the same cardinality, and write

|A| = |B|,if and only if there is a bijection f : A �! B.

We say that A has cardinality strictly less than B, and write

|A| < |B|,if and only if there is an injective function f : A �! B but there is nobijection f : A �! B.

Clearly |A| |B| and |B| |C| together imply |A| |C|. Also, itis true, but not a trivial fact, that |A| = |B| holds if and only if both|A| |B| and |B| |A| hold (Cantor–Bernstein–Schroder theorem, wewill see a proof of this theorem later on).

The notion of cardinality captures the notion of “size” of a set. (Ex-ample: |5| < |6|).

Notation 1.2. Given a set X, P(X) is the set of all sets Y such thatY ✓ X. (P(X) is the power set of X).

The following theorem arguably marks the beginning of set theory.

Theorem 1.3. (Cantor, December 1873)

(1) |N| < |{f | f : N �! {0, 1}}|(2) Given any set X, |X| < |P(X)|.

Proof. (1) Let N2 = {f | f : N �! {0, 1}} (this notation is actuallystandard and we will use it again later on). There is clearly an injection : N �! N2. For example we can let (n) = fn, where fn : N �!2 = {0, 1} is given by fn(m) = 1 i↵ m = n.

To see that, on the other hand, there is no bijection between N andN2, suppose � : N �! N2 is a function. For each n let �(n) = (ain)i2N.Now we consider the N ⇥ N matrix whose first row is �(0) = (ai0)i2N,whose second row is �(1) = (ai1)i2N, whose third row is �(2) = (ai2)i2N,and so on. Now we define a sequence (bn)n2N of 0’s and 1’s by lookingat the diagonal of this matrix and making sure that, for each n, bn isdi↵erent from the n-th entry in the diagonal; in other words, bn = 1�ann.Then (bi)i2N 2 N2 and yet (bi)i2N /2 range(�). Indeed, for every n,


(bi)i2N 6= �(n) since the n-th entry of (bi)i2N, namely bn, is di↵erentfrom the n-th entry of �(n), namely ann. We have seen that � is not ontoN2. In particular this means that there is no bijection � : N �! N2.

(2) There is clearly an injection f : X �! P(X), for example takef to be the function sending x to the singleton of x, i.e., to {x}.

Now suppose f : X �! P(X) is a function. Let us see that f cannotbe a surjection: Let

Y = {a 2 X : a /2 f(a)}Y 2 P(X). But if a 2 X is such that f(a) = Y , then a 2 Y if andonly if a /2 f(a) = Y . This is a logical impossibility, so there is no sucha. ⇤

Part (2) of this theorem immediately yields that not all infinite setsare of the same size, and in fact there is a whole hierarchy of infinities!(which was not known at the time):6

|N| < |P(N)| < |P(P(N))| = |P2(N)| < . . .. . . < |Pn(N)| < |Pn+1(N)| < . . .. . . < |Sn2N Pn(N)| < |P(

Sn2N Pn(N))| < . . .

More elementary facts about sets.There are many sets whose members are sets. For example, given the

constructions of the number systems (N, Q, R, and so on) that we havesketched, every subset of each of these numbers systems (every set ofreals, of rationals, and so on) is a set whose members are sets. In fact,if mathematics is to be reducible to sets, then every set of relevance tomathematics should be a set of sets. Hence we may define R be theset of all those sets X such that

X /2 X

R is indeed a collection of objects – namely, all sets X such thatX /2 X –, and so (we would naturally say that) it is therefore a set.

6As an aside, an easy consequence of Cantor’s theorem is that |{x 2 R :x is algebraic}| < |{x 2 R : ⇠ is transcendental}| (x 2 C is transcendental i↵it is not algebraic, and x is algebraic i↵ it is a solution of a polynomial with ra-tional coe�cients). Curiously, 1873 is also the year when Charles Hermite provedthat e is transcendental; this was the first number known to be transcendental andwhich had not been constructed specifically for this purpose. To sum up, even ifmost reals are transcendental, it is not easy to show that naturally occurring realsare transcendental.

8 D. ASPERO

R contains many sets. For instance, ; 2 R, 1 2 R, every naturalnumber is in R, N 2 R, R 2 R, etc.

Question 1.4. Does R belong to R?

Well, note that R 2 R if and only if R /2 R, which is a contradiction(!) In fact, if R 2 R, then R satisfies the definition of R and thereforeR /2 R. And if R /2 R, then R does not satisfy the definition of R andtherefore, since it is a set, R 2 R.

So R cannot possibly be a set!! This contradiction is known asRussell’s paradox.7

Thinking exercise 1.1. Notice the resemblance between the proof ofCantor’s theorem (part (2)) and the argument behind Russell’s para-dox. Note also that, whereas Russell’s paradox certainly shows thatsomething goes wrong, there is nothing obviously contradictory aboutCantor’s theorem.

So, our naive “theory” of sets is inconsistent and maybe it’s not sogood a foundation of mathematics after all...

Is this the end of the story for set theory?

Well, we like to think in terms of objects built out of sets and likethe simplicity of the foundations set theory was intending to provide.Also, we find the multiplicity of infinities predicted by set theory anexciting possibility, and there was nothing obviously contradictory inCantor’s theorem.

A retreat: A valid move at this point would be to retreat to a moremodest theory T such that

(1) T should express true facts about sets (or should we say plau-sible, desirable instead of true?),

(2) T enables us to carry out enough constructions so as to build allusual mathematical objects (real numbers, spaces of functions,etc.),

(3) T gives us an interesting theory of the infinite (|N| < |P(N)|,etc.), and such that

(4) we can prove that T is consistent; or, if we cannot prove that,such that we have good reasons to believe that T is consistent.

First questions:

7After Bertrand Russell, the logician who discovered it (in 1901).


(A) What is a theory?(B) Which should be our guiding principles for designing T?

We will answer (A) first. We will then set up a theory, called ZFC,and will see that it addresses (1)–(4) in a presumably satisfactory man-ner.8 ZFC has indeed become the standard foundation of mathematics.

1.2. You may want to read this subsection again later. Beforejumping to the next section, let us look a bit more closely at how we gotto the uncomfortable point of deriving Russell’s paradox. Our startingpremises were:

(i) Logic: By this I mean completely universal true assertions like,for example, “either P is true or P is false, and it cannot bethat P is both true and false” for every statement P .

(ii) The assumption that every specifiable (definable) collection ofobjects is a set. Let us call this assumption (naive) Compre-hension principle.

Comprehension seemed to be universal enough that some (clever)people would even have agreed to call it a law of logic.9 In any case, thecomprehension principle certainly was intuitively true to many (clever)people. It follows from the comprehension principle of course that thecollection

R = {X : X a set, X /2 X}is a set. We then saw that if R is not a member of R, then it is amember of R, and that if R is a member of R, then it is not. Hence wehave both that R 2 R and that R /2 R, or none of R 2 R and R /2 R,and this is impossible given our logic. So our starting assumption issimply inconsistent, and therefore either (i) or (ii) has to go.10 Thestandard verdict is to say: We perceive our logic to be so universaland so deeply ingrained in the way we actually think, that we wouldrather prefer to keep it as it is. Therefore we will have to concludethat Comprehension – that very intuitive principle – is false and thattherefore we will have to revise our initial intuitions. More specifically,we will have to conclude that the collection R we defined above cannotpossibly be a set.

What then about a collection like {0, 1, 2}? Is it a set or not? Wecould cleverly say that if, as we have seen, not every collection of objects

8Incidentally, ZFC will avoid Russell’s paradox simply because it will prove thatR does not exist (even if it is a definable class of sets). As we will see, ZFC willprove that R is too big (in a precise sense) for it to be a set.

9Even if it talks specifically about sets and is therefore not ‘completely universal.’10This came as quite a big shock when it happened.

10 D. ASPERO

can possibly be a set, then ‘collection of objects’ is not a definition of‘set’. We thought we had a definition of set before, but it turned outwe were wrong. Unless we can again define the notion of ‘set’ in termsof other entities we cannot answer the question whether {0, 1, 2} is aset. It is not clear how we would do that and, even if we could do that,we would be left with the question of showing that the theory of thoseother entities is a good one.

In our next move, we will in fact not define what a set is, but willinstead specify a list of properties of the universe of sets. This is exactlywhat we do when we do abstract group theory. We don’t say what theelements of a group are (they could be anything), but we know thatthey relate to each other in certain ways. Similarly, physical theoriesdon’t say what physical things are; rather, they study the properties ofphysical things, in other words, how they relate to each other. Thereare several good reasons for treating sets axiomatically, more preciselyfor setting up an axiomatic theory of sets. One of them is that this waywe will be able to actually look at the theory of sets as a mathematicalobject in itself and will thus be able to address questions like “Is thistheory consistent at all?” Another reason is that we want to be certainthat we are playing the rules of the game correctly when we provethings about sets, like Cantor’s theorem (which we have already seen).We are not asking for a plausible argument to convince ourselves ofsome assertion, but for an actual mathematical proof. For this – andin the best tradition of modern rigour – we need a complete and precisespecification of the principles (axioms) that we are assuming.11

However, at the present point, even if we certainly don’t have adefinition of ‘set’, and even if we haven’t yet agreed on the promised listof properties of the universe of sets, we still feel that there is somethingfundamentally di↵erent between collections like R, or like V = {X :X a set}, on the one hand, and collections like ; or {0, 1, 2} on the otherhand. The point is not that R and V = {X : X a set} are infinitewhereas ; and {0, 1, 2} are finite and therefore we can e↵ectively listtheir elements; after all, there are finite sets which are so large thatwe will never be able to actually list their elements. Also, N is infiniteand we don’t see anything dangerous with the idea of N being a set.12

The point is rather that the definitions of R and V refer to a certaintotality of objects – namely the totality of all sets – to which R andV themselves belong if they are to be sets (!) This is not the case for; or {0, 1, 2}: the definition of ; does not refer to anything, and the

11After all we are interested in set theory as a stable foundation for mathematics.12I hope.


definition of {0, 1, 2} refers only to 0, 1 and 2, and neither of 0, 1 and 2is {0, 1, 2}. We say that the definitions of R and V are impredicative.A definition is impredicative if it refers to a totality of objects to whichthe thing purportedly being defined belongs. Impredicativity, like otherforms of self–reference13 is often problematic.14

In which way does ZFC block impredicativity? Well, as we will see,it is a theorem of ZFC – and in fact of the weaker theory ZF (ZF isZFC without the Axiom of Choice) – that all sets are generated instages. ZF proves that every set comes into existence, in a precisemathematical sense, at a later stage than all its members.15 None of Rand V, the way we defined them, could be of this type. ZF will in factprove that the universe of sets can be represented in a very ‘natural’and appealing way. Once we prove this theorem you will probably say“yes, yes, of course that, and nothing else, is the right notion of set,not the notion that emerged from the Comprehension principle!” Wewill have thereby revised our initial intuitions about sets. Is there anelement of indoctrination here? Well, I would say that yes, of coursethere is, like in so many aspects in life.16

13For another example of self–reference imagine the sentence ‘The statementwritten on this blackboard is false’ written on a blackboard, and suppose there isnothing else written on that blackboard. In the lecture we saw yet another examplein terms of games between two players.

14Self–reference cannot be the only source of all evil, though. More precisely, notevery instance of self–reference is necessarily problematic (or so it seems). Considerfor example the statement ‘all sentences are false’. It refers to all sentences, inparticular to itself. However, there is nothing problematic about this sentence. Itis simply false, it is false that all sentences are false, some are true, like ‘2+2 = 4’ (onthe other hand, the sentence ‘The statement written on this blackboard is false’, ifit is the only sentence written on a blackboard, can be neither true nor false). For anon–problematic example of impredicative definition in mathematics, consider thedefinition of the least upper bound lub(X) of a bounded set of reals X: lub(X) =min{b : b is an upper bound of X}. Now, lub(X) is itself an upper bound of X, soit is in the set {b : b is an upper bound of X} that we are referring to in the verydefinition of lub(X) (as the minimum of this set). It is not a trivial task to separatenon–problematic instances of reference from those which are problematic; in factthere has been disagreement throughout history as to how to distinguish them.

15These stages are indexed by the so–called ordinals. We will see precisely whatall this means.

16On the other hand, it has been argued that this ‘revised intuition’ was actuallyCantor’s original intuition about sets.

12 D. ASPERO

2. The axiomatic method: A crash course in first orderlogic

For us a theory will be a first order theory.17 A theory T will alwaysbe a theory in a given formal language L.18 It will be a set (!) ofL–formulas expressing facts about some intended domain of discourse.

Talk of “sets” of L–sentences before we have even defined T (whichmight end up being an intended theory of sets)? Well, those sets ofsentences, as well as the sentences, the language L, etc., are objects inour meta–theory. Presumably they will obey laws expressible in somemeta–meta–theory (perhaps the same laws the same theory T is, in ourunderstanding, trying to express!).

A language L consists of

• a (possible empty) set of constant symbols c, d, ...• a (possibly empty) set of functionals (or function symbols) f ,g, ..., together with their arities (this arity is a natural number;if f is meant to represent a function fM : M �! M it has arity1, if it is meant to represent a function fM : M ⇥M �! M ithas arity 2, etc.)

• a (possibly empty) set of relationals (or relation symbols) R,S, .... together with their arities (this arity is again a naturalnumber; if R is meant to represent a subset RM ✓ M , then ithas arity 1, if it is meant to express a binary relation RM ✓M ⇥M , then it has arity 2, etc.)

These are the non-logical symbols and completely determine L.We also have logical symbols, which are independent from L:

• ^, _, ¬, !, $ (connectives)• 8, 9 (quantifiers)• = (equality symbol)• (, ), , (the two parentheses and the comma)

The parentheses and the comma are not really needed but it makesthe presentation simpler. Also, some of the connective and quantifiersare not actually necessary; we could actually do with just ¬, _ and 9,for example.

Finally, we have a su�ciently large supply of variables : V ar ={v0, v1, . . . , vn, . . .}. For most uses it is enough to take the set of vari-ables to have the same size as the natural numbers.

17You are not asked to learn these things by heart, but you are supposed tounderstand what is going on.

18Rather than in a natural language like, say, English.


Examples: The language of set theory has only one non–logicalsymbol, namely a relation symbol 2 of arity 2, to be interpreted (wewill define this in a moment) as the membership relation. The lan-guage of group has, typically, only a function symbol of arity 2, to beinterpreted as the group operation.

The collection TermL of terms of L is defined in the following way.

(1) Every variable is a term.(2) Every constant symbol is a term.(3) For every n, if t1, . . . , tn are terms and f is a function symbol

of arity n, then f(t1, . . . , tn) is a term.(4) All terms are of the form (1), (2), or (3).

We will typically use natural abbreviations to refer to terms; forexample, if + and · are functionals of arity 2, we may refer to a termlike ·(v1,+(v3,+(v2, v1))) by the expression v1 · (v3 + (v2 + v1)), whichis easier to read.

Now we are ready to define the collection FmlL of L–formulas.

(1) (atomic formulas)19

(a) vi = vj for any, not necessarily distinct, variables vi, vj.(b) R(t1, . . . , tn) for any n, any relational R of arity n, and ev-

ery sequence t1, . . . , tn of terms (there could be repetitionsamong t1, . . . , tn).

(2) If ' and � are formulas, then (¬'), (' _ ), (' ^ _), (' ^ ),(' ! ), (' $ �) are formulas. Also, if v is a variable, then(8v') and (9v') are formulas.

(3) Something is a formula if and only if it is an atomic formula oris obtained from formulas as in (2).

When referring to a formula, we often omit parentheses to improvereadability (these expressions are not actual o�cial formulas but referto them in a clear way).

A sentence is a formula ' without free variables, i.e., such that everyvariable v occurring in ' occurs in some subformula of the form (8v )or of the form (9v ).

Example: The atomic formulas in the language for set theory arethe formulas of the form vi = vj and vi 2 vj , where vi and vj arevariables (they could of course be the same variable).

19They are called atomic because they cannot be split into simpler formulas.

14 D. ASPERO

Examples of formulas in the language of set theory are the formulasabbreviated as:

8x8y(x = y $ 8z(z 2 x $ z 2 y))

(The axiom of Extensionality)

8x8y9z8w(w 2 z $ (w = x _ w = y))

or, even more abbreviated,

“for all x, y, {x, y} exists”

(Axiom of unordered pairs).Another example:9a9b8y(y 2 x $ ((8w(w 2 y $ (w = a _ w = b))) _ (8w(w 2 y $

w = a)))))(x is in ordered pair)

The first two formulas are sentences. The third one is not.

An example of a formula in the language for group theory is thesentence abbreviated by

¬8x, y, z((x · y) · z = x · (y · z))Group theory implies that the above sentence is false. Next we will

see what this means.

SatisfactionLet us fix a language L.An L–structure is an ordered pair of the form

M = (M, cM, fM, RM)c,f,R2L

where:

• M is a nonempty set.20

• For every constant symbol c, cM 2 M .• For every n and every functional f of arity n, fM is a function,fM : nM �! M .

• For every n and every relational R of arity n, RM ✓ nM .

20It is important that M be a set.


In the above, nM denotes {(a1, . . . , an) : ai 2 M for all i}.In the above, we say that M is the universe of M.

Given an L–structure M with universe M , an assignment in M is afunction ~a : Var �! M .

Given an L–structure M with universe M and an assignment ~a :Var �! M , we define tM[~a], the interpretation of t in M by ~a, forterms t 2 TermL, as follows (by recursion on the complexity of theterms, with the obvious meaning of ‘complexity’):

(1) If t is a variable vi, then tM[~a] = ~a(vi).(2) If t is a constant c, then tM[~a] = cM.(3) For all n, if f is a functional of arity n, t1, . . . , tn are terms, and

t is f(t1, . . . , tn), then tM[~a] = fM(tM1 [~a], . . . , tMn [~a]).

Given an L–structure M with universe M , we define the relationM |= ⇥[~a] between formulas ⇥ and assignment ~a : Var �! M , in thefollowing way (by recursion on the complexity of ⇥, with the obviousmeaning of ‘complexity’):

(1) (For atomic formulas ')(a) M |= (vi = vj)[~a] if and only if ~a(vi) = ~a(vj).(b) For every n, every relation symbol R of arity n, and every

sequence t1, . . . , tn of terms, M |= R(t1, . . . , tn)[~a] if andonly if (tM1 [~a], . . . , tMn [~a]) 2 RM.

(2) M |= (¬')[~a] if and only if M |= '[~a] does not hold.(3) M |= ('0 _ '1)[~a] if and only if M |= '0[~a] or M |= '1[~a] or

both.(4) M |= ('0 ^ '1)[~a] if and only if M |= '0[~a] and M |= '1[~a].(5) M |= '! [~a] if and only if the following is true: If M |= '[~a],

then M |= [~a].(6) M |= '$ [~a] if and only if the following is true: M |= '[~a] if

and only if M |= [~a].(7) M |= (9v')[~a] if and only if there is some b 2 M such that

M |= '[~a(v/b)], where ~a(v/b) is the assignment ~b such that~b(vi) = ~a(vi) if v 6= vi and ~b(v) = b.

(8) M |= (8v')[~a] if and only if for every b 2 M , M |= '[~a(v/b)].

We say that M satisfies ' with the assignment ~a i↵ M |= '[~a].

Easy fact: If � is a sentence, then M |= �[~a] for some assignment ~aif and only if M |= �[~a] for every assignment ~a. In that case we say

16 D. ASPERO

that M is a model of �.

Definition 2.1. Given a set T of formulas and a formula ', we write

T |= '

if and only if for every L–structure M = (M,R) and every assignment~a : Var �! M , IF M |= �[~a] for every � 2 T , THEN M |= '[~a].

The relation |= aims at capturing the notion of ‘logical consequence’:' follows logically from T if and only if ' is true in every world in whichT is true. |= is often called the relation of logical consequence.

‘First order’ in ‘first order logic’ refers to the fact that variables rangein the above definition only over the individuals of the universe of therelevant L–structures M. In second order logic we can have variablesthat range over (arbitrary) subsets of the universe of the relevant L–structures M. Etc.

Syntactical deductionLet T be a set of formulas. We will view T as a set of axioms and

deduce theorems from T : A theorem of T will be the final member �nof a deduction

� = (�0, �1, . . . �n)

from T , where we say that � = (�0, �1, . . . �n) is a deduction from T ifit is a finite sequence of L–formulas and for every i,

• �i is either in T , or• �i is a logical axiom of first order logic, or• �i is obtained form �j and �k, for some j, k < i, by the rule ofModus Ponens “If '! is true and ' is true, then is true”(for all L–formulas ', ). In other word, there are j, k < i andan L–formula ' such that �j is ' and �k is '! �i.

Here, a logical axiom is a member of a certain infinite easily specifi-able list, independent of the theory, consisting of formulas that expresslogical / completely universal truths. Typical members of this list arefor example, ' ! ( ! ') for all formulas ', , or ' _ ¬' for allformulas '. Indeed, we see it as a general truth that if ' is true, thenit is true that if is true then ' is true. Other typical members ofthis sequence are all instances of the schema ' _ ¬', for ' being any


formula. Again, we see it as a general truth that for every ' either 'is true or ¬' is true.21

This list of axioms is not unique: Many di↵erent lists of axioms giverise to the same system of logic.

If ' is a theorem from T , we write

T ` '` is often called the relation of logical derivability.On the face of their definition, |= and ` are quite di↵erent relations,

aimed at capturing two apparently di↵erent notions: The notion oflogical (semantical) consequence and the notion of deducibility in areasonable calculus. However, we do have the following remarkablefact, proved by the logician Kurt Godel in his PhD thesis.

Theorem 2.2. (Completeness theorem for first order logic) (K. Godel,1930’s) |==`

A theory T is consistent if no contradiction (say, 9x¬(x = x)) canbe derived from it:

T 0 9x¬(x = x)

Otherwise, it is inconsistent. In classical first oder logic, a theoryis inconsistent if and only if it is trivial, in the sense that it proveseverything.22

By the completeness theorem the following are equivalent:

• T is consistent.• There is an L–structure M such that M |= T (T is true insome world).

An immediate consequence of the above is that a theory T has amodel M (i.e., M |= T ) if and only if every finite T 0 ✓ T has a model(since a contradiction from T is a contradiction from some finite T 0 ✓T as every derivation from T involves only finitely many formulas).This is known as the Compactness Theorem for first order logic. TheCompactness Theorem has many interesting uses [we’ll hopefully seean example in the lecture].

21If we are classical logicians. There are weakening / versions of classical firstorder logic in which '_¬', also known as Law of Excluded Middle, is not true forsome choices of '.

22There are other logics, so called para–consistent logics, which may allow thepresence of contradictions but which nevertheless may not be trivial, i.e., whichmay not prove everything to be a theorem.

18 D. ASPERO

We will be interested in whether or not T ` � for various choices oftheories T and sentences �. The following are equivalent again by (thecontrapositive of) the completeness theorem:

• T 0 �• There is an L–structure M such that M |= T but M |= ¬�.

3. Axiomatic set theory: ZFC

Z is for Ernst Zermelo, F is for Abraham Fraenkel, C is for theAxiom of Choice.

The objects of set theory are sets. As in any axiomatic theory, theyare not defined (they are feature–less objects; in the context of thetheory there is nothing to them apart from what the theory says).

ZFC expresses facts about sets expressible in the first order languageof set theory. The same is true for any other first order theory in thelanguage of set theory. Some such theories, which are actually studiedby set–theorists, are the following: ZF, ZF+AD, ZFC+“There is asupercompact cardinal” + ZFC+GCH, ZFC+V = L, ZFC+PFA, ...

Most ZFC axioms will be axioms saying that certain “classes” builtout of given sets are actual sets (they are objects in the set–theoreticuniverse): Axiom 0, The Axiom of unordered pairs, Union set Axiom,Power set Axiom, Axiom Scheme of Separation, Axiom Scheme of Re-placement and Axiom of Infinity will be of this kind. Here, a classis any collection of objects, where this collection is definable possiblywith parameters. For example the class of all sets (which I referred toas V), the class R of all sets X such that X /2 X, or the class N ofall natural numbers. A proper class will be a class which is not a set(for example V and R).

ZFC will also have one axiom guaranteeing the existence of sets witha given property, even if these sets are not definable: The Axiom ofChoice.23 We will also have two “structural” axioms, namely the Ax-iom of Extensionality and the Axiom of Foundation.

A classification of the ZFC axioms.

(1) Structural axioms: Axioms of Extensionality, Axiom of Foun-dation.

(2) Constructive set–existence axioms: Axiom 0, The Axiomof unordered pairs, Union set Axiom, Power set Axiom, Ax-iom Scheme of Separation, Axiom Scheme of Replacement andAxiom of Infinity.

23There are strengthenings of ZFC incorporating additional non-constructive setexistence axioms.


(3) Non–constructive set–existence axiom: Axiom of Choice.

3.1. The axioms. The following is the list of the ZFC axioms.

Axiom of Extensionality: Two sets are equal if and only if theyhave the same elements:

8x8y(x = y $ 8z(z 2 x $ z 2 y))

In other words: the identity of a set is completely determined by itsmembers:

Example: The sets

• ;• {(a, b, c, n) : an + bn = cn, a, b, c, n 2 N, a, b, c � 2, n � 3}

are the same set.

Axiom 0: ; exists.

9x8y(y 2 x $ y 6= y)

(of course y 6= y abbreviates ¬(y = y)).

In the theory given by the Axiom of Extensionality together withAxiom 0 we can only prove the existence of one set:

;So this theory is not so interesting yet. The theory T = {Axiom 0,

Axiom of Extensionality} surely is consistent: For any set a,

({a}, ;) |= T

On the other hand, note that ({a, b}, ;) 6|= T if a 6= b.

Axiom of unordered pairs: For any sets x, y there is a set whosemembers are exactly x and y; in other words, {x, y} exists.

8x8y9z8w(w 2 z $ (w = x _ w = y))

Exercise 3.1. Prove, using the Axiom of Extensionality, that if x = y,then {x, y} = {x}.

20 D. ASPERO

Recall that we defined the ordered pair (x, y) as the set {{x}, {x, y}}.The theory laid down so far gives us already the existence of in-

finitely many sets! For example ;, {;}, {{;}}, {{{;}}}, {{{{;}}}},{;, {;}}, {;, {;, {;}}}, {{;}, {;, {;}}}, {;, {;, {;, {;}}}}, ... With thedefinition of the natural numbers we have adopted these sets are: 0,1, {1} = (0, 0), {{1}} = {(0, 0)}, ((0, 0), (0, 0)), 2 = (0, 1), {0, 2},{1, 2} = (0, 1), {0, {0, 2}}, ...

All sets whose existence is proved by the theory given so far have atmost two elements (!).

Exercise 3.2. The theory laid down so far proves the existence of (a, b)for all a, b.

Union set Axiom: For every set x,[

x = {y : (9w)(w 2 x ^ y 2 w)}exists:

8x9v8y(y 2 v $ (9w)(w 2 x ^ y 2 w))Sx is the set consisting of all the members of members of x,

SSx

is the set of all the members of members of members of x, etc.

Notation 3.1. Given sets x, y, x[y = {a : a 2 x_a 2 y} =S{x, y}.

Note: Given sets x, y, x[ y exists (by the Axiom of unordered pairsand the Union set Axiom).

With the theory given so far we can prove the existence of: {0} [{1, 2} = {0, 1, 2} = 3, {0, 1, 2} [ {3} = {0, 1, 2, 3} = 4, {0, 1, 2, 3} [{4} = {0, 1, 2, 3, 4} = 5, ....

So we can prove the existence of every individual natural number!Similarly, we can prove the existence of every finite set of natural num-bers, every ordered pair of natural numbers, every tuple of naturalnumbers, every finite set of tuples of natural numbers, ... However, allparticular sets proved to exist by the theory given so far are finite.

Notation 3.2. z ✓ x means: Every member of z is a member ofx.

Power set Axiom: For every x there is y whose elements are exactlythose z which are a subset of x:

8x9y8z(z 2 y $ (8w)(w 2 z ! w 2 x))


Notation 3.3. For every a, P(a) = {z : z ✓ a}.The Power set Axiom says that P(a) is a set whenever a is a set.With the theory T we have so far we can prove the existence of P(n)

for any particular n 2 N.For example:

• P(0) = {;} = 1• P(1) = {;, {;}} = 2• P(2) = {;, {;}, {{;}}, {;, {;}}} 6= 4• ...

T is consistent:24 Let (Xn)n2N be defined recursively by

• X0 = {;}• Xn+1 = Xn [ {{a, b} : a, b 2 Xn} [ {S a : a 2 Xn} [ {P(a) :a 2 Xn}

Then (S

n2N Xn,2) |= T .Actually it would be enough to start with ; and take Xn+1 = P(Xn)

at each stage n+ 1.Note: All particular sets proved to exist by T are still finite.

Axiom Scheme of Separation: Given any set X and any firstorder property P ,

{y 2 X : P (y)}exists; in other words: any definable subclass of a set exists as a set.

8x8v0, . . . , vn9y8z(z 2 y $ (z 2 x ^ '(x, z, v0, . . . vn)))for every L–formula '(x, z, v0, . . . vn) such that y does not occur asbound (i.e., non–free) variable in it, and where x, y, z, v0, . . . , vn aredistinct variables.

In the theory laid down so far we can prove the existence, for all x,y, of

x⇥ y = {(a, b) : a 2 x, b 2 y},and much more.

Fact 3.4. The theory we have so far proves the existence of x⇥ y forall y.

Proof. Work in the theory. Let x and y be given, Let z = x [ y,which we know exists in our theory. Note that x ⇥ y is a definablesub-collection of P(P(x[y)). Hence x⇥y exists using Power set twiceand Separation once. ⇤

24Isn’t it?

22 D. ASPERO

For a formula '(v0, . . . vn, u, v), ‘'(v0, . . . vn, u, v) is functional ’ is anabbreviation of the formula expressing “for all u there is at most onev such that '(v0, . . . vn, u, v)”.

25

Axiom Scheme of Replacement: Given any set X and any de-finable (class)–function F , range(F � X) is a set:

“For all x, v0, . . . , vn, if '(v0, . . . vn, x, u, v) is functional, then thereis y such that for all v, v 2 y if and only if there is some u 2 x suchthat '(v0, . . . vn, x, u, v),”

for every formula '(v0, . . . vn, x, u, v) such that y does not occur asbound variable, and where x, y, u, v, v0, . . . , vn are distinct variables.

Caution: The Axiom schemes of Separation and Replacement arenot axioms but infinite sets of axioms (!). However, it is obviouslypossible to write down a computer program which, given a sentence �,recognises whether or not � belongs to either of these schemes.

Exercise 3.3. Prove that the Axiom Scheme of Separation is not neededand that in fact every instance of this axiom scheme follows from someinstance of the Axiom Scheme of Replacement. More precisely, provethat if T is the theory we have laid down so far, without any instancesof the Axiom Scheme of Separation, then T proves all instances of thisaxiom scheme.

Given a set X such that a 6= ; for all a 2 X, a choice functionfor X is a function f with dom(f) = X and such that f(a) 2 a for alla 2 X.

Axiom of Choice (AC): Every set consisting of nonempty setshas a choice function.

Exercise 3.4. Write down a sentence expressing the Axiom of Choice.

AC is needed in a lot of mathematics. For example, to prove thatevery vector space has a basis, that there are sets of reals which arenot Lebesgue measurable, etc. Nevertheless, historically AC has beenseen with suspicion: Finite sets clearly have choice functions,26 but ifX is infinite, where did the choice function for X come from? Also,

25Which can be expressed in our language since it has =.26Try to see why.


AC has “strange consequences”: For example, it is possible to decom-pose a sphere S into finitely many pieces and rearrange them, withoutchanging their volumes – in fact by moving them around and rotatingthem, and without running into one another –, in such a way that weobtain two spheres with the same volume as S!27 This result is knownas the Banach–Tarski paradox.28

As we will see later, AC has interesting equivalent formulations(modulo the rest of ZFC). For example AC is equivalent to “For everytwo nonempty sets A, B, either |A| |B| or |B| < |A|”. AC is alsoequivalent to “Every product of compact topological spaces is com-pact”.

The Axiom of Foundation: If X 6= ; is a set, there is some a 2 Xsuch that b /2 a for every b 2 X.

In other word: Every nonempty sets has some 2–minimal element.Modulo the other axioms (in particular AC), the following are equiv-

alent:

• Foundation• There are no x0, x1, . . . , xn, xn+1, . . . such that . . . 2 xn+1 2xn 2 . . . 2 x1 2 x0.

The idea behind Foundation is that sets are generated at di↵erentstages. If a set X is generated at stage ↵, then all members of X havebeen generated at some stage before ↵.

Foundation, together with Extensionality, of course, is perhaps themost fundamental axiom in set theory.

As with AC, one could perhaps also complain: Where did the 2–minimal element a of X come from? But wait. a was already in X. Ifyou remove a from X, what you get is no longer X!

In fact, most people like Foundation: It says that the universe isgenerated in an orderly fashion. And it provides a very convenient toolto use in proofs, which we will be using all the time: Induction.

Let (Vn)n2N be defined by recursion as follows.

• V0 = ;• Vn+1 = P(Vn)

The theory laid down so far, T = Ax0+ Extensionality + UnorderedPairs + Union + Power Set + Separation + Replacement + AC +

27The pieces are not Lebesgue measurable, though.28The Banach–Tarski is not an actual paradox, in the sense that Russel’s paradox

is, but a counterintuitive fact.

24 D. ASPERO

Foundation, is consistent.29 In fact

([

n

Vn,2) |= T

Still, all sets proved by T to exist are finite. In fact,

([

n

Vn,2) |= “Every set is finite”

What do we mean by finite? For the moment let us say that a set Xis finite if and only if for every a 2 X, |X \ {a}| < |X|. Correspond-ingly, let us say the a set is infinite if and only if it is not finite. Thisis not the o�cial definition of ‘finite’ but is equivalent to the o�cialdefinition. But it makes things easier to deal with the above ‘definition’(which does not involve the notion of ordinal, which we haven’t definedyet). In any case, (

Sn Vn,2) thinks that every set is finite in this sense.

Axiom of Infinity: There is an infinite set.

Definition 3.5. Given a set x,

S(x) = x [ {x}(the successor of x).

So, S(0) = 1, S(1) = 2, ... S(n) = n+ 1.The Axiom of Infinity is equivalent to:

(9x)(; 2 x ^ (8y)(y 2 x ! S(y) 2 x))

This is also phrased as: There is an inductive set.

Proposition 3.6. Every inductive set is infinite (in our present sense).

Proof. Suppose X is inductive, let a 2 X, and let f : X \ {a} �! Xbe the function sending every set of the form Sn(a) (for n 2 N, n > 0)to Sn�1(a), and every set x 2 X which is not of the above form to xitself. It is easily checked that this function f is a bijection. ⇤

One could also define “↵ is an ordinal” (which we will do soon).Then we would define a natural number as an ordinal ↵ such that

(1) ↵ is either ; or of the form S(y) for some ordinal y and(2) for every x 2 ↵, x is either ; or of the form S(y) for some

ordinal y.

29Isn’t it?


The Axiom of Infinity is then equivalent to:

Axiom of Infinity’: The class of all natural numbers is a set.In other words: The Axiom of Infinity’ says that there is some x suchthat for all y, y 2 x if and only y is a natural number.

Remark 3.7. Note that Axiom of Infinity’ is a constructive set–exis-tence axiom, whereas Axiom of Infinity was not, strictly speaking (itjust says that there is an infinite, without defining it). However, Axiomof Infinity’ and Axiom of Infinity are equivalent modulo the other ax-ioms (and we don’t need many of them for this; in particular, we don’tneed the Axiom of Choice). This observation shows that one shouldbe carefully when specifying what a classification of set–existence ax-ioms into constructive set–existence axioms and non–constructive set–existence axioms would be. Indeed, depending on the context, twoaxioms which apparently belong to di↵erent classes can actually beequivalent.

Another observation that shows that one should be careful aboutthe above classification is the following: Given any sentence �, � isequivalent to a (seemingly) constructive set–existence axiom over anytheory that proves, say, that ; exists as a set and that R = {y : y /2 y}does not exist as a set. This axiom is 9x8y(y 2 x $ (y /2 y ^ ¬�)).The point of course is that if � is true then the above set x is the emptyset, and if � is false, then the above set x is R.

The Axiom of Infinity’ completes the list of ZFC axioms.30

Notice the big leap when adding Infinity to the list of axioms. ZFCcertainly proves the existence of infinite sets, by design! Before addingInfinity we had a theory T which ‘surely’ was consistent.31 Now, withthe addition of Infinity, it’s not so obvious that ZFC is consistent... .

Challenge 3.1. Construct a model of ZFC.

3.2. ZFC vs. PA. [You may skip this subsection]Peano Arithmetic, also known as PA, is the following first order the-

ory for (N, S,+, ·, 0), where S(n) = n+1 (in the language of arithmetic,i.e., the language with S, +, ·, 0):

• 8x(S(x) 6= 0)

30As we noted above, we could have adopted Axiom of Infinity’ instead of Axiomof Infinity. For expository purposes (see Section 4), it is better to take Axiom ofInfinity’ as our o�cial axiom.

31Since (S

n2N Vn,2) |= T .

26 D. ASPERO

• 8x, y, (S(x) = S(y) $ x = y)• 8x(x+ 0 = x)• 8x, y(x+ S(y) = S(x+ y)• 8x(x · 0 = 0)• 8x, y, x · S(y) = x · y + x• 8y(('(0, y) ^ (8x('(x, y) ! '(S(x), y))) ! 8x'(x, y))

for every first order formula '(x, y) in the language of arith-metic

(First order Induction Axiom Scheme)

First order arithmetical facts can be expressed in this language, likefor example “· is distributive with respect to +”, Fermat’s last theorem,Goldbach’s conjecture, ...

PA does prove many facts about (N, S,+, ·, 0). But it does not proveeverything!

Theorem 3.8. (Godel, 1930’s, Incompleteness Theorem (special case))If PA is consistent then there is a sentence � in the language of arith-metic such that

• PA 0 � and• PA 0 ¬�

Godel’s Incompleteness theorem(s), in their general formulation, arevery profound facts that we will look back into in a moment.

The sentence � in the Incompleteness Theorem does not express anyfact that mathematicians would have looked into prior to proving theincompleteness theorem. � is designed for the purpose of the proofonly.32

Notation 3.9. Given a set X and n 2 N, let[X]n = {a ✓ X : |a| = |n|}

Consider the following statement HP:“For all n, k, m there is some N such that for every colouring f : [N]ninto k colours there is some Y ✓ N such that Y has at least m manymembers and at least min(Y ) many members and such that all mem-bers of [Y ]n have the same colour under f .”

32In its intended interpretation, � says about itself that it cannot be provedin PA. This type of self–reference may sound strange at first; in particular, itmay seem doubtful that it even makes sense. However, there is a perfectly soundmathematical way to make sense of this, and in fact such a sentence can be writtendown in the language of arithmetic.


Here, n, k, m and N range over natural numbers.HP can be easily expressed by a sentence, which I will call HP, in thelanguage of arithmetic.

ZFC proves that (N, S,+, ·, 0) |= HP. On the other hand:

Theorem 3.10. (L. Harrington and J. Paris, 1977): If PA is consis-tent, then

PA 0 HP

Consider the theory T = (ZFC \{Infinity}) [ {¬Infinity}. It turnsout that T and PA are essentially the same theory: There are e↵ectivetranslation procedures

' �! �(')

between the sentences in the language of set and the sentences in thelanguage of arithmetic and

�! �( )

between the sentences in the language of arithmetic and the sentencesin the language of set theory such that for all ', ,

• T ` ' if and only if PA ` �(')• PA ` if and only if T ` �( )

The Harrington–Paris theorem gives an example of a simple “natu-ral” (purely combinatorial) statement � talking only about finite setswhich is true if there is an infinite set but need not be true if there areno infinite sets (!). Other examples have been found since then.

3.3. The consistency question. We pointed out that the theoryT = ZFC \{Infinity} was ‘surely’ consistent, based on the fact that(S

n2N Vn,2) |= T (assuming, in our metatheory, that P(a) exists forevery a, that N exists, that the recursive construction of F = (Vn)n2N iswell–defined class–function, and that

Srange(F ) exists, i.e., assuming

something like ZFC in our metatheory!)

Question 3.11. Can we prove, in T (equivalently, in PA), that T isconsistent? Can we prove, in ZFC, that ZFC is consistent?

The above questions do make sense: Both T and PA have enoughexpressive power to make “T is consistent”, “PA is consistent”, etc., ex-pressible in the theory: For example, we can code formulas, proofs, andother syntactical notions as natural numbers and reduce a statementlike “PA is consistent” to an arithmetical statement (some specific, but

28 D. ASPERO

extremely complex, diophantine equation p(x) = 0 does not have so-lutions). It then makes sense to ask whether T proves that p(x) = 0does not have solutions.

Theorem 3.12. (Godel’s Incompleteness Theorems) Suppose T is afirst order theory such that

• T is computable (in the sense that there is an algorithm decid-ing, for any given sentence �, whether or not � 2 T ),

• T interprets PA and• T is consistent.

Then:

(1) There is a sentence � such that• T 0 � and• T 0 ¬�

(First Incompleteness Theorem)

(2) T does not prove that T is consistent (T 0 Con(T ))

(Second Incompleteness Theorem)

A theory T as in (1) is said to be incomplete.Note: Both ZFC and PA are computable (in the above sense). Hence,

IF they are consistent, THEN they are incomplete and they cannotprove their own consistency. It follows that if we adopt, say, ZFC asour meta-theory, we won’t be able to prove any statement of the form“ZFC+� is consistent”. What we can at most do is prove relativeconsistency statements of the form “If ZFC is consistent, then ZFC+�is consistent” (Con(ZFC) ! Con(ZFC+�)).

On the other hand, note that ZFC ` Con(ZFC \{Infinity}) (equiv-alently, ZFC ` Con(PA)): Working within ZFC we can build the setS

n2N Vn and we can prove

([

n2N

Vn,2) |= ZFC \{Infinity}

We express the above fact by saying that ZFC has consistency strengthstrictly larger than ZFC \{Infinity}.

In general, we say that T1 has consistency strength at least that of T0

if and only if we can prove that if T1 is consistent then T0 is consistent.T1 has consistency strength strictly larger than T0 if and only if we canprove “if T1 is consistent, then T0 is consistent”, but we cannot prove“if T0 is consistent, then T1 is consistent” unless we can prove “T0

inconsistent.” And, similarly, we define “T0 and T1 are equiconsistent.”


It is important to bear in mind that we need to be careful withwhat we understand by the informal ‘proving’ in the above definitionor otherwise we might render the notion of consistency strength unin-teresting. In fact, if we identify ‘proving’ with ‘being true’, then allconsistent theories would have the same consistency strength, and allinconsistent theories would have the same consistency strength too.This is not so interesting, so we instead interpret ‘proving’ as ‘prov-ing within some reasonable theory, like PA or ZFC.’ For example, weshould understand a statement of the form “T1 has consistency strengthstrictly larger than T0” as meaning, for example, that

(1) we can prove in ZFC the arithmetical statement Con(T1) !Con(T0), and that

(2) we can prove in ZFC that if T0 and ZFC are both consistent,then ZFC does not prove the arithmetical statement Con(T0) !Con(T1) .

For example, although ZFC does not prove Con(ZFC), if ZFC isconsistent, it proves that ZFC+� and ZFC+¬� are equiconsistent(i.e., it proves Con(ZFC+�) $ Con(ZFC+¬�)) for many interestingchoices of �.

If T1 has consistency strength strictly larger than T0, then T1 is more“daring” than T0. There is a whole natural hierarchy of theories orderedby consistency strength:

• ZFC is equiconsistent with ZF (= ZFC \{AC}) and is strictlystronger than ZFC \{Infinity}.

• ZFC + “There is an inaccessible cardinal” is strictly strongerthan ZFC.

• ZFC + “There is a weakly compact cardinal” is strictly strongerthan ZFC + “There is an inaccessible cardinal”.

• ZFC + “There is a measurable cardinal” is strictly strongerthan ZFC + “There is a weakly compact cardinal”.

• ZFC + “There is a Woodin cardinal” is strictly stronger thanZFC + “There is a measurable cardinal”.

• ZFC + “There is a supercompact cardinal” is strictly strongerthan ZFC + “There is a Woodin cardinal”.

• ZFC + “There is a huge cardinal” is strictly stronger than ZFC+ “There is a supercompact cardinal”.

• ...

Let the Axiom Scheme of Comprehension be: For every formula'(x) in the language of set theory,

9x8y(y 2 x $ '(y, x))

30 D. ASPERO

Our naive set theory T we initially considered consists of the Axiomof Extensionality together with all instances of the Axiom Scheme ofComprehension. This was (essentially) Frege’s bold attempt to reduceall of mathematics to logic.

T is of course inconsistent by Russell’s paradox.In which way does ZFC (or ZF) neutralise Russell’s paradox? Well,

the answer is simple: ZF proves that there is no R such that for everyx, x 2 R if and only if x /2 x. Working in an arbitrary model of ZF,we note that if there was such an R, then R 2 R if and only if R /2 R,which is impossible. Therefore we conclude that there is no such R.

In other words, R = {x : x /2 x} is, in ZF, a proper class but nota set.

ZFC is not the only theory of sets that people have considered asa foundation for mathematics and which neutralises Russell’s paradox(and other related paradoxes). There are also: Type theories, Quine’sNew Foundations (NF), etc. However, ZFC is the most well–suitedfor developing mathematics. Incidentally, it is worth pointing out thatNF is not known to be consistent relative to any natural extension ofZFC.33

We cannot prove that ZFC is consistent. So why should we feelconfident about its consistency?

The first observation is that the question on the consistency of ZFCis reducible to the question on the consistency of the smaller theoryZF:

• ZFC is equiconsistent with ZF: Given any (M,R) |= ZF thereis a LM ✓ M such that (LM , R \ LM ⇥ LM) |= ZFC (Godel).

OK, why should we trust ZF then? I will give two reasons next.

• All axioms of ZF are “reasonable” assertions about sets: ZFsays that the set–theoretic universe is exactly the “cumulativehierarchy”, which provides a very appealing and very naturalpicture of the ‘generation of sets from previously generated sets’(see later).34 This is perhaps the best intrinsic justification forZF and, as a by–product, speaks in favour of it consistency: Thecumulative hierarchy looks so natural that it should be a “real

33Placing this comment here is, admittedly, a bit ZFC–centric. One could believein NF instead and, working within NF, could try to prove whether or not ZFC, orsome extension of ZFC by large cardinals, say, is consistent.

34In the cumulative hierarchy there is no place for such large classes as Russell’sclass R.


object”. It satisfies the axioms of ZF. Therefore, ZF should notbe inconsistent.

• History: No inconsistency has ever been detected within ZF.

We will be working in ZF \{Foundation} until further notice.

4. Ordinals

Definition 4.1. A partial order is an ordered pair (X,R) such that

• R ✓ X ⇥X,• for every x 2 X, (x, x) 2 R (R is reflexive),• for all x, y 2 X, (x, y) 2 R and (y, x) 2 R together imply x = y(R is anti–symmetric), and

• for all x, y, z 2 X, (x, y) 2 R and (y, z) 2 R together imply(x, z) 2 R (R is transitive).

We say that R is a partial order on X. Also, we often write xRy for(x, y) 2 R.

(X,R) is a total ordering (or linear order) if for all x, y 2 X, eitherxRy or yRx.

For example, (N, <), (Q, <) and Q⇥Q (with the product order) arepartial orders, and (N, <) and (Q, <) are linear but Q⇥Q is not.

Definition 4.2. A binary relation R ✓ X ⇥X is well–founded if andonly if for every nonempty Y ✓ X there is some a 2 Y which is R–minimal, i.e., such that (b, a) /2 R for every b 2 Y , b 6= a.

Definition 4.3. A well–order is a well–founded linear order.

For example, (5, <) and (N, <) are well–orders, but (R, <) is not.

Notation 4.4. Given a partial order (L,), < is the relation on Lgiven by y < x if and only if y x and y 6= x.

We will sometimes abuse language and say that a pair (L,<) is awell–order if

• < is transitive,• < is irreflexive (i.e., x < x fails for all x 2 dom(<))

(i.e., < is a strict order) and

• < is total (i.e., for all x, y, x < y, y < x or x = y) and• every nonempty subset of dom(<) has an <–minimal member.

We will often use the following fact, the proof of which I leave as aneasy exercise.

32 D. ASPERO

Fact 4.5. Suppose (L,) is a well–order and A ✓ L is nonempty.Then min(A) exists.

Given a partial order (L,) and x 2 L,

pred(L, x) = {y 2 L : y < x}A ✓ L is an initial segment of (L,) i↵ for all x < y 2 L, if

y 2 A, then x 2 A. A ✓ L is a proper initial segment of (L,) ifA = pred((L,), a) for some a 2 L.

The following fact is immediate.

Fact 4.6. If (L,<) is a well–order and A ✓ L, then (A,<� A) is awell–order.

Given two partial orders (X0,0), (X1,1) a bijection f : X0 �! X1

is an order–isomorphism if and only if for all x, y 2 X0,

x 0 y if and only if f(x) 1 f(y).

If there is such an order–isomorphism we write

(X0,0) ⇠= (X1,1)

Proposition 4.7. Let (L0,0) and (L1,1) be two well–orders. Thenexactly one of the following holds.

(1) (L0,0) ⇠= (L1,1)(2) (L0,0) is order–isomorphic to some proper initial segment of

(L1,1).(3) (L1,1) is order–isomorphic to some proper initial segment of

(L0,0).

(Trichotomy)

Proof. Let

f = {(u, v) 2 L0 ⇥ L1 : pred((L0,0), u) ⇠= pred((L1,1), v)}f is a function: Suppose (u, v), (u, v0) 2 f , v 6= v0. Wlog v <1 v0.Since pred((L0,0), u) ⇠= pred((L1,1), v) and pred((L0,0), u) ⇠=pred((L1,1), v0), by composing these order–isomorphisms we obtainan order–isomorphism

g : pred((L1,1), v) �! pred((L1,1), v0)

Let v <1 v such that g(v) = v. The existence of v shows that

A = {z 2 L1 : z <1 g(z)} 6= ;


Let z⇤ = min(A). Let z be such that g(z) = z⇤. Then g(z) = z⇤ <1

g(z⇤) implies z < z⇤ and therefore z /2 A. Hence, z⇤ = g(z) 1 z <1 z⇤

and therefore z⇤ <1 z⇤. Contradiction.

Similarly one can show that for all u, u0, u 0 u0 i↵ f(u) 1 f(u0).In particular, f is injective.

dom(f) is an initial segment of (L0,0): Let u 2 dom(f) and letu0 <0 u. There is some v 2 L1 such that there is an order–isomophism

h : pred(L0,0), u) �! pred((L1,1), v)

Let v0 = h(u0). Then h � pred((L0,0), u0) is an order–isomorphismbetween pred((L0,0), u0) and pred((L1,1), v0). This shows (u0, v0) 2f .

Similarly one shows range(f) is an initial segment of (L1,1).If either dom(f) = L0 or range(f) = L1, then we are done: In the

first case, either range(f) = L1 and therefore f is an order–isomorphismbetween (L0,0) and (L1,1) or else min(L1\range(f)) = v exists andf is an order–isomorphism between (L0,0) and pred((L1,1), v). Inthe second case one proceeds similarly.

Suppose towards a contradiction that L0 \dom(f) and L1 \ range(f)are both nonempty. Let u = min(L0 \ dom(f)) and v = min(L1 \range(f)). Then f is an order–isomorphism between (pred(L0,0), u)and (pred(L1,1), v) and therefore (u, v) 2 f . But then u 2 dom(f)and v 2 range(f). A contradiction.

Finally: It is easy to check that (1)–(3) are mutually exclusive. ⇤Definition 4.8. A set x is transitive i↵ y ✓ x for every y 2 x. Inother words, x is transitive i↵ y 2 x and z 2 y imply z 2 x.

Examples:

• Every natural number is transitive.• N is transitive.• P(N) is transitive.• {1} is not transitive.

Definition 4.9. A set ↵ is an ordinal if and only if ↵ is transitive andwell–ordered under 2. In other words, letting 2 |↵ be the restriction of2 to ↵⇥ ↵, i.e., the relation on ↵ given by x 2 |↵ y i↵ x 2 y, (↵,2 |↵)is a well–order.

Fact 4.10. If ↵ is an ordinal and x 2 ↵, then x is an ordinal and

x = pred((↵,2 |↵), x)Proof. Let ↵ be an ordinal and x 2 ↵.

34 D. ASPERO

x is transitive: Let z 2 y 2 x. Using the transitivity of ↵ twice wehave that z 2 y 2 ↵ and therefore z 2 ↵. Since 2 |↵ is a transitiverelation (as ↵ is an ordinal), x, y and z are in ↵, and both y 2 x andz 2 y hold, we must have that z 2 x.

The proof that x = pred((↵,2 |↵), x) is then trivial.Since both x and ↵ are transitive, (2 |↵) \ (x ⇥ x) =2 |x. To see

this, note that the following are equivalent for all sets y, z:

• z 2 y 2 x• y 2 ↵ and z 2 ↵ and z 2 y 2 x.

But then 2 |x is a well–order on x since it is the restriction of thewell–order 2 |↵ to x. ⇤Fact 4.11. If ↵ and � are ordinals and f : (↵,2) �! (�,2) is anorder–isomorphism, then f is the identity on ↵. In particular, ↵ = �.

Proof. Suppose towards a contradiction that there is a minimal ⇠ 2 ↵such that f(⇠) 6= ⇠. Since f � ⇠ is the identity on ⇠,

f(⇠) = {f(⇠0) : ⇠0 2 ⇠} = {⇠0 : ⇠0 2 ⇠} = ⇠

which is a contradiction, where the first equality holds since the func-tion f : (↵,2) �! (�,2) is an isomorphism. ⇤Corollary 4.12. (Trichotomy for ordinals) Suppose ↵ and � are ordi-nals. Then exactly one of the following holds.

(1) ↵ = �(2) ↵ 2 �(3) � 2 ↵

Corollary 4.13. For every ordinal ↵, ↵ /2 ↵.35

Corollary 4.14. If A is a nonempty set of ordinals, then A has an2–minimal element.

Proof. Let ↵ 2 A. If ↵ is not 2–minimal, then A \ ↵ 6= ;. But then� = min(A \ ↵) exists, and then � is the 2–minimum of A. ⇤

We also have the following fact, which follows immediately from thetransitivity of ordinals.

Fact 4.15. For all ordinals ↵, �, �, if ↵ 2 � and � 2 �, then ↵ 2 �.

35It follows from the Axiom of Foundation, that X /2 X for every set X. Indeed,if X 2 X, then A = {X} is a nonempty set such that there is no a 2 A such thatb /2 a for every b 2 a, which violates the Axiom of Foundation. However we are notassuming Foundation here. The point of Corollary 4.13 is that, as we have seen, itis true for ordinals, even in the absence of the Axiom of Foundation.


Notation 4.16. Ord denotes the class of all ordinals.

Corollaries 4.13, 4.12 and 4.14, together with Fact 4.15, yield thatthe relation 2 well–orders Ord. So, if Ord were a set, it would be anordinal. But then Ord 2 Ord, which is a contradiction since we haveseen that ↵ /2 ↵ for every ordinal ↵ (Corollary 4.13). Hence we havethe following.

Theorem 4.17. Ord is not a set. (Burali–Forti Paradox)

On the other hand:

Fact 4.18. Every transitive set of ordinals is an ordinal.

Exercise 4.1. Prove Fact 4.18.

Notation 4.19. In the context of ordinals, we will often use < todenote 2. In other words, if ↵ and � are ordinals, ↵ < � means ↵ 2 �.Also, we will write ↵ � to denote that either ↵ = � or else ↵ 2 �

Fact 4.20. For any ordinals ↵, �, ↵ � if and only if ↵ ✓ �.

Exercise 4.2. Prove Fact 4.20.

The Burali–Forti Paradox indicates that there should be many ordi-nals. Here is one:

Fact 4.21. ; is an ordinal.

Notation 4.22. If ↵ is an ordinal, then ↵+1 denotes S(↵) = ↵[{↵}.The following fact shows how to generate the least ordinal bigger

than a given ordinal.

Fact 4.23. If ↵ is an ordinal, then ↵+1 is an ordinal. In fact, ↵+1 =min{� 2 Ord : ↵ < �}.Proof. If y 2 S(↵), then either y 2 ↵ or y = ↵. In the first case,y ✓ ↵ [ S(↵). In the second case, y = ↵ ✓ ↵ [ {↵}. Hence S(↵) istransitive.

Every member of S(↵) is either a member of ↵ or is ↵, and hence is anordinal and therefore transitive. It follows that 2 |S(↵) is a transitiverelation, and it can be shown similarly that it is linear.

Finally, ifX ✓ ↵[{↵} is nonempty andX\↵ 6= ;, then a 2–minimalmember of X \ ↵ (which exists since ↵ is an ordinal) is 2–minimal inX. The other case is when X = {↵}. Then ↵ is 2–minimal. Thisconcludes the proof of the first part.

For the second part, suppose � is an ordinal and ↵ < �. We want tosee that ↵+1 = � or else ↵+1 < �. If none of these two things holds,then � 2 ↵+ 1. But then either � 2 ↵ or � = ↵. In either case, ↵ /2 �,which is a contradiction. ⇤

36 D. ASPERO

Definition 4.24. An ordinal is a successor ordinal if and only if it isof the form S(x). It is a limit ordinal if and only if it is not a successorordinal (so, ; is a limit ordinal).

Definition 4.25. A natural number is an ordinal which is either ;or a successor ordinal and such that all its members are either ; or asuccessor ordinal.

Definition 4.26. A set is finite if and only if it is bijective with anatural number, and it is infinite if and only if it is not finite.

The set of all natural numbers is denoted by !. ! exists by theAxiom of Infinity.! is an ordinal since it is a transitive set of ordinals. It is the least

nonzero limit ordinal.!+1 = S(!) = ![{!}, (!+1)+1 = S(!+1) = (![{!})[{![{!}},

etc. are successor ordinals.We have seen in Fact 4.23 that the operation ↵ 7�! S(↵) generates

an ordinal whenever it is applied to an ordinal. The following lemmahighlights another operation for building ordinals out of certain othergiven ordinals (namely, the operation of taking the union of a set ofordinals).36

Lemma 4.27. If X is a set of ordinals, then � =SX is an ordinal.

In fact

� = sup(X) = min{� 2 Ord : ↵ � for all ↵ 2 X}Proof. First of all note that

SX is a set b the Union set Axiom. Also,S

X consists of ordinals since every member of an ordinal is itself anordinal, and it is transitive since ↵in� 2 S

X implies that there is some � 2 X such that � 2 �, and then↵ 2 � by the transitivity of �, which gives ↵ 2 S

X. Hence, � =SX

is an ordinal by Fact 4.18.It is easy to see that ↵ � for all ↵ 2 X. In fact, for every such

↵ we have that ↵ ✓ � by definition ofSX, and therefore ↵ � by

Fact 4.20. Also, if � is an ordinal such that ↵ � for all ↵ 2 X yet� < �, then � 2 �, which means that there is some ↵ 2 X such that� 2 ↵ � and therefore � 2 ↵ ✓ � by Fact 4.20. But then � 2 �,which we know is impossible for ordinals. ⇤

We will automatically view ordinals ↵ as embedded with the relation2 |↵ well–ordering them.

36All ordinals can in fact be thought of as being generated in either of these twoways.


Next we aim to show that ordinals are canonical representatives ofwell–orders, in the sense that every well–order is order–isomorphic, ina unique way, to a unique ordinal.

Lemma 4.28. Let (L,) be a well–order and ↵ an ordinal. Then thereis at most one order–isomorphism f : (L,) �! (↵,2).

This lemma is immediate since the composition of order–isomorphismsis an order–isomorphism, the inverse of an order–isomorphism is anorder–isomorphism, and since the identity is the only order–isomorphismbetween (↵,2) and itself.

Theorem 4.29. Every well–order (L,) is order–isomorphic to a uniqueordinal.

Proof. By what we have seen it su�ces to prove that (L,) is order–isomorphic to some ordinal.

Suppose, for a contradiction, that

{y 2 L : pred((L,), y) 6⇠= (↵,2) for any ↵ 2 Ord} 6= ;and let

x = min{y 2 L : pred((L,), y) 6⇠= (↵,2) for any ↵ 2 Ord}By the lemma, for all z < x let ↵z be the unique ordinal such that(pred((L,), z),) ⇠= (↵z,2) and let

fz : (pred((L,), z),) �! (↵z,2)be the corresponding unique order–isomorphism. Then, again by thelemma, if z < z0 < x, then ↵z 2 ↵z0 and fz = fz0 � pred((L,), z).

Assume max(pred((L,), x) does not exist (the proof in the othercase is similar [Exercise]). Let now

f : (pred((L,), x),) �! Ord

be given by f(y) = fy0(x) for any y0 such that y < y0 < x. This functionis well–defined by the above and it is easy to see that it is an order–isomorphism between pred((L,), x) and (range(f),2). But range(f)is a set, by Replacement, and it is transitive. Therefore it is an ordinal.This contradicts the choice of x. We thus have that for every x 2 Lthere is a unique ordinal ↵x such that there is an order–isomorphism

fx : pred(L,), x) �! (↵x,2),and this isomorphism is unique.

Now, arguing as above, we can glue together all these isomorphismsinto an isomorphism f : (L,) �! (X,2), where X is a transitive setof ordinals and therefore an ordinal. ⇤

38 D. ASPERO

Exercise 4.3. Complete the proof of Theorem 4.29

Given a well–order (L,), the unique ordinal ↵ such that (L,) ⇠=(↵,2 |↵) is the order type of (L,), denoted ot((L,)). If X is a setof ordinals, then ot(X) denotes the order type of (X,2 |X).

Exercise 4.4. Prove that if ↵ is an ordinal and X ✓ ↵, then ot(X) ↵.

Many sets can be well–ordered in di↵erent ways (so that the corre-sponding well–orders have di↵erent order types). For example, ! canbe well–ordered by 2 in order type !. And it can be well—ordered byputting 0 on top of every n > 0 and well-ordering ! \ {0} according to2. This well–order has order type ! + 1.

5. Cardinals

We have seen the ordinals ! + 1, (! + 1) + 1, ((! + 1) + 1) + 1, etc.,aka !, !+1, !+2, etc. The set consisting of all natural numbers and! + n for every n < ! is a transitive set of ordinals and therefore alsoan ordinal. It is called !+!. We can then build (!+!)+1, and so on.All these ordinals are countable in the sense of the following definition.

Definition 5.1. A set X is countable i↵ there is a bijection f : ! �!X.

Question 5.2. Is there an infinite ordinal which is not bijective with!?

Definition 5.3. A cardinal is an ordinal such that is not bijectivewith any ordinal ↵ < .

So, each natural number is a cardinal, ! is a cardinal, but no !+ n,is a cardinal. And the same goes for ! + !, (! + !) + 1, etc.

When regarded as a cardinal, ! is also denoted @0 (so N = ! = @0).

Notation 5.4. If X is bijective with a cardinal , we say that is thecardinality of X and write |X| = .

We will see that ZFC proves that every set is bijective with an or-dinal. Caveat : In a context without AC one can extend the notion ofcardinals to things that are not ordinals in a perfectly meaningful way.We don’t need to do that for the moment. So, for us, at least for themoment, cardinals are ordinals. Cardinals, in our sense, are sometimesalso called ‘alephs ’.


Definition 5.5. !1, also denoted @1, is the first uncountable cardinal(in other words, the first infinite cardinal not bijective with !).

Proposition 5.6. !1 exists.

Proof. Say that X ✓ ! codes a well–order if

{(n,m) 2 ! ⇥ ! : 2n+13m+1 2 X}is a well–order.

It is easy to see that every initial segment of a well–order coded bya subset of ! can itself be coded by a subset of !. Hence

� = {↵ : ↵ = ot() for some coded by some X ✓ !}is transitive and is a set, thanks to the Power set Axiom togetherwith a suitable instance of Replacement, since it is range(F ), whereF : P(!) �! Ord is the function sending X to ot() if X codes and to 0 otherwise. Hence � is an ordinal.� is not countable: If f : � �! ! were a bijection,

{2n+13m+1 : f�1(n) 2 f�1(m)}would be a subset of ! coding a well–order of order type �. But then� 2 �, which is impossible for ordinals. ⇤Remark 5.7. The use of the Power set Axiom in the above proof iscrucial. Indeed, there are models of ZFC \{Power set Axiom} in which!1 does not exist; in other words, these models think that every infiniteordinal is countable.

Exercise 5.1. Prove that the ordinal � in the above proof is precisely!1; that is,

!1 = {↵ : ↵ = ot() for some coded by some X ✓ !}Similarly, one can prove in ZF that there is a least cardinal strictly

bigger than !1. It is called !2, or @2. In general, we define:

Definition 5.8. Given an ordinal ↵, @↵, also denoted !↵, is the ↵–thinfinite cardinal.

Definition 5.9. Given a cardinal , +, the successor of , is the leastcardinal strictly bigger than .

Hence, (@0)+ = @1, (@1)+ = @2, and in general, (@↵)+ = @↵+1.

Definition 5.10. A cardinal � is a successor cardinal i↵ � = + forsome cardinal . A cardinal � is a limit cardinal i↵ � is not a successorcardinal.

Proposition 5.11. (ZF) For every infinite cardinal , + exists.

40 D. ASPERO

Proof. Similar to the proof that @1 exists:Every initial segment R0 of a well–order R ✓ ⇥ is a well–order

and of course R0 ✓ ⇥ . Hence,

� = {↵ : ↵ = ot(R) for some well–order R ✓ ⇥ }is transitive and is a set since it is range(F ), where F : P(⇥) �! Ordis the function sending X to ot(X) if X is a well-order on , and to 0otherwise. Hence � is an ordinal.� is not bijective with : If f : �! � were a bijection,

{(↵0,↵1) 2 ⇥ : f(↵0) 2 f(↵1)}would be a well–order of order type �. But then � 2 �, which isimpossible for ordinals. ⇤Exercise 5.2. Prove that

+ = {↵ : ↵ = ot(R) for some well–order R ✓ ⇥ }We will need the following notion later on.

Regular cardinals: Let (P,) be a partial order and let X ✓ P .X is cofinal i↵ for every a 2 P there is some b 2 X such that a b.

Definition 5.12. Given an ordinal ↵, the cofinality of ↵, cf(↵), is theleast ordinal such that there is a set X ✓ ↵, X cofinal in ↵, suchthat ot(X) = .

Definition 5.13. An ordinal is regular if and only if there is no↵ < for which there is a function f : ↵ �! with range cofinal in (in other words, i↵ cf() = ). An ordinal is singular i↵ it is a limitordinal and it is not regular.

Exercise 5.3. Prove that:

(1) Every regular ordinal is a cardinal.(2) 0 and 1 are the only regular natural numbers.(3) ! is regular.(4) !1 is regular in ZFC.

5.1. The Cantor–Bernstein–Schroder Theorem.

Theorem 5.14. (Cantor–Bernstein–Schroder Theorem) (ZF) For allsets X and Y , the following are equivalent.

(1) |X| |Y | and |Y | |X|.(2) |X| = |Y |


Proof. The implication from (2) to (1) is of course trivial, so we onlyneed to prove that (1) implies (2). For this, let f : X �! Y andg : Y �! X be injective functions. By replacing if necessary X andY by, for example, X ⇥ {0} and Y ⇥ {1}, respectively, we may assumethat X and Y are disjoint in the first place. Given c 2 X [ Y , let�c be the ✓–maximal sequence with domain included in Z such that�c(0) = c, �c(z + 1) = f(�c(z)) or �c(z + 1) = g(�c(z)) depending onwhether �c(z) 2 X or �c(z) 2 Y , such that �c(z � 1) = c if c 2 X issuch that f(c) = �c(z) (if �c(z) 2 Y and if there is such a c), and suchthat �c(z � 1) = c if c 2 Y is such that g(c) = �c(z) (if �c(z) 2 Xand if there is such a c). We will call �c the orbit of c. We say that�c starts in X if there is some z 2 Z in the domain of �c such that�c(z) 2 X and such that there is no c 2 Y with g(c) = �c(z) (so �c(z)is the first member of �c). Similarly we define ‘�c starts in Y ’. Andwe say that �c does not start in the remaining case (i.e., if and only ifdom(�c) = Z). We also say that a set � is an orbit if � is (the rangeof) the orbit of some c 2 X [ Y in the above sense.

The first observation is that every two distinct orbits are disjoint andthat the orbits partition X [ Y . The second observation is that if � isan orbit, then

• f � � is a bijection between � \X and � \ Y if � starts in X,• g � � is a bijection between � \ Y and � \X if � starts in Y ,and

• f � � is a bijection between � \ X and � \ Y and g � � is abijection between � \ Y and � \X if � does not start.

Using these two observations we can now define a bijection h : X �!Y by ‘gluing together’ suitable restrictions of f and/or of the inverseof g: Given a 2 X, if the unique orbit to which a belongs starts in Xor does not start, then h(a) = f(a). And if this orbit starts in Y , leth(a) be the unique b 2 Y such that g(b) = a. ⇤

As we have seen in this proof, if f : X �! Y and g : Y �! Xare injective functions, then there is a bijection h : X �! Y thatcan be e↵ectively constructed from f and g. For example, let f bethe identity on {2n : n 2 !} and let g : ! �! {2n : n 2 !}be given by g(n) = 4n. These are injective non–surjective functionsbetween ! and {2n : n 2 !}, and the above proof produces a bijectionh : ! �! {2n : n 2 !} that can be e↵ectively constructed from f andg.

Corollary 5.15 follows immediately from Theorem 5.14.

42 D. ASPERO

Corollary 5.15. The following are equivalent for any two cardinals ,�.

(1) || |�| (in other words, there is an injective function f : �!�).

(2) �.

Let us now consider the following statement:Dual C–B–S : For all sets X, Y , the following are equivalent:

(1) |X| = |Y |(2) There is a surjection f : X �! Y and there is a surjection

g : Y �! X.

Proposition 5.16. (ZFC) Dual C–B–S is true.

Proof. Suppose (2) holds. Using AC we find functions f : Y �! Xand g : X �! Y as follows:

• For every b 2 Y , f(b) is some a 2 X such that f(a) = b.• For every a 2 X, g(a) is some b 2 Y such that g(b) = a.

Then f and g are injective functions, so by C–B–S, |X| = |Y |. ⇤The following question is apparently open.

Question 5.17. It is not known whether or not, modulo ZF, DualC–B–S is equivalent to the Axiom of Choice.

5.2. More on cardinals. We have seen that @0 exists (by The Axiomof Infinity), and so does @1 and, in general, @n for every n (this followsimmediately from Proposition 5.11). It turns out that

S{@n : n < !}is the least cardinal bigger than @n for all n < !. Hence, since of course! = sup{n : n 2 !}, we have that @! =

S{@n : n < !}. This followsimmediately from the following general fact.

Fact 5.18. If X is a set of cardinals, then � :=S

X is a cardinal andis the least cardinal µ such that || |µ| for every 2 X.

Proof. We know that � is a union of a set of ordinals, so it is an ordinalby Lemma 4.27, and in fact � = sup{� 2 Ord : � � for al 2 X}.Suppose, towards a contradiction, that there is an ordinal ↵ < � anda bijection f : ↵ �! �. By definition of � =

SX there is some 2 X

such that ↵ < . But then, noting that ✓ SX = �, f�1 � is a

bijection between and a subset of Y of ↵. By Exercise 4.4 we havethat the order type of Y is such that ot(Y ) ↵. Let g : Y �! ot(Y )be a bijection. Then g � (f�1 � ) : �! ot(Y ) is a bijection, whichis impossible since is a cardinal and ot(Y ) ↵ < . Hence � iscardinal.


Finally, the fact that is the least cardinal µ such || |µ| for all 2 X follows immediately from the fact that � = sup{� 2 Ord : � � for al 2 X} together with Corollary 5.15. ⇤Exercise 5.4. Prove that @! is a singular cardinal.

The following fundamental theorem, which fits well here, has ofcourse appeared already (see Theorem 1.3 (2)).

Theorem 5.19. (Cantor’s Theorem) Suppose X is a set. Then thereis no injective function f : X �! P(X).

The proof of this theorem was done, back in Section 1, when wegave Theorem 1.3. Notice that the argument given there can indeedbe implemented in ZF.

5.3. Countable and uncountable sets. As we know (cf. Definition5.1), a set X is countable i↵ there is a bijection between @0 and X. Letus see some examples of countable sets.

Proposition 5.20. (ZF) The following sets are countable.

• ! ⇥ !; in general, n! := {s : s : n �! !} for any n 2 !,n � 1. (the set n! is sometimes denoted also !n.)

• <!! :=S

n2!n!. (the set <!! is also denoted !<!.)

• [!]n := {X ✓ ! : |X| = n} for any n 2 !, n � 1.• [!]<! = {X ✓ ! : X finite}• Z• Q• The set of algebraic numbers (x 2 C is algebraic if and only ifit is a root of a polynomial with rational coe�cients).

Proof. We can produce the corresponding bijections h : X �! ! (orh : ! �! X) by showing that there are one–to–one functions f : X �!! and g : ! �! X and then appealing to C–B–S. In many cases theexistence of at least one of these one–to–one functions is immediate.

An injective f : ! ⇥ ! �! ! is given for example by f(n,m) =2n+13m+1 (we used this coding in the proof of Proposition 5.6). Theexistence of a bijection between n! and ! can be proved by induction onn since |n+1!| = |(n!)⇥ !|. This gives a definable sequence (fn)1n<!

where fn : ! �! n! is a bijection for all n. We can then find abijection f : ! ⇥ ! �! <!! by, for example, sending (0, 0) to ; andsending (n,m) 6= (0, 0) to fn+1(m). Since there is a bijection g : ! �!! ⇥ !, the composition f � g : ! �! <!! is a bijection. To see thatthere is a bijection hn : [!]n �! ! for every n � 1, send x 2 [!]n

to f�1n ((x0, . . . , xn�1)), where (x0, . . . , xn�1) is the strictly increasing

44 D. ASPERO

enumeration of x. We can also code all (the inverses of) these bijectionstogether into a bijection h : ! �! [!]<! exactly as in the proof of|<!!| = |!|.

Using the above bijections and any of the usual representations of Zand Q (as, say, pairs of natural numbers and pairs of integers, respec-tively), we can easily build bijections between ! and Z and between! and Q. Using also the above bijections, we can well–order all poly-nomials with coe�cients in Q in length !. Once this is done, we caneasily find a bijection between a subset of !⇥! and the set of algebraicnumbers (which gives what we want by C–B–S): Given (n, k), if p(x) isthe n–th polynomial with rational coe�cients and p(x) has at least kdistinct roots, then we send (n, k) to the k–th root of p(x) in (say) thelinear order <lex of C given by a0 + ib0 <lex a1 + ib1 i↵ either a0 < a1or else a0 = b1 and b0 < b1 (where < refers to the usual order on thereal line). ⇤Proposition 5.21. (ZFC) The union of every countable collection ofcountable sets is countable: If (Xn)n2! is such that each Xn is count-able, then

Sn2! Xn is countable.

Proof. @0 |Sn2! Xn| is clear: There is a bijection f : ! �! X0, andf : ! �! S

n Xn is an injection.|Sn Xn| @0: For every n < ! pick, using the Axiom of Choice, a

bijection fn : Xn �! ! (i.e., let X = {Fn : n 2 !} where for each n,Fn is the set of all pairs (n, f), where f : Xn �! ! is a bijection, letG be a choice function for X, and let fn = f if G(Fn) = (n, f)).

Now let F :S

n2! Xn �! ! ⇥ ! be the function sending x to(n, fn(x)) if n is first k < ! such that x 2 Xk. F is an injection,and if g : ! ⇥ ! �! ! is a bijection (which exists since |! ⇥ !| = @0),then g � F :

Sn Xn �! ! is an injection.

Since |Sn Xn| @0 and @0 |Sn Xn|, by C–B–S we get |Sn Xn| =@0. ⇤

Some form of Choice is necessary in the above proposition. In fact,this proposition is not necessarily true without the Axiom of Choice: IfZF is consistent, then there are models of ZF in which !1 is a countableunion of countable sets (!)

Definition 5.22. A set is uncountable if and only if it is not finite orcountable.

The following are some examples of uncountable sets.

• !1, and in fact all ordinals ↵ � !1.• P(!) (by Cantor’s Theorem 5.19).


Notation 5.23. Given a set X, we write |X| = 2@0 i↵ there is abijection f : X �! P(!). In general, if is a cardinal and X is a set,we write |X| = i↵ there is a bijection f : X �! .

Proposition 5.24. |R| = 2@0. In particular, R is uncountable.

Proof. Let I be the closed–open interval [0, 1) ✓ R. Since of course|[0, 1)| |R|, by C–B–S it su�ces to show |P(!)| |[0, 1)| and |R| |P(!)|.

Let f : P(!) �! [0, 1) send X ✓ ! toP

n2!✏n22n

, where ✏n = 0 ifn /2 X and ✏n = 1 if n 2 X (i.e., (✏n)n2! is the characteristic functionof X).

Let h : Q �! ! be a bijection and let g : R �! P(!) send x 2 R to{h(q) : q < x} (this < is of course the natural order on R).

f and g are injective functions, so by C–B–S, |[0, 1)| = |R| = |P(!)|.⇤

Remark: Even if |Q| = @0 < |P(!)| = |R|, the rationals are dense inthe reals, i.e., between every two reals there is some (in fact, infinitelymany) rationals (!)

Exercise 5.5. Prove that |C| = 2@0.

Hence, since the set of algebraic numbers is countable, most complexnumbers are transcendental (i.e., non–algebraic). In fact

|{x 2 C : x transcendental}| = |C| = |R| = |P(!)|5.4. Almost disjoint families. Note that a collection of pairwise dis-joint subsets of ! has to be finite or countable. We even have thefollowing.

Exercise 5.6. Let A ✓ P(!) and suppose n < ! is such that |a\b| nfor all distinct a, b 2 A. Prove that A is finite or countable.

Definition 5.25. Two sets X, Y are almost disjoint if X \Y is finite.A set A is an almost disjoint family of sets i↵ any two distinct membersof A are almost disjoint.

We have seen that |!| < |P(!)|. Therefore, the following might looksurprising.

Theorem 5.26. (ZF) There is an almost disjoint family A of subsetsof !.

Proof. Let <!2 be the complete binary tree of height !, that is, thetree of n–sequences of 0’s and 1’s, for n < !. We have seen that <!2is countable. In fact, there is a simple enumeration of the nodes of T ,

46 D. ASPERO

where the first member is ;, the next two members are h0i and h1i, thenext fours members are h0, 0i, h0, 1i, h1, 0i and h1, 1i, the next eightmembers are the sequences with exactly three members, and so on. Letf : ! �! <!2 be such a bijection.

Now consider any two distinct infinite branches b, b0 through <!2and note that b\ b0 is finite; in fact, if n is the first position where theydisagree, then they have all nodes b � k, for k n, in common, but haveno other nodes in common. Also, there are as many infinite branchesthrough <!2 as there are subsets of !. In fact, there is obviously abijection g from P(!) into the set of such branches sending X ✓ !to the characteristic function �X (i.e., the function sending n to 1 ifn 2 X and to 0 if n /2 X).

It follows that A = {g�1[b] : b an infinite branch through <!2} is asubset of P(!) consisting of 2@0–many pairwise almost disjoint sets. ⇤Definition 5.27. A collection A ✓ P(!) is a maximal almost disjointfamily (a mad family) i↵ A is an almost disjoint family and there is noalmost disjoint family B ✓ P(!) such that A ✓ B but A 6= B.

It is not di�cult to see that the almost disjoint family A constructedin the proof of Theorem 5.26 is not mad.

6. Foundation, recursion and induction. The cumulativehierarchy

We have seen recursive definitions, for example when we proved inProposition 5.20 that |n!| = |!| for all n 2 !, n � 1 (this was arecursion on !). Also, many familiar definitions in mathematics are byrecursion: For example n! is defined by

• 0! = 1• (n+ 1)! = n!(n+ 1)

Another example: For a given n 2 !, we can define the functionf : ! �! ! given by f(x) = n + x (in other words, we can definen+m) as follows:

• n+ 0 = n• n+ (m+ 1) = (n+m) + 1

In fact, for a given ordinal ↵, we define ↵ + � by recursion on theordinals by:

(1) ↵ + 0 = ↵(2) ↵+S(�) = S(↵+�) = (↵+�)+1 (recall: �+1 = S(�) = �[{�}

by definition).(3) ↵ + � =

S{↵ + � : � < �} if � is a nonzero limit ordinal.


Note: + is not commutative: ! + 1 6= ! = 1 + ! ![We will see this operation ↵+�, and other arithmetical operations,

with more detail later on.]

Two more examples: @↵ can be defined, by recursion on the ordinals,by

(1) @0 = !(2) @S(↵)(= @↵+1) = (@↵)+

(3) @� =S{@� : � < �} if � is a nonzero limit ordinal.

Also:

Definition 6.1. In ZFC, for every cardinal , i↵() is defined, byrecursion on the ordinals, by

(1) i0() = (2) i↵+1() is the unique cardinal such that |P(i↵())| = .37

This cardinal is also denoted by 2i↵().(3) i�() =

S{i�() : � < �} for every nonzero limit ordinal �.

Notation 6.2. If = !, we write i↵ for i↵().

One last example:

Definition 6.3. We define (V↵ : ↵ 2 Ord) as follows:

(1) V0 = ;(2) V↵+1 = P(V↵)(3) V� =

S{V� : � < �} if � is a nonzero limit ordinal.

(V↵ : ↵ 2 Ord) is called the cumulative hierarchy.

Let us have a look at the first levels of this hierarchy.

• V0 = ;• V1 = {;} = 1• V2 = {;, {;}} = 2• V3 = {;, {;}, {{;}}, {{;, {;}}}• |V4| = 24 = 16• |V5| = 216 = 65536• |V6| = 265536 (which, according to Wikipedia, is much biggerthan the number of atoms of the observable universe!)

• |V7| = 2(265536)

• ...• |V!| = @0

• |V!+1| = 2@0 = i1

37When we talk about the Axiom of Choice, we will see that this cardinal indeedexists.

48 D. ASPERO

• |V!+2| = 2i1 = i2

• For every ordinal ↵, |V!+↵| = i↵.

Also: We prove things by induction on the ordinals: Let P (x) be afirst-order property. Suppose the following.

(1) P (0) holds.(2) For every ordinal ↵ > 0, if P (�) holds for every ordinal � < ↵,

then P (↵) holds.

Then P (↵) holds for every ordinal ↵.

Example:P

kn k = n(n+1)2

for every n < !.

Another example:

Proposition 6.4. (ZF) For every ordinal ↵, V↵ is transitive.

Proof. V0 = ; is transitive.Let ↵ > 0 be an ordinal and suppose V� is transitive for every � < ↵.

Suppose ↵ is a successor ordinal, ↵ = �+1. Then V↵ = V�+1 = P(V�).Let x 2 V↵ and let y 2 x. Then x ✓ V�. It follows that y 2 x ✓ V�

and therefore y 2 V�. Since V� is transitive by induction hypothesis,y ✓ V�. But then y 2 P(V�) = V↵.

Finally suppose ↵ > 0 is a limit ordinal. Then V↵ =S

�<↵ V�. Let

y 2 x 2 V↵. Then there is some � < ↵ such that x 2 V�. Since V� istransitive by induction hypothesis, y 2 V�. But then y 2 S

�<↵ V� =V↵. ⇤

Why is it ok to make definitions by recursion on the ordinals and toprove things by induction on the ordinals?

Induction is easy:

Theorem 6.5. (Induction Scheme) Let '(x, y0, . . . , ym) be a formula,let a0, . . . , am be sets, and let P = {↵ 2 Ord : '(↵, a0, . . . , am)}.Suppose

(1) 0 2 P , and(2) for every ordinal ↵, if � 2 P for all � < ↵, then ↵ 2 P .

Then P = Ord.

Proof. Suppose towards a contradiction that

X = Ord \P 6= ;Let ↵ = min(X). For every ordinal � < ↵, � /2 X by definition ofmin(X). But then � 2 P . Hence we have that � 2 P for all � 2 ↵.Therefore ↵ 2 P by (2). So ↵ /2 X. This is a contradiction since↵ = min(X) 2 X. ⇤


Theorem 6.5 is of course a theorem scheme; we have a theorem foreach instance of '.

It is sometimes useful to use Theorem 6.5 in the following form.Indeed, note that the usual proofs of induction on ! one sees in math-ematics are of this form.

Corollary 6.6. Let '(x, y0, . . . , ym) be a formula, let a0, . . . , am besets, and let P = {↵ 2 Ord : '(↵, a0, . . . , am)}. Suppose the followingholds.

(1) 0 2 P(2) For every ordinal ↵, if ↵ 2 P , then ↵ + 1 2 P .(3) For every nonzero limit ordinal ↵, if � 2 P for all � < ↵, then

↵ 2 P .

Then P = Ord.

Exercise 6.1. Prove Corollary 6.6 using Theorem 6.5 together withFact 4.23.

What about definitions by recursion?

Theorem 6.7. (ZF) (Recursion Theorem Scheme)

Let G(x, y) be a class–function. Then there is a unique class functionF defined on Ord such that for every ordinal ↵,

F (↵) = G(↵, F � ↵)Proof. We prove, by induction on the ordinals, that for every ordinal ↵there is a unique function f with domain ↵ such that for every � < ↵,

f(�) = G(�, f � �)We show uniqueness first and then existence.

Uniqueness: Suppose f0 and f1 are distinct functions with dom(f0) =dom(f1) = ↵ such that f0(�) = G(�, f0 � �) and f1(�) = G(�, f1 � �)for every � < ↵. Since f0 6= f1, let

� = min{� 2 ↵ : f0(�) 6= f1(�)}Then f0 � � = f1 � �. But then

f0(�) = G(�, f0 � �) = G(�, f1 � �) = f1(�)

Contradiction.Existence: Let ↵ be an ordinal. For every � < ↵ let f� be the unique

function h with domain � such that h(�) = G(�, h � �) for all � < �(which exists by induction hypothesis).

50 D. ASPERO

One can easily prove by induction on � < ↵ that if �0 < �, thenf�0

= f� � �0 ([Exercise]).If ↵ is a limit ordinal, then f↵ =

S{f� : � < ↵} is as desired bythe previous line.

If ↵ = ↵ + 1, letf = f ↵ [ {(↵, G(↵, f ↵))}

Let now � < ↵. If � < ↵, then

f(�) = f ↵(�) = G(�, f ↵ � �) = G(�, f � �)If � = ↵, then

f(↵) = G(↵, f ↵) = G(↵, f � ↵)since f ↵ = f � ↵ by definition of f . ⇤Exercise 6.2. Complete the proof of Theorem 6.7

Example: The class–function F sending ↵ 2 Ord to V↵ is such thatF (↵) = G(↵, F � ↵), where G(x, y) is:

• ; if x = 0.• P(y(x)) if x is the successor ordinal x + 1 and y is a functionsuch that x 2 dom(y).

• Srange(y) if x is a nonzero limit ordinal and y is a function.

• ; in all other cases.

We have seen, by induction on the ordinals, that V↵ is transitive forevery ordinal ↵.

Proposition 6.8. For all ↵ < �, V↵ ✓ V�.

Proof. Again by induction on �. This is vacuously true for � = 0. For� a nonzero limit ordinal, V� ◆ V↵ by definition of V�. For � = � + 1,V� = P(V�). If ↵ = �, then we are done since every member of V�

is a subset of V� (as V� is transitive) and therefore V� ✓ P(V�). If↵ < �, then V↵ ✓ V� by induction hypothesis. But V� ✓ P(V�) by theprevious case, and hence V↵ ✓ V� ✓ P(V�) = V�. ⇤Definition 6.9. For every x 2 S

↵2Ord V↵,

rank(x) = min{↵ 2 Ord : x 2 V↵+1}Definition 6.10. For every set x, the transitive closure of x, denotedby TC(x), is

S{Xn : n < !} where

• X0 = x• Xn+1 =

SXn

So TC(x) = x [Sx [SS

x [SSSx [ . . .


Exercise 6.3. Prove that TC(x) is the ✓–least transitive set y suchthat x ✓ y. In other words, TC(x) =

T{y : y transitive, x ✓ y}.Let us fix some notation now for the set–theoretic universe and forS↵2Ord V↵.

Definition 6.11. V denotes the class of all sets; that is,

V = {x : x = x}[Remember that we have already seen this notation.]

Definition 6.12. WF =S{V↵ : ↵ 2 Ord}: The class of all x such

that x 2 V↵ for some ordinal ↵.

In the above definition, WF stands for “well–founded”.

Note: WF is a transitive class: y 2 x 2 V↵ implies y 2 V↵ since V↵

is transitive.

From now on, and until further notice, we will add Foundation tothe axioms we are assuming; in other words, we will be working in fullZF.

Theorem 6.13. (ZF) V = WF

Proof. Suppose, towards a contradiction, that there is some set x suchthat x /2 WF. Let y = TC(x).

y /2 WF: Suppose y 2 V↵. Since x ✓ y ✓ V↵ (where y ✓ V↵ is trueby transitivity of V↵), x 2 P(V↵) = V↵+1. Contradiction.

By Foundation we may find a 2 y [ {y}, a 2–minimal in y [ {y},such that a /2 WF. For every z 2 a, it follows that z 2 y [ {y} (bytransitivity of y [ {y}) and therefore z 2 V↵ for some ↵ 2 Ord by 2–minimality of a among {w 2 y [ {y} : w /2 WF}. Hence, the functionrank � a sending z 2 a to rank(z) is defined for all z 2 a. But thenrange(rank � a) has to be a set by Replacement and therefore there issome ordinal ↵ such that ↵ > rank(z) for every z 2 a [No set X ofordinals can be cofinal in Ord (i.e., such that for every ↵ 2 Ord thereis some � 2 X with ↵ < �). Why? Otherwise

SX = Ord, which is

not a set (Burali–Forti), butSX is a set if X is a set by Union Axiom.

Contradiction.]

It follows that for every z 2 a there is some � < ↵ such that z 2V� ✓ V↵. Hence a ✓ V↵ and therefore a 2 P(V↵) = V↵+1. Contradic-tion with a /2 WF. ⇤

52 D. ASPERO

The fact that V = WF realises the idea that a set is any collectionbuilt out of sets already built. This is known as the iterative conceptionof sets. Note that this conception of sets rules out such “sets” as Vor the Russell class {x : x /2 x}: They couldn’t possibly be sets sinceone needs to refer to the totality of sets for their definition, a totalityto which they would belong if they were sets. Take for example V.Certainly, if V is a set, then V 2 V. But this goes against the iterativeconception of set, whereby a set is built up out of previously built sets.

The picture of the universe provided by V = WF is a very appealingand very natural one (once one has come across it, at least). Thispicture of the universe of all sets, and the fact that ZF implies V =WF, is the main source of intrinsic justifications of the ZF axioms.

7. Inner models and relativization

Let (M,2M) be a submodel, or inner model, defined by a formula,possibly with parameters. By this we mean that there is a formula�(x, x0, . . . , xn) and sets p0, . . . , pn such that

M = {a : ⇥(a, p0, . . . , pn)}and, for all a, b 2 M , a 2M b if and only if a 2 b (we usually leave out2M and write M instead of (M,2M)). (Examples: V, WF, L, HOD,...).

We define the relativization to M of a formula '(~x), to be denoted'M(~x), in the following manner.

• (x 2 y)M is x 2 y.• (x = y)M is x = y.• ('0 _ '1)M is 'M

0 _ 'M1 .

• (¬')M is ¬'M .• ((8x)('(x))M is 8x(⇥(x, p0, . . . , pn) ! 'M(x)). We may alsowrite something like (8x 2 M)'M(x).

Note: Suppose T is a theory in the language of set theory. Suppose(N,E) is a structure in the language of set theory, and suppose M isan inner model in N . Then (N,E) |= �M for every � 2 T if and onlyif (M,E) |= T .

Notation 7.1. If (N,E) is a structure in the language of set theory, Mis an inner model defined by a formula ⇥(x) possibly with parameters(i.e., M = (N,E � (M ⇥M)), where M = {a 2 N : (N,E) |= ⇥(a)}),and we want / need to emphasise that M is the inner model definedby ⇥(x) as defined within (N,E), then we often write MN instead ofM .


Example: WFM

Note: For every ordinal ↵, V WF↵ = V↵ (here V↵ refers to the set,

definable from the parameter ↵, with the definition that we have seen).

Many facts about the universe V are inherited by reasonable sub-models, in particular by transitive ones. For example:

Lemma 7.2. Suppose M is a transitive set or a transitive proper class.Then M |= Axiom of Extensionality.

Proof. Let a, b 2 M and suppose M |= (8x)(x 2 a $ x 2 b) (this ofcourse is shorthand for

M |= (8x)(x 2 y $ x 2 z)[~a]

where ~a is any assignment sending the variable y to a and the variablez to b).

This means that a\M = b\M . Since M is transitive (in V), everymember of a or of b is a member of M . It follows that a \M = a andb \M = b and therefore a = b. Hence M |= a = b. In sum, M thinksthat for all y, z, if y and z have the same elements, then they are equal.In other words, M |= Axiom of Extensionality. ⇤

Also:

Lemma 7.3. Suppose M is a transitive set or a transitive proper classwhich is closed under unordered pairs (meaning that for all a, b 2 M ,{a, b} 2 M). Then M |= Axiom of Unordered pairs.

Proof. Let c = {a, b} 2 M . Check, as in the previous proof, thatM |= (8x)x 2 c $ x = a _ x = b. ⇤

Similarly:

Lemma 7.4. Suppose M is a transitive set or a transitive proper class.Suppose

Sa 2 M for every a 2 M . Then M |= Union set Axiom.

Lemma 7.5. Suppose M is a transitive set or a transitive proper class.Suppose for every a 2 M there is some b 2 M such that b = P(a)\M .Then M |= Power set axiom.

Exercise 7.1. Prove Lemmas 7.4 and 7.5.

Note: There are situations in which there are transitive models Mof fragments of ZFC, or even of all of ZFC, and some a 2 M such thatP(a)M is strictly included in P(a) (i.e., there are subsets b of a suchthat b /2 M).

54 D. ASPERO

Lemma 7.6. Suppose M is a transitive set or a transitive proper class.If ! 2 M , then M |= Infinity.

Proof idea: As in the previous proofs. The point is that M recog-nises ; correctly, recognises correctly that something is an ordinal, andrecognises correctly that something is the successor of an ordinal.

We say that the notion of ordinal is absolute with respect to tran-sitive models. It is possible to identify large families of properties thatare absolute with respect to transitive models by virtue of their beingdefinable by syntactically ‘simple’ formulas (from the point of view oftheir quantifiers). We don’t need this kind of general analysis at themoment so we won’t go into that now.

Note: The notion of finiteness is also absolute with respect to transi-tive models but, on the other hand, the notion of countability is highlynon–absolute with respect to transitive models: There are transitivemodels M and a 2 M such that

M |= a is uncountable

but there is a bijection f : ! �! a, so a is countable inV. The problemof course is that f is not in M . There are even transitive models of(fragments of) ZFC such that all their sets are countable in V. Andeven the whole model can be countable in V.

The notion of choice function is also absolute with respect to tran-sitive models: If M is transitive, a 2 M consists of nonempty sets,f 2 M , and f is a choice function for M , then we have that M |=“f is a choice function for a”. Hence:

Lemma 7.7. Let M be a transitive set or a transitive proper class.Suppose for every a 2 M consisting of nonempty sets there is a choicefunction f for a, f 2 M . Then M |= AC.

Lemma 7.8. Let M be a transitive set or a transitive proper class.Suppose b 2 M whenever a 2 M and b ✓ a is definable over M ,possibly from parameters (in other words, b = {c : c 2 a, M |= '(c, ~p)}for some parameters ~p 2 M). Then M |= Separation.

Lemma 7.9. Let M be a transitive set or a transitive proper class.Suppose F [a] 2 M whenever a 2 M , and F is a class–function over M(in other words, if F is definable by a formula '(x, y, ~z) which, over Mis functional, ~p 2 M , and a 2 M , then {c : (9b 2 a)M |= '(b, c, ~p)} 2M). Then M |= Replacement.

7.1. A relative consistency proof: Con(ZF \{Foundation}) im-plies Con(ZF). We will need the following lemma.


Lemma 7.10. For all a, b in WF, if a 2 b, then rank(a) < rank(b).

Proof. Let a, b 2 WF be such that a 2 b, and let ↵ and � be minimalsuch that a 2 V↵+1 and V�+1. We want to see that ↵ < �. Now,a 2 b ✓ V� by definition of V�+1, and so a 2 V�. If � is a limit ordinal,then there is some � < � such that a 2 V� ✓ V�+1 ✓ V� by definition ofV�. If � is a successor ordinal, � = �0+1, then ↵ �0 since a 2 V�0+1,and so ↵ �0 < �0 + 1 = �. So in all cases we obtain ↵ < � asdesired. ⇤Theorem 7.11. Let M |= ZF \{Foundation}. Then M |= �WFM

forevery � 2 ZF. Hence, WFM |= ZF.

Proof. By the previous lemmas and the construction of (V↵ : ↵ 2 Ord),WFM |= � for every axiom � of ZF \{Foundation} [go through theseaxioms one by one them and check thatWF is closed under the relevantoperation, then apply the relevant lemma].

To see that M |= FoundationWF holds, let us work in M : Leta 2 WF be nonempty and let b 2 a be such that � = rank(b) =min{rank(y) : y 2 a}. Then, for every y 2 a, rank(y) � rank(b), andtherefore y /2 b by Lemma 7.10. Hence, WF thinks that the Axiom ofFoundation is true. ⇤Corollary 7.12. If ZF \{Foundation} is consistent, then ZF is con-sistent.

Proof. Suppose ZF \{Foundation} is consistent. By the completenesstheorem we may find a model M |= ZF \{Foundation}. Let M 0 =WFM . By the theoremM 0 |= ZF. Hence, ZF has a model and thereforeit is consistent. ⇤Remark 7.13. By exactly the same argument, if M is a model ofZFC \{Foundation}, then M |= �WFM

for every � 2 ZFC. Hence,Con(ZFC \{Foundation}) implies Con(ZFC).

Similar relative consistency results: One can define “the constructibleuniverse” L:

• L0 = ;• L↵+1 = Def(L↵), where Def(L↵) is the set of all subsets of L↵

definable over L↵ possibly with parameters, i.e., the collectionof all sets of the form

{b 2 L↵ : L↵ |= '(b, a0, . . . , an�1)}for some formula '(x, ~x) and a0, . . . , an�1 2 L↵.

• L� =S

↵<� L↵ if � > 0 is a limit ordinal.

56 D. ASPERO

L =S

↵2Ord L↵.This construction is due to Godel. He proved that if we do this con-struction in ZF, then L |= ZF but also L |= AC.38

The above results imply that if ZF is consistent, then ZFC is alsoconsistent.39 Linking this to the implication we have seen we thus havethat if ZF \{Foundation} is consistent, then so is ZFC.

These relative consistency proofs proceed by building suitable innermodels.40

Most relative consistency proofs proceed, on the other hand, bybuilding suitable outer models of some given ground model. The con-struction of these outer models is done with the forcing method. Thisis an extremely powerful method in set theory.

7.2. Basics of ordinal arithmetic. We are going to take a brief lookat ordinal arithmetic. The arithmetical operations on the ordinals arenaturally defined by recursion and, as will be obvious from the defini-tion, they generalise (and extend) the usual arithmetical operations onthe natural numbers.

Definition 7.14. Let ↵ be an ordinal. We define ↵+ �, ↵ · � and ↵�,by recursion on �, as follows.

(1) (a) ↵ + 0 = ↵(b) ↵ + S(�) = S(↵ + �)(c) ↵+� = sup{↵+� : � < �} if � is a nonzero limit ordinal.

(2) (a) ↵ · 0 = ↵(b) ↵ · S(�) = (↵ · �) + ↵(c) ↵ · � = sup{↵ · � : � < �} if � is a nonzero limit ordinal.

(3) (a) ↵0 = ↵(b) ↵S(�) = (↵�) · ↵(c) ↵� = sup{↵� : � < �} if � is a nonzero limit ordinal.

Let us make a couple of computations:

38He proved also that L |= GCH and L |= V = L. We will see the definition ofGCH later, when we see some cardinal arithmetic.

39And in fact also ZFC+GCH+V = L, etc.40What set–theorists understand by “inner models” are usually much more com-

plicated than WF or L. However, the construction of L is in fact the paradigm formost of these more complicated constructions.


! + 1 = S(!) = ! [ {!}. On the other hand, we have that 1 +! = sup{1 + n : n < !} since ! is a limit ordinal. But note thatsup{1 + n : n < !} = ! and hence ! 2 S(!) = ! [ {!}. In otherwords, we have proved that 1 + ! = ! < ! + 1. In particular, + is notcommutative.41

Also, ! · 1 = ! and ! · 2 = ! · S(1) = ! + ! > !, whereas 2 · ! =sup{2 · n : n < !} = !. So, similary as with the previous example,2 · ! = ! < ! + ! = ! · 2. In particular, ordinal multiplication is notcommutative.42

On the other hand, general ordinal arithmetic does share some pro-perties with its restriction to !. For example, the following facts arenot too hard

• Both addition and multiplication are associative,• For all ordinals ↵ � there is a unique ordinal � such that↵ + � = �.

• ↵ · (� + �) = (↵ · �) + (↵ · �) for all ordinal ↵, �, �.These facts are naturally proved by induction.

8. The Axiom of Choice

Theorem 8.1. (ZF) The following are equivalent.

(1) AC(2) The Well–ordering Principle: Every set can be well–ordered.

Proof. Suppose AC holds. LetX be a set. Let f be a choice function forP(X) \ {;}. We define one–to–one enumerations ~x� = (x↵ : ↵ < �) ofsubsets ofX by recursion on the ordinals in such a way that ~x� = ~x� � �for all � < �, as follows: Let � be an ordinal and suppose (x↵ : ↵ < �)has been defined. If {x↵ : ↵ < �} = X, then we are done. OtherwiseX \ {x↵ : ↵ < �} 6= ;. Set

x� = f(X \ {x↵ : ↵ < �})This defines ~x�+1. If � is a limit ordinal we of course let ~x� =

S�<↵ ~x�.

This gives a class–function F from P(X) to the ordinals, sending Y ✓X to � if Y = X \ {x↵ : ↵ < �}. Since P(X) is a set, by Replacementrange(F ) is a set of ordinals so it cannot be all of Ord. Hence thisconstruction has to stop at some point (there must be � such thatX \ {x↵ : ↵ < �} = ;). But then {(x↵, x↵0) : ↵ 2 ↵0 2 �} is awell–order of X.

41On the other hand, the restriction of + to the natural numbers is, as everybodyknows, commutative.

42Again, this is in contrast with the commutativity of the multiplication of nat-ural numbers.

58 D. ASPERO

Now assume the Well–ordering Principle. Let X be a set consistingof nonempty sets and let be a well–order of

SX. Now, given a 2 X

let f(a) be the –minimal element of a. Then f is a choice functionfor X. ⇤

Note that we have used recursion on the ordinals to define ~x� in thefirst part of the above proof.

Corollary 8.2. (ZF) The following are equivalent:

(1) AC(2) Every set is bijective with a unique cardinal.

The following is another consequence of Theorem 8.1 (the proof that(1) implies (2) is straightforward by Theorem 8.1, the proof that (2)implies (1) needs a bit of argument but not too much).

Proposition 8.3. (ZF) The following are equivalent:

(1) AC(2) (Trichotomy for sets) Given any two sets X, Y , exactly one of

the following holds.• |X| = |Y |• |X| < |Y |• |Y | < |X|

Given a partial order P = (P,) and C ✓ P , we say that C is achain of P i↵ C is linearly ordered by , i.e., if for all x, y 2 C, x yor y x.

The following is characterisation of the Axiom of Choice which isoften convenient to work with.

Definition 8.4. (Zorn’s Lemma) For every partial order P = (P,), ifevery chain in P has an upper bound in P, then there is a –maximala 2 P , i.e., there is some a 2 P such that there is no b 2 P such thata < b.43

Let us see a standard application of Zorn’s Lemma.

Example 8.1. Let V be a vector space, let

X = {X ✓ V : X is linearly independent}and let P = (X ,✓). Note that every chain P has an upper bound.Indeed, suppose C is a chain in X (in other words, for all X, X 0 2 C,eiher X ✓ X 0 or X 0 ✓ X). Then it is easily seen that

SC is also

linearly independent, in other words, thatS

C 2 X . But X ✓ SC

43P could have more one maximal element. Zorn’s Lemma says that P has atleast one maximal element.


for every X 2 C. By Zorn’s Lemma it follows that there is B 2 Xwhich is ✓–maximal in P, in other words, such that there is no linearlyindependent set X properly extending B. Then a standard argumentfrom linear algebra shows that B generates V and so is a basis of V .

Theorem 8.5. (ZF) The following are equivalent.

(1) AC(2) Zorn’s Lemma

Proof. Let us first prove that (1) implies (2). Let P = (P,) be apartial order as in the hypothesis of Zorn’s Lemma. Let (p⇠)⇠<, forsome cardinal , be an enumeration of P . This enumeration exists bythe Well–ordering Principle, which we know is equivalent to AC. Nowwe build by recursion a certain sequence ~C = (ai)i<�+1 of members ofP , for a suitable ordinal �, such that ai < ai0 for all i < i0. Let a0 2 Pbe arbitrary. Given an ordinal µ, if (ai)i<µ has been defined, then:

(a) If there is some ⇠ such that ai < p⇠ for all i < µ, then letaµ = p⇠0 , where ⇠0 = min{⇠ : ai < p⇠ for all i < µ}.

(b) If there is no ⇠ such that ai < p⇠ for all i < µ, then µ = µ0 + 1for some ordinal µ0. (This is true since, by hypothesis, everychain in P has an upper bound in P and since (ai)i<µ is theenumeration of a strictly increasing chain.) In that case we let� = µ0 and stop the construction.

During the construction we must eventually find ourselves in case (b),as otherwise we would have a one–to–one function of the ordinals intothe set P , from which we would derive the usual contradiction thatOrd is a set. But then a� is a –maximal element of P since there isno p 2 P such that a� < p.

Let us now prove that (2) implies (1). Let X be a set consisting ofnonempty sets and let P = (P,✓), where P is the collection of all partialchoice functions for X, in other words, all functions p ✓ X⇥S

X suchthat, for all a 2 dom(p), p(a) 2 a. Note that if C is a chain in P, thenS

C is a partial choice function of X and therefore an upper bound in Pof C. Hence P satisfies the hypothesis of Zorn’s Lemma. But then, byZorn’s Lemma there is a ✓–maximal member p of P . Now it is enoughto check that dom(p) = X. But if dom(p) 6= X, a 2 X \ dom(p), andx 2 a, then p [ {(a, x)} would a proper extension of p in P and so pwould fail to be ✓–maximal. This finishes the proof. ⇤

As we have seen, the Axiom of Choice has the following counterin-tuitive consequence: R, and even C, can be well–ordered (of coursein length 2@0). This well–order has to be highly non–constructive. Infact there are models of ZF in which such a well–order does not exist.

60 D. ASPERO

The fact that R can be well–ordered enables one to construct ratherpathological objects (Banach–Tarski decompositions, etc.).

Example: A non–Lebesgue measurable set: Let ⌘ be the equivalencerelation on (0, 1) ✓ R given by x ⌘ y if and only if x�y 2 Q. Let f bea choice function for the quotient set (0, 1)/ ⌘ (i.e., f picks an elementout of each equivalence class of ⌘). Then range(f) is not Lebesguemeasurable. It’s called a Vitali set. The reason why X := range(f) isnot Lebesgue measurable is the following. Suppose X were measurableand let r = µ(X). Suppose r > 0. For every q 2 Q let q+X = {q+x :x 2 X}. Then (q+X)\ (q0+X) = ; whenever q 6= q0 are in Q. Henceµ(S

q2Q\(0,1)(q +X)) =P

q2Q\(0,1) µ(X) since

(1) µ is translation invariant (i.e., µ(x+ Y ) = µ(Y ) for every mea-surable Y ✓ R, and

(2) µ is �–additive, meaning that µ(S

n<! Xn) =P

n<! µ(Xn) when-ever (Xn)n<! is a countable sequence of pairwise disjoint mea-surable sets.

It follows that µ(S

q2Q\(0,1)(q + X)) =P

q2Q\(0,1) r = 1. On theother hand, µ(

Sq2Q\(0,1)(q + X)) < 1 since

Sq2Q\(0,1)(q + X)) ✓

(0, 2) and µ((0, 2)) = 2. That is a contradiction, so µ(X) = 0. ButSq2Q\(0,1)(q + X) turns out to be exactly (0, 2) [check]. If µ(X) = 0,

then µ((0, 2)) =P

q2Q\(0,1) µ(q +X) =P

q2Q\(0,1) 0 = 0, which is alsoa contradiction since µ((0, 2)) = 2. It follows that X cannot be mea-surable.

There are extensions of ZF incompatible with AC and which rule outsuch pathological consequences of AC as the Banach–Tarski “paradox”,non–Lebesgue measurable sets, etc. These extensions of ZF say that allsets of reals have nice regularity properties and therefore seem to reflectbetter our intuitions about such sets than ZFC. One such extension isZF + “The Axiom of Determinacy”. Su�ciently strong large cardinalaxioms (these are natural axioms extending ZFC, and in fact the Axiomof Infinity can be regarded as one such axiom) actually imply

L(R) |= ZF+ “The Axiom of Determinacy”,

where L(R) is the minimal inner model of ZF containing all the ordinalsand all the reals.

9. Basics of cardinal arithmetic

Definition 9.1. (ZF) Given cardinals , �,

• + � = |(⇥ {0}) [ (�⇥ {�})|, and• · � = |⇥ �|


Note: For any two ordinals , �, + � is the cardinality of X [ Yfor any two disjoint sets X, Y such that |X| = and |Y | = �. Thus,5 + 8 = 13, for example.

We need to prove that the above definitions make sense in generalin ZF; in other words, that for all cardinals , � there are cardinals µ0,µ1 such that |([ {0})[ (�⇥ {1})| = |µ0| and |⇥�| = |µ1|. This willfollow immediately from Proposition 9.3.

Definition 9.2. (Godel’s pairing function) Let us define the follow-ing relation / ✓ (Ord⇥Ord) ⇥ (Ord⇥Ord): Given pairs of ordinals(↵0, �0), (↵1, �1), let (↵0, �0) / (↵1, �1) i↵

• either max{↵0, �0} < max{↵1, �1}, or• max{↵0, �0} = max{↵1, �1} and ↵0 < ↵1, or else• max{↵0, �0} = max{↵1, �1} ↵0 = ↵1, and �0 < �1.

It is not di�cult to see that / is a well–order on Ord⇥Ord [Exercise].Now, � : Ord⇥Ord �! Ord be the class–function given by

�(↵, �) = ot({(�, �) 2 Ord⇥Ord : (�, �) / (↵, �)}, /)� is called Godel’s pairing function.

Exercise 9.1. Prove that for all ordinals ↵0, �0, ↵1 and �1, �(↵0, �0) <�(↵1, �1) i↵ (↵0, �0) / (↵1, �1).

Exercise 9.2. Prove that for every ordinal ↵,

{(�, �) 2 Ord⇥Ord : (�, �) / (0,↵)} = ↵⇥ ↵

For every ordinal ↵ let �(↵) = �[↵⇥ ↵].It follows from the above that for every ordinal ↵,

• �(↵) is an ordinal,• ↵ �(↵) �(�) for all ↵ �, and• if ↵ is a nonzero limit ordinal, then �(↵) = sup{�(�) : � < ↵}.

Exercise 9.3. Prove that every infinite cardinal is a limit ordinal.

Proposition 9.3. (ZF) For every infinite cardinal , �[⇥ ] = .

Proof. It su�ces to prove by induction on ↵ that �(@↵) = �[@↵⇥@↵] =@↵ and therefore |@↵ ⇥ @↵| = @↵. For ↵ = 0 this follows immediatelyfrom the above.

Now suppose ↵ > 0 and suppose, towards a contradiction, then thereare �, � < @↵ such that �(�, �) = @↵. Since @↵ is a limit ordinal wemay find an infinite ordinal ⌘ < @↵ such that � < ⌘ and � < ⌘, i.e.(�, �) 2 ⌘ ⇥ ⌘. Let @0 < ↵ be such that |⌘| = @↵0 . @0 exists since ⌘ isinfinite. Then

|�(⌘)| = |�[⌘ ⇥ ⌘]| = |⌘ ⇥ ⌘| = |@@0 ⇥ @↵0 | = @↵0

62 D. ASPERO

by induction hypothesis. But this is a contradiction since @↵ < �(⌘).⇤

Corollary 9.4. (ZF) For all infinite cardinals + � = ⇥ � =max{,�}.Proof. Let us prove that + � = max{,�}. Assume, without loss ofgenerality, that � = max{,�}. Then clearly |�| |(⇥{0})[(�⇥{1})|(as witnessed by the function sending ↵ 2 � to (↵, 1)). On the otherhand, since �, the identity on (⇥{0})[ (�⇥{1}) is a one–to–onefunction from ( ⇥ {0}) [ (� ⇥ {1}) into (� ⇥ {0}) [ (� ⇥ {1}). Also,|(�⇥ {0}) [ (�⇥ {1})| = |�⇥ 2|, and of course |�⇥ 2| |�⇥ �|. But,by Proposition 9.3, |�⇥ �| = |�|. Putting everything together we havethat |�| |(⇥{0})[(�⇥{1})| and that |(⇥{0})[(�⇥{1})| �. ByCantor–Bernstein–Schroder we thus have |�| = |([{0})[(�[{1})| �, as was to be proved.

Similarly one can prove that ⇥ � = max{,�}. ⇤Exercise 9.4. Prove that · � = max{,�} for all infinite cardinals, �.

Corollary 9.4 shows that addition and multiplication of infinite car-dinals are trivial. Also, arithmetic of infinite cardinals behaves di↵er-ently from arithmetic of finite cardinals: For example 5+7 = 12 6= 7 =max{5, 7}, whereas @2 + @2 = @2.

Note: It is a bit unfortunate that we use the same notation whenreferring to ordinal addition (and / or multiplication) and cardinal ad-dition (and / or multiplication). These operations are defined in com-pletely di↵erent ways and have nothing to do with each other; in factthey behave di↵erently.44 Exactly the same comments apply to ordinalexponentiation vs. cardinal exponentiation (see below).45 In any case,the fact whether one is referring to operation of ordinal arithmetic oroperations of cardinal arithmetic should be left clear by the context.

Definition 9.5. (ZFC) If is a cardinal, |P()| is denoted by 2.

In particular, |R| = 2@0 . We have seen, by Cantor’s Theorem, that@0 < 2@0 (Cantor’s Theorem), and therefore @1 2@0 by definition of

44For example, the ordinal @1 + @0, when + is taken to be ordinal addition, issup{@1+n : n < !}, which is an ordinal strictly above @1. But @1+@0, when now+ is taken to be cardinal addition, is @1.

45For example, as will be clear from Definition 9.5, 2 > for every cardinal .In particular 2! > !. On the other hand, if we regard exponentiation as ordinal

exponentialtion, then 2! = sup{2n : n < !} = !.


@1 as the least uncountable cardinal (we need the Axiom of Choice toconclude that there is an injection from !1 into R; without AC this isnot true in general!). The following is therefore a very natural question.

Question 9.6. Is @1 = 2@0? In other words, if X ✓ R is uncountable,does it follow that |X| = |R|?

This is perhaps the most famous question in set theory. Georg Can-tor was obsessed with it, and it is the first question on the famouslist of problems that David Hilbert presented at his address at theInternational Congress of Mathematics in 1900 in Paris.

Definition 9.7. Cantor’s Continuum Hypothesis (CH): 2@0 = @1.

Definition 9.8. The Generalized Continuum Hypothesis (GCH): Forevery ordinal ↵, 2@↵ = @↵+1.

Again, Cantor’s Theorem shows that 2@↵ � @↵+1, so GCH says thatthe power set of every infinite set has least size possible.

It turns out that ZFC proves neither CH nor ¬CH. In fact, if ZFCis consistent, then the following theories are also consistent:

• ZFC + 2@0 = @1

• ZFC + 2@0 = @2

• ZFC + 2@0 = @3

• ZFC + 2@0 = @471

• ZFC + 2@0 = @!1

• ZFC + 2@0 = @!2+!1+5678

• ...

Similarly, if ZFC is consistent, then the following theories are alsoconsistent:

• ZFC + GCH• ZFC + 2@5 = @6

• ZFC + 2@5 = @8

• ZFC + 2@0 = @1 + 2@5 = @8

• ZFC + 2@0 = 2@1 = 2@2 = @3

• ...

All these relative consistency results are proved using the method offorcing.

On the other hand, ZFC ` 2@0 6= @!, ZFC ` 2@0 6= @!2 , ZFC ` 2@1 6=@!1 , ... In fact:

Theorem 9.9. (Konig’s Theorem) (ZFC) cf(2) > for every infinitecardinal .

64 D. ASPERO

Proposition 9.10. (ZFC) For every infinite cardinal , + is regular.

Proof. Suppose, towards a contradiction, that + is singular. Hencethere is some ↵ < + for which there is a function f : ↵ �! + withrange cofinal in +. Since |↵| , we may of course assume that infact ↵ = .

Now, using AC, choose for every ⇠ < a surjection g⇠ : �! f(⇠),which is possible since f(⇠) 2 + and therefore |f(⇠)| (in otherwords, consider a choice function of the set {X⇠ : ⇠ < } where, foreach ⇠, X⇠ = {h : h : �! f(x) onto}). Let g : ⇥ �! + besuch that g((⇠0, ⇠1)) = g⇠0(⇠1). Since f has range cofinal in +, g is asurjection onto +. But, as we know by Corollary 9.4, | ⇥ | = ,so there is a surjection g⇤ : �! +. Since of course there is asurjection h : + �! (as < +), by Dual Cantor–Bernstein–Schroder (which is true since we are assuming AC) we have that |+| =. Contradiction. ⇤Corollary 9.11. (ZFC) There is no largest regular cardinal.

Proof. Suppose were the largest regular cardinal. By the above propo-sition, + is also regular. But + > . ⇤Remark 9.12. AC is needed in the above. In fact, if some (natural)extension of ZFC with large cardinal axioms is consistent, then thereare models of ZF in which all infinite cardinals have cofinality !.

It is also easy to show that there is no largest singular cardinal (andfor this AC is not needed):

Proposition 9.13. For every infinite cardinal there is a cardinal� > such that cf(�) = ! (and in particular is singular).

Proof. Let (n)n<! be the sequence defined as 0 = and n+1 =(n)+. Let � = sup{n : n < !} =

S{n : n < !}. Then for every↵ 2 � there is of course some n such that ↵ n and so {n : n < !}is cofinal in �. Also, � is a cardinal and therefore a singular ordinaland therefore cf(�) 6= 1. Hence cf(�) = !. ⇤

Given a cardinal and n < !, the cardinal n defined in the aboveproof is often denoted +n. +n is of course the n-th cardinal above .

Exercise 9.5. (ZF) Prove that for every infinite regular cardinal µ andevery cardinal there is some cardinal � > such that cf(�) = µ.

10. Filters and ultrafilters

Definition 10.1. Let X be a set. F ✓ P(X) is a filter on X i↵:


(1) X 2 F and ; /2 F .(2) For all Y ✓ Z ✓ X, if Y 2 F , then Z 2 F (F is closed under

supersets).(3) For all Y0, Y1 2 F , Y0 \ Y1 2 F (F is closed under finite inter-

sections).

A filter F is an ultrafilter i↵:

(4) For all Y ✓ X, either Y 2 F or else X \ Y 2 F .

Examples:

• If a 2 X, UXa = {Y ✓ X : a 2 Y } is an ultrafilter [Exercise:

Prove that this is true.] UXa is called the principal ultrafilter on

X generated by a.• F = {Y ✓ ! : |! \ Y | < !} is a filter on ! (F is called theFrechet filter).

• F = {Y ✓ (0, 1) : µ(Y ) = 1}, where (0, 1) denotes the unitinterval, (0, 1) ✓ R, and µ denotes Lebesgue measure.

Definition 10.2. A filter on X is said to be principal i↵ it is UXa for

some a 2 X. A filter on X which is not of the form UXa for any a 2 X

is said to be non–principal.

Fact 10.3. The Frechet filter is non–principal.

Proof. Let F be the Frechet filter and suppose, towards a contradiction,that F = U!

n = {Y ✓ ! : n 2 Y } for some n 2 !. Let X = {n}.Then Y 2 U!

n . But ! \X 2 F . Hence X \ (! \X) = ; 2 F , which isa contradiction. ⇤

A filter F on a set X corresponds to some notion of “largeness” forsubsets of X: Being in F means ‘being almost everything (except forsome negligible set)’. Thus, being in the Frechet filter means beingalmost all of !, except for, possibly, a finite set.

Let us now introduce a notion of ‘dual of a filter’. As we will see,members of an ideal are to be seen as ‘small’ or ‘negligible’ (for somesuitable notion of “smallness”).

Definition 10.4. Given a set X, I ✓ P(X) is an ideal on X if andonly if FI = {X \Y : Y 2 I} is a filter on X. We call FI the dual filterof I.

Remark 10.5. Clearly, if F ✓ P(X) is a filter on X if and only ifIF = {X \Y : Y 2 F} is an ideal on X. We also call IF the dual idealof F .

Exercise 10.1. Let I ✓ P(X). Prove that the following are equivalent.

66 D. ASPERO

(1) I is an ideal on X.(2) The following holds.

• ; 2 I and X /2 I.• If Y 2 I and Z ✓ Y , then Z 2 I.• If Y0 and Y1 are both in I, then Y0 [ Y1 2 I.

Given a filter F on X and a cardinal , F is –complete i↵T

i<� Yi 2F whenever � < , � 6= 0, and {Yi : i < �} ✓ F .

Exercise 10.2. (1) Every filter is @0–complete.(2) The Frechet filter is not @1–complete.(3) Every principal filter is –complete for every cardinal .

As we have seen, every principal filter is an ultrafilter. Let us considerthe following questions now:

Question 10.6. Is there any non-principal ultrafilter? Is there anynon-principal ultrafilter on !?

Proposition 10.7. (ZF) Let X be a set. The following are equivalentfor every filter F on X.

(1) F is a ✓–maximal filter on F (i.e., there is no filter F 0 ✓ P(X)such that F ✓ F 0 and F 6= F 0).

(2) F is an ultrafilter.

Proof. (1) implies (2): Let Y ✓ X and suppose none of Y , X \ Y is inF . Suppose

• there is Z 2 F such that Z \ Y = ;, and• there is Z 0 2 F such that Z 0 \ (X \ Y ) = ;.

Then Z \ Z 0 = ; 2 F . Contradiction.Hence, either

(a) for all Z 2 F , Z \ Y 6= ;, or else(b) for all Z 2 F , Z \ (X \ Y ) 6= ;.In case (a),

F 0 = {W ✓ X : Z \ Y ✓ W for some Z 2 F}is a filter on X properly extending F and such that Y 2 F 0.In case (b),

F 0 = {W ✓ X : Z \ Y ✓ W for some Z 2 F}is a filter on X properly extending F and such that X \Y 2 F 0. Hence,in both cases we have that F is not ✓–maximal.

(2) implies (1): Suppose F 0 ✓ P(X) is a filter such that F ✓ F 0 andF 6= F 0. Let Y 2 F 0 \ F . Since F is an ultrafilter, X \ Y 2 F . Butthen ; = Y \ (X \ Y ) 2 F 0 and so F 0 is not a filter. Contradiction.


⇤As we will see next, ZFC proves that the answer to the the above

questions (Question 10.6) is Yes.

Theorem 10.8. (ZFC) For every set X and every filter F on X thereis a ultrafilter U on X such that F ✓ U .

Proof. The proof is naturally given using Zorn’s Lemma, which weknow is equivalent to ZF over ZF. Let

F = {F 0 : F 0 a filter on X, F ✓ F 0}and let P = (FF ,✓). Then the partial order P satisfies the hypothesisof Zorn’s Lemma. Indeed, if C is a chain in P, then

SC is a filter on

X, and it of course extends F . Hence, by Zorn’s Lemma there is someU 2 P which is ✓–maximal in P, and therefore U is an ultrafilter onX extending F by Proposition 10.7. ⇤Exercise 10.3. Prove that every ultrafilter on ! extending the Frechetfilter is non–principal.

Remark 10.9. ZF does not su�ce to prove the existence of non–principal ultrafilters on ! (if ZF is consistent). For example, ZF+ Ax-iom of Determinacy implies that there are no non–principal ultrafilteron !.

10.1. Clubs and stationary sets. The following is a prominent ex-ample of filter on an uncountable regular cardinal.

Definition 10.10. Let � !1 be a regular cardinal. C ✓ is a closedand unbounded subset subset of (a club of ) i↵ the following holds.

(1) For every nonzero limit ordinal ↵ < , if ↵ = sup(C \ ↵), then↵ 2 C. (C is closed in )

(2) For every ↵ < there is some � 2 C such that ↵ < �. (C isunbounded in )

Thus, is a club of , but so is the set of limit ordinals of , andthe set of all nonzero limit ordinals ↵ such that ↵ is the supremum ofthe set of limit ordinals below ↵. And so on. In general, if C ✓ is aclub, then C 0 = {↵ 2 C : ↵ = sup(C \ ↵)} is also a club.

As we will see next, the intersection of two clubs is again a club.

Proposition 10.11. Let be an uncountable regular cardinal and letC0, C1 ✓ be clubs of . Then C0 \ C1 is a club of .

Proof. It is straightforward to check that C = C0 \ C1 is closed in .Indeed, if ↵ < is a nonzero limit ordinal and ↵ = sup(C \ ↵), then

68 D. ASPERO

of course ↵ = sup(C0 \ ↵) (since C ✓ C0), and therefore ↵ 2 C0, sinceC0 is closed in , and similarly ↵ = sup(C1 \ ↵) and therefore ↵ 2 C1.

To see that C is unbounded in , let ↵ 2 . It su�ces to showthat there is some � 2 C such that � > ↵. For this, we build strictlyincreasing sequences (↵n)n<!, (�n)n<! in such that

(1) ↵0 = �0 = ↵,(2) for all n, �n < ↵n+1, and(3) for all n, ↵n+1 < �n+1.

Let � = sup{↵n ; n < !} and note that also � = sup{�n : n < o}since (↵n)n<! and (�n)n<! are interleaving. Then ↵ < � by (2), � < since cf() � !1, � 2 C0 since C0 is closed and � = sup{↵n ; n < !},and � 2 C1 since C1 is closed and � = sup{�n ; n < !}. ⇤

By a similar argument one can prove the following strengthening ofProposition 10.11.

Proposition 10.12. Let be an uncountable regular cardinal, let � <, � 6= 0, and let (C↵)↵<� be such that C↵ is a club of for every ↵.Then

T↵<� C↵ is also a club of .

Exercise 10.4. Prove Proposition 10.12.

Given an uncountable regular cardinal , let Ck be the filter on generated by the collection of clubs of , i.e.,

Ck = {X ✓ : there is some club C of such that X ✓ C}C is known as the club filter on .

Theorem 10.13. (ZFC) Let be an uncountable regular cardinal.Then C� is a –complete filter on .

Proof. C is clearly closed under supersets of , and of course ; /2 C and 2 C. To see that C is closed under intersections of size � < , � 6= 0,let � be such an ordinal, and let (Xi)i<� be a sequence of members ofC. Using AC we may find a sequence (Ci)i<� of clubs of such thatCi ✓ Xi for all i. But then, by Proposition 10.12, C =

Ti<� Ci is a

club of such that C ✓ Ti<� Xi. ⇤

By the above theorem, the notion of club of is a suitable no-tion of ‘being almost everything’ for subsets of . Next we considerthe corresponding notion of ‘largeness’ (intersecting every club) and of‘smallness’ (being disjoint from a club):

Definition 10.14. Given a regular cardinal � !1, S ✓ is a sta-tionary subset of i↵ S \ C 6= ; for every club C of . Also, S ✓ is a non–stationary subset of i↵ S is not a stationary subset of ; inother words, if and only if there is a club such that S \ C = ;.


By Proposition 10.11, every club of is of course stationary. We willsee next that there are stationary subsets of which are not of thisform.

Definition 10.15. Given a set of ordinals X and a function f withdom(f) = X, f is regressive i↵ f(↵) 2 ↵ for every ↵ 2 X.

Definition 10.16. Given a cardinal and a sequence (C↵)↵< of sub-sets of , let

�↵<C↵ = {� 2 : � 2 C↵ for all ↵ < �}�↵<C↵ is the diagonal intersection of (C↵)↵<.

Lemma 10.17. Given an uncountable regular cardinal , C is closedunder diagonal intersections; in fact, if (C↵)↵< is a sequence of clubsof , then �↵<C↵ is also a club of .

Exercise 10.5. Prove Lemma 10.17. [Hint: This is very much likethe proof of Proposition 10.12].

Lemma 10.18. (ZFC) (Fodor’s lemma) Let be an uncountable reg-ular cardinal. For every stationary subset S of and every regressivefunction on S, there is some ↵ < such that f�1(↵) = {� 2 S :f(�) = ↵} is stationary.

Proof. Suppose otherwise, Then for every ↵ < we may pick a clubC↵ such that for every � 2 C↵ \ S, f(�) 6= ↵. By Lemma 10.17,C = �↵<C↵ is a club of . Since S is stationary, we may pick � 2S \ C. Let ↵ = f(�) < �. Then � 2 C↵ by the definition of C.But then f(�) 6= ↵ by the choice of C↵ and since � 2 C↵, which is acontradiction. ⇤

In the following, let Lim(!1) denote the set of nonzero limit ordinalsin !1.

Definition 10.19. ~C = (C↵ : ↵ 2 Lim(!1)) is a ladder system i↵ forevery ↵ 2 Lim(!1),

(1) C↵ ✓ ↵ is cofinal in ↵ and(2) ot(C↵) = !.

The following is immediate since every nonzero limit ordinal in !1 iscountable and therefore has cofinality !.

Lemma 10.20. (ZFC) There is a ladder system.

The following theorem shows in particular that !1 can be split into@1–many disjoint stationary sets. These stationary sets can of coursenot contain any club of !1. This theorem is in fact true, and by verymuch the same proof, for any uncountable cardinal � !1.

70 D. ASPERO

Theorem 10.21. (ZFC) Let S be a stationary subset of !1. Thenthere is a sequence (S↵)↵<!1 such that

(1) S↵ \ S↵0 = ; for all ↵ 6= ↵0 in !1 and(2) S↵ ✓ S is a stationary subset of !1 for all ↵.

Proof. Let ~C = (C↵ :,↵ 2 Lim(!1)) be a ladder system (we can indeedpick one by Lemma 10.20).

Claim 10.22. There is some n0 < ! such that for every � < !1,S� := {↵ 2 S : C↵(n) > �} is stationary, where (C↵(n))n<! denotesthe strictly increasing enumeration of C↵.

Proof. If this fails, then for every n < ! there is some club Cn ✓ !1

and some �n < !1 such that for every ↵ 2 Cn \ S, C↵(n) �n. LetC =

Tn<! Cn, which is a club of !1, let � = sup{�n : n < !}, and

note that � < !1 since cf(!1) = !1. But now, if ↵ 2 C \ S, ↵ > � + 1,then C↵(n) �n �, and therefore ↵ = sup(C↵) �, which is acontradiction. ⇤

Let us fix now n0 as in the Claim. Let f be the function withdomain S such that f(↵) = C↵(n0). Since f is regressive, by Fodor’sLemma there is some �0 such that S0 := {↵ 2 S ; C↵(n0) = �0} isstationary. By the choice of n0, again by Fodor’s Lemma we can thenfind �1 > �0 such that S1 := {↵ 2 S ; C↵(n0) = �1} is stationary. Ingeneral, given ⌫ < !1 such that �⌫0 has been defined for all ⌫ 0 < ⌫, let�⌫ > sup{�⌫0 : ⌫ 0 < ⌫} be such that S⌫ = {↵ 2 S ; C↵(n0) = �⌫} isstationary (this �⌫ exists since cf(!1) = !1). But now we are easilydone since S⌫ \ S⌫0 = ; for all ⌫ 6= ⌫ 0 in !1. ⇤

11. Infinite Ramsey theory

In this section we address questions of the following sort, for someset X: Suppose c is some colouring of [X]µ, the set of all subset of X ofcardinality µ, into ⌫–many colours. Must there be a large subset H ofX (where ‘large’ will typically mean |H| = |X|) which is homogeneousfor c? Here, H being homogeneous for c means that there is a colouri 2 ⌫ such that c(s) = i for all s 2 [X]µ.

It will be convenient to adopt the following notation.46

Notation 11.1. Given cardinals , �, µ and ⌫,47

! (�)µ⌫

46This is sometimes known as Hungarian arrow notation.47These could be finite or infinite cardinals.


denotes the following statement:Suppose c : []µ �! ⌫. Then there are H ✓ and i 2 ⌫ such that|H| = � and c(s) = i for all c 2 [H]µ.

Note that with the above notation, if we increase the parameter on the left–hand side of the expression, then we obtain a statementimplying ! (�)µ⌫ , whereas if we increase any of the parameters onthe right–hand side of the expression (�, µ or ⌫), then we obtain astatement implied by ! (�)µ⌫ .

A well–known – and easily checked – instance of Ramsey–type state-ment is the following: 6 ! (3)22 but 5 9 (3)22 (i.e., in any party of 6people there are at least 3 people any two of whom have met before orsuch that no two of them have ever met before, and this is not true ifwe replace 6 by 5.)

It is also known that 18 ! (3)22 but 17 9 (4)22.It is known that the least N < ! such that N ! (5)22 is such that

43 N 49. However, the exact value of N is not known!The starting point of infinite Ramsey theory is the following classic

theorem.

Theorem 11.2. (Ramsey) ! ! (!)nm for all nonzero n, m < !.

Proof. The proof is by induction on n. For n = 1 the conclusion holdssince every finite union of finite sets is finite. Let us give the proof forthe general case n > 1.

Let c : [!]n �! m. We define the following sequences (ak)k<!,(Hk)k<!, (ik)k<! of integers, sets of integers, and colours in m, re-spectively: a0 = 0 and H0 is some infinite subset of ! \ 1 such thatc({0} [ s) = i0 for all s 2 [H0]n�1. The existence of such H0 andi0 follows of course by induction hypothesis applied to the colouringc0 : [! \ 1]n�1 �! m given by c0(s) = c({0}[ s). Let a1 = min(H0) andlet i1 2 m and H1 be such that H1 is an infinite subset of H0 \ (a1 +1)such that c({a1}[ s) = i1 for all s 2 [H1]n�1. These i1 and H1 exist bythe same reason as before.

In general, if ak0 and Hk0 have been defined for all k0 k, let ak+1 =min(Ak) and let ik+1 2 m and Hk+1 2 [Hk]@0 be such that c({ak+1} [s) = ik+1 for all s 2 [Hk+1]n�1.

Finally, let i 2 m be such that {k < ! : ik = i} is infinite. It isnow easy to check that c(s) = i for all s 2 [{ak : ik = i}]n. Indeed,suppose s = {ak0 , . . . , akn�1}, where k0 < . . . < kn�1 are such thatikj = i for all j. Then c(s) = c({ak0} [ {ak1 , . . . , akn�1}) = ik0 =i since {ak1 , . . . , akn�1} 2 [Ak0+1]n�1 as (Ak)k<! is ✓–decreasing andak = min(Ak+1) for all k. It follows that H = {ak : ik = i} is asdesired. ⇤

72 D. ASPERO

There is a nice argument which combines Theorem 11.2 with theCompactness Theorem of first order logic and which yields the follow-ing: For all n, m, r 2 ! there is some N 2 ! such that N ! (n)mr .

It is natural to ask whether ! in Ramsey’s theorem can be replacedby !1 or by some higher cardinal. Before answering the first question,let us prove the following important facts, which are of independentinterest.48

Theorem 11.3. There is no strictly increasing or strictly decreasingsequence !1–sequence of reals.49

Proof. Let us prove that there is no strictly increasing sequence (x⇠)⇠<!1

of reals (the proof that there is no strictly decreasing !1–sequence ofreals is completely symmetrical). Since Q is dense in R, for every⇠ < !1 we may pick a rational q⇠ such that x⇠ < q⇠ < x⇠+1. But weknow that Q is countable. Hence there must be q 2 Q such that I ={⇠ 2 !1 :, q⇠ = q} is uncountable. But this is impossible since all q⇠’shave to be distinct! Indeed, if ⇠ < ⇠0 < !1, then q⇠ < x⇠+1 x⇠0 < q⇠0 .This contradiction concludes the proof. ⇤Theorem 11.4. For every countable ordinal ↵ and for all x < y inR there is a strictly increasing sequence (x⇠)⇠<↵ of reals such that x <x⇠ < y for all ⇠.

Proof. We prove, by induction on ↵, that for all x < y in R the followingstatement (⇤)x,y↵ holds: There is a strictly increasing sequence (x⇠)⇠<↵

such that x < x⇠ < y for all ⇠.For ↵ = 0, 1, the conclusion is trivial. For ↵ = ↵0 + 1, simply take

any real z such that x < z < y, take any sequence (y⇠)⇠<↵0 witnessing(⇤)xz

↵0, and notice that the concatenation of (y⇠)⇠<↵0 with z, (y⇠)

a⇠<↵0

hzi,witnesses (⇤)x,y↵ .

Finally, suppose ↵ is a nonzero limit ordinal and let (↵n)n<! be astrictly increasing sequence of ordinals such that supn↵n = ↵. Let also(xn)n<! be a strictly increasing sequence of reals converging to y andsuch that x < x0. By induction hypothesis we may fix a sequence(x0

⇠)⇠<↵0 of reals witnessing (⇤)x,x0↵0

, a sequence (x1)⇠<↵1 of reals wit-nessing (⇤)x0,x1

↵1and, in general, for every n a sequence (xn+1

⇠ )⇠<↵n+1 ofreals witnessing (⇤)xn,xn+1

↵n+1. It follows now that the concatenation

(x0⇠)

a↵0(x1)a↵0⇠<↵1

. . .a (xn+1⇠ )a↵n⇠<↵n+1

. . .

is a sequence of reals as desired. ⇤48We will use the first of these facts in the proof of Theorem 11.5.49Here, and in the next theorem, we are of course referring to the real line

endowed with the natural order relation.


By a symmetrical argument we can of course prove that for everycountable ordinal ↵ and for all x < y in R there is a strictly decreasingsequence (x⇠)⇠<↵ such that x < x⇠ < y for all ⇠.

We can now answer to above question for !1. As we will see, theanswer is No.

Theorem 11.5. (ZFC) (Sierpinski) !1 9 (!1)22

Proof. Let us start by fixing a sequence in length !1, (x↵)↵<!1 , con-sisting of distinct reals. Such a sequence exists since we know that2@0 � @1 by Cantor’s Theorem. Let c : [!1]2 �! 2 be given by: Given↵ < � < !1, c({↵, �}) = 1 i↵ x↵ < x� (in the natural order on R).

Now, if H is a homogeneous subset of !1 for this colouring, then(r↵)↵2H is a strictly increasing sequence of reals if H(s) = 1 for alls 2 [H]2, and a strictly decreasing sequence of reals if H(s) = 0 for alls 2 [H]2. In both cases this is a contradiction by Theorem 11.3. ⇤

It is also true, by a similar argument, that !2 9 (!2)22, !3 9 (!3)22,and in general + 9 (+)22 for every infinite cardinal . Also, 9 ()22whenever is a singular cardinal.

Definition 11.6. A cardinal is weakly compact i↵ is uncountableand ! ()23.

The question naturally arises then whether there is any weakly com-pact cardinal at all.

Fact 11.7. (ZFC) If is weakly compact, then V |= ZFC.

It follows from the above fact, together with the 2nd IncompletenessTheorem, that not only we cannot prove in ZFC that there are weaklycompact cardinals, but we cannot even prove

Con(ZFC) �! Con(ZFC+ There is a weakly compact cardinal)

unless we can prove that ZFC is inconsistent! On the other hand, anyproof of ¬Con(ZFC+ There is a weakly compact cardinal) would ra-dically sever the large cardinal hierarchy as we know it. In particular,it is extremely unlikely that there is such a proof.

12. Some countable combinatorics

We will briefly look at countable linear orders and countable graphs.

74 D. ASPERO

12.1. Countable linear orders. We start out with the following ob-servation: Every finite linear order is order–isomorphic to a naturalnumber. In fact, for every n 2 !, every linear order of cardinality nis order–isomorphic to (n,2). This can be easily proved by inductionon n. Hence, the theory of finite linear orders is simply the thepry offinite ordinals. What about countable linear orders?

Certainly not all countable linear orders are order–isomorphic toordinals (i.e., well–orderable). For example Q, the set of the rationalnumbers, with the usual order relation. In particular, not all countablelinear orders are order–isomorphic (Q is not order–isomorphic to !).Let us look for some feature of Q such that we can prove that allcountable linear orders with that feature are order–isomorphic.

The first thing that comes to mind is perhaps the property of density:

Definition 12.1. A linear order (L,) is dense i↵ for all x, y 2 Lsuch that x < y there is some z 2 L such that x < z < y.

Thus, Q is dense whereas ! is not.Are all countable dense linear orders order–isomorphic? No: Q�0 =

{q 2 Q : q � 0} is also dense, but Q�0 has a minimum, namely 0,whereas Q does not. Hence Q�0 and Q cannot be order–isomorphic.Let us try with a further property:

Definition 12.2. A linear order (L,) is without end–points i↵ forevery x 2 L there are y and z in L such that y < x and x < z.50

Are all countable dense linear orders without end–points order–iso-morphic? The answer to this question, as we see next, is yes. It followsthat all countable dense linear orders without end–points are order–isomorphic to Q since Q is such a linear order.

Theorem 12.3. (ZF) (Cantor) All countable dense linear orders with-out end–points are order–isomorphic.

Proof. Let (L0,0) and (L1,1) be countable dense linear orders with-out end–points. Let (an)n<! and (bn)n<! be one–to–one enumerationsof L0 and L1, respectively. We are going to build a sequence (fn)n<!

of finite partial order isomorphism between (L0,0) and (L1,1), i.e.,each fn ✓ L0 ⇥ L1 will be a one–to–one order–preserving function be-tween dom(fn) (with the restriction of 0 to dom(fn)) and range(fn)(with the restriction of 1 to range(fn)). Moreover, we will ensure thatfn ✓ fn+1 for all n and that dom(

Sn<! fn) = L0 and range(

Sn<! fn) =

50In order words, a linear order is without end–points if it has neither minimumnor maximum.


L1. It will then follow that f =S

n<! fn is an order–isomorphism be-tween (L0,0) and (L1,1). The construction of (fn)n<! will takeplace, together with the construction of auxiliary sequences (cn)n<!

and (dn)n<!, by a back–and–forth argument.To start with, we let c0 = a0 and d0 = b0 and let f0 = {(a0, d0)}.

Let us look at whether b1 2 L1 is such that b1 <1 d0 or d0 <1 b1.In the first case we pick c1 <0 a0 (which exists since (L0,0) doesnot have a minimum) and set f1 = f0 [ {(c1, b1)}. In the second casewe pick c1 such that a0 <0 c1 (which exists since (L0,0) does nothave a maximum) and again set f1 = f0 [ {(c1, b1)}. Then f1 is aone–to-one order–preserving function extending f0. Now we look ata1 2 L0. If a1 happens to be c1, then we set d1 = b1. If a1 <0 c0 anda1 <0 c1, then pick d1 2 L1 such that d1 <1 d0 and d1 <1 b1. This d1can be found again since L1 does not have minimum. If c0 <0 a1 andc1 <0 a1, then pick d1 2 L1 such that d0 <1 d1 and b1 <1 d1. Thisd1 can be found since L1 does not have maximum. Finally, if eitherc0 <0 a1 < c1 or c1 <0 a1 <0 c0, then pick di 2 L1 <1–between d0and b1. This time d1 can be found since (L1,1) is dense. In all caseswe set f2 = f2 [ {(a1, d1)}. Then f2 extends f1 and again is order–preserving. Now we look at b2 2 L2 and pick c2 2 L0 in such a waythat f3 = f2 [ {(c2, b2)} is again an order–preserving map.

In general, if f2n has been defined, we pick cn+1 2 L0 in such a waythat f2n+1 = f2n [ {(cn+1, bn+1)} is an order–preserving map, and iff2n+1 has been defined, then we pick dn+1 2 L1 in such a way thatf2n+2 = f2n+1[{(an+1, dn+1)} is an order–preserving map. In all cases,dn (resp. cn) can be found given that (L1,1) (resp. (L0,1)) is adense linear order without end–points. By proceeding in this back–and–forth manner we ensure that f =

Sn fn is such that dom(f) = L0

and range(f) = L1. Since each fn was order–preserving, it then followsthat f is an order–isomorphism between (L0,0) and (L1,1). ⇤Definition 12.4. Given a class � of linear orders, we say that a linearorder (L⇤,⇤) is universal for � i↵ for every linear order (L,) in �there is one–to–one order–preserving map f : (L,) �! (L⇤,⇤). Inparticular, (L⇤,⇤) is universal for countable linear orders i↵ for everycountable linear order (L,) there is a one–to–one order–preservingfunction f : (L,) �! (L⇤,⇤).

Thus, a linear order (L⇤,⇤) being universal for a class � of linearorders amounts to (L⇤,⇤) containing a copy of every linear order in�.

The following is a consequence from the proof of Theorem 12.3.

Corollary 12.5. Q is universal for countable linear orders.

76 D. ASPERO

Exercise 12.1. Prove Corollary 12.5

The relevance of the above theorem is that there is a linear order –namely Q – which is ‘small’ (it is countable) and which neverthelesscontains a copy of every countable linear order.

12.2. Countable graphs. For us, graphs will consist of a certain baseset V (the set of vertices of the graph) together with a set E of 2–membered subsets of V (the edges of the graph):

Definition 12.6. A graph is an ordered pair G = (V,E), where V is aset and E ✓ [V ]2.

We often write vEv0 to indicate that {v, v0} 2 E. Of course, vEv0 ifand only if v0Ev.

The corresponding notion of structure–preserving function is givenin the following definition.

Definition 12.7. Given graphs G0 = (V0, E0), G1 = (V1, E1), a func-tion f : V0 �! V1 is a graph homomorphism i↵ f is one–to–one andfor all distinct v, v0 2 V0, vE0v

0 if and only if f(v)E1f(v0). And agraph homomorphism f is a graph isomorphism i↵ f is also a bijectionbetween the corresponding sets of vertices.

The following definition is also natural.

Definition 12.8. Given a graph G = (V,E) and V 0 ✓ V , the subgraphof G induced by V 0 is (V 0, E 0), where E 0 = E \ [V 0]2 (in other words,given two vertices v0, v1 in V 0, v0E 0v1 if and only if v0Ev1.

Graphs can be more complex than linear orders. In fact, alreadyfinite graphs are much richer than finite linear orders. For example,we have seen that, up to order–isomorphism, there is only one linearwith exactly 5 members. On the other hand, there are certainly manydi↵erent non–isomorphic graphs with exactly 5 vertices.

We will be focusing on the following remarkable graph.

Definition 12.9. A countable graph G = (V,E) is a random graph(also known as a Rado graph) i↵ for any two disjoint finite sets F0,F1 ✓ V there is some v⇤ 2 V , v⇤ /2 F0 [ F1 such that

(1) {v⇤, v} 2 E for all v 2 F0 and(2) {v⇤, v} /2 E for all v 2 F1.

Random graphs can be characterized in the following way.

Proposition 12.10. The following are equivalent for any graph G =(V,E).


(1) G is a random graph.(2) For every finite graph H = (V 0, E 0), every v 2 V 0 and every

graph homomorphism f : (V 0 \ {v}, E 0 \ [V 0 \ {v}]2) �! G,51 fcan be extended to a graph homomorphism g : H �! G.

Exercise 12.2. Prove Proposition 12.10.

Does there exist a Rado graph? As we will see next, the answer isYes.

Theorem 12.11. (ZF) Given i < j < !, let {i, j} 2 E i↵ the i–th digitin the binary expression for j is 1 (in other words, if j =

Pk<N ✏k2

k

for some N < ! and ✏k 2 {0, 1} for all k < N , then ✏i = 1). ThenG = (!, E) is a Rado graph.

Proof. Let F0 and F1 be finite subsets of ! such that F0 \ F1 = ;.We need to find some n 2 ! such that {v, n} 2 E for all v 2 F0 and{v, n} /2 E for all v 2 F1. Let N < ! be large enough (e.g., N > v forevery v 2 F1). Let n = (

Pv2F0

2v) + 2N . Then

(1) n > v for every v 2 F0 [ F1 (by the choice of N),(2) for every v 2 F0, the v-th digit in the binary expansion of n is

1, and(3) for every v 2 F1, the v-th digit in the binary expansion of n is

0.

It follows from (1)–(3) that {v, n} 2 E for all v 2 F0 and {v, n} /2 Efor all v 2 F1. ⇤

Next we prove the most remarkable property of random graphs. Aswe will see, the proof is very similar to the proof that any two countabledense linear orders without end–points are order–isomorphic.

Theorem 12.12. (ZF)

(1) Any two Rado graphs are graph–isomorphic.(2) Given any countable graph H and any Rado graph G, there is a

graph homomorphism f : H �! G.Proof. We sketch the proof of (1) (the proof of (2) is similar and in factsimpler). Let G0 = (V0, E0) and G1 = (V1, E1) be Rado graphs and let(an)n<! and (bn)n<! be enumerations of V0 and V1, respectively. Webuild a ✓–increasing sequence (fn)n<! of finite functions, fn ✓ V0⇥V1,such that each fn is a graph–isomorphism between the subgraph of G0

induced by dom(fn) and the subgraph of G1 induced by range(fn). Wemake sure that, letting f =

Sn<! fn, dom(f) = V0 and range(f) =

51(V 0 \ {v}, E0 \ [V 0 \ {v}]2) is of course the subgraph of H induced by V

0 \ {v}.

78 D. ASPERO

V1 (f will then be a graph–isomorphism between G0 and G1). Webuild (fn)n<!, together with auxilary sequences (cn)n<! and (dn)n<! of,respectively, members of V0 and members of V1 by means of a naturalback–and–forth construction.

To start with we may set c0 = a0, d0 = b0 and f0 = {(c0, d0)}. Now,since G0 is a Rado graph, we may find c1 2 V0 such that c1E0c0 ifand only if b1E1b0. Let f1 = f0 [ {(c1, b1)}. Then we look at a1. Ifa1 = c1, then we let d1 = b1. If a1 6= c1, then we may find d1 2 V1

such that d0E1d1 if and only if c0E0a1 and b1E1d1 if and only if c1E0a1.This d1 can be found, by a suitable choice of F0, F1 ✓ {d0, b1} in thedefinition of random graph since G1 is such a graph. We proceed inthis way, taking care at alternate stages of the members of V0 and ofthe members of V1, in length !. Each fn is by construction a functionas required and in the end f =

Sn fn is such that dom(f) = V0 and

range(f) = V1, and therefore f is a graph–isomorphism between G0 andG1. ⇤

One often talks of the random (or Rado) graph. The reason is ofcourse is that, by Theorem 12.11 and Theorem 12.12 (1), there is ex-actly one such graph modulo graph–isomorphism. Theorem 12.12 (2)says of course that the random graph is universal for the class of count-able graphs, where the notion of being a universal graph for a class � ofgraphs is defined in the natural way; in other words, the random graphcontains a copy (as an induced subpart) of every countable graph what-soever.

Here it is important to note that universal objects are by no meansunique. Certainly if G is a universal graph for some class � of graphs,then any ‘larger’ graph, i.e., any graph G 0 such that G is an inducedsubgraph on G 0, is of course also universal for �. Exactly the sameobservation applies of course in the case of universal linear orders as inthe previous subsection.

David Aspero, School of Mathematics, University of East Anglia,Norwich NR4 7TJ, UK

E-mail address: [email protected]

Date post:	05-Jul-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

SET THEORY (MTH-3E22) (SPRING 2015)bfe12ncu/MTH-3E22-set-theory.pdf · 2015-05-11 · 12.2....

Documents