A Primer in Combinatorics - Alexander Kheyfits.pdf

De Gruyter Textbook

Kheyfits · A Primer in Combinatorics

Alexander Kheyfits

A Primer in Combinatorics

De Gruyter

Mathematics Subject Classification 2010:Primary: 05-01; Secondary: 97K, 62H30, 91C20.

ISBN 978-3-11-022673-7e-ISBN 978-3-11-022674-4

Library of Congress Cataloging-in-Publication Data

Kheyfits, Alexander.A primer in combinatorics / by Alexander Kheyfits.

p. cm.Includes bibliographical references and index.ISBN 978-3-11-022673-7 (alk. paper)1. Combinatorial analysis � Textbooks. 2. Graph theory �

Textbooks. I. Title.QA164.K48 20105111.6�dc22

2010011761

Bibliographic information published by the Deutsche Nationalbibliothek

The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie;detailed bibliographic data are available in the Internet at http://dnb.d-nb.de.

” 2010 Walter de Gruyter GmbH & Co. KG, 10785 Berlin/New York

Typesetting: Da-TeX Gerd Blumenstein, Leipzig, www.da-tex.dePrinting and binding: AZ Druck und Datentechnik GmbH, Kempten� Printed on acid-free paper

Printed in Germany

www.degruyter.com

Preface

Combinatorial Analysis or Combinatorics, for short, deals with enumerative problemswhere one must answer the question “How many?” or “In how many ways?”. Otherproblems are concerned with the existence of certain combinatorial objects subject tovarious constraints. These kinds of problems are considered in this book.

Combinatorial problems, methods and graphical models are abundant in many ar-eas ranging from engineering and financial science to humanitarian disciplines likesociology, psychology, medicine and social sciences, not to mention mathematics andcomputer science. As parts of discrete mathematics, combinatorics and graph theoryhave become indispensable parts of introductory and advanced mathematical trainingfor everyone dealing not only with quantitative but also with qualitative data.

Moreover, combinatorics and graph theory have a remarkable and uncommon fea-ture – to begin its study, one needs no background but elementary algebra and commonsense. Even simple combinatorial problems often lead to interesting, sometimes dif-ficult questions and allow an instructor to introduce various important mathematicalideas and concepts and to show the nature of mathematical reasoning and proof. Thesequalities make combinatorics and graph theory an excellent choice for an introductorymathematical class for students of any age, level and major.

This is a text for a one-semester course in combinatorics with elements of graphtheory. It can be used in two modes. The first three chapters cover an introductorymaterial and can be (and have actually been) used for an undergraduate class in combi-natorics and/or discrete mathematics, as well as for a problem-solving seminar aimedat undergraduate and even motivated high-school students.

Chapters 4 and 5 are of more advanced level and the whole book includes enoughmaterial for an entry-level graduate course in combinatorics. For the mathematicallyinclined reader, the material has been developed systematically and includes all theproofs. After this book, the reader can study more advanced courses, e.g. [1, 8, 9, 19,45]. At the same time, the reader who is primarily interested in applying combinatorialmethods can skip (most of) the proofs and concentrate on problems and methods oftheir solution.

In Chapter 1 we introduce basic combinatorial concepts, such as the Sum and Prod-uct Rules, combinations, permutations, and arrangements with and without repetition.Various particular elementary methods of solving combinatorial problems are alsoconsidered throughout the book, such as, for instance, the trajectory method in Sec-tion 1.4 or Ferrers diagrams in Section 4.4. In Section 1.6 we apply the methods ofSections 1.1–1.5 to develop the elementary probability theory for random experimentswith finite sample spaces. Our goal in this section is not to give a systematic expo-

vi Preface

sition of probability theory, but rather to show some meaningful applications of thecombinatorial methods developed earlier.

Chapter 2 contains an introduction to graph theory. After setting up the basic vo-cabulary in Sections 2.1–2.2, in the next three sections we study properties of trees,Eulerian and planar graphs, and some problems of graph coloring and graphical enu-meration. Many other graph theory problems appear in Chapters 3–5. As an applica-tion of the methods developed in Chapters 1–2, in Chapter 3 we give an elementaryintroduction to hierarchical clustering algorithms. This topic has likely never appearedin textbooks before.

Chapter 4 is devoted to more advanced methods of enumerative combinatorics.Sections 4.1–4.2 cover inversion formulas, including the Möbius inversion, and thePrinciple of Inclusion-Exclusion. The method of generating functions is developedin Section 4.3. Generating functions are introduced as analytical objects, the sumsof converging power series. In Section 4.4 we consider several applications of themethod of generating functions, in particular partitions and compositions of integernumbers and linear recurrence relations (difference equations) with constant coeffi-cients. The Pólya–Redfield enumeration theory is considered in Section 4.5.

The last chapter of the book is concerned with combinatorial existence problems.The Ramsey theorem and its applications are considered in Section 5.1. The Dirich-let (pigeonhole) principle follows immediately. Section 5.2 treats Hall’s theorem onsystems of distinct representatives (the marriage problem) and some of its equivalentstatements, namely, König’s theorem on zero-one matrices and Dilworth’s theoremon chains in partially ordered sets. An example of an extremal combinatorial problem(the assignment problem) is also considered here. Section 5.3 contains an introduc-tion to the theory of balanced block designs. We consider only recursive methods ofconstruction of block designs since deep algebraic results are beyond the scope of thisbook. Finally, Section 5.4 is devoted to the systems of triples concluding with theproof, due to Hilton [27] of the necessary and sufficient conditions of the existence ofSteiner’s triple systems.

The author’s credo in teaching mathematics involves advancing from examples andmodel problems to theory and then back to problem solving. This approach worksespecially well in combinatorics. Every section of the book starts with simple modelproblems. Discussing and solving these problems, we derive the basic concepts anddefinitions. Then, we study essential properties of the concepts developed and againsolve problems to illustrate the ideas, methods, and their applications. In particular,some parts of proofs are left as problems to be solved by the reader. Studying thesolutions of typical problems in the book, the reader can quickly grasp the methods ofsolving various combinatorial problems and apply these methods to a range of similarproblems in any subject. Thus the book can be used as a self-study guide by the readerinterested in solving combinatorial problems.

Preface vii

More than 800 problems constitute an integral part of the text. Many problems aredrawn from literature, some are folklore, and some may be original. Many problemsare solved in the text, scores of other problems and exercises, marked by EP, are in theend of each section. Solutions, answers or hints to selected problems and exercisesare given in the end of the book.

Additional problems can be found in the books cited in the list of references, specif-ically, in [10, 11, 26, 33, 34, 47]. Interesting topics for further reading and individualprojects can be found in [4].

Combinatorial problems often provide natural intuitive motivation and models forimportant mathematical ideas and concepts, such as operations on sets, various classesof functions, classes of binary relations, and many others. Primary combinatorial con-cepts, permutations, combinations and alike, can be naturally defined in terms of settheory operations and functions. In the text, we systematically use this approach thatcan be traced (at least) as far back as C. Berge’s monograph [7]. Not to mention itsconciseness and theoretical merits, this set-theory based approach is often advanta-geous in problem solving, and we demonstrate this in the text using many examples.This approach removes the ambiguity that is often present in combinatorial prob-lems, especially when different objects must be identified, and significantly reducesthe number of student errors.

The author’s experience shows that freshmen usually master this approach withease and successfully apply it to problem solving. For readers unfamiliar with thelanguage and basics of set theory, Section 1.1 systematically develops some standardterminology, which is used in the following sections. The reader familiar with naiveset theory can skip Section 1.1 and refer back to it as needed.

Very few non-elementary concepts are included in the text. No concept beyond theprecalculus level appears before Section 4.3. Two calculus-level concepts, those ofderivatives of elementary functions and of converging series, appear in Section 4.3on generating functions. From this point on the book can be subtitled “Combina-torics through the eyes of an analyst”. Even the notion of a converging series canbe eliminated and replaced by the finitary concept of generating polynomials, that is,truncated power series, and we solve a few problems to demonstrate the method. Thisapproach makes the method of generating functions accessible to the readers withoutany calculus background at all, though calculations become more tedious.

It should be noted that these days many college students take at least one calculusclass, but afterwards they see no actual application of calculus. Therefore, some non-trivial examples of applications of calculus ideas and methods are appropriate. Thesame can be said of the few elementary algebraic concepts (groups, rings) appearingin Chapters 4 and 5.

The book is self-contained; all the concepts and definitions used are defined andexplained by examples. The detailed Index includes references to important groupsof problems and specific methods of their solution, such as “coloring problems” or

viii Preface

“method of generating functions”. Throughout the text, we use several abbrevia-tions: GF stands for generating function(s), EGF for exponential generating func-tion(s), SDR for system(s) of distinct representatives, and BIBD for balanced incom-plete block design(s). Theorems, lemmas, problems, etc., have three-digit numbering,thus, Problem 1.2.3 refers to the third problem in the text of Section 1.2 of Chapter 1,while EP 1.2.3 means the third problem in Exercises and Problems 1.2 in the end ofSection 1.2. Figures have two-digit numbering, thus Fig. 2.3 refers to the third fig-ure in Chapter 2. The symbol indicates the ends of the proofs of statements orsolutions of problems.

Combinatorial problems and graphical models have been studied by many outstand-ing scientists for thousands of years. The web site www.degruyter.com/primer-in-combinatorics contains many interesting links describing the history of thesedevelopments and lives of the people involved. The coffee cup icon indicatesthat there is information available at the web site. This site also provides the author’semail address where the reader can send comments, remarks and corrections on thebook to.

Chapter 3 is a revised version of Module 03-1 in the DIMACS series of educationalmodules, written when the author participated in Reconnect 1998 and Reconnect 1999conferences at the DIMACS Center at Rutgers University of New Jersey. The authoris grateful to the DIMACS Center, its Director Professor Fred Roberts and ProfessorMelvin Janowitz for their hospitality and the kind permission to include Module 03-1in this text, and to Professor Catherine McGeoch for her generous help.

It is finally the author’s great pleasure to thank Simon Albroscheit, Robert Plato,Friederike Dittberner and the staff of De Gruyter for their friendly and highly profes-sional handling of the whole publishing process.

New York, February 2010 Alexander Kheyfits

Contents

Preface v

I Introductory Combinatorics and Graph Theory 1

1 Basic Counting 31.1 Combinatorics of Finite Sets . . . . . . . . . . . . . . . . . . . . . . 31.2 The Sum and Product Rules . . . . . . . . . . . . . . . . . . . . . . 301.3 Arrangements and Permutations . . . . . . . . . . . . . . . . . . . . 381.4 Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 441.5 Permutations with Identified Elements . . . . . . . . . . . . . . . . . 691.6 Probability Theory on Finite Sets . . . . . . . . . . . . . . . . . . . . 74

2 Basic Graph Theory 922.1 Vocabulary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 922.2 Connectivity in Graphs . . . . . . . . . . . . . . . . . . . . . . . . . 1012.3 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1102.4 Eulerian Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1222.5 Planarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

3 Hierarchical Clustering and Graphs 1293.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1293.2 Model Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1333.3 Hubert’s Single-Link Algorithm . . . . . . . . . . . . . . . . . . . . 1473.4 Hubert’s Complete-Link Algorithm . . . . . . . . . . . . . . . . . . 1543.5 Case Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168

II Combinatorial Analysis 175

4 Enumerative Combinatorics 1774.1 The Inclusion-Exclusion Principle . . . . . . . . . . . . . . . . . . . 1774.2 Inversion Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 1904.3 Generating Functions I. Introduction . . . . . . . . . . . . . . . . . . 1964.4 Generating Functions II. Applications . . . . . . . . . . . . . . . . . 2174.5 Enumeration of Equivalence Classes . . . . . . . . . . . . . . . . . . 240

x Contents

5 Existence Theorems in Combinatorics 2625.1 Ramsey’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 2625.2 Systems of Distinct Representatives . . . . . . . . . . . . . . . . . . 2745.3 Block Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2895.4 Systems of Triples . . . . . . . . . . . . . . . . . . . . . . . . . . . 298

Answers to Selected Problems 307

Bibliography 315

Index 319

Part I

Introductory Combinatorics and Graph Theory

Chapter 1

Basic Counting

In this chapter we introduce some basic concepts of enumerative combinatorics.In Section 1.1 the language is being prepared – we discuss the Axiom of Math-ematical Induction and operations on sets, binary relations, important classes offunctions (mappings). This language of sets and mappings is systematically usedin Sections 1.2–1.5 to introduce the sum and product rules, arrangements, per-mutations, and combinations with and without repetition. As an important appli-cation of the methods developed, in Section 1.6 we consider some basic notionsof the probability theory in the case of finite sample spaces.

1.1 Combinatorics of Finite Sets

In this introductory section we review a few fundamental set-theory notions andcalculate cardinalities of basic set theory objects – the unions, intersections, andCartesian products of sets. All proofs are based on the Principle of MathematicalInduction.

Cantor’s biography � Deduction & Induction � Nicomachus’ biogra-phy � Al-Kashi’s biography � Abel’s biography � Stirling’s approx-imation for factorials � Factorials � Boole’s biography � DIMACScenter � Hypsicle’s biography

Throughout we mostly deal with finite sets, thus we accept a naïve point of view,do not introduce axioms of the set theory, do not distinguish sets, classes, etc. Anycollection of different elements is called a set and is denoted by braces, ¹x1; x2; : : :º,here x1; x2; : : : are the elements of this set. A set X can also be introduced by thedefining property of its elements, that is, the property P such that every element ofX has this property, but no other element possesses it. In this case we write X D¹x jP.x/º. If x is an element of a set X , we write x 2 X , otherwise x … X . A setthat contains no element is called the empty set and is denoted by ;. A detailedexposition of the naive set theory can be found, for example, in [22].

Example 1.1.1. The setD D ¹d j d is a Hindu-Arabic digitº consists of ten elements0; 1; 2; : : : ; 9, thus, 0 2 D; 5 2 D, but 10 … D. The set T D ¹1; 2; 3º consists of thefirst three positive integer numbers, 3 2 T but 0 … T .

4 Chapter 1 Basic Counting

It is important that sets are unordered collections, that is, ¹a; bºD¹b; aº, ¹a; b; cºD¹b; a; cº D ¹a; c; bº, and similar statements hold true for any number of elements.Moreover, a set cannot contain repeating elements, that is, ¹a; b; aº D ¹a; bº.

Thus, sets are primary, undefined objects. Another major undefined object is the setof all natural numbers1 N D ¹1; 2; : : : ; n; : : :º. The set of the first n natural numbersis denoted by Nn D ¹1; 2; : : : ; nº and for brevity is called a natural segment, or morespecifically, a natural n-segment; thus, N3 D ¹1; 2; 3º and N1 D ¹1º. The wholenumbers W include all natural numbers and zero, that is, W D ¹0; 1; 2; : : : ; n; : : :º.The set of all integer numbers, positive, negative, and zero, is denoted by Z. Denotealso Zp D ¹0; 1; : : : ; p � 1º for any natural p; in particular, Z2 D ¹0; 1º. The setof all real numbers is denoted by R, and RC stands for the set of all nonnegative(including 0) real numbers.

We notice that two words “for all” often appear in mathematical texts. As an abbre-viation for this expression, a special symbol 8 is used, called the universal quantifier.Thus, a sentence

“A property P.x/ holds true for all the elements of a set X”

can be shortened to.8x 2 X/P.x/:

Likewise, the symbol 9, called the existential quantifier serves as an abbreviation forthe expression “there exists”. For example, the expression

.9x 2 X/P.x/

means that there exists at least one element in the setX that possesses the property P .These expressions are often shortened to 8x P.x/, 9x P.x/, if it is clear what set Xis referred to.

Definition 1.1.2. A set X is called a subset of a set Y , if x 2 Y whenever x 2 X ,that is2, 8x .x 2 X ) x 2 Y /; this is denoted by X � Y .

Example 1.1.3. 1) N1 � N3 but not vice versa.

2) N �W � Z � R.

Basic combinatorial concepts are defined in this text in terms of functions (map-pings) and equivalence classes. The concept of a mapping itself is also a primary,undefined notion; the following paragraph is not a mathematical definition rather it isan intuitive description of mappings (functions).

1It is customary in computer science and mathematical logic to treat 0 as a natural number, that isdefine N D ¹0; 1; 2; : : :º. For our goals, however, it is more convenient to assume that 0 ¤ N.

2In definitions) stands for the implication “if. . . then”.

Section 1.1 Combinatorics of Finite Sets 5

Let X and Y be two sets; if to each element x 2 X there corresponds the uniquelydefined element y 2 Y , denoted by y D f .x/, then it is said that a mapping (ora function or a transformation) f is given with the domain X D dom.f / and the

codomain Y D codom.f /; it is denoted by f W X ! Y or xf7�! y.

Now we can define the new concepts in terms of mappings.

Definition 1.1.4. Given a map f W X ! Y and an element x 2 X , then the elementy D f .x/ 2 Y is called the image of the element x with respect to the mapping f ; inturn, x is called a preimage or an inverse image of y. Denote the total preimage of anelement x 2 X , that is, the set of all its preimages, by

f �1.y/ D ¹x 2 X jf .x/ D yº:

The set of all images is called the range of a function (mapping) f and is denoted byRan.f / or f .X/, thus,

Ran.f / D ¹y 2 Y j 9x 2 X such that f .x/ D yº;

in particular, Ran.f / � Y .

Definition 1.1.5. Two mappings, f W X ! Y and g W X1 ! Y1, are called equal if3

X D X1; Y D Y1, and f .x/ D g.x/, 8x 2 X D X1.

Example 1.1.6. Consider the mappings

f W R! R; g W RC ! R; h W R! RC; k W RC ! RC;

all four given by the same formula f .x/ D x2, g.x/ D x2, h.x/ D x2, k.x/ D x2,but with different domains or codomains. These four mappings are pairwise different.

Definition 1.1.7. 1) A mapping f W X ! Y is called injective (or univalent), ifno element of Y has more than one preimage.

2) A mapping f W X ! Y is called onto or surjective, if each element of Y has atleast one preimage.

3) A mapping f W X ! Y is called bijective or a one-to-one correspondence, if itis both injective and surjective.

Problem 1.1.1. What mappings in Example 1.1.3 are injective? Surjective? Bijec-tive? Neither?

3In definitions “if” always means “if and only if”.


Definition 1.1.8. A set X is called finite if it is the empty set ; or if it can be putin a one-to-one correspondence with a set Nk with some k D 1; 2; : : : ; the quantityk 2 N is called the number of elements or the cardinality of X and is denoted byjX j D k. Otherwise, the set is called infinite. We set j;j D 0 by definition. The set ofnatural numbers N, as well as any set that can be put in a one-to-one correspondencewith N, is called countable .

Problem 1.1.2. Are natural segments finite? Explain why the set of natural numbersN is infinite. Prove that the sets of even positive integers ¹2; 4; 6; : : :º, odd positiveintegers ¹1; 3; 5; : : :º, prime numbers ¹2; 3; 5; 7; 11; : : :º are countable.

Problem 1.1.3. Prove that the set of integers Z is infinite. Is it countable? Explainwhy the set of real numbers R is infinite. Is it countable?

To introduce our next topic, the Axiom of Mathematical Induction, we first discussan example. We want to find an explicit formula for the sum

1C 3C 5C � � � C .2n � 1/

of n consecutive odd numbers, which is valid for every n D 1; 2; 3; : : : . To guess theformula, we consider the three sums, 1C3 D 4, 1C3C5 D 9, 1C3C5C7 D 16. Wenotice that all these sums are squares of integer numbers: if n D 2 then 4 D 22, forn D 3, 9 D 32, and for n D 4, 16 D 42. It is natural now to guess that all such sumsare squares. We can check a few more cases, for example, 1C 3C 5C 7C 9 D 52,1C 3C 5C 7C 9C 11 D 62, 1C 3C 5C 7C 9C 11C 13 D 72; the shortest sum,comprising one addend, 1 D 12, also supports the guess.

Thus, we claims that the equation

1C 3C � � � C .2n � 1/ D n2

holds true for all natural n D 1; 2; 3; : : : . We have checked this equation for severalvalues of n, however, by no means can we verify infinitely many numerical equationsfor infinitely many natural numbers. Therefore, we have to develop a new methodcapable to solve similar problems, that is, the problems involving a parameter, whichcan take on infinitely many integer values. So that, this method must reflect somefundamental properties of the infinite set of natural numbers.

This method is called the Principle (or the Axiom or the Postulate) of mathematicalinduction.

The Axiom of Mathematical Induction

Consider a set of statements or formulas Sn1 ; Sn1C1; Sn1C2; : : : , numbered by allinteger numbers n � n1. Usually n1 D 1 or n1 D 0, but it can be any integer number.


1) Firstly, suppose the statement Sn1 , called the basis step of induction, is valid.In applications of the method of mathematical induction the verification of Sn1is an independent problem. This step may be sometimes trivial, but it cannot beskipped.

2) Secondly, suppose that for each natural n � n1 we can prove a conditional state-ment Sn ) SnC1, that is, we can prove the validity of SnC1 for each specifiednatural n > n1 assuming the validity of Sn, and this conditional statement isvalid for all natural n � n1. This part of the method is called the inductive step.The statement Sn is called the inductive hypothesis or inductive assumption.

3) If we can independently show these two steps, then the principle of mathemat-ical induction claims that all of the statements Sn, for all natural n � n1 arevalid.

This method of proof is accepted as an axiom, because nobody can actually ver-ify infinitely many statements Sn, n � n1; the method cannot be justified withoutusing some other, maybe even less obvious properties of the set of natural numbers.Mathematicians have been using this principle for centuries and never arrived at acontradiction. Therefore, we accept the method of mathematical induction without aproof, as a postulate, and believe that this principle properly reflects certain funda-mental properties of the infinite set N of natural numbers. We will apply the method(Axiom) of mathematical induction many times in the sequel chapters, the method willbe employed in each proof in this chapter, however, sometimes the method presentsitself only implicitly, through some known results that have been already proved byusing mathematical induction. In the following problem we give a detailed exampleof an application of the method of mathematical induction.

Problem 1.1.4. Show that for every natural n; n D 1; 2; 3; : : : ,

12 C 22 C � � � C n2 D1

6n.nC 1/.2nC 1/:

Solution. Here n1 D 1 and Sn stands for the equation above, thus, S1 denotes theequation 12 D 1

61.1C 1/.2 � 1C 1/, which is certainly true, S2 denotes 12 C 22 D

162.2C1/.2 �2C1/, which is true as well; S3 is also a valid statement 12C22C32 D163.3 C 1/.2 � 3 C 1/. Therefore, we have the basis of induction (of course, it was

enough to verify only one statement S1) and we have to validate the inductive step.To do that, we have to prove SnC1, assuming that Sn is valid for some unspecified

but fixed natural n D n0. In this problem we must prove the statement (the equation)Sn0C1, which reads

Sn0C1 W 12 C 22 C � � � C .n0/2 C .n0 C 1/2 D1

6.n0 C 1/.n0 C 2/.2n0 C 3/


assuming that Sn0 is valid, that is, using the equation

Sn0 W 12 C 22 C � � � C .n0/2 D1

6n0.n0 C 1/.2n0 C 1/

as if it were correct. Its validity in general is unknown yet, however, in the procedurewe suppose it to be true. It is worth repeating that our reasoning must be valid for anynatural number n, that is, the reasoning can use only properties common to all naturalnumbers. For instance, we cannot assume that n is an odd number.

To simplify notation, in the sequel we drop the apostrophe and write in all formulasn assuming it to be fixed. We observe that the left-hand side of

SnC1 W 12 C 22 C � � � C n2 C .nC 1/2 D1

6.nC 1/.nC 2/.2nC 3/

contains the left-hand side of Sn, and the latter in our inductive reasoning is consideredto be known. This observation gives us the idea of the proof. Since we assume thatthe equation

Sn W 12 C 22 C � � � C .n/2 D1

6n.nC 1/.2nC 1/

holds true, we employ Sn to transform the left-hand side of SnC1 as follows,

SnC1 W�12 C 22 C � � � C n2

�C .nC 1/2

D

�1

6n.nC 1/.2nC 1/

�C .nC 1/2 D

1

6.nC 1/.nC 2/.2nC 3/:

Thus we have derived the statement SnC1 from Sn for an arbitrary fixed natural n.Since we completed both steps of the principle of mathematical induction, we claimthat Sn is valid for all natural n.

Next we introduce a useful notation. In the preceding problem we had to deal withsums with variable limits. To simplify many formulas, it is convenient to use thesummation or sigma notation. The sum a1 C a2 C � � � C an is denoted by

PkDnkD1 ak .

Here k is called the summation index, 1 and n are the lower and upper limits ofsummation. Usually we simplify the upper index and write the sums as

PnkD1 ak .

For example,PnkD1

1k

means

nXkD1

1

kD1

1C1

2C1

3C1

4C � � � C

1

n:

If n D 1, this is just1XkD1

1

kD1

1;


if n D 2, that becomes2XkD1

1

kD1

1C1

2;

if n D 3, that means3XkD1

1

kD1

1C1

2C1

3:

Using the sigma-notation, Problem 1.1.4 can be stated as

nXkD1

k2 D1

6n.nC 1/.2nC 1/; n D 1; 2; 3; : : : :

The summation index is often called dummy index, for it can be replaced by anycharacter, which collides with no other indeterminate in the formula. For example,we can write

nXkD1

1

kD

nXlD1

1

lD

nXiD1

1

i;

but it is ambiguous to writePnnD1

1n

.Similarly to the summation notation, we can abbreviate any other operation with

several operands. For instance, the product a1 � a2 � � � an can be written as

nYkD1

ak D a1 � a2 � � � anI

for example,Q4kD1 k

2 D 576.The following problem shows some useful properties of the sigma notation. In the

end of this section the reader finds other problems concerning this symbol.

Problem 1.1.5. Prove that

1)PnkDm.�ak/ D �

PnkDm ak

2)PnkDm.ak C bk/ D

PnkDm ak C

PnkDm bk

3)PnkDm.bak/ D b

PnkDm ak for any constant b

4)PnkDm akCl D

PnClkDmCl ak .

Problem 1.1.6. Prove by mathematical induction that for every natural n,

1C 2C 3C � � � C n Dn.nC 1/

2

or using the summation notation,PnkD1 k D

n.nC1/2

.


As another example, we consider an ancient Greek problem.

Problem 1.1.7 (Nickomachus ). Partition all odd numbers into groups consistingof 1; 2; 3; : : : ; n; : : : consecutive odd numbers, namely,

N D ¹1º [ ¹3; 5º [ ¹7; 9; 11º [ ¹13; 15; 17; 19º [ : : : :

If we add up the numbers within each group, we discover (cf. the discussion afterProblem 1.1.3) the equations 3C 5 D 8 D 23, 7C 9C 11 D 27 D 33, 13C 15C17 C 19 D 43, etc.; certainly 1 D 13. Show that this is a general pattern, that is,demonstrate that the sum of odd numbers in the nth group is n3 for any natural n.

Solution. It is convenient here to denote odd numbers by 2k � 1; k D 1; 2; 3; : : : .Since the nth group contains n numbers, the preceding n�1 groups altogether contain,by Problem 1.1.6, 1 C 2 C � � � C .n � 1/ D .1=2/.n � 1/n odd numbers. thus theproblem reduces to proving the equation

k2XkDk1

.2k � 1/ D n3

where we must determine the indices k1 and k2 so that 2k1 � 1 is the smallest oddnumber in the nth group and 2k2 � 1 is the largest one.

We notice that in the equationm D 2k�1 the number k means the “serial number”of the odd number m in the series of all odd numbers. Indeed,

if 1 D 2k � 1, then k D 1, that is, 1 is the first odd number

if 3 D 2k � 1, then k D 2, which means 3 is the second odd number

if 5 D 2k � 1, then k D 3, and 5 is the third odd number, etc.

Therefore, since the first n � 1 groups contain first .n � 1/n=2 odd numbers, thefirst odd number in the nth group is the

� .n�1/n2C 1

�stodd number, which implies

k1 D.n�1/n2C 1. By the same token, k2 D

n.nC1/2

, thus to solve the problem wehave to prove that

n.nC1/2X

kD .n�1/n2C1

.2k � 1/ D n3:

It is not hard to prove this by the straightforward mathematical induction, but it issimpler to use the properties mentioned in Problem 1.1.5 and to transform the sum on


the left side as

2

n.nC1/

2XkD .n�1/n

2C1

k �

n.nC1/2X

kD .n�1/n2C1

1

D 2

² n.nC1/2X

kD1

k �

.n�1/n2X

kD1

k

³�

�n.nC 1/

2�.n � 1/n

2

�;

apply twice Problem 1.1.6 to the sums in braces and simplify the resulting expression.

Problem 1.1.8. Where in the proof did we use the mathematical induction?

The mathematical induction is a very powerful method of proof, however some-times we can find an easier approach.

Problem 1.1.9. Show that for every natural n

1

1 � 2C

1

2 � 3C � � � C

1

n � .nC 1/D

n

nC 1:

Give two proofs, by mathematical induction and by making use of telescoping sums –which method is simpler in this problem?

Solution. We do only the second proof. A sumP2nkD1 ak is said to be telescoping if

a3 D �a2; a5 D �a4; : : : ; a2n�1 D �a2n�2, thus all the addends but the first andthe last one, cancel out and the sum is a1 C a2n. We remark that

1

k.k C 1/D1

k�

1

k C 1;

thus the sum in the problem is telescoping and we get

1

1 � 2C

1

2 � 3C � � � C

1

n � .nC 1/D 1 �

1

nC 1D

n

nC 1;

thus proving the claim.

Remark 1.1.9. Did we really avoid the mathematical induction?

It is essential that neither the first nor the second step in an inductive proof canbe omitted. For instance, consider a polynomial P.x/ D x2 C x C 41. ComputingP.1/ D 43, P.2/ D 47, P.3/ D 53, we observe that all these values are primenumbers; P.0/ D 41 is also prime. A reasonable hypothesis springs up that the valueP.n/ is prime for any whole n. Such a conclusion is called incomplete induction,


since it is based on a finite set of observations and has not been confirmed by theinductive step. Without this validation the incomplete induction can lead to falseconclusions. Indeed, if we continue the numerical experiment with the polynomialabove, we discover that all numbers P.0/; : : : ; P.39/ are prime, butP.40/ D 1681 D412 is a composite number, thus invalidating our guess.

The following result may look simple, although it is fundamental in solving combi-natorial problems. It is this property that underlines, for instance, the following trivialfact: if 25 students attend a class, and there are only 24 chairs in the classroom, theneither two students will have to share a chair or one student will have to stand. Thelatter is obvious, but there are many non-obvious problems where this result is use-ful. Even if this theorem is not mentioned explicitly, it is present in any enumerativeproblem. We prove it only for finite sets.

Theorem 1.1.10. For any finite sets X and Y ,

1) jX j � jY j if and only if there exists an injective mapping f W X ! Y

2) jX j � jY j if and only if there exists a surjective mapping f W X ! Y

3) jX j D jY j if and only if there exists a bijective mapping f W X ! Y .

Proof. Denote X D ¹x1; : : : ; xnº and Y D ¹y1; : : : ; ymº. If there exists an injectivemapping f W X ! Y such that f .xi / D yji ; 1 � i � n, then all images yji ; 1 � i �n, must be different for f is injective, thus, there are at least as many yj s as xi s, that is,n � m. On the other hand, if n D jX j � m D jY j, we can straightforwardly constructa required injective mapping f W X ! Y , for instance as f .xi / D yi ; 1 � i � n,which proves part 1) of the problem. Part 2) can be proved likewise and part 3) followsfrom parts 1) and 2).

Problem 1.1.10. To facilitate memorization of telephone numbers, they can be ex-pressed as certain combinations of digits and relevant words; for example, it is easierto remember 1-800-333-TOLL, than 1-800-333-8655. To this end, the dialing keys ontelephone handsets are marked by both digits and letters. What relationship betweenthe set of digits ¹0; 1; : : : ; 9º and the English alphabet allows us to use this approach?

Next we introduce operations on sets. Working on any problem involving sets, wealways assume, explicitly or implicitly, that all sets under consideration are subsets ofa certain ambient totality, called the universal set U . This is our universe and nothingexists in the problem beyond U . This remark is important when we compute thecomplement of a set.

Definition 1.1.11. 1) If ¹Xlº!lD1

is a family of sets Xl , then the collection of allelements x, belonging to at least one of the sets Xl ; l D 1; 2; : : : , is called the


union of the sets Xl and is denoted by

![lD1

Xl D ¹x j 9l � 1; x 2 Xlº:

2) The collection of all elements x belonging to each one of the sets Xl ; l D1; 2; : : : ; !, is called the intersection of the sets Xl and is denoted by

!\lD1

Xl D ¹x j x 2 Xl ; 8l � 1º:

If X \ Y D ;, the sets X and Y are called disjoint.

3) The difference of setsX and Y , denoted byX nY , is the set of all those elementsof X , which do not belong to Y and is denoted by

X n Y D ¹x j x 2 X and x … Y º:

4) The complement of a set X , denoted by X or Xc , is the set of all the elementsof the universal set U that do not belong to X ,

X D ¹x 2 U j x … XºI

it is obvious that X D U nX .

Problem 1.1.11. Find N1[N3; N1\N3; N1 nN3; N3 nN1 – the natural segmentsNl were introduced at the beginning of this section.

The following problem lists important properties of these operations, some of themare similar to the well-familiar properties of the addition and multiplication of num-bers. These properties are also valid not only for two, but for any finite collection ofsets.

Problem 1.1.12. 1) The union and intersection of sets are commutative,

X [ Y D Y [X

X \ Y D Y \X

and associative operations,

X [ .Y [Z/ D .X [ Y / [Z D X [ Y [Z

X \ .Y \Z/ D .X \ Y / \Z D X \ Y \Z;

they satisfy two distributive laws,

X [ .Y1 \ Y2/ D .X [ Y1/ \ .X [ Y2/

X \ .Y1 [ Y2/ D .X \ Y1/ [ .X \ Y2/:


2) The complement is connected with the union and intersection by de Morganlaws,

X \ Y D X [ Y ;

X [ Y D X \ Y :

The properties we have already considered, are useful in many enumerative prob-lems. For instance, the definition of the union of two sets directly implies the follow-ing statement.

Lemma 1.1.12. If X and Y are finite disjoint sets, that is, jX j < 1, jY j < 1, andX \ Y D ;, then jX [ Y j D jX j C jY j.

By the Axiom of Mathematical Induction, this lemma immediately extends to anyfinite collection of sets.

Lemma 1.1.13. If jXi j < 1; i D 1; 2; : : : ; m, and Xi \ Xj D ; for all 1 � i; j �m; i ¤ j , then jX1 [X2 [ � � � [Xmj D jX1j C jX2j C � � � C jXmj.

We will often use the following notion.

Definition 1.1.14. It is said that nonempty and mutually disjoint sets

X˛; Xˇ ; X ; : : :

make a partition of a set X , if X D X˛ [ Xˇ [ X [ � � � , where the order of setsis immaterial. The number of all partitions of an n-element set is called the Bellnumber Bn, see Problem 1.1.19.

Example 1.1.15. Thus, the set Ze D ¹: : : ;�4;�2; 0; 2; : : :º of all even numbersincluding zero, and the set Zo D ¹: : : ;�3;�1; 1; 3; : : :º of all odd numbers forma partition of the set of integers, Z D Ze [ Zo and Ze \ Zo D ;, while Z0e D¹: : : ;�4;�2; 2; : : :º and Zo D ¹: : : ;�3;�1; 1; 3; : : :º do not.

Problem 1.1.13. Find a set Z00 such that ¹Z0e;Zo;Z00º is a partition of Z; the set Z0ewas introduced in Example 1.1.15.

Problem 1.1.14. Prove that the total preimages of all the elements in the range of anymapping make a partition of the domain of this mapping.

The result of this problem implies immediately


Lemma 1.1.16. If X and Y are finite sets and f W X ! Y is a surjective mapping,then

jX j DXy2Y

jf �1.y/j:

In particular, if all total preimages have the same cardinality n0, then

jX j D n0jY j:

Example 1.1.17. Let a mapping f W X ! Y , where X D ¹�3;�2;�1; 0; 1; 2; 3ºand Y D ¹0; 1; 4; 9º, be given by f .x/ D x2. Then f �1.¹0º/ D ¹0º, f �1.¹1º/ D¹�1; 1º, f �1.¹4º/ D ¹�2; 2º, f �1.¹9º/ D ¹�3; 3º. Here

jX j DXy2Y

jf �1.y/j D 7:

If f1 W X1 ! Y1; f1.x/ D x2, where X1 D ¹�3;�2;�1; 1; 2; 3º and Y1 D¹1; 4; 9º, then jY1j D 3; n0 D 2 and jX1j D 6 D 2 � 3.

In many problems we have to distinguish ordered and unordered totalities. Thelatter are sets and as such, are denoted by braces, ¹aº, ¹a; bº D ¹b; aº; : : : . However,unlike the two-element set ¹a; bº, ordered pairs, denoted by parentheses, .a; b/, arecharacterized by the profound property

.a; b/ D .b; a/ if and only if a D b;

and a definition must preserve this property. Such a definition can be given in termsof mappings.

Definition 1.1.18. An ordered pair with the first element a and the second elementb is a mapping f W ¹1; 2º ! ¹a; bº, where a D f .1/ and b D f .2/. This pair isdenoted by .a; b/.

The next definition introduces a useful mathematical model dealing with orderedtotalities.

Definition 1.1.19. Given two sets X and Y , the set of all ordered pairs .x; y/ withx 2 X; y 2 Y is called the Cartesian or direct product of these sets in this orderand is denoted by

X � Y D ¹.x; y/ j x 2 X and y 2 Y º:

Problem 1.1.15. Compute N1 � N3; N3 � N1; N2 � N3; N3 � N2, and find thecardinal numbers of these sets.


An ordered totality of n elements a1; a2; : : : ; an is called an n-tuple or n-vector andis denoted by .a1; a2; : : : ; an/; thus, 2-tuples are ordered pairs. To avoid confusionwith unordered sets, ordered totalities are denoted by parentheses.

Problem 1.1.16. Define n-tuples in terms of mappings. Give a definition of the Carte-sian product of three or more sets.

In many problems it is necessary to consider not the entire Cartesian product butonly its subsets.

Definition 1.1.20. Given two sets X and Y , any subset % of their Cartesian productX � Y is called a binary relation between elements of X and Y . If Y D X , that is,% � X �X , % is called a (binary) relation on the set X .

Example 1.1.21. For instance, if X � Y D N � N, we can consider %0 D ;, or%1 D ¹.1; 1/; .1; 2/º, or %2 D ¹.1; 3/º, or %3 D ¹.3; 1/º; it is worth repeating that%2 ¤ %3. We say that 1 2 X D N is in the relation %2 with 3 2 Y D N but not viceversa, that is, 3 2 Y D N is not in the relation %2 with 1 2 X D N.

Problem 1.1.17. How many binary relations do exist between the natural segmentsN1 and N3?

Definition 1.1.20 of binary relations is very general. In applications we are usuallyinterested in more specific classes of binary relations. In the following definitions weconsider only relations on a set X .

Definition 1.1.22. 1) A binary relation % � X � X is called reflexive if each ele-ment of X is in this relation with itself, that is,

.8x 2 X/..x; x/ 2 %/:

2) A binary relation % � X�X is called symmetric if for all x; y 2 X , the elementy is in the relation % with x whenever the element x is in the relation % with y,that is,

.8x; y 2 X/..x; y/ 2 %) .y; x/ 2 %/:

3) A binary relation % � X � X is called transitive if for all x; y; z 2 X , theelement x is in the relation % with z whenever the element x is in the relation %with y and y is in this relation with z, that is,

.8x; y; z 2 X/...x; y/ 2 % & .y; z/ 2 %/) ..x; z/ 2 %//:

4) A binary relation % � X � X is called antisymmetric if for all x; y 2 X , theelements x and y cannot simultaneously be in the relation % with one anotherunless x D y, that is,

.8x; y 2 X/...x; y/ 2 % &.y; x/ 2 %/) .y D x//:


An important class of binary relations is introduced in the following definition.

Definition 1.1.23. A reflexive, symmetric, and transitive binary relation % � X � Xis called an equivalence relation on the setX . If % � X�X is an equivalence relationon X and an ordered pair .x; y/ 2 %, then the elements x and y are called equivalent

(with respect to %); this equivalence is denoted by x%� y or simply x � y.

If % is an equivalence relation onX , then a subsetX0 � X consisting of all pairwiseequivalent elements of X , is called an equivalence class.

The family of all equivalence classes with respect to an equivalence relation % ona set X is called the factor-set of X with respect to this equivalence relation % and isdenoted by X=% or X=�. Examples of equivalence relations are considered in the endof this section.

Problem 1.1.18. Prove that any equivalence class is nonempty, any two differentequivalence classes are disjoint, and the union of all the equivalence classes withrespect to an equivalence relation on a set X is equal to X . Thus, the equivalenceclasses make up a partition of X .

The converse assertion is also true.

Problem 1.1.19. Prove that any partition of a set generates an equivalence relationon this set, such that the factor-set of this equivalence relation is precisely the fam-ily of all the parts of the partition. Therefore, there is a one-to-one correspondencebetween the partitions of a set and the equivalence relations on the set and the num-ber of the equivalence relations on an n-set X is equal to the Bell number Bn, seeDefinition 1.1.14.

The following class of binary relations also often occurs in applications.

Definition 1.1.24. A reflexive, antisymmetric, and transitive binary relation % � X �X is called a relation of partial order or just a partial order on the set X . If % is apartial order and .x; y/ 2 %, then we write x � y. If x � y or y � x, the elementsx and y are called comparable (with respect to the order %). A set with a relation ofpartial order on it is called a partially ordered set (poset). If any two elements of aposet are comparable, that is, either x � y or y � x for all x; y 2 X , then the set iscalled a chain or a linearly (sometimes totally) ordered set.

In the following statements we create our “combinatorial toolkit” – we computecardinal numbers of major set-theory constructions. The next statement directly fol-lows from Lemma 1.1.16 and Problem 1.1.18.

Lemma 1.1.25. Let an equivalence relation be given on a finite set X , such thatall the equivalence classes have the same cardinality k. Then the cardinality of the


factor-set, that is, the number of the equivalence classes is

jX j

k: (1.1.1)

Next we calculate the cardinality of the union of finite sets. We will need thefollowing properties, whose proofs are left to the reader.

Problem 1.1.20. For any (not necessarily finite) sets X and Y ,

1)X [ Y D X [ .Y nX/ (1.1.2)

2)Y D .Y \X/ [ .Y nX/; (1.1.3)

where the sets on the right in both (1.1.2) and (1.1.3) are disjoint, that is, X \.Y nX/ D ; and .Y \X/ \ .Y nX/ D ;.

Theorem 1.1.26. If jX j <1 and jY j <1, then

jX [ Y j D jX j C jY j � jX \ Y j:

Proof. It is sufficient to apply Lemma 1.1.12 to identities (1.1.2)–(1.1.3).

We extend this statement to any finite family of sets.

Theorem 1.1.27. If jXi j <1; 1 � i � k, then

jX1 [X2 [ � � � [Xkj

D jX1j C jX2j C � � � C jXkj � jX1 \X2j � � � � � jXk�1 \Xkj

C jX1 \X2 \X3j C � � � C .�1/k�1jX1 \X2 \ � � � \Xkj:

(1.1.4)

Proof. To prove (1.1.4) for any k, we use the mathematical induction on the numberk of sets. If k D 1, then formula (1.1.4) is obvious, which already makes the basisof induction. Moreover, if k D 2, (1.1.4) reduces to Theorem 1.1.26. Suppose,the statement is valid for any union of k sets, and consider a union of k C 1 setsX1 [ X2 [ � � � [ Xk [ XkC1. Now, Theorem 1.1.26 with X D X1 [ X2 [ � � � [ Xkand Y D XkC1 implies the equation

jX1[� � �[Xk[XkC1j D jX1[X2[� � �[XkjCjXkC1j�j.X1[X2[� � �[Xk/\XkC1j:

By the distributive law (Problem 1.1.12)

.X1 [X2 [ � � � [Xk/ \XkC1 D .X1 \XkC1/ [ � � � [ .Xk \XkC1/:

Applying the inductive hypothesis to the unionsX1[X2[� � �[Xk and .X1\XkC1/[� � � [ .Xk \XkC1/, we get the result.


Consider now Cartesian products. A proof of the following proposition is left as anexercise to the reader.

Lemma 1.1.28. For any, not necessarily finite sets X; Y1; Y2,

X � .Y1 [ Y2/ D .X � Y1/ [ .X � Y2/:

Moreover, if Y1 \ Y2 D ;, then .X � Y1/ \ .X � Y2/ D ;.

Theorem 1.1.29. If jX j <1 and jY j <1, then

jX � Y j D jX j � jY j: (1.1.5)

Proof. It is worth mentioning that the symbol � in (1.1.5) on the left means theset-theory operation – the Cartesian product of two sets, while the symbol � on theright indicates the usual arithmetic multiplication of whole numbers. We customarilyomit the symbol � and write jX jjY j.

To prove the assertion, we carry the mathematical induction on the cardinal num-ber k D jY j. If k D 1, then Y is a one-element set, Y D ¹yº. Denoting X D¹x1; x2; : : : ; xnº, we haveX �Y D ¹.x1; y/; .x2; y/; : : : ; .xn; y/º, hence, jX �Y j DjX j D jX j � jY j and the basis of induction is valid.

To make the inductive step, we fix a set X with jX j D n, assume that (1.1.5) holdsfor all k-element sets with a fixed natural k, and consider an arbitrary .kC1/-elementset Y . Choose an element y 2 Y and consider two subsets of Y , Y1 D ¹yº andY2 D Y nY1; it is clear, that jY1j D 1, jY2j D k, Y D Y1[Y2, and Y1\Y2 D ;. Dueto the inductive assumption, jX �Y2j D jX jjY2j, moreover, we have seen at the basisstep that jX � Y1j D jX j. By making use of Lemmas 1.1.28 and 1.1.12 we derive theequation

jX � Y j D jX � Y1j C jX � Y2j D jX j.1C jY2j/ D jX jjY j

which completes the inductive step of the proof. The statement follows by the Axiomof Mathematical Induction.

Theorem 1.1.29 and the Axiom of Mathematical Induction imply immediately

Theorem 1.1.30. If jXi j <1; 1 � i � k, then

jX1 �X2 � � � � �Xkj D jX1j � jX2j � � � jXkj:

Definition 1.1.31. 1) The class of all mappings with the domain X and codomainY is called the power set4 and is denoted by Y X .

4The set of all subsets of X is sometimes also called the power set.


2) The class of all injective mappings with the domain X and codomain Y is de-noted by Inj.Y X /.

3) The class of all surjective mappings with the domain X and codomain Y isdenoted by Surj.Y X /.

4) The class of all bijective mappings with the domain X and codomain Y is de-noted by Bij.Y X /.

5) The set of all subsets of a set X , including the empty set ; and the set X itself,is called the set of subsets or the Boolean of X and is denoted by 2X . Theset of all k-element subsets of X is denoted by 2X

k.

Let us stipulate that if X D ;, then there is only one “empty” mapping, belongingto Y ;, that is, jY ;j D 1. Also, it is obvious that 2X

kD 0 whenever k < 0 or k > jX j.

Example 1.1.32. Let X D ¹a; b; cº and Y D ¹1; 2º. Then Y X D ¹f1; f2; : : : ; f8º,where the mappings fi ; 1 � i � 8, are given by the following charts:

f1 W

8<:f1.a/ D 1

f1.b/ D 1

f1.c/ D 1

9=; ; f2 W8<:f2.a/ D 1

f2.b/ D 1

f2.c/ D 2

9=; ; f3 W8<:f3.a/ D 1

f3.b/ D 2

f3.c/ D 1

9=; ;f4 W

8<:f4.a/ D 2

f4.b/ D 1

f4.c/ D 1

9=; ; f5 W8<:f5.a/ D 1

f5.b/ D 2

f5.c/ D 2

9=; ; f6 W8<:f6.a/ D 2

f6.b/ D 1

f6.c/ D 2

9=; ;f7 W

8<:f7.a/ D 2

f7.b/ D 2

f7.c/ D 1

9=; ; f8 W8<:f8.a/ D 2

f8.b/ D 2

f8.c/ D 2

9=; :Notice that in this example there are 8 D 23 different mappings, that is, jY X j DjY jjX j. This is a particular case of the subsequent Theorem 1.1.35. First we introducea convenient notation and prove a lemma.

Definition 1.1.33. Given a mapping f W X ! Y and a subset Z � X , the mappingf jZ W Z ! Y such that f jZ.x/ D f .x/;8x 2 Z, is called the restriction of fonto Z.

Lemma 1.1.34. If X1; X2, and Y are finite sets and X1 \X2 D ;, then

jY X1[X2 j D jY X1 j � jY X2 j:

Proof. We establish a one-to-one correspondence between the power set Y X1[X2 andthe Cartesian product Y X1 � Y X2 . Consider a mapping f 2 Y X1[X2 and denote itsrestrictions f jXi onto Xi by fi ; i D 1; 2. Introduce a mapping

H W Y X1[X2 ! Y X1 � Y X2


by the rule H.f / D .f1; f2/; here on the right we have an ordered pair of tworestrictions of the mapping f . We prove that H is a bijection, that is, H is a one-to-one correspondence we are looking for.

We have to prove that H is both injective and onto. To prove the former, we con-sider two different mappings f; g 2 Y X1[X2 . Since f ¤ g, there exists an elementx0 2 X D X1 [ X2 such that f .x0/ ¤ g.x0/. If x0 2 X1, then by the definitionof a restriction, f1.x0/ ¤ g1.x0/ at the point x0, thus, the restrictions are differ-ent maps, f1 ¤ g1. If x0 2 X2, then f2 ¤ g2 on the same basis. In both casesH.f / D .f1; f2/ ¤ .g1; g2/ D H.g/, which proves that H is injective.

To prove that H is surjective, we pick an arbitrary ordered pair

.f 01 ; f02 / 2 Y

X1 � Y X2

and find its preimage with respect to H . In order for the mapping H to be onto, theremust exist a mapping f 0 2 Y X1[X2 such that H.f 0/ D .f 01 ; f

02 /. We define this

mapping f 0 explicitly

f 0.x/ D

´f 01 .x/ if x 2 X1f 02 .x/ if x 2 X2:

The mapping f 0 is well-defined since X1 and X2 are disjoint sets by the assumption.Obviously, H.f 0/ D .f 01 ; f

02 /, thus, H is a surjective mapping. Since all sets here

are finite, by Theorems 1.1.10 and 1.1.29 we get the equation

jY X1[X2 j D jY X1 � Y X2 j D jY X1 jjY X2 j:

The next two statements explain the choice of notation for the power set Y X andfor the Boolean 2X .

Theorem 1.1.35. If X and Y are finite non-empty sets, then

jY X j D jY jjX j:

Proof. The conclusion follows immediately if we set X2 D ; in Lemma 1.1.34.However, it is useful to give here another proof by mathematical induction on thecardinality of X . If jX j D 1, say X D ¹xº, then the statement is clear, for Y ¹xº

contains exactly as many mappings as there are elements in Y . Indeed, an image forthe unique element x 2 X can be chosen in jY j ways, and each choice generatesexactly one mapping from X to Y , so that jY X j D jY j D jY jjX j.

Suppose now that the statement is valid for all k-element sets, and consider a set Xwith k C 1 elements. Select an element x1 2 X and introduce two sets, X1 D ¹x1ºandX2 D X nX1. Since 1CjX2j D jX j, we have by Lemma 1.1.34 and the inductiveassumption

jY X j D jY X1 j � jY X2 j D jY j � jY jjX2j D jY jjX j

which proves the theorem.


Definition 1.1.36. Let A be an arbitrary subset of a set X , A � X , thus 0 � jAj �jX j. A function fA 2 Y X given by

fA.x/ D

´1 if x 2 A

0 if x 2 X n A

is called the characteristic function of a subset A � X .

Theorem 1.1.37. If X is a finite set, then j2X j D 2jX j.

Proof. We reduce the statement to Theorem 1.1.35. Consider the two-element setY D ¹0; 1º and the power set Y X . Since jY j D 2, to prove the theorem it is suffi-cient to set up a one-to-one correspondence between the two sets Y X and 2X . As inLemma 1.1.34, we will prove that the mapping

H W 2X ! Y X ; H.A/ D fA; 8A � X;

is bijective. To prove that H is injective, we select two different subsets A;B � X ,A ¤ B . So that, there exists an element x0 2 .A n B/ [ .B n A/. If x0 2 A n B ,then fA.x0/ D 1 ¤ 0 D fB.x0/ and mappings fA and fB are different. The sameconclusion, fA ¤ fB , follows if x0 2 B nA. Thus, H.A/ D fA ¤ H.B/ D fB , andH is an injective mapping.

To prove that H is onto, we consider a mapping f 0 2 Y X and the subset A0 D.f 0/�1.¹1º/ � X . We immediately see that H.A0/ D f 0, which proves that His surjective and, together with the preceding part, proves that H is bijective. NowTheorem 1.1.37 follows straightforwardly from Theorem 1.1.35.

Example 1.1.38. Let X D ¹a; b; cº; jX j D 3. Then

2X0 D ¹;º; j2X0 j D 1;

2X1 D ¹¹aº; ¹bº; ¹cºº; j2X1 j D 3;

2X2 D ¹¹a; bº; ¹a; cº; ¹b; cºº; j2X2 j D 3;

2X3 D ¹Xº; j2X3 j D 1;

and j2X j D 1C 3C 3C 1 D 8 D 23.

Definition 1.1.39. We recall that n-factorial, denoted by nŠ, is the function definedfor all natural numbers n 2 N as the product of the first n natural numbers,

nŠ D 1 � 2 � � � .n � 1/ � n:

We also define 0Š D 1.


Example 1.1.40. 1Š D 1; 2Š D 1 � 2 D 2; 3Š D 1 � 2 � 3 D 6; 4Š D 1 � 2 � 3 � 4 D 24.

Problem 1.1.21. 1) Compute 11Š.

2) Compute 201Š199Š

.

3) Simplify n � .n � 1/Š; .nC2/Š.nC2/.nC1/

.

Remark 1.1.41. Using some calculus, we can prove the Stirling asymptotic for-mula

nŠ �p2�n

�ne

�n; n!1: (1.1.6)

Here e � 2:7182818 is the base of natural logarithms; “asymptotic” means that theratio of the left-hand side and the right-hand side of (1.1.6) tends to 1 as n!1. Forexample, when n D 7, formula (1.1.6) computes 7Š with a relative error slightly morethan 1%.

We remind that 2Xk

is a set, not the cardinality of this set.

Theorem 1.1.42. If jX j D n <1, then for 0 � k � n

j2Xk j DnŠ

.n � k/Š kŠ: (1.1.7)

Proof. We will carry mathematical induction on n D jX j. Since the claim containstwo natural parameters, n and k, we reformulate the statement of the theorem bybinding one of them.

Theorem 1.1.43. Equation .1:1:7/ holds true for every nonnegative integer n and forall integers k; 0 � k � n.

Proof. It is convenient in this proof to use n D 0 as the basis of induction. Since0 � k � n, for n D 0 there is the only value of k, k D 0. Hence X D ;, 2X0 D¹;º; j2X0 j D 1 D 2

0, and (1.1.7) in the case n D 0 follows.To make the inductive step, we choose an n � 1 and assume that equation (1.1.7)

is valid for any n-element set. Consider a set X such that jX j D nC 1. If k D nC 1,then 2X

kD ¹Xº; j2X

kj D 1, and (1.1.7) is valid. To verify (1.1.7) when k � n, we

pick an element x0 2 X and split 2Xk

in two subsets, 2XkD A [ B , where A consists

of all k-element subsets ofX containing x0 andB D 2XknA, thus, subsets inB do not

contain x0. Therefore, these subsets, which are elements of B , can be considered ask-element subsets of the setX n¹x0º. Since jX n¹x0ºj D n, the inductive assumptionis applicable to B , and we have jBj D nŠ=..n � k/Š kŠ/.

On the other hand, if any ˛ 2 A, that is, ˛ is a k-element subset of X , then ˛ 22Xk; j˛j D k, and by definition of A, ˛ 3 x0. Therefore, ˛ n ¹x0º is a .k�1/-element

subset of the set X n ¹x0º. Hence the elements of A can be put in a one-to-one


correspondence with .k � 1/-element subsets of X n ¹x0º, thus jAj D j2Xn¹x0ºk�1

j. Bythe inductive assumption, jAj D nŠ=..n � k C 1/Š .k � 1/Š/. Since A \ B D ;,Lemma 1.1.12 implies

j2Xk j DnŠ

.n � k C 1/Š .k � 1/ŠC

nŠ

.n � k/Š kŠ

DnŠ

.n � k C 1/Š kŠ.k C n � k C 1/ D

.nC 1/Š

.n � k C 1/Š kŠ:

The proof of Theorem 1.1.43 and that of Theorem 1.1.42 are complete.

Corollary 1.1.44. Consider an n-element set X . Applying Theorems 1:1:37, 1:1:42,and Lemma 1:1:13 to the Boolean 2X , we deduce the equation

1C nCn .n � 1/

2Cn .n � 1/ .n � 2/

3ŠC � � � C

nŠ

.n � k/Š kŠC � � � C nC 1 D 2n:

Next we calculate the number of injective mappings, Inj.Y X /, for finite sets Xand Y .

Theorem 1.1.45. Let X and Y be two finite sets and 0 < n D jX j � m D jY j. Then

jInj.Y X /j DmŠ

.m � n/Š: (1.1.8)

Remark 1.1.46. If n D 0, then in agreement with (1.1.8) we define

jInj.Y X /j D 1;

assuming that there exists the unique “empty mapping” with the empty domain.

To prove (1.1.8), we first consider a special case m D n.

Lemma 1.1.47. If jX j D jY j D n; 0 < n <1, then jInj.Y X /j D nŠ.

Proof. We again use the mathematical induction. The conclusion is clear if n D 1,because in this case there is the unique mapping fromX D ¹xº to Y D ¹yº : x 7! y,and this mapping is certainly injective (as well as surjective and hence bijective).

Now we assume the statement to be valid for all n-element sets, and select two.nC 1/-element sets X and Y D ¹y1; y2; : : : ; ynC1º. Pick an element x0 2 X . Theset jInj.Y X /j breaks down into the union of n C 1 disjoint sets A1; A2; : : : ; AnC1,such that an injective mapping f W X 7�! Y belongs to the set Ai ; 1 � i � nC 1,if and only if f .x0/ D yi . For a fixed image f .x0/ D Oy 2 Y , the set X n ¹x0ºcan be injectively mapped into the set Y n ¹f .x0/º in nŠ ways due to the inductiveassumption. Altogether, we have jInj.Y X /j D .nC 1/ � nŠ D .nC 1/Š.


Define now the following equivalence relation on the set Inj.Y X /.Two mappings f; g 2 Inj.Y X / are equivalent if and only if they have the same

range, that is, f .X/ D g.X/.

Problem 1.1.22. Verify, that this is an equivalence relation in the sense of Defini-tion 1.1.23 such that jf .X/j D jX j.

End of Proof of Theorem 1:1:45. All mappings in any equivalence class have the samerange f .X/. This range is an n-element subset of Y . Hence, there exists a one-to-onecorrespondence between the factor-set and the set of all n-element subsets of Y , whichis denoted by 2Yn . Lemma 1.1.47 implies that the cardinality of each equivalence classis nŠ. Now by Lemma 1.1.25, j2Yn j D jInj.Y X /j=nŠ, and Theorem 1.1.42 yieldsequation (1.1.8), jInj.Y X /j D mŠ=.m � n/Š.

Remark 1.1.48. Thus, for any finite sets X and Y we have found the numbers ofinjective, bijective and arbitrary mappings from X and Y . There is no such a simpleformula for the number of surjective mappings. We will find that number in Sec-tion 4.1.

Several statements have already been proved by making use of a simple and power-ful method – by establishing a one-to-one correspondence between the set in questionand another set, whose cardinality can be found easier than the former, and we willuse this approach again and again – see, for instance, the solution of Problem 1.4.16.

We end this section with a notation, which is convenient in many instances. Let thesymbol b.modp/ denote the remainder after dividing b over p.

Definition 1.1.49. For integer numbers a; b and a natural p, we write a � b.modp/if p divides the difference a � b; in other words, the difference a � b D kp with aninteger k, or p divides both a and b with the same remainder. In this case the numbersa and b are called congruent modulo p.

For example, 7.mod 3/ D 7.mod 2/ D 1, while 7.mod 4/ D 3; 5 and 11 arecongruent modulo 2, 5 � 11.mod 2/, but 5 and 4 are not.

Problem 1.1.23. Prove that the congruence is an equivalence relation on the set Z ofinteger numbers and describe its factor sets. Does this statement remain true on theset of natural numbers N? The same question regarding the set of whole numbers¹0; 1; 2; : : :º?

Exercises and Problems 1.1

EP 1.1.1. Compute the sums

1)P5kD0

kkC2


2)P5kD1

1k

3)P1kD5

1k

(here the summation index is decreasing)

4)P1m;nD1

1.nC1/mC1

5) The following transformation, called the Abel transformation or discretesummation by parts, is useful in many problems involving sums. Consider twofinite or infinite sequences ¹akº and ¹bkº; k D 1; 2; : : : , and the sequence oftheir pairwise products ¹ak � bkº; k D 1; 2; : : : . Introduce the partial sums ofthese sequences Bn D

PnkD1 bk and Sn D

PnkD1 ak �bk; k D 1; 2; : : : . Prove

that for all n � 2

Sn D

n�1XkD1

.ak � akC1/ Bk C anBn: (1.1.9)

Use (1.1.9) to find the sums

6)PnkD1 q

k; q is a constant

7)PnkD1 kq

k

8)PnkD1 k cos.kx/ for a fixed number x.

EP 1.1.2. Prove the following statements by mathematical induction.

1) 13 C 23 C � � � C n3 D�12n.nC 1/

�2; 8n 2 N

2) (Al-Kashi ) 14 C 24 C � � � C n4 D 130

�6n5 C 15n4 C 10n3 � n

�, 8n 2 N

3) 2n < nŠ for any natural n � 4.

EP 1.1.3. FindPnkD1.2k � 1/

3.

EP 1.1.4. Prove by mathematical induction that for any natural nX1�i1<i2<��<ik�n

1

i1i2 � � � ikD n;

where the sum runs over all k-tuples of natural numbers i1 < i2 < � � � < ik for eachk D 1; 2; : : : ; n.

EP 1.1.5. A sequence ¹a1; a2; : : : ; an; : : :º is called an arithmetic progression or anarithmetic sequence, if ajC1 D aj C d for each j � 1, where a constant d is calledthe common difference of the progression and a1 is its first term. Find by mathematicalinduction an explicit formula for the general term an of an arithmetic progression andfor the sum

PlnDk an of its l C 1 consecutive terms. In particular, find the sum of the

first l terms of an arithmetic sequence.


EP 1.1.6. Prove that a sequence ¹anº; n � 1, is an arithmetic progression if and onlyif anC1 C an�1 D 2an; 8n � 2.

EP 1.1.7 (Hypsicle from Alexandria ). Let ¹a1; : : : ; an; anC1; : : : ; a2nº be anarithmetic progression with an even number of terms. Prove that

P2nkDnC1 ak �Pn

kD1 ak D bn2, where b is an integer number.

EP 1.1.8. A sequence ¹a1; a2; : : : ; an; : : :º is called a geometric progression or a ge-ometric sequence if ajC1 D q �aj for each j � 1, where q is called the common ratioof the progression and a1 is its first term. Find an explicit formula for the general terman of the geometric progression and for the sum

PlnDk an of its l�kC1 consecutive

terms.

EP 1.1.9. Prove that a sequence ¹anº; n � 1, is a geometric progression if and onlyif anC1 � an�1 D a2n; 8n � 2.

EP 1.1.10. Find a closed-form expression for the sumPnkD1.k

2 C k/.

EP 1.1.11. How many zeros are in the end of the number 5Š? 53Š? 100Š?

EP 1.1.12. What is bigger, 300Š or 100300?

EP 1.1.13. Use the mathematical induction to prove the Fundamental Theorem ofArithmetic:

Any natural number n > 1 can be uniquely, up to the ordering of factors, written asa product of prime numbers. If n is prime, then the product contains only one factor.

EP 1.1.14. Find a flaw in the following “inductive proof” of the claim that all girlshave sky-blue eyes:

The reader definitely knows at least one such a girl, which establishes the basis ofinduction. Suppose now that in any group of n girls all the girls have sky-blue eyesand deduce that if so, then any group G of n C 1 girls possesses the same property.Indeed, let g be any girl in G. Consider an n-element group G1 D G n ¹gº consistingof n girls. By the inductive assumption, all girls in G1 have sky-blue eyes. Choosea girl g1 in G1; it is obvious that g and g1 are two different girls. Next we removeg1 from G1 and replace her with g, that is, consider a set G2 D .G1 n ¹g1º/ [ ¹gº.The set G2 also consists of n elements, hence by the inductive assumption, all girlsin G2 have sky-blue eyes. In particular, g1 2 G2, therefore, she also has sky-blueeyes, which in turn means that all girls inG D G1[¹gº have sky-blue eyes. Now theprinciple of mathematical induction implies the claim.

EP 1.1.15. Compare the sequences an D 2n and bn D n2; n D 1; 2; : : : . Weimmediately verify that a1 D 2 > b1 D 1, while a2 D b2, a3 < b3, and a6 > b6.


Determine, what inequality, an � bn or an � bn, is valid for all n � n0, that is, forall but finitely many subscripts n. Find the smallest such n0 and prove the correctinequality, an � bn or an � bn, for all n � n0.

EP 1.1.16. Prove the following modification of the Axiom of Mathematical Induc-tion: If the statements S1 and S2 are valid and statements Sn and SnC1 togetherimply SnC2 for all natural n, then all the statements Sn; n D 1; 2; : : : , are valid.

EP 1.1.17. Does the pair of sets Ze D ¹: : : ;�4;�2; 0; 2; : : :º and Z0o D ¹: : : ;�3;�1; 0; 1; 3; : : :º make up a partition of Z?

EP 1.1.18. Prove that

1

2

�1

2� 1

��1

2� 2

��

�1

2� nC 1

�D.�1/n�1.2n � 2/Š

22n�1.n � 1/Š(1.1.10)

EP 1.1.19. Let jX j D jY j < 1. Prove that in this case f 2 Y X is injective if andonly if it is surjective, thus, in the case jX j D jY j < 1 the three properties (to beinjective, to be surjective, and to be bijective) are equivalent.

EP 1.1.20. The binary relations below are given as sets of ordered pairs on appropri-ate sets. Are they reflexive, symmetric, antisymmetric, transitive, or neither? For theequivalence relations, describe their factor-sets.

A) The relations on the set ¹a; b; c; dº:

1) %1 D ¹.a; a/; .b; b/; .c; c/; .d; d/º

2) %2 D ¹.a; a/º

3) %3 D ¹.b; b/; .c; c/; .d; d/º

4) %4 D ¹.a; b/; .b; a/; .d; d/º

5) %5 D ¹.a; b/; .b; c/; .a; c/; .d; d/º

6) %6 D ¹.a; a/; .a; b/; .b; b/; .b; c/; .c; c/; .c; d/; .d; d/º

7) %7 D ¹.a; a/; .b; b/; .c; c/; .d; d/; .a; b/; .a; c/; .a; d/; .b; c/; .b; d/; .c; d/º.

B) The relations on the set of real numbers R:

1) %8 D ¹x; y 2 R j x C y D 0º

2) %9 D ¹x; y 2 R j x C y D 0 or x � y D 0º

C) The relations on the set of integer numbers Z:

1) %10 D ¹m; n 2 Z jm D 2nº.

2) %p D ¹m; n 2 Z jp divides m � n; where p is a given prime numberº


EP 1.1.21. Find the flaw in the following “proof” of the claim: A symmetric andtransitive binary relation % on a set X is an equivalence relation.

Let a; b 2 X and .a; b/ 2 %. Due to the symmetry, .b; a/ 2 %, and by virtue of thetransitivity, .a; b/ 2 % and .b; a/ 2 % together imply .a; a/ 2 %. Thus, % is reflexive.

Find a counter-example to the claim, that is, construct a symmetric and transitivebut not reflexive binary relation.

EP 1.1.22. Let P ˛ stand either for the property P or for its negation. Prove thatthree properties, reflexivity (R), symmetry (S), and transitivity (T) are independentin totality, that is, for any triple of properties .R˛1 ; S˛2 ; T ˛3/ there exists a binaryrelation possessing this set of properties. By Theorem 1.1.37, to prove the claim it isenough to provide 23 D 8 examples of binary relations.

EP 1.1.23. Prove that three properties, reflexivity (R), antisymmetry (AS), and transi-tivity (T) are independent in totality.

EP 1.1.24. How many binary relations are there on the set ¹1; 2; 3; 4; 5º? How manyamong them are reflexive? Symmetric? Antisymmetric? Transitive? How manypossess any two or any three of these properties?

EP 1.1.25. By the definition, binary relations are sets, therefore, one can form theirunions, intersections, etc.

1) Let � and � be two reflexive binary relations. Is any of the relations � \ � or� [ � reflexive?

2) Let � and � be two symmetric relations. Is any of the relations � \ � or � [ �symmetric?

3) Let � and � be two transitive relations. Is any of the relations � \ � or � [ �transitive?

EP 1.1.26. Prove Lemma 1.1.16.

EP 1.1.27. Is it true that 7 � �8.mod 4/?

EP 1.1.28. Suppose that the binary relation of acquaintanceship on a set of people issymmetric. Prove that in a party of n � 2, at least two people have an equal numberof acquaintances.

EP 1.1.29. Give an example of a binary relation % in a Cartesian product X � Y ,which is not a mapping from X to Y . What restrictions should be imposed on % tomake it a mapping?

Give a definition a mapping f W X ! Y as a binary relation % in the Cartesianproduct X � Y .


EP 1.1.30. Let Z2 D ¹0; 1º and Zn2 be the Cartesian product of n copies of Z2.A Boolean function of n variables is a mapping f W Zn2 ! Z2. How many are theredifferent Boolean functions of n variables?

EP 1.1.31. Prove the equation

nŠ D

Z 10

e�t tndt; n D 0; 1; 2; : : : :

1.2 The Sum and Product Rules

In this section we study two important results called the Sum Rule and the Prod-uct Rule, which demonstrate themselves in many combinatorial problems. Wewill see that they are nothing but the formulas for calculating the cardinalities ofthe union and the Cartesian product of finite sets. We introduce these rules byconsidering simple model problems.

Descartes’ biography � Sum and Product Rules

Problem 1.2.1. In a group of students, each person studies one and only one of threeforeign languages: six people take French, eight take German, and nine students takeSpanish. How many students are there in the group?

Solution. Denote the set of all students in the group by X , the subset of studentsstudying French by XF , the subset of students studying German by XG , and thesubset of students studying Spanish by XS . Since each student studies at least onelanguage, we can represent X as the union,

X D XF [XG [XS :

Moreover, these subsets are pairwise disjoint,

XF \XG D XF \XS D XG \XS D ;;

for none student studies two languages. Thus, by Lemma 1.1.13 with m D 3, jX j D6C 8C 9 D 23.

There are many similar problems where the set in question is the union of severaldisjoint subsets, or this set can be put in a one-to-one correspondence with a union ofdisjoint sets. Consequently, the cardinality of the set can be calculated by making useof Lemmas 1.1.12 or 1.1.13. It is said in such situation that the solution was derived

Section 1.2 The Sum and Product Rules 31

by the Sum Rule; both these lemmas are referred to as the Sum Rule as well. Withoutusing the set theory terminology the rule can be stated as follows.

If one task can be performed in k ways and another task in l ways, and these taskscannot be done simultaneously, then one of the two tasks can be done in kC l ways. Itis clear after our analysis of Problem 1.2.1 that the latter statement is just a descriptiveformulation of Lemma 1.1.12, where jX j D k and jY j D l .

The Sum Rule can also be stated in other terms.

The Sum Rule. If finite sets X1; X2; : : : ; Xm form a partition of a set X , then

jX j D jX1j C jX2j C � � � C jXmj: (1.2.1)

Evidently, (1.2.1) is equivalent to Lemma 1.1.13.If there is the Sum Rule, then it likely should be a Product Rule. To introduce it,

we again analyze a model problem.

Problem 1.2.2. Identification cards on Small Planet contain two characters, one cap-ital Latin letter and one Hindu-Arabic digit, for example, “S � 8”. How many arethere various cards, if we can use all 26 letters and 10 digits?

Solution. First of all, we have to state unequivocally what cards must be consideredidentical, and what cards are different. Since we consider a mathematical problem,we do not take into consideration size, color, font, etc. Two cards are considered asdifferent, if they have different pairs of symbols, that is, if at least one symbol oneither card is distinct from the corresponding symbol on another card. In other words,to say that two cards are identical is just to say that they have both the same letter andthe same digit. We reiterate here this statement, because clear qualification of whatobjects are distinct in a combinatorial problem and which ones are the same (areidentical) is a crucial step in solving the problem; otherwise, two people can read thesame words but solve two different problems.

Another important issue is the ordering of characters. In this problem, should wecount the cards “S � 8” and “8 � S” as different or identical?

As the matter of fact, these are two different problems. Combinatorics itself does notknow whether or not the order of elements is substantial, combinatorics only providesnecessary means for solving both problems. This is the solver’s task to clarify theproblem and choose the right approach. The distinction between problems whereorder of elements is or is not essential, will be discussed in more detail later on in thischapter.

In Problem 1.2.2 we assume that the first character on the card is always a letterand the second one is a digit. Thus from our standpoint, each card is an ordered pairof symbols .�; ı/, where � may be any of 26 letters and ı any of ten digits. Havingsaid the key words “ordered pair”, we immediately recognize that these objects makeup the Cartesian products of sets and we can use the latter as a mathematical model


in our problem. Denote the set of all characters of the English alphabet by ƒ D¹A;B;C; : : : ; Y;Zº; jƒj D 26, and the set of digits by � D ¹0; 1; : : : ; 8; 9º; j�j D10. Our discussion implies that there is a one-to-one correspondence between the setof various identification cards we sought for and the Cartesian product ƒ ��. Thus,by Theorem 1.1.29 the number of different cards is equal to jƒ � �j D jƒj � j�j D260.

Henceforth, we say that a solution was derived by making use of

The Product Rule if we have established a one-to-one correspondence between aset under consideration and some Cartesian product, and computed the cardinality ofthis product by making use of Theorems 1:1:29–1:1:30. The latter theorems are alsoreferred to as the Product Rule.

The subsequent problems illustrate the Sum and Product Rules.

Problem 1.2.3. Find the number of car license plates containing four letters and threedigits (as in the previous problem, there are 26 letters and 10 digits available).

Solution. Using the same notation and reasoning as in the preceding problem andapplying the Product Rule, we get the answer: there are

jƒ �ƒ �ƒ �ƒ �� j D jƒj4 � j�j3 D 264 � 103 D 456 976 000

license plates.

Problem 1.2.4. Find the number of license plates containing four letters and eitherone, or two, or three digits.

Solution. The desired set of license plates … comprises objects of three types – withone, or with two, or with three digits. So that, we can set up the equation

… D …1 […2 […3;

where …i denotes the set of plates containing i digits, i D 1; 2; 3. Moreover, a claimthat a plate contains one digit, clearly distinguishes such a plate from those with twoor three digits, whence the subsets …i ; i D 1; 2; 3, are pairwise disjoint. Hence wecan apply equation (1.2.1) and conclude that j…j D j…1j C j…2j C j…3j.

The cardinal number j…3j D 456 976 000 was found in Problem 1.2.3. In the sameway we calculate the cardinal numbers j…1j and j…2j,

j…1j D jƒ �ƒ �ƒ �ƒ ��j D jƒj4� j�j D 264 � 10 D 4 569 760;

j…2j D jƒ �ƒ �ƒ �ƒ �� j D jƒj4� j�j2 D 264 � 102 D 45 697 600;

and the total number of plates is

j…j D j…1jC j…2jC j…3j D 4 569 760C45 697 600C456 976 000 D 507 243 360:


Problem 1.2.5. Three polyhedron-shape beads having, respectfully, six faces (cube),eight faces (octahedron), and ten faces (decahedron) are rolled simultaneously. Theirfaces are numbered, respectively, from 1 through 6, from 1 through 8, and from 1through 10. After each roll we write down the numbers on the face they landed.

1) In how many different ways can these beads land?

2) In how many different ways can these beads land, if at least two of them fall onthe faces, marked with a 1?

Solution. 1) The result of each roll can be written as an ordered (since, for instance,a 7 cannot occur in the first position) triple .a; b; c/, where 1 � a � 6, 1 � b � 8and 1 � c � 10. Thus, we can directly use the product rule, implying that there are6 � 8 � 10 D 480 variants of landing these beads.

2) Let Pcu be the set of all possible results of the landing, such that the octahedronand the decahedron read a 1, and the cube shows any face; obviously (or by the Prod-uct Rule again), jPcuj D 6 � 1 � 1 D 6. The sets Poc and Pde are defined in a similarway, jPocj D 8 and jPdej D 10. After that we are compelled to apply the Sum Ruleand to compute the “answer”: 6C 8C 10 D 24.

However, the Sum Rule does not apply here and this “answer” is wrong since thethree sets Pcu; Poc and Pde are not mutually exclusive, they have a non-empty inter-section, containing one element, namely the triple .1; 1; 1/. To take this intersectioninto account, it is convenient to introduce three other sets, bP cu; bP oc and bP de, wherebP cu stands for the set of all possible results of landing of the beads, such that theoctahedron and the decahedron read a 1, but the cube shows any face but a 1; bP oc andbP de are defined similarly. It is clear now that jbP cuj D 5, since one of the six faces ofa cube is now excluded, and jbP ocj D 7, jbP dej D 9.

The three “hatted” sets are disjoint, but there appears now another obstacle: thesesets do not exhaust all the ordered triples in the problem. Introduce the set P1 D¹.1; 1; 1/º corresponding to the case when all three beads land on a 1; thus,jP1j D 1. These four sets, bP cu; bP oc ; bP de, and P1, partition the set of all the possi-ble outcomes in the problem, and by the sum rule we have the answer, 5C7C9C1 D22.

Problem 1.2.6. In how many ways can one choose two movies in different genres outof five different comedies, seven different thrillers, and ten different dramas?

Solution. Combining the Sum and Product Rules, we arrive at the answer: 5 � 7 C5 � 10C 7 � 10 D 155 variants.

Problem 1.2.7. A Combi Club has 18 members. In how many ways can the memberselect the President and the Treasurer of the Club?

Solution. Let S be the set of the Club members, jS j D 18. If a student s1 was electedthe Club President, then there are only 17 candidates the Treasurer can be chosen


from. So that, there are 17 ways to elect the President and the Treasurer given that thestudent s1 is to be the President. If the student s2 is to be the President, we also have17 possibilities, etc. Since these 18 options do not intersect, we can apply the SumRule and get 17C 17C � � � C 17„ ƒ‚ …

18 addends

D 17�18 D 306 different results of the elections.

Remark 1.2.1. We notice in this problem another issue, important in many combi-natorial problems. In our solution we implicitly assume that one student cannot servesimultaneously as the President and the Treasurer. In Problem 1.2.7 such repeatingchoices were not allowed, but it may be another way elsewhere. Actually in the busi-ness world we often see a person who simultaneously is the President and the CEO ofa company. To distinguish these two kinds of problems, we say that a problem allowsor does not allow repetition. It is worth emphasizing that, like the order of elements,the assumption of (non)repetition depends on a particular problem. Combinatoricsonly provides the means for solving both kinds of problems.

The end of the solution of this problem is similar to applying the Product Rule.However, the number 17 here is not a cardinality of a specific set – the sets of candi-dates to elect the Treasurer for different Presidents elected are different, even thoughthey all have the same cardinality. It may be convenient to model this and similarproblems by a special drawing, tree of alternatives similar to one in Fig. 1.1. Westudy such drawings in more detail in Chapter 2.

ss sOP1 P18s1 s18

s s s s��

@@@

@

��

@@@@

T T T T

s2 s18 s1 s17

a aaa aa aaaaa aaa aa aaa

Figure 1.1. The tree of alternatives in Problem 1.2.7.

This tree represents all possible outcomes of the voting. Since any one of the 18students s1; s2; : : : ; s18 can be the Club President, 18 first-level branches, incident tothe root O and labelled by s1; s2; : : : ; s18, represent 18 possible results of the Presi-dent election; Fig. 1.1 displays only few branches corresponding to s1 and s18. If thePresident has been elected, there are only 17 candidates for the Treasurer, however,the sets of candidates are all different. Indeed, if the student s1 has been elected as thePresident (this case is depicted by the subtree at the vertex P1 in Fig. 1.1) then onlythe students s2; s3; : : : ; s18 may run for the Treasurer, hence there are 17 second-levelbranches incident to the left vertex P and labelled by s2; : : : ; s18. If the student s18


has been elected as the President (this case is depicted by the subtree at the vertexP18) then only the students s1; s2; : : : ; s17 may run for the Treasurer, hence there are17 second-level branches incident to the right vertex P and labelled by s1; : : : ; s17;likewise, the tree has 16 intermediate branches between s1 and s18. The tree has18 � 17 D 306 pendant vertices representing all possible results of the voting.

The tree in Fig. 1.1 is regular, that is, every vertex except for the pendant ones, hasthe same number of incident second-level branches. In other problems these quantitiescan be different. To solve such problems, the Sum Rule may be of use.

ss s s��

HHHHH

HHH

HH

A is 1st A is 2nd A is 3rd

˛ ˛

B is 2ndB is 3rd

B is 4thB is 3rd

B is 4th B is 4ths s s s s s��

AAAAA

��

AAAAA

s s s s s s s s s��

��

��

LLLLLL

��

AAAAA

��

AAAAAˇ ˇ

C is 3rdC is 2nd

C is 2ndC is 3rd

C is 1stC is 1st

C is 3rdC is 1st

C is 2nd

s s s s s s s s sD is 4thD is 4th

D is 3rdD is 2nd

D is 4thD is 3rd

D is 1stD is 2nd

D is 1st

Figure 1.2. The tree of alternatives in Problem 1.2.8.

Problem 1.2.8. Four people – A;B;C , and D took part in a car race. A student hasonly partial information on the results of the race. It is known that B lost to A, C wasnot the last one, and there were no ties. How many different results are possible in therace?

Solution. The tree of alternatives for the race is drawn in Fig. 1.2. Since B finishedafter A, A can finish either 1st, or 2nd, or 3rd. The first level of the tree, above thebroken line ˛, has three branches, representing these alternatives. Next, if A is 1st,then B can be either 2nd, or 3rd or 4th; if A is 2nd, B can be either 3rd or 4th; andif A is 3rd, B can be only the last one. The second level of the tree, above the bro-


ken line ˇ, represents these alternatives. The entire tree (Fig. 1.2) has nine pendantvertices corresponding to nine possible results of the race.


EP 1.2.1. Bob participates in two sweepstakes simultaneously. In the first one he canwin one out of four books, in the other – one out of five tapes. How many differentpairs of prizes can he bring home?

EP 1.2.2. Betty takes part in two book raffles. In the first one she can win one of fourdifferent books. In the second raffle she can win one of five different books, but oneof them is the same as in the first drawing. If one wins the same book twice, she maychange one of them for another book distinct from any book drawn. How many pairsof different books can Betty win?

EP 1.2.3. A dog and a cat can peacefully sit and dine side by side, but if two dogs ortwo cats are sitting alongside, they start fighting. In how many ways can n dogs andn cats be peacefully seated at a round table?

EP 1.2.4. How many are there licence plates consisting of one letter and two digits ifat least one of these digits is to be 9?

EP 1.2.5. How many are there licence plates consisting of three letters and four digitsif at least one of these digits is 9?

EP 1.2.6. Four cars take part in a race. How many ways are there to finish the race ifties between the second and the next places are allowed but the winner cannot make atie?

EP 1.2.7. Among the following nine sets, what combinations of them make up parti-tions of the set of natural numbers N?

1) N1 D ¹1º

2) N2 D ¹1; 2º

3) T0 D ¹n D 3k j k 2 Nº

4) T1 D ¹n D 3k C 1 j k 2 Nº

5) T2 D ¹n D 3k C 2 j k 2 Nº

6) P – the set of all prime numbers

7) P c – the set of all composite numbers greater than 1

8) N

9) ;


EP 1.2.8. A gentleman has eight shirts and five ties. In how many ways can he choosea shirt and a tie to go out, if he cannot combine a shirt S1 with ties T1 and T2, andalso a shirt S2 with ties T1 and T3?

EP 1.2.9. A family consisting of mother, father, four daughters, and two sons partic-ipates in a mixed doubles badminton tournament, where each team consists of a fe-male and a male player. How many various family teams are possible if the youngestdaughter does not want to be on a team with her elder brother?

EP 1.2.10. Prove the following combination of the sum and product rules, when sub-sets may have different cardinalities. Let % be a subset of a direct product X � Y offinite sets X D ¹a1; : : : ; amº and Y D ¹b1; : : : ; bnº. Then

j%j D

mXkD1

card.ak; �/ DnXlD1

card.�; bl/; (1.2.2)

where card.ak; �/ or card.�; bl/ is, respectively, the number of ordered pairs in % withthe first element ak or with the second element bl .

Thus, % is a binary relation between X and Y and equation (1.2.2) allows us tocompute the cardinality of an arbitrary binary relation.

EP 1.2.11. How many divisors does the number 23345576 have? Find the sum of alldivisors.

EP 1.2.12. A student put 5 sheets of paper in a shredder. The shredder cut some ofthese sheets into 5 parts, then cut some of these pieces into 5 parts, and so on. Whenthe shredder stopped, the student found 2006 small pieces of paper in the shredder. Isthis count correct?

EP 1.2.13. In how many ways is it possible to place three rooks on the 8 � 8 chess-board so that no two of them can attack one another?

EP 1.2.14. In the year of 2006 there were 2 006 meetings of student clubs in a BigClub College, each meeting attended by 40 students. For any two meetings, exactlyone student attended both of them. Prove that there was a student who attended all2006 meetings.

EP 1.2.15. How many four-digit natural numbers are multiple of 7?

EP 1.2.16. The vertices of a triangle belong to the set of the vertices of a given convexn-gon, but no side of the triangle is an entire side of the n-gon. How many are theresuch triangles?

EP 1.2.17. All integer numbers from 1 through 2 222 222 are written in a row. Howmany times each of the digits 0, 1, 2 appears in this series of digits?


1.3 Arrangements and Permutations

In this section we deal with ordered totalities of objects, called here arrange-ments. To introduce them, we consider a model problem.

Listing permutations

Problem 1.3.1. – Problem 1.2.7 revisited. Combi Club has 18 members. In howmany ways can the members elect the President and the Treasurer of the Club?

Solution. The following solution is similar to the solution of this problem in Sec-tion 1.2, but we put it in different terms. Suppose that the Election Board reports theresults of the voting, using the form

P D T D

and fills in two blank spaces with the names of the students elected. To convert thisform to standard mathematical notation, we introduce two sets, the two-element setC D ¹P; T º symbolizing the positions to be filled in and the set S D ¹s1; : : : ; s18ºof the Club members. Denote the result of an election by .s.P /; s.T //, where s.P /signifies the student having been elected to preside, and s.T / stands for the studenthaving been elected to count money.

We see that the result of every voting can be described as a mapping v with thedomain C and the codomain S . Choosing the President of the Club, we associatewith the element P 2 C an element s0 D s.P / 2 S ; choosing the Treasurer, weassociate an element s00 D s.T / 2 S with the element T 2 C .

Suppose that a student cannot simultaneously serve as the President and the Trea-surer, that is, different persons have to be elected for these two positions; in our no-tation it must be s0 D s.P / ¤ s00 D s.T /. Thus the mapping v is to be injective,v 2 Inj.SC /. Vice versa, each injective mapping v 2 Inj.SC / can be interpretedas the result of some voting in this Club. Hence, we see for ourselves that there is aone-to-one correspondence between the set of all possible results of the election andthe set Inj.SC / of all injective mappings from C into S . Now Theorem 1.1.45 withn D 2 andm D 18 implies that there are 18Š=16Š D 18 � 17 D 306 different outcomesof the election, as we have already found in Problem 1.2.7.

Suppose now that one person may be elected for both positions. In this case wehave to take into account not only injective but all mappings from C to S , that is theentire power set SC . By Theorem 1.1.35, we get jSC j D 182 D 324 different resultsof this voting.

Section 1.3 Arrangements and Permutations 39

Remark 1.3.1. The difference 324 � 306 D 18 gives the number of possible out-comes of the voting, when one student is elected for both offices.

Considering this problem as a model, we give the following definitions. It is clearthat the answer does not depend on particular sets, like C and S in Problem 1.3.1,it only depends upon their cardinalities, therefore, for the domains of mappings inthese definitions we always use natural segments Nn with various n. For instance, inProblem 1.3.1 n D 2.

Definition 1.3.2. Let A be a finite set, jAj D m 2 N. An arbitrary mapping f WNn ! A is called an n-arrangement with repetition of the elements of the set A, ormore precisely, arrangement of m elements taken n at a time.

Let the element ai 2 A be the image of the element i 2 Nn under the mappingf; ai D f .i/. Since arrangements are ordered totalities, we denote the arrangementswith repetition by .a1; a2; : : : ; an/, using the same notation as for n-tuples. If jAj Dm, then the number of n-arrangements with repetition is denoted by Arep.m; n/.

Theorem 1.3.3. By Theorem 1:1:35, the number of n-arrangements with repetition is

Arep.m; n/ D jANn j D jAjn D mn: (1.3.1)

This number certainly depends upon the cardinality m of the set A, but not on thespecific nature of its elements.

Definition 1.3.4. Let A be a finite set, jAj D m 2 N. Any injective mapping f WNn ! A is called an n-arrangement without repetition of the elements of A, or moreprecisely, arrangement of m elements without repetition, taken n at a time.

We often omit the specification “without repetition”, assuming that an “n-arrange-ment” always means an arrangement without repetition, but “with repetition” must bespecified. Arrangements with and without repetition are denoted by the same symbol.a1; a2; : : : ; an/. If jAj D m, the number of n-arrangements without repetition isdenoted5 by A.m; n/.

Theorem 1.3.5. By Theorem 1:1:45,

A.m; n/ D jInj.ANn/j DmŠ

.m � n/Š; 1 � n � m: (1.3.2)

We also set by definition

A.m; 0/ D 1 and A.m; n/ D 0 if n < 0 or n > m: (1.3.3)

5Sometimes the notations P.m; n/ and mPn are used, and these arrangements are called n-permutations.


Remark 1.3.6. In other words, arrangements without repetition of the elements of aset A can be considered as ordered n-subsets of A. Thus, to introduce the arrange-ments in a proper way, we have to either accept ordered sets as a primary, undefinedconcept, or define them through another notion. At the same time the arrangementswith repetition can contain several copies of the same element, although no set cancontain repeating elements. Therefore, the arrangements cannot be defined as sets.To unify definitions, it is convenient to introduce arrangements both with and withoutrepetition as mappings, as it has been done above.

Definition 1.3.7. In the case m D n the arrangements (without repetition) are calledpermutations (of n elements) or n-permutations; their number is denoted by P.n/.

Theorem 1.3.8. By Lemma 1:1:47, the number of n-permutations is

P.n/ D A.n; n/ D nŠ; n > 0: (1.3.4)

Remark 1.3.9. Therefore, the permutations of a set A are bijective mappings. If theelements of A are ordered, for instance, they are numbered by natural numbers, likeA D ¹a1; a2; : : : ; amº, then any permutation gives a reordering of A, for example,.ai1 ; ai2 ; : : : ; aim/. This sequence of elements, that is, the ordered image under theoriginal bijection, is also often called a permutation of the set A. We return to permu-tations in Section 4.5.

Example 1.3.10. Thus, if a Board consists of seven members, they can be seated in arow in P.7/ D 7Š D 5040 ways.

Problem 1.3.2. Prove a recurrence relation

P.m/ D A.m;m/ D A.m; n/ � A.m � n;m � n/; 0 � n � m:

Remark 1.3.11. If formula (1.3.4) has been proven independently, say by mathemat-ical induction, then we can deduce (1.3.2) from (1.3.4) and Problem 1.3.2.

Problem 1.3.3. A bus route has nine stops, excluding the departure stop; there are 23passengers in the bus. In how many ways can they get off the bus?

Solution. First of all we have to clarify which runs of the bus we treat as different.We consider two runs to be different, if there is at least one stop such that the sets(not the quantity!) of passengers, leaving the bus at this stop in the first run and in thesecond run, differ. Denote the set of all passengers by P; jP j D 23, and the set of thestops by S; jS j D 9. Now we can associate a mapping r W P ! S with every runof the bus. Namely, if a passenger p gets off the bus at a stop s, then r.p/ D s. Nextwe notice that there is a one-to-one correspondence between the bus runs and thesemappings. By Theorem 1.3.3, there are 923 different runs of the bus.



EP 1.3.1. Compute A.m; n/ for all m; n, 0 � n � m � 5, and P.n/ for all n,0 � n � 10.

EP 1.3.2. (Problem 1.2.7 revisited again.) Suppose that for certain personal reasons,Ann and Alex cannot serve as officers together and Bob cannot be the treasurer. Inhow many ways can the officers of the Combi Club be elected?

EP 1.3.3. Given 6 different balls and 4 different urns, in how many ways can we place4 balls in 4 urns, one ball in an urn?

EP 1.3.4. We have 7 tasks to do. In how many ways can we choose 5 of them toperform one task a day during 5 consecutive weekdays?

EP 1.3.5. How many are there 9-digit natural numbers, containing every digit1; 2; : : : ; 9 once?

EP 1.3.6. How many are there 10-digit natural numbers, containing each digit0; 1; 2; : : : ; 9 once?

EP 1.3.7. How many are there 10-digit numbers with the sum of digits equal 4?

EP 1.3.8. Find the sum of all integer numbers containing digits 1; 2; 3; 4, such thatany digit occurs in each number once.

EP 1.3.9. Find the sum of all 4-digit integers containing digits 1; 2; : : : ; 9, such thatany digit occurs in each number no more than once.

EP 1.3.10. Town Infiniburg occupies the entire plane. It has s straight parallel streets.In addition, it has t more straight streets such that none among them is parallel to anyone among the other s C t � 1 streets. Moreover, no three streets have a commonintersection. Into how many blocks have the streets split the town?

EP 1.3.11. 1) Consider the first 1 000 000 natural numbers. What numbers makethe majority among them: those whose decimal representation contains a 1, orthose without a 1?

2) Solve the same problem for the first 10 000 000 natural numbers.

3) How many among the first 1 000 000 natural numbers contain exactly one of thedigits 2, 3, and 4?

4) How many among the first 1 000 000 natural numbers contain exactly one digit2 and two digits 3?


EP 1.3.12. How many of each of the digits 0; 1; 2; : : : ; 9must be used to represent allinteger numbers from 1 through 9 999 inclusive? From 1 through 10k � 1?

EP 1.3.13. How many are there 6-digit odd numbers without repeating digits? Howmany such numbers begin with a 1?

EP 1.3.14. How many permutations of the 10 digits 0; 1; : : : ; 9 contain either thesequence 246 or the sequence 578, but not both?

EP 1.3.15. How many permutations of the 10 digits 0; 1; : : : ; 9 contain the sequence246 or the sequence 680, but not both?

EP 1.3.16. How many permutations of the 10 digits contain either the sequence 246,or the sequence 680, or both?

EP 1.3.17. How many are there different 10-digit natural numbers consisting only ofdigits 1, 2, and 3, if a 3 appears precisely two times?

EP 1.3.18. A combination lock has 5 disks with 12 different symbols on each. Onlyone combination opens the lock. Assuming that it takes 10 seconds to change a com-bination, what is the maximum time necessary to open the lock at random?

EP 1.3.19. How many different pairs of disjoint subsets does an n-element set have?

EP 1.3.20. Solve the equations for integer n and k.

1) A.n; 2/ D 20

2) P.n/ D 5P.k/.

EP 1.3.21. There are n traffic lights in Lighttown, each with three standard colors –green, yellow, and red.

1) How many different combinations of signals can they show?

2) Answer the same question, if the lights TL1 and TL2 can only be either bothyellow or in opposite green-red state, that is, if one of them is green, then anothermust be red and vice versa.

EP 1.3.22. How many ways are there to assign 12 players to 5 coaches for practice?

EP 1.3.23. How many are there 10-digit phone numbers such that 0 and 9 do notappear among the first four digits?

EP 1.3.24. How many are there 3-digit natural multiples of 3, which contain a digit 9in their decimal representation? We remind that an integer number is divisible by 3 ifand only if 3 divides the sum of all its digits.


EP 1.3.25. How many are there 6-digit natural numbers divisible by 9, such that theirlast digit is 9? We remind that an integer number is divisible by 9 if and only if 9divides the sum of all its digits.

EP 1.3.26. Consider all 105 whole 5-digit numbers attaching, if necessary, a fewzeros in front of such a number, like 00236. How many of them contain exactly onedigit 0, one 1, one 2, and one 3?

EP 1.3.27. Show that the elements of an n-element set can be ordered in nŠ ways.

EP 1.3.28. How many are there 4-arrangements of the letters a; b; c; d; e; f , if they

1) begin with an a?

2) contain the letter a?

3) contain two letters a; b?

4) contain the letters a; b in this order?

EP 1.3.29. Find the number of arrangements of n different objects taken r at a time,if each arrangement must contain p specified objects from the given n. When do sucharrangements exist?

EP 1.3.30. Find the number of arrangements of n different objects taken r at a time,if each arrangement must contain p specified objects from the given n, but cannotcontain any of the other q specified objects (assuming p C q � n).

EP 1.3.31. 1) A college prepares three-student teams for a tournament. How manysuch teams can be made, if the students can be distinguished only by their stand-ing – freshmen, sophomores, juniors, seniors?

2) To get the Mass Award, the college must have at least 25 teams. Is it possibleto get this award, if this year the school has no seniors? If the school has onlyfreshmen and sophomores?


1.4 Combinations

This section deals with unordered totalities of objects. The binomial coefficientsand Catalan numbers inevitably make their presence here. We also consider thetrajectory method.

Pascal triangle � Pascal’s biography � Catalan’s biography � vonDyck’s biography � Dyck language � Kronecker’s biography � Van-dermonde’s biography � Kaplansky’s biography

Problem 1.4.1. The same Combi Club with 18 members (see Problems 1.2.7 and1.3.1) has to send two of its members to a meeting. How many ways are there to selectthese two delegates assuming that both have the same rights and responsibilities?

Solution. As in Problem 1.2.7, we have to choose two different people. However,unlike Problem 1.3.1, this problem emphasizes that the order of the members chosenmakes no difference, only the two selected names matter. We immediately recallthat these are sets, where the order of elements does not count. Thus, any 2-memberdelegation can be viewed as a 2-element subset of the same set S of the Club members,jS j D 18, and we have to compute the number of 2-element subsets in an 18-elementset. Theorem 1.1.42 with n D 18 and k D 2 yields j2S2 j D 18Š=.2Š � 16Š/ D 153

delegations.

Considering this analysis, we give the following definition.

Definition 1.4.1. Given a set X , any k-element subset of X is called a combination(a k-combination without repetition) of the elements of X taken k at a time. Thenumber of k-combinations of the elements of an n-element set X is hereafter denotedby C.n; k/; sometimes the symbols nCk and C kn are also used. These quantities arealso called binomial coefficients and denoted by

�nk

�. For integer n � 0 we use both

symbols C.n; k/ D�nk

�interchangeably, for other n we will write only

�nk

�.

In Section 1.1, the set of all such subsets, that is, the set of k-combinations wasdenoted by 2X

k. By Theorem 1.1.42, if jX j D n, then for any 0 � k � n,

j2Xk j DnŠ

.n � k/Š � kŠ:

Clearly, this number does not depend on a particular set X , so long as its cardinalnumber is jX j D n.

Section 1.4 Combinations 45

Theorem 1.4.2. We immediately deduce from the latter formula the number of com-binations (binomial coefficients),

C.n; k/ DnŠ

.n � k/Š � kŠ; 0 � k � n: (1.4.1)

An n-element set cannot contain k-element subsets with k > n. Thus, for k > n andk < 0 we set C.n; k/ D 0.

Corollary 1.4.3. Now Corollary 1:1:44 can be stated as

C.n; 0/C C.n; 1/C � � � C C.n; k/C � � � C C.n; n/ D 2n:

The binomial coefficients with a negative upper index �n < 0, that is, n > 0, aredefined as�

�n

k

�D .�1/k

n.nC 1/ � � � .nC k � 1/

kŠD.�n/.�n � 1/ � � � .�n � k C 1/

kŠ

D .�1/k�nC k � 1

k

�D .�1/kC.nC k � 1; k/:

Some important properties of the binomial coefficients are discussed in the sequelproblems, including problems in the end of this section. Many more problems arescattered in the literature, in addition to the references mentioned above, see for ex-ample, [31, Sect. 1.2.6].

Problem 1.4.2. Show that for 0 � k � n,

C.n; k/ D C.n � 1; k � 1/C C.n � 1; k/: (1.4.2)

Solution. The equation easily follows from (1.4.1), but we shall prove it using specifi-cally combinatorial reasoning useful in many instances (cf. the proof of Theo-rem 1.1.43). Choose any n-element set X and an element a 2 X . A k-elementsubset Y � X either contains this a, or does not. If a 2 Y , then Y n ¹aº is a .k � 1/-subset of the .n� 1/-element set X n ¹aº, otherwise, Y itself is a k-element subset ofX n ¹aº. Since the sets of subsets 2Xn¹aº

k�1and 2Xn¹aº

kare disjoint – no set can consist

of k elements and k � 1 elements simultaneously, thus by definition of combinationsand the Sum Rule we get (1.4.2).


In the following chart, called Pascal’s triangle, every number, except for theunities at the boundary, is equal to the sum of its two upper neighbors, 2 D 1C1; 3 D1C 2 D 2C 1; : : :

1

1 1

1 2 1

1 3 3 1

1 4 6 4 1

1 5 10 10 5 1

: : :

Since C.n; 0/ D 1, Problem 1.4.2 implies that all the entries in this numericaltriangle are consecutive binomial coefficients. Indeed, the upper-most 1 D C.0; 0/,let us call this row the zero row. The next, first row contains 1 D C.1; 0/ and 1 DC.1; 1/, after that we have 1 D C.2; 0/, 2 D C.2; 1/, 1 D C.2; 2/, the third rowstarts with 1 D C.3; 0/, followed by 3 D C.3; 1/, and so on. The sum of entriesin the nth-row is 2n by Corollary 1.1.44. Pascal’s triangle often appears in variousproblems.

The following properties of the binomial coefficients are often helpful. The solu-tions of the next two problems are left to the reader.

Problem 1.4.3. Prove that

C.n; k/ D C.n; n � k/: (1.4.3)

Problem 1.4.4. Use the combinatorial interpretation of binomial coefficients to provethe binomial formula or binomial theorem

.aCb/n D anCnan�1bCC.n; 2/an�2b2C� � �CC.n; k/an�kbkC� � �Cnabn�1Cbn:

(1.4.4)Evidently, the coefficients of an and bn here can be written as C.n; 0/ D 1 and thoseof an�1b and abn�1 as C.n; 1/ D n.

Problem 1.4.5. No three diagonals of a convex decagon6 intersect at one point. Inhow many segments are the diagonals split by the intersection points?

6A polygon with 10 sides and 10 vertices.


Solution. First, we find the number of the points of intersection. Any such pointcomes from two intersecting diagonals connecting four vertices of the decagon. Sothat, each 4-element subset of the set of vertices generates exactly one intersectionpoint, and we obtainC.10; 4/ D 10Š=.4Š.10�4/Š/ D 210 intersection points. Some ofthese points are incident to the four segments we sought. However, there are segmentsthat are incident to only one intersection point and a vertex of the decagon. Therefore,if we multiply 210 by 4 (because an interior intersection point connects 4 segments)we count the former segments once, but the latter segments twice. To overcome thisdiscrepancy, we notice that each vertex has seven incident diagonals, therefore, thereare 7 � 10 D 70 segments incident to all the vertices of decagon. Finally, the totalnumber of segments is 1

2.4 � 210C 70/ D 455. The factor 1

2occurs here because a

segment has two end points and the expression 4 �210C70 counts them separately.

Problem 1.4.6. Where in the solution did we use the condition that three diagonalscannot intersect at a point?

Problem 1.4.7. Prove that in the case of an n-gon there are

2C.n; 4/Cn.n � 3/

2

such segments.

Problem 1.4.8. In how many ways can one choose three different numbers in the setN300 D ¹1; 2; : : : ; 299; 300º, so that 3 divides their sum?

Solution. Since the three numbers chosen must be different (we will do without thisassumption in Problem 1.4.12) and their order makes no difference, each triple is a3-element subset of the given set. But we cannot immediately apply 3-combinations,for not every ordered triple verifies the problem. We notice that when we divide aninteger number by 3, there are exactly three possible remainders, 0, 1, and 2, and tosatisfy the condition, the remainders for each triple either must be the same or must bepairwise different. If the three remainders are different, that is, they are 0, 1, 2, then thenumbers themselves are also different, and we have to select one number out of 100numbers ¹1; 4; 7; : : : ; 295; 298º, another number from the set ¹2; 5; 8; : : : ; 296; 299º,and the third number from the set ¹3; 6; 9; : : : ; 297; 300º; hence, we have 1003 suchtriples. The reader can put this result in the formal framework of the 3-arrangementswith repetition.

Next, if each of the three remainders is 0, then there are C.100; 3/ such triples. Thecases when the remainder is 1 or 2, give in addition 2C.100; 3/ choices. Altogether,we get by the Sum Rule 1003 C 3C.100; 3/ D 1 485 100 triples.


To introduce combinations with repetition, we analyze a sweet model problem.

Problem 1.4.9. A college cafeteria sells four kinds of pastries: biscuits (B), dough-nuts (D), muffins (M), and napoleons (N). In how many ways can a student buy sevenpastries?

Solution. A crucial point in this problem is to clarify in what way two purchasesof pastries can be distinct from one another. Certainly, they can contain differentquantities of similar items. For instance, one student bought three Bs and four Mswhile another student bought four Bs and three Ms; of course, we consider these twopurchases as different. Now, what if each of these two students bought, say, sevenBs? As physical objects, all these pastries are different, but once again we do notconsider physical entities, rather corresponding mathematical symbols. If we thinkthis way, both purchases of seven biscuits have the same notation (B, B, B, B, B,B, B). Therefore, in this problem any two symbols B are indistinguishable, and wemust identify them. The same applies to symbols D, M, and N.

Moreover, suppose a student bought seven pastries, put them on a tray and thenshuffled them up on the tray. It is natural not to consider this new ordering of thesame seven pastries as a new buy, this is exactly the same purchase. Thus, sinceordering does not count, we cannot consider strings (B, B, B, B, B, B, B), (B, B, B,B, D, D, D), etc., as subsets of some set, for no set can contain the same elementtwice. This is a typical problem about combinations with repetition, where one has tocount the number of families of the same cardinality, containing elements of differenttypes, provided that two families are considered to be different if and only if there isat least one type of elements, which in these two families is represented by differentquantities of the elements. At the same time neither order of the elements, nor whatelements of any type are included, matters.

This heuristic description is actually an informal definition of the combinations withrepetition, and the reader can skip the following formal definition, which translatesthe description in the formal set-theory language. Before deriving the formula for thenumber of combinations with repetition in Theorem 1.4.5, to illustrate the proof, weapply the method to solve Problem 1.4.9.

Solution of Problem 1:4:9 (continued). Since the order of pastries (objects) is imma-terial, we fix any order; let it be, say, B, D, M, N. Suppose we bought 3 biscuits, 2doughnuts, 1 muffin and 1 napoleon. If we write B, B, B, D, D, M, N, this string rep-resents the buy but does not help us and we want to develop better way to representthe outcomes. If we write just seven zeros, 0; 0; 0; 0; 0; 0; 0, this is much simpler butdoes not represent the buy, since we do not know what zeros represent biscuits, etc.But since we know that the left-most zeros represent biscuits, we can insert a separa-tor, say 1, which separates the zeros representing biscuits from the zeros representingdoughnuts, etc., therefore the string of 10 zeros and ones, 0; 0; 0; 1; 0; 0; 1; 0; 1; 0, rep-resents the buy above in unique way. For instance, the string 0; 0; 0; 0; 0; 1; 1; 1; 0; 0


means that the student bought 5 biscuits and 2 napoleons. We immediately see thatthere is a one-to-one correspondence between our buys and the set of all stringscontaining 7 zeros and 3 units in any order. The latter can be easily found to beC.10; 3/ D C.10; 7/ D 120.

Now we give a formal definition of combinations with repetition.

Definition 1.4.4. Consider a set X , any its n-partition

X D X1 [X2 [ � � � [Xn

and a natural number r . On the set 2Xr of all r-element subsets of X we introduce anequivalence relation (see Problem 1.4.10) as follows:

Two subsets A;B 2 2Xr are said to be equivalent, A � B , if

jA \X1j D jB \X1j

jA \X2j D jB \X2j

:::

jA \Xnj D jB \Xnj;

that is, the sets A and B are equivalent if and only if they contain an equal numberof elements of the subset X1, and an equal number of elements of the subset X2; : : : ,and an equal number of elements of the subsetXn. The equivalence relation partitionsthe set 2Xr into disjoint equivalence classes. These equivalence classes, that is, theelements of the factor-set 2Xr =�, are called r-combinations with repetition or withidentified elements from elements of n types, or more precisely, combinations withidentified elements of the subsets Xi ; 1 � i � n, taken r at a time.

The number of r-combinations with repetition depends on n and r , but not upon aspecific set X , so that we denote this quantity by Crep.n; r/.

Problem 1.4.10. Verify that the binary relation in Definition 1.4.4 is an equivalencerelation in the sense of Definition 1.1.23.

Theorem 1.4.5. If1 � r � min

1�i�njXi j; (1.4.5)

then

Crep.n; r/ D C.nC r � 1; r/ D C.nC r � 1; n � 1/ D.nC r � 1/Š

.n � 1/Š rŠ: (1.4.6)


Proof. Consider the equivalence relation in Definition 1.4.4 and choose an element-representative in every equivalence class. These representatives make up an r-combi-nation with repetition of the elements of n types. Associate with this r-combination asequence of r symbols 0 and n � 1 symbols 1 as follows. First, write down as many0s as there are elements of the first type, that is, the elements of the subset X1 in thisr-combination; if there is no element of the first type, we do not write a 0. Afterthat write a 1, which separates two groups of 0s corresponding to different types ofelements. Then write as many 0s, as there are elements of the second type (from thesubset X2) in this r-combination and again write a separator 1, and so on; but we donot write a 1 after the very last, nth group of 0s.

In this way, we have constructed a one-to-one correspondence between all r-com-binations with repetition from elements of n types and the sequences of r 0s and n�11s. This one-to-one correspondence is useful, because we can easily find the numberof the latter sequences. Indeed, this number is equal to the number of ways to choose,without ordering, r places for 0s among the given n C r � 1 places and fill out theremaining .n C r � 1/ � r D n � 1 places with 1s; or, which is the same, to selectn � r places for 1s. Now formula (1.4.6) follows immediately.

Problem 1.4.11. Where in the proof was the condition (1.4.5) used?

Second solution of Problem 1:4:9. We apply (1.4.6) with n D 4 and r D 7 and asbefore, we compute Crep.4; 7/ D C.10; 3/ D C.10; 7/ D 120 ways to buy sevenpastries.

Problem 1.4.12. We solve again Problem 1.4.8, allowing now the triples with two orall three equal numbers.

Solution. This provision does not change the number of triples whose elements havedifferent remainders after dividing by 3, there are still 1003 such triples. Considernow the numbers with the remainder 1, that is, the elements of the set ¹1; 4; : : : ; 298º.The cases of numbers with the remainders 2 or 3 are similar. Since the ordering isimmaterial, triples ¹1; 4; 7º and ¹1; 7; 4º must be identified, however now we shouldcount also triples with repeating elements, like ¹1; 4; 4º or ¹4; 4; 4º. This is again atypical problem concerning the combinations with repetition. Actually, we have in theproblem not 100 different elements, but 100 various types of elements and we have toselect three elements of these types, which can be done in Crep.100; 3/ D C.102; 3/

ways. All in all, there are 1003 C 3Crep.100; 3/ D 1 515 100 such triples.

If the restriction (1.4.5), which guarantees that the entire combination can consistof identical elements, fails, the scheme is not immediately applicable. Nonetheless,problems with r > min1�i�n jXi j can be solved using Theorem 1.4.5 and the SumRule. Consider the following modification of Problem 1.4.9.


Problem 1.4.13. A college cafeteria sells the same four kinds of pastries: biscuits (B),doughnuts (D), muffins (M), and napoleons (N), however only three muffins remainin stock. In how many ways can a student buy seven pastries?

Solution. Since now there are only three objects of the M type and 3 < 7, the condi-tion (1.4.5) fails and we cannot immediately apply formula (1.4.6). Nevertheless, wecan use it if we partition the set of all possible purchases in four disjoint subsets.

0) No muffin was bought.

1) One muffin was bought.

2) Two muffins were bought.

3) Three muffins were bought.

By making use of Theorem 1.4.5, in case 0) we have Crep.3; 7/ purchases, since wehave to buy seven items of three types.

In case 1), we have Crep.3; 6/ purchases, since now in addition to one muffinbought, we have to buy six more pastries of three other types. In case 2), there areCrep.3; 5/ purchases, because we buy two muffins and five pastries of the other threetypes. Finally, in case 3) there are Crep.3; 4/ purchases.

By the Sum Rule, we have

Crep.3; 7/C Crep.3; 6/C Crep.3; 5/C Crep.3; 4/

D C.9; 7/C C.8; 6/C C.7; 5/C C.6; 4/ D 100 purchases.

Problem 1.4.14. Show that if instead of (1.4.5) we have

r1 D jX1j < r � min2�i�n

jXi j; (1.4.7)

then the number of r-combinations with repetition of elements of n types is

Crep.n; r/ D C.nC r � 1; n � 1/ � C.nC r � r1 � 2; n � 1/: (1.4.8)

Solution. Arguing as in Problem 1.4.13, we represent the number sought as

Crep.n � 1; r/C Crep.n � 1; r � 1/C � � � C Crep.n � 1; r � r1/

D C.nC r � 2; n � 2/C � � � C C.nC r � r1 � 2; n � 2/:

To obtain (1.4.8), we rewrite each addend here by formula (1.4.2) as

C.m; l/ D C.mC 1; l C 1/ � C.m; l C 1/

and then combine like terms.

In the rest of this section we solve various enumerative problems.


Problem 1.4.15. How many whole-number solutions (that is, consisting of nonnega-tive integer numbers) does the equation

x1 C x2 C � � � C xk D n (1.4.9)

have?

Solution. Introducing new unknowns yi D xi C 1; 1 � i � k, we will look forpositive integer solutions of the equivalent equation y1 C y2 C � � � C yk D n C k.If we represent the number n C k on the right-hand side of the latter as the sumof n C k unities, we immediately realize that solving the problem is equivalent tosplitting n C k identical items (in our case, 1s) into k non-empty groups such thatthe i th group contains yi � 1 1s. To this end, we arrange these n C k 1s in a rowand observe that there are nC k � 1 spaces (gaps) between these 1s. To split the 1sinto k groups, we choose k � 1 places among these nC k � 1 gaps and insert someseparators; for example, we can insert 0s in these gaps. This insertion can be done inC.nC k � 1; k � 1/ D Crep.k; n/ ways, which is the number of solutions of equation(1.4.9).

It is worth repeating that we have established a one-to-one correspondence betweenthe set of solutions of (1.4.9) and a set with the known cardinality, namely, the set of alln-combinations with repetition of the elements of k types. In the following problemwe systematically exploit the same approach of the reduction of the set at question toa set with a simpler structure, whose cardinality is known or can be found easier.

Problem 1.4.16. Compute the sum of all natural numbers whose digits go either inincreasing order or in decreasing order.

Solution. Let us denote a k-digit natural number a with digits (from left to right)a1; a2; : : : ; ak by overline, a D a1a2 : : : ak . The set of all natural numbers withstrictly increasing digits is denoted by INC and the set of numbers with strictly de-creasing digits is denoted, respectively, by DEC; the sum of all numbers in a set X isdenoted by SUM.X/.

Denote by DEC0 the set of all numbers with decreasing digits, whose last digitis zero and let DEC1 D DEC nDEC0; we have DEC1\DEC0 D ; and so thatSUM.DEC/ D SUM.DEC0/C SUM.DEC1/. We immediately observe that there isa one-to-one correspondence between DEC0 and DEC1, given by

b 2 DEC1, 10b 2 DEC0 :

Thus, SUM.DEC0/ D 10 SUM.DEC1/ and SUM.DEC/ D 11 SUM.DEC1/.Pick a number a0, whose digits go in increasing order, say

a0 D a1a2 : : : ak 2 INC; a1 < a2 < � � � < ak;


and consider the numberb0 D b1b2 : : : bk (1.4.10)

where bj D 10 � aj . The left-most digit of a0 cannot be 0, a1 ¤ 0, while all otherdigits must be bigger than a1, thus, 1 � aj � 9 for 1 � j � 9. In turn, this implies

1 � bj D 10 � aj � 9; 1 � j � 9;

therefore, b0 2 DEC1. For example, if k D 3 and a0 D 139, then b0 D 971; weobserve that a0 C b0 D 1110 D .10=9/.103 � 1/. We generalize this observation inthe following problem.

Problem 1.4.17. Prove that this observation is not a coincidence, that is, if a0 is a k-digit number with digits going in increasing order and b0 is defined by (1.4.10), thena0 C b0 D .10=9/.10k � 1/.

Next we notice that the pairing a0 , b0 establishes a one-to-one correspondencebetween the sets INC and DEC1. For each k D 1; 2; : : : ; 9, the set INC containsC.9; k/ k-digit numbers, and every number a 2 INC can be derived by removal ofcertain digits from the string 123456789. Thus,

SUM.INC/C SUM.DEC1/ D9XkD1

C.9; k/.10=9/.10k � 1/

D .10=9/..1C 10/9 � .1C 1/9/ D .10=9/.119 � 29/:

Denote the latter number by x D .10=9/.119 � 29/ and let 9DEC be a subset ofthe set DEC consisting of numbers, whose left-most digit is a 9. The sets 9DEC andDEC0 D DEC n9DEC make a partition of DEC,

DEC D 9DEC[DEC0 :

We also notice that DEC0 consists of all numbers with decreasing digits, including 0,whose first digit is not 9 and

SUM.DEC/ D SUM.9DEC/C SUM.DEC0/:

Another one-to-one correspondence, now between the sets INC and DEC0 is estab-lished by

a00 D a1a2 : : : ak 2 INC, b00 D .9 � a1/.9 � a2/ : : : .9 � ak/ 2 DEC0;

we immediately see that a00 C b00 D 10k � 1. Thus, denoting 119 � 29 D y, we have

SUM.INC/C SUM.DEC0/ D9XkD1

C.9; k/.10k � 1/ D 119 � 29 D y:


Between the sets INC[¹0º and 9DEC there also exists a one-to-one correspondenceby virtue of the pairing

a000 D .9 � b1/.9 � b2/ : : : .9 � bk/ 2 INC, b000 D 9b1b2 : : : bk 2 9DEC

for k � 1; if k D 0, we set 9 , 0, therefore a000 C b000 D 10kC1 � 1. Denoting10 � 119 � 29 D z, we derive from here that

SUM.INC/C SUM.9DEC/ D9XkD0

C.9; k/.10kC1 � 1/ D 10 � 119 � 29 D z

and

y C z D SUM.INC/C SUM.DEC0/C SUM.INC/C SUM.9DEC/

D 2SUM.INC/C SUM.DEC/:

Combining these linear equations for SUM.INC/ and SUM.DEC/, we find

SUM.INC/ D .1=9/.11x � y � z/ and SUM.DEC/ D .11=9/.y C z � 2x/:

However, the sets INC and DEC are not disjoint, their intersection consists of 9one-digit numbers with the total sum of 45. Thus, the sum we look for, is

SUM.INC/C SUM.DEC/ � 45 D .80=81/1110 � .35=81/210 � 45

D 25 617 208 995:

Definition 1.4.6. For a real number x, let Œx� denote its integer part, that is, the largestinteger number not exceeding x; it is also called the floor function and is denotedby bxc. For example, Œ3:14� D 3; Œ�3:14� D �4; Œ3� D 3.

Problem 1.4.18. Find the number of n-arrangements with repetition from the setA D¹0; 1º, containing an even number of 0s.

Solution. Since we have defined an arrangement as a mapping, to specify such anarrangement (that is, a mapping) we have to choose preimages for 0s, and the numberof preimages must be an even number 2k; 0 � 2k � n. We suppose that the ar-rangement (1; 1; : : : ; 1) without 0s satisfies the condition; this corresponds to the casek D 0. Hence, by the Sum Rule there are

S D C.n; 0/C C.n; 2/C � � � C C.n; 2Œn=2�/

such arrangements. Setting a D 1 and b D 1 in the binomial expansion (1.4.4) yields

2n D .1C 1/n D C.n; 0/C C.n; 1/C C.n; 2/C � � � C C.n; n/;


and setting a D 1 and b D �1 in (1.4.4) yields

0 D .1 � 1/n D C.n; 0/ � C.n; 1/C C.n; 2/ � � � � C .�1/nC.n; n/:

Adding these two equations gives 2n D 2S , thus S D 2n�1.

Remark 1.4.7. We know from (1.3.1), that without any parity restriction there are 2n

n-arrangements from a two-element set A D ¹0; 1º. Hence, among them there are2n�1 arrangements with an even number of 0s and 2n � 2n�1 D 2n�1 arrangementswith an odd number of 0s.

Another solution of Problem 1:4:18 is of interest. Let us denote the number of ar-rangements we sought for by Sn. All these arrangements fall into two disjointclasses: those beginning with a 1, .1; a2; : : : ; an/, and those beginning with a 0,.0; a2; : : : ; an/. In the first case an .n � 1/-arrangement .a2; : : : ; an/ contains aneven number of 0s, therefore there are Sn�1 such arrangements. In the second casea1 D 0, thus the .n�1/-arrangement .a2; : : : ; an/ contains an odd number of 0s, thatis, Sn�1 less than the total number of .n � 1/-arrangements. By the Sum Rule,

Sn D Sn�1 C .2n�1� Sn�1/ D 2

n�1:

Problem 1.4.19. Find the number of n-arrangements with repetition from the setA D¹0; 1; 2º, containing an even number of 0s.

Solution. Hereafter we refer to the solution of Problem 1.4.18. If 2k preimages of0 have been chosen, then by virtue of (1.3.1) the images for the remaining n � 2kpreimages can be assigned in 2n�2k ways, and these images are either 1 or 2. Usingthe sum and product rules, as in Problem 1.4.18, we get the formula

2nC.n; 0/C 2n�2C.n; 2/C � � � C 2n�qC.n; q/; where q D 2Œn=2�:

To compute this sum explicitly, we add the equations (cf. Problem 1.4.18)

3n D .2C 1/n D 2nC.n; 0/C 2n�1C.n; 1/C � � � C 20C.n; n/

and

1 D .2 � 1/n D 2nC.n; 0/ � 2n�1C.n; 1/C � � � C .�1/n20C.n; n/;

which gives .1=2/ .3n C 1/.

Problem 1.4.20. Find the number of n-arrangements with repetition from the setA D¹0; 1; 2; 3º, containing an even number of 0s and an even number of 1s.


Solution. Problem 1.4.19 readily implies that there are .1=2/.3n C 1/ arrangementswithout 0s. If an arrangement contains two 0s, then their preimages can be chosen inC.n; 2/ ways. For the remaining n � 2 preimages, their n � 2 images, containing aneven number of 1s and any numbers of 2s and 3s, can be chosen in .1=2/.3n�2 C 1/ways – we again use here the result of Problem 1.4.19, with n � 2 instead of n.Continuing in the same way and using the sum and product rules, we derive

1

2.3n C 1/C.n; 0/C

1

2.3n�2 C 1/C.n; 2/C � � � C

1

2.3n�q C 1/C.n; q/

D1

2.C.n; 0/C � � � C C.n; q//C

1

2.3nC.n; 0/C � � � C 3n�qC.n; q//;

where q D 2Œn=2�. The first sum on the right-hand side of this equation was foundin Problem 1.4.19. To find the second sum, we proceed similarly, using expansions.3˙ 1/n. Finally, we get the answer 4n�1 C 2n�1.

We solve these problems again in Section 4.3 (Problem 4.3.17) using the methodof GF.

Problem 1.4.21. Consider 10n n-digit nonnegative integer numbers. Two numbersare said to be equivalent, if one can be derived from another by permuting some digits.For example, four-digit numbers 3213 and 3231 are equivalent. If after permuting theleft-most digit is 0, we still consider the number as having n digits.

1) How many classes of equivalence, that is, pairwise nonequivalent numbers arethere?

2) The same question if a number cannot contain more than one digit 0 and morethan one digit 9.

Solution. 1) If all digits in any number are different, then every equivalence class con-tains nŠ numbers – obviously, in this case n � 10 and there are C.10; n/ equivalenceclasses. But digits may repeat, and we have to use combinations with repetition – twonumbers are equivalent, if there is at least one digit occurring a different number oftimes in these two numbers. Therefore, we have n objects of 10 types, that is, thereare Crep.10; n/ equivalence classes.

2) In this case the factor-set splits into four disjoint subsets:

a) Numbers containing neither 0 nor 9.

b) Numbers containing one 9 and no 0.

c) Numbers containing one 0 and no 9.

d) Numbers containing one 0 and one 9.


In case a) we have Crep.8; n/ equivalence classes, in cases b) and c) there areCrep.8; n � 1/ classes, in case d) there are Crep.8; n � 2/ classes. Altogether we have

Crep.8; n/C 2Crep.8; n � 1/C Crep.8; n � 2/ D2.nC 5/Š.2n2 C 12nC 21/

7ŠnŠ

nonequivalent numbers.

Problem 1.4.22. Let .a1; a2; : : : ; anCp/ denote .nCp/-arrangements with repetitionfrom the elements of the set A D ¹�1; 1º containing n numbers �1 and p numbers 1.Denote f .k/ D

PklD1 al . Find the number of these arrangements, such that f .k/ � 0

for each k D 1; 2; : : : ; nC p.

Solution. In this problem we use the trajectory method ([16, Chap. 3], see also [19,p. 127, No. 2.7.13]), which sometimes gives an easy and very transparent solution.Introduce an orthogonal coordinate system in the plane and consider points

Z0 D .0; 0/; Zk D .k; f .k//; 1 � k � nC p:

A broken line consisting of nC p segments consecutively connecting the points Z0and Z1, Z1 and Z2, Z2 and Z3; : : : ; ZnCp�1 and ZnCp , is called the trajectory orDyck path corresponding to the arrangement .a1; a2; : : : ; anCp/. Among thesen C p segments, p are directed upward and have the slope C1, and n are directeddownward and have the slope �1, hence it is easy to find the coordinates of the pointZnCp , namely, ZnCp D .n C p; p � n/. To determine a particular trajectory, itsuffices to choose p places for the upward segments among the given nCp places or,which is the same, n places for the downward segments. Therefore, the total numberof the trajectories is C.nC p; p/ D C.nC p; n/. We have to find how many of themdo not drop below the X-axis, but it is easier to compute the number of trajectoriesthat do drop below it, that is, which have common points with the horizontal liney D �1. Let T be such a trajectory, and k0 be the left-most common point of T andthe line y D �1.

Consider another trajectory T that coincides with T from 0 to k0, and is the mirrorreflection of T at the line y D �1 to the right of k0. This procedure sets a one-to-onecorrespondence between the set of all trajectories joining the points 0 and ZnCp andcrossing the line y D �1, on the one hand, and the set of trajectories joining thepoints 0 and ZnCp D .nC p; n � p � 2/, on the other hand.

If a trajectory, connecting 0 and ZnCp , has u upward and d downward segments,then ´

uC d D nC p

u � d D n � p � 2:

Solving this system of linear equations we find d D p C 1. Hence the number oftrajectories crossing the line y D �1 isC.nCp; pC1/, and the number of trajectories


in question is

C.nC p; p/ � C.nC p; p C 1/ Dp C 1 � n

p C 1C.nC p; p/: (1.4.11)

This implies in particular, that the trajectories we looked for, exist only if p � n,though this is clear from the problem without calculations.

Remark 1.4.8. If p D n, then (1.4.11) becomes 1nC1

C.2n; n/; these numbers, calledthe Catalan numbers, occur in many combinatorial and other problems, see, e.g.[2, 45] and Problem 4.4.10; we denote them Catn.


EP 1.4.1. 1) An urn contains 12 different balls. In how many ways is it possibleto draw 8 of them without return?

2) With return?

3) An urn contains 12 identical balls. In how many ways is it possible to draw 8 ofthem without ordering and without return?

4) Without ordering but with return?

EP 1.4.2. Calculate the binomial coefficients C.m; n/ for all �2 � n � m � 6.

EP 1.4.3. Prove that kŠ divides the product of any natural number n and its k � 1successors. For example, for k D 5 and n D 3, 5Š D 120 divides the product3 � 4 � 5 � 6 � 7 D 2 520.

EP 1.4.4. Prove the following properties of the binomial coefficients for any naturaln and appropriate values of all other parameters.

1)PnkDm C.k;m/ D C.nC 1;mC 1/

2)PnkD1 kC.n; k/ D n2

n�1

3)PnkD2 k.k � 1/C.n; k/ D n.n � 1/2

n�2; n � 2.

4) Extend the two preceding equations, so that the left-hand side readsPnn0

with1 � n0 � n.

5)PnkD0.2k C 1/C.n; k/ D .nC 1/2

n

6)PnkD0

1kC1

C.n; k/ D 1nC1

.2nC1 � 1/

7)PnkD0

.�1/k

kC1C.n; k/ D 1

nC1

8)PnkD1.nC 1 � k/k

2 D13C.nC 1; 2/C.nC 2; 2/.


9) (Vandermonde’s identity)

nXkD0

C.m; k/C.l; n � k/ D C.mC l; n/; n � min¹mI lº

10) Use Vandermonde’s identity above to prove the formula

XkClDm

l

m

�s � i

k

��i

l

�Di

s

�s

m

�:

11)PmkDl.�1/

kC.m; k/C.k; l/ D .�1/mıml , where the Kronecker symbolıml (Kronecker’s delta) is defined for all non-equal integersm and l by ıml D 0and for m D l by ıl l D 1.

12) For natural m and n, prove the identity

.�1/nC.�n;m � 1/ D .�1/mC.�m; n � 1/:

13) For 0 � k � n, find the maximum value of the binomial coefficients C.n; k/.For each n, how many binomial coefficients C.n; k/; k D 0; 1; : : : ; n, are equalto this maximum value?

14) Prove that7PmkD1

.�1/k�1

kC.m; k/ D 1C 1

2C

13C � � � C

1m

.

15) Prove the identity for the harmonic numbers Hm,

HŒn=2� �Hn D

nXkD1

.�1/k

k:

EP 1.4.5. The binary, ternary, . . . , decimal, . . . numerical systems represent any nat-ural number by making use of a fixed number of digits – for instance, the two digits,0 and 1, in the binary system, the ten digits, 0; 1; : : : ; 8; 9, in decimal system, etc.Another representation of the integer numbers, called the combinatorial representa-tion, uses binomial coefficients to write down any natural number as a sum of a fixednumber of addends.

1) Prove that given an integer number k � 1, any natural number n can be writtenas

n D C.d1; 1/C C.d2; 2/C � � � C C.dk; k/

and this representation is unique if we require, in addition, that 0 � d1 < d2 <� � � < dk .

7The numbers Hm D 1C 12 C

13 C � � � C

1m ; m D 1; 2; : : : , are called harmonic numbers.


2) Find the combinatorial representations of n D 1 000 and n D 1 000 000 withk D 5; 8; 10.

3) Given n and k, estimate dk in the combinatorial representation of n.

EP 1.4.6. Yet another useful representation of integer numbers, called factorial rep-resentation, uses factorials instead of powers or binomial coefficients.

1) Prove that any whole number n can be written as

n D f1 � 1ŠC f2 � 2ŠC f3 � 3ŠC � � � ;

and this representation is unique if we also assume that 0 � fi � i .

2) Find the factorial representations of n D 1 000 and n D 1 000 000.

3) Given n, estimate the number of addends in the factorial representation of n.

EP 1.4.7. Consider families with 5 children, without twins. If we assume that thefamily composition depends on the order the kids were born, then among these fam-ilies there is one family with all 5 girls, 5 families with one boy and 4 girls, etc.List all families with 5 kids. The answer is 32 D 25. Explain this answer, using acombinatorial argument.

EP 1.4.8. Assuming that boys and girls have equal chances to be born, what part ofall families with 6 children have 4 girls and 2 boys?

EP 1.4.9. 2n people depart from the upper point of Pascal’s triangle, which was de-fined after Problem 1.4.2. At each point, including the uppermost one, half of themmove to the left and another half to the right. How many people arrive at each pointof the nth row?

EP 1.4.10. How many are there functions

f W ¹1; 2; : : : ; 2006º ! ¹2005; 2006; 2007º

such that the number f .1/C f .2/C � � � C f .2006/ is even? Is odd?

EP 1.4.11. The following two identities connect the Catalan numbers Catn, definedin Remark 1.4.8, and the binomial coefficients C.m; n/.

1)PnkD0.C.n; k//

2 D .nC 1/Cn

2) Cn D C.2n; n/ � C.2n; n � 1/.

EP 1.4.12. Find the coefficients of x19 and x21 after expanding the polynomial .x8Cx5 C 1/20 by the binomial formula (1.4.4) and combining like terms.


EP 1.4.13. Find the number of 2n-dimensional vectors .˛1; ˛2; : : : ; ˛2n/, such that˛i D ˙1; 1 � i � 2n,

PkiD1 ˛i � 0 for k D 1; 2; : : : ; 2n � 1, and

P2niD1 ˛i D 0.

EP 1.4.14. Given n points in the plane, how many different lines, connecting thempairwise, can be drawn, if no three among the points are collinear, that is, lie on thesame line.

EP 1.4.15. Among k points in the plane, l lie on the same line, while no three pointsamong the others lie on the same line.

1) How many lines are necessary in order to connect all these points pairwise?

2) How many triangles are there with vertices at these points?

EP 1.4.16. Find the largest number of parts that a plane can be divided by

1) 7 lines

2) l lines

3) 3 circumferences

4) m circumferences.

EP 1.4.17. 13 resorts are located by the shore of a convex lake. For every 2, and every3; : : : , and for all 13 resorts there is a route connecting them. Each route is a convexpolygon (or a line segment for two ports) with vertices at the resorts and is served bya separate ferry. How many ferries are necessary for all these routs?

EP 1.4.18. Among given 15 points in a plane, 6 lie on a line, however, no other 3points are collinear. How many are there lines containing at least two given points?

EP 1.4.19. Given 15 points in a plane, 6 among them lie on a circumference, however,no other 4 points belong to a circumference. How many are there circumferencescontaining at least three given points?

EP 1.4.20. At a meeting of the Combi Club, if two attending students know eachother, they have no more mutual acquaintances. At the same time, if two participantsdo not know each other, then they have exactly two common acquaintances at themeeting. Prove that every participant is familiar with the same number of attendees.

EP 1.4.21. There are 10 mutually intersecting lines, such that no three of them havea common point of intersection. How many are there circumferences tangent to anythree lines among the given 10?


EP 1.4.22. Three points are said to be collinear if they lie on a line. It is known thatfor any three non-collinear points in three-dimensional space there exists the uniqueplane containing these three points. Suppose that among k points in space, l arecoplanar, that is, lie in the same plane, while no four points among the others arecoplanar. How many different planes do exist, such that each plane contains a tripleof given points?

EP 1.4.23. Consider three non-collinear points in a plane and draw p lines throughthe first point, q lines through the second one, and r lines through the third one, suchthat no three among these pC qC r lines intersect at a common point and no two areparallel. How many triangles are made up by the intersections of these lines?

EP 1.4.24. A family of l parallel lines is crossed by another family of k parallel linesmaking several parallelograms. How many different parallelograms are there in thisfigure?

EP 1.4.25. A beetle, moving in horizontal or vertical direction, can visit only pointswith integer coordinates. It starts at the origin and must return back to the origin aftertravelling 2m units. How many different routes does the beetle have?

EP 1.4.26. P parents and S students attend a school meeting. In how many ways canthey be seated in a row, if at least one parent must sit between any two students? Thesame question if they sit by a round table?

EP 1.4.27. Prove that for any natural p,

Xr�1

p

r.r C p/D

Xr�1

.�1/r�1

rC.p; r/:

EP 1.4.28. Find n and m such that

C.n;m/ W C.n;mC 1/ W C.n;mC 2/ D 22 W 20 W 15

where a W b stands for the ratio a to b.

EP 1.4.29. (Compare with Problem 1.4.18.) Consider a polynomial

.1C x C x2/n D a0 C a1x C a2x2C � � � C a2nx

2n:

Prove that

a0 C a3 C a6 C � � � D a1 C a4 C a7 C � � � D a2 C a5 C a8 C � � � D 3n�1:


EP 1.4.30. The numbers Tn � C.nC 1; n/ Dn.nC1/2

(Fig. 1.3) are called triangularnumbers. Prove by mathematical induction that

.�1/nC1Tn D

nXkD1

.�1/kC1k2:

c � c ss s C c c cc c sc s s

� c c c sc c s sc s s ss s s s

D �

s s s ss s ss ssFigure 1.3. Triangular numbers T1 � T4.

EP 1.4.31. There are n identical black balls and n identical white balls. In how manyways is it possible to choose n balls containing at least one ball of each color? Extendthe problem for 3, 4 and more colors.

EP 1.4.32. In how many ways can mC n balls be chosen among 2m identical whiteand 3n identical black balls?

EP 1.4.33. No two students at the Even College have the same performance, that is,every two students get different grades at least at one test. Moreover, no student per-forms better than any other one, that is, for every two students s1 and s2, s1 performsbetter than s2 at some test but worse at some other test. This semester, every studenttakes 2n classes. Prove that there are at most C.2n; n/ students at the school.

EP 1.4.34. Prove that for a prime p, C.p; n/ � 0 .modp/ for 1 � n � p � 1, andC.p � 1; n/ � .�1/n .modp/ for 0 � n � p � 1.

EP 1.4.35. In how many ways is it possible to choose six different numbers from theset N49 D ¹1; 2; : : : ; 49º, so that the difference of two of them is 1? Such pair ofnumbers does not have to be unique.

EP 1.4.36. How many divisors does the number 2534537211 have?

EP 1.4.37. For how many integers from 1 to 9 999 is the sum of their digits equalto 9?


EP 1.4.38. In how many ways can one choose 4 colors from given seven colors?Assuming that one of the given seven colors is red, what is the answer if red mustenter the chosen combination? What is the answer to the latter question if red is notamong the given colors?

EP 1.4.39. A high school offers classes in English, French, German, Italian, andSpanish. How many bilingual dictionaries must the school library buy for the stu-dents?

EP 1.4.40. 1) In how many ways is it possible to split a 20-element set into ten2-element sets?

2) In how many ways is it possible to split a 21-element set into ten 2-element setsand one 1-element set?

3) In how many ways is it possible to split a 21-element set into seven 3-elementsets?

EP 1.4.41. At a grocery store, there are five identical bottles of Coca-Cola, sevenbottles of Pepsi-Cola, and eight bottles of Sprite. In how many ways can a studentbuy three bottles of soft drinks for a party?

EP 1.4.42. The following statement is called the Kaplansky lemma: n differentbooks are ordered on a shelf. Prove that for k � n=2, there are n

n�kC.n� k; k/ ways

to choose k books, so that no two neighboring books are chosen.

EP 1.4.43. A department store has 12 kinds of shoes in Kate’s size. In how manyways can she buy 4 pairs of different shoes? What if the shoes bought can repeat?

EP 1.4.44. The city of Oldnewburg has the shape of a rectangle, and all its streets areparallel to the sides of the given rectangle. The City Hall is located in the South-Westcorner of the city. 2x sheriffs leave the City Hall, half of them due East and anotherhalf due North. Officers who reach any street crossing, do the same: half of them goesto East and another half is due North. Eventually m sheriffs arrived at the crossingof the kth and l th streets. Is there any relation between the numbers k; l;m, and x?Compute x in terms of k; l , and m.

EP 1.4.45. 16 scouts are searching for their friend who got lost in the woods. Amongthem there are only 4 boys, who know the area. In how many ways can they make twoequal groups for the search, if each group must have two guides knowing the area?

EP 1.4.46. 16 friends reserved 8 identical double cabins for a cruise. In how manyways can they occupy the cabins?


EP 1.4.47. 1) How many pairwise products can be made from the numbers 1; 2;: : : ; 100?

2) How many among them are multiple of 3?

EP 1.4.48. How many are there 7-digit phone numbers with the same last four digits?

EP 1.4.49. There are 8 banks of lights in a school hall controlled by 8 differentswitches. Students decided that at a graduation dance no more than two banks oflights are to be on. In how many ways is it possible to set these 8 switches?

EP 1.4.50. Draw k lines through each of the 3 given points in the plane.

1) At how many points do these 3k lines intersect if no two of the lines are paralleland no three intersect at a point (the intersections at the given 3 points do notcount)?

2) Answer the same question if there are four points in the plane.

EP 1.4.51. There are l lines and p points on each of these lines, such that no threepoints on different lines are collinear. How many are there triangles with vertices atthese points?

EP 1.4.52. n rays in a plane have the common vertex. How many angles do theymake?

EP 1.4.53. The cafeteria in John’s school has a very stable menu, every day they offerthe same 13 tasty meals. During a day John can consume any number, from 0 to 13, ofthese dishes. For how many days can he buy meals at the cafeteria without repetition,that is, no two days have the same selection of dishes? How many dishes will he eatduring this time?

EP 1.4.54. John’s friends Nancy and Kate also decided to have a new menu everyday, but Nancy would eat an even number of dishes every day, while Kate an oddnumber. Who will have to repeat her menu sooner, Nancy or Kate?

EP 1.4.55. Generalize EPs 1.4.53–1.4.54 if the cafeteria has n meals instead of 13.

EP 1.4.56. Solve again Problem 1.4.9 under the additional assumption that, amongthe seven pastries, the student must buy at least four donuts. Compare the answer withthe answer to Problem 1.4.13.

EP 1.4.57. Let n D pk11 � � �pkll

be the prime factorization of a natural number n > 1,that is, 1 < p1 < � � � < pl are distinct primes and k1; : : : ; kl are arbitrary naturalnumbers. Find the number and the sum of all natural divisors of n. First solve theproblem for l D 1; 2 and 3.


EP 1.4.58. 1) What is the smallest natural number with exactly 6 divisors?

2) With no more than 6 divisors?

EP 1.4.59. Given a finite set X; jX j D k, find the number of pairs of subsets A;Bof X , such that A [ B D X .

EP 1.4.60. A standard deck of playing cards consists of 52 cards of 4 suits; spadesand clubs are black, diamonds and hearts are red. Each suit contains cards of 13denominations: 9 numbered cards 2; 3; 4; : : : ; 9; 10, and 4 face cards: J (a Jack), Q (aQueen), K (a King), and A (an Ace). Find in how many ways it is possible to drawfive cards from a standard deck, so that among these five cards there are

1) 10, J, Q, K, and A of the same suit (a royal flush)

2) Five adjacent cards of the same suit not starting at 10 (a straight flush)

3) Five (not necessarily adjacent) cards of the same suit (a flush)

4) Four cards of the same denomination (four of a kind)

5) Three cards of the same denomination and two cards of two other different val-ues (three of a kind)

6) Three cards of the same denomination and two cards of another denomination(full house)

7) Four cards of four different denominations.

EP 1.4.61. In how many ways can k cards, comprising cards of all 4 suits, be dealtfrom a standard deck of cards if

1) k D 4?

2) k D 5?

3) k D 6?

EP 1.4.62. Solve the previous problem if a deck consists of 4n cards of four differentsuits and cards are numbered consecutively from 1 through n.

EP 1.4.63. How many are there monic (that is, with the coefficient of 1) monomialsof degree k in l variables?

EP 1.4.64. In how many ways is it possible to distribute 25 identical coins among 4students?

EP 1.4.65. How many solutions in the whole numbers does the equation x1 C x2 Cx3 C x4 D 15 have?


EP 1.4.66. How many solutions in the positive integer numbers does the equationx1 C x2 C x3 C x4 D 15 have if x2 � 2 and x3 � 5?

EP 1.4.67. How many solutions in the natural numbers does the equation x1 C x2 Cx3 C x4 D 15 have if, in addition, x2 � 2 and 1 � x3 � 5?

EP 1.4.68. How many solutions in the whole numbers does the inequality x1C x2Cx3 C x4 � 15 have?

EP 1.4.69. How many solutions in the integer numbers does the equation x1 C x2 C� � � C xk D n have subject to restrictions x1 > n1; x2 > n2; : : : ; xk > nk?

EP 1.4.70. How many solutions in the integer numbers does the inequality jx1j Cjx2j � 100 have?

EP 1.4.71. How many natural numbers not exceeding 10 000 000 are there with thesum of their digits equal to 9?

EP 1.4.72. In how many ways can six coins be chosen from an ample supply ofpennies, nickels, dimes, and quarters?

EP 1.4.73. A raffle ticket at the Combi Club party costs $5. In a line to the counter,each member has either a $5 or a $10 bill. Therefore, the line would stop if thenext student has a $10 bill but the treasurer has no change. To avoid such halts, thetreasurer prepared t $5 bills to give change. If p students have $10 bills and q have$5 bills, in how many ways can the students make up a line to buy the tickets withoutinterruptions?

EP 1.4.74. How many diagonals does a convex 30-gon have?

EP 1.4.75. Find n if a convex n-gon has 35 diagonals.

EP 1.4.76. No three diagonals of a convex n-gon have a point in common. In howmany regions is the n-gon divided by its diagonals?

EP 1.4.77. This problem refers to the binomial formula (1.4.4).

1) Compute .x ˙ y/n for n D 1; : : : ; 5. Determine the largest coefficient(s) inthese expansions.

2) Find the coefficient of x8y5 in .x � 2y/10.


EP 1.4.78. In how many ways can you read the word MAGIC in the following dia-gram?

MA A

G G GI I

C

EP 1.4.79. A binary string is a 0-1-sequence, for instance, 01100011 is a binary stringof length 8. A binary code, that is, a set of binary strings is designed to represent aset of 35 objects, each object is coded by a string. Every string of the code contains kzeros and l unities, and k C l D n. Find k; l; n such that n has the smallest possiblevalue.

EP 1.4.80. Let us call two real numbers equivalent if they have the same integer part.

1) Prove that this is an equivalence relation.

2) What is the cardinality of the factor-set of this equivalence relation consideredon the set of all positive real numbers less than 10?

Boolean functions were defined in EP 1.1.30 where the reader has computed thatthere are 22

n

Boolean functions with n variables. However, some of these functionsactually depend on less than n arguments in the following sense.

Definition 1.4.9. Given a Boolean function f .z1; : : : ; zn/, a variable zi ; 1 � i � n,is called essential if there are values

z01 ; : : : ; z0i�1; z

0iC1; : : : ; z

0n 2 Z2 D ¹0; 1º;

such that

f .z01 ; : : : ; z0i�1; 0; z

0iC1; : : : ; z

0n/ ¤ f .z

01 ; : : : ; z

0i�1; 1; z

0iC1; : : : ; z

0n/:

Otherwise a variable is called unessential or fictitious. For example, for the Booleanfunction f .z1; z2/ D z1 ^ .z2 _ z2/, where z denote the negation of z, z1 is anessential variable, while z2 is a fictitious one.

Denote the number of Boolean functions with precisely n essential variables byBess.n/.

EP 1.4.81. 1) Verify that Bess.0/ D Bess.1/ D 2; Bess.2/ D 10 and comparethese numbers with 22

n

; n D 0; 1; 2. Find all Boolean functions with no morethan 2 essential variables.

Section 1.5 Permutations with Identified Elements 69

2) Prove that

Bess.n/ D 22n

� C.n; n � 1/Bess.n � 1/

� C.n; n � 1/Bess.n � 1/ � � � � � C.n; 1/Bess.1/ � Bess.0/:

3) (G. Krylov) Prove that Bess.n/ DPnkD0.�1/

kC.n; k/22n�k

.

EP 1.4.82. How many Boolean functions of three variables satisfy the equationf .z1; z2; z3/ D f .z1; z2; z3/?

EP 1.4.83. Consider a k-gon spanned by k vertices of a convex n-gon, k � n. Howmany such k-gons do exist, if at least s vertices of the n-gon lie between every twovertices of a k-gon?

EP 1.4.84. Ms. Matrix and Mr. Radical ran for the President of the Combi Club.After each ballot vote was casted, Matrix has never been behind Radical. Prove that ifeach candidate received exactly n votes, then there are Catn ways to count the votes,where Catn is the nth Catalan number.

1.5 Permutations with Identified Elements

Objects, considered in this section, resemble the combinations with repetition–they involve indistinguishable elements, which must be identified; however un-like the combinations, ordering of the elements is also important. We again beginwith a model problem.

Multinominial theorem � Bose’s biography � Dirac’s biography �

Einstein’s biography � Fermi’s biography � Maxwell’s biography �

Boltzmann’s biography

Problem 1.5.1. In how many ways is it possible to order the letters of the word DAD?The same question about the words ARMADA and LETTER?

Solution. In this and similar problems “words” like DDA, which we cannot find in adictionary, are also acceptable sequences of characters called strings. The difficulty ofthis problem is due to the presence of two identical characters D, for transpositions ofthese symbols do not generate a new string. Moreover, since a set cannot contain tworepeating elements, the three characters D, A, and D of a given word do not constitutea set. To overcome this obstacle, we make the two repeating letters distinguishable bysupplying subscripts and introducing the setX D ¹A;D1;D2º. Now we can considerall 3Š D 6 permutations of the elements of this new set,

.A;D1;D2/

.A;D2;D1/

.D1; A;D2/

.D2; A;D1/

.D1;D2; A/

.D2;D1; A/:


If we remove here all the subscripts, then two permutations in each of the threecolumns become indistinguishable and have to be identified. Thus, we break downP.3/ D 6 permutations of the elements of set X , taken all three at a time, in threedisjoint subsets of pairs of permutations. Each subset consists of two permutations,because two elements “D1” and “D2” can be transposed in P.2/ D 2Š D 2 ways,hence the set of 6 permutations is split in three pairs. These three pairs of permutationsgenerate three different strings ADD, DAD and DDA. Therefore, in the problem thereare 3Š=2Š D 3 essentially different permutations.

Similarly, the letters of the word ARMADA can be rearranged in 6Š=3Š D 120

ways, therefore, there are 120 strings from the letters of the word ARMADA. Now,the characters of the word LETTER can be reordered in 6Š=.2Š � 2Š/ D 180 ways;here we have to make two independent identifications in the set of all permutationsof the elements of the set ¹E1; E2; L;R; T1; T2º – we have to identify permutationsthat can be derived from one another by transposing the symbols E1 and E2, andalso to identify permutations, that can be derived from one another by transposing thesymbols T1 and T2.

It should be noticed that in the solution we implicitly used Lemma 1.1.25. Themethod of solution presented above is quite transparent and sufficient in the most ofapplications. However, it is also necessary to have a formal definition of the permuta-tions with identified elements.

Definition 1.5.1. Given a k-partition of a set X D X1 [X2 [ � � � [Xk , consider thefollowing equivalence relation on the set of all permutations of the elements of X :

Two permutations are called equivalent if one of them can be derived from theother by transposing only the elements of the subset X1, or only the elements ofX2; : : : , or only the elements of Xk . The elements of the factor-set derived are calledpermutations of the elements of the set X with identified elements of the subsetsX1; X2; : : : ; Xk; we call them permutations with repetition if it is clear what parti-tion of the set X generates them.

Let jXi j D ni ; 1 � i � k, and jX j D n DPkiD1 ni . The number of permuta-

tions with repetition is denoted by C.nI n1; : : : ; nk/; these numbers are also calledmultinomial coefficients.

Theorem 1.5.2. The following equation holds for the multinomial coefficients,

C.nI n1; : : : ; nk/ DnŠ

n1Š � n2Š � � �nkŠ: (1.5.1)

Proof. The result follows immediately from (1.3.4) after a k-fold application of Lem-ma 1.1.25.


Problem 1.5.2. Show that for any natural l the number .lŠ/Š � .lŠ/�.l�1/Š is integer.

Solution. The following is a pure combinatorial proof. If one considers objects of.l � 1/Š types, l items of each type, that is, l � .l � 1/Š D lŠ items in total, then theexpression in the problem is exactly the number of permutations with repetition ofthis set, given by equation (1.5.1) with k D .l � 1/Š; n D lŠ, and n1 D n2 D � � � D

nk D lŠ.

Problem 1.5.3. 1) Small College runs four mathematical courses for the LiberalArts students – five sections of The History of Mathematics, four sections ofMathematics in the Arts, three sections of Introductory Statistics, and two sec-tions of Elementary Combinatorics. Each section of these courses has, respec-tively, 28, 25, 15, and 12 seats. All 14 sections are taught by 14 different pro-fessors, and 5 � 28C 4 � 25C 3 � 15C 2 � 12 D 309 students satisfy prerequisitesand want to take one class each. In how many ways can these students registerfor the classes?

2) Solve the same problem if one professor teaches all sections of The Historyof Mathematics, another professor teaches all sections of Mathematics in theArts, yet another one teaches all sections of Introductory Statistics, and anotherprofessor teaches both sections of Elementary Combinatorics.

Solution. 1) Since all the professors are different, we have 309 objects of 14 typesand by (1.5.1) there are

309Š

.28Š/5.25Š/4.15Š/3.12Š/2

ways these students can register for the classes.2) However, if one professor teaches all sections of The History of Mathematics, it

makes no difference for the student what particular section of a class to register for,and the answer is now

309Š

.28Š/5.25Š/4.15Š/3.12Š/25Š4Š3Š2Š:

These problems and many others can be conveniently stated by using a generalmodel of objects, say balls, placed in urns. Both urns and balls can be either distin-guishable or identical. Therefore, there are in general four possible cases. Two ofthese cases are treated in the following theorem; we omit its proof, which is similarto the reasoning in the solution of Problem 1.5.3. Two other cases will be consideredlater.

Theorem 1.5.3. Given k distinguishable groups of urns, p1 urns of one type, p2 urnsof another type, etc., then there are

nŠ

.n1Š/p1 � � � .nkŠ/pk


ways to place n D n1p1C � � � C nkpk different objects into these p1C � � � Cpk urnsif all the urns are different, and

nŠ

.n1Š/p1 � � � .nkŠ/pk .p1/Š � � � .pk/Š

ways to place n D n1p1 C � � � C nkpk different objects into p1 C � � � C pk urns ifurns within each group are indistinguishable.


EP 1.5.1. In how many ways is it possible to place 6 identical balls into 4 differenturns, so that:

1) No urn is empty?

2) Exactly 2 urns are empty?

3) At most 3 urns are empty?

4) At least 3 urns are empty?

EP 1.5.2. A bus with 35 passengers makes 7 stops. In how many ways can the pas-sengers leave the bus, so that exactly 5 of them get off at each stop?

EP 1.5.3. Prove that .2n/Š � 2�n and .3n/Š � 6�n are integer numbers.

EP 1.5.4. Prove that the fraction nŠn1Šn2Š��nkŠ

is an integer number whenever n1Cn2C� � � C nk � n.

EP 1.5.5. A student is preparing to a Spelling Bee contest. She looks for an 11-character word containing four letters s, four letters i, two letters p, and one moreconsonant. How many dictionary entries should she browse at most?

EP 1.5.6. How many four-digit integers can be composed from the digits of number12 553 322?

EP 1.5.7. In how many ways can the letters of the word ARMADA be rearranged sothat the letters R and M remain together

1) in the same (RM) order?

2) in any order?

EP 1.5.8. In how many ways can the letters of the word MISSISSIPPI be rearrangedso that the first occurrence of the letter I precedes the first letter S?

EP 1.5.9. How many natural numbers less than one million contain only digits 7and 8?


EP 1.5.10. How many are there 4-arrangements of 4 red, 1 green, 1 blue, 1 black, and1 white balls?

EP 1.5.11. In how many ways can 30 boy scouts be split in 10 equal groups of 3? In3 equal groups of 10?

EP 1.5.12. In how many ways can the letters a; e; i; o; u; z be arranged so that a andz are adjacent?

EP 1.5.13. In how many ways can 13 balls be placed into 6 urns, so that urn 1 contains3 balls, urn 2 also contains 3 balls, urn 3 contains 1 ball, urn 4 contains 2 balls, urn 5contains 4 balls, and urn 6 is empty?

EP 1.5.14. How many are there 27-digit natural numbers containing the digits 1; 2;: : : ; 9 if each digit appears three times?

EP 1.5.15. In how many ways can we partition a k-element set X in l parts if thefirst part contains k1 elements, the second part contains k2 elements, . . . , the l th partcontains kl elements, thus k1 C � � � C kl D k?

EP 1.5.16. How many are there r-combinations with repetition from k letters A, lletters B , and m other different characters, if each combination contains all symbolsA and B (and maybe some other symbols)?

EP 1.5.17. There are 15 students in the Combi Club who play ice hockey. In howmany ways can their coach make up three sets of five field players? Consider twodifferent cases – when the ordering of the selected five players in mini-teams of 5makes or does not make difference.

EP 1.5.18. There are 18 students in the Combi Club. In how many ways can their icehockey coach assign three goalies and make up three sets of five field players?

EP 1.5.19. Prove the multinomial theorem,

.t1 C t2 C � � � C tk/nD

XC.nIn1; n1; : : : ; nk/t

n11 � � � t

nkk;

where C.nIn1; n1; : : : ; nk/ are multinomial coefficients (1.5.1); the sum is taken overall sets of whole numbers ni such that n1 C n2 C � � � C nk D n.

EP 1.5.20. Use the multinomial theorem to find the expansion of .x1 C x2 C x3/4.

EP 1.5.21. 1) How many terms does the expansion .x1 C x2 C x3/8 have?

2) Use the multinomial theorem to find the coefficient of x21x2x53 in .x1 C x2 C

x3/8.


3) What is the constant term (not containing x) in the expansion .x C 1x� 3/7 in

powers of x?

EP 1.5.22. How many are there four-digit multiples of 4, composed of the digits1; 2; 3; 4, and 5?

EP 1.5.23. How many are there permutations with repetition of b identical balls andc identical cubes?

EP 1.5.24. (Cf. Theorem 1.5.3.) In how many ways can n balls be distributed in kdifferent urns if

1) All balls are different and any urn can contain any number of balls (Maxwell–Boltzmann statistics)?

2) The balls are indistinguishable and any urn can contain any number of balls(Bose–Einstein statistics)?

3) The balls are indistinguishable and any urn can contain no more than one ball(Fermi–Dirac statistics)

1.6 Probability Theory on Finite Sets

In this section we consider probabilistic problems with finite sample spaces. Ifwe in addition assume the hypothesis of equally likely outcomes, then these prob-lems can be straightforwardly translated into combinatorial ones and vice versa.Therefore, these probabilistic problems provide an ample field for applications ofthe methods we have developed in preceding sections. In particular, we considerapplications of these results to calculating the outcomes of lotteries and othergames of chance.

Abacus � Genetics and Probability Theory � Bayes’ biography �

Bernoulli family � De Mere’s paradox � Gender’s ratio

Our world is random, often unpredictable, meaning that the results, outcomes, ofmany of our actions cannot be predicted in advance. When a kid starts study at ele-mentary school, her parents have certain expectations, but they cannot predict for sureher college GPA8. Another simple and popular example of randomness is tossing acoin. The probability theory studies (some of) such random events by mathematicalmethods. First we introduce some terminology.

8Grade Point Average

Section 1.6 Probability Theory on Finite Sets 75

Any operation, procedure, experiment with results that cannot be predicted in ad-vance, like tossing a coin, or rolling a die, or drawing a card from a deck, is referredto as a random experiment. This is not a definition, here we just introduce a primarynotion like the concepts of a set and a function introduced in Section 1.1. When anexperiment has the only possible result, the outcome is certainly known in advanceand this experiment is not random.

Definition 1.6.1. All the possible results of a random experiment are called its out-comes. The totality of all possible outcomes is called the sample space S of theexperiment. Points of the sample space, that is, outcomes of a random experiment,are also called elementary events. Any set E of outcomes, that is, a subset of the sam-ple space E � S , is called an event. The empty event E D ; is also called impossibleor improbable, the universal event E D S is called certain.

The outcomes, belonging to a given event, are sometimes called favorable out-comes to this event. The sample space depends upon the problem. For instance, whenwe roll a coin, then in addition to two typical outcomes, heads and tails, a coin mightrest on the edge, even though this phenomenon is not easy to observe, or it can rollaway and disappear, but the latter two possibilities are practically improbable, negligi-ble. Thus, discussing experiments with flipping a coin, we always consider the samplespace consisting of only two points, a head H and a tail T , in symbols S D ¹H;T º.If we roll a die (a six-faced cube) with faces marked by the digits 1, 2, 3, 4, 5, 6, orby dots, then the sample space of this random experiment is S D ¹1; 2; : : : ; 6º. How-ever, if a die is marked by ¹1; 2; 3; 4; 5; 5º, then the sample space of the experiment isS D ¹1; 2; 3; 4; 5º.

As another example, we consider a lottery with prize levels $1; $5; $100, and$10 000. The drawing is a random experiment and if we have only one ticket, thesample space consists of five points, S D ¹$0; $1; $5; $100; $10 000º. However, insome cases we may only be interested in the very fact of winning (W) or loosing (L)the game and can choose another sample space S1 D ¹W;Lº. Depending on the issuewe are interested in, there are also other possible choices for the sample space in thisproblem.

To correctly solve a problem in the probability theory, we must explicitly specify thesample space of the problem, otherwise different people can read the same words indifferent ways and arrive to different conclusions. Hereafter we consider only randomexperiments with finite sample spaces.

Problem 1.6.1. Define the sample space in the last example if

1) We have two tickets

2) We have one ticket that costs $1 and are interested in the net income.

In many problems it is necessary to consider composite events, consisting of sim-ple ones, and combine simple sample spaces in more complex spaces. For exam-


ple, if we toss two distinguishable coins, the sample space consists of ordered pairsof the symbols H and T ; by the Product Rule, the new sample space contains 4points, namely, S D ¹.H;H/; .H; T /; .T;H/; .T; T /º. If we roll simultaneously3 different dice, then the sample space consists of 63 D 216 ordered triples, S D¹.1; 1; 1/; .1; 1; 2/; : : : ; .6; 6; 6/º.

Up to this point we have discussed only sample spaces. The probability theory orig-inates when a certain specific number p.s/, called the probability of the outcome s9,is assigned to each point s of the sample space S . The set of these values is called aprobability distribution on the sample space S , because we distribute a certain given“supply” of probability among the points of S . These values cannot be assigned ar-bitrarily, they must satisfy certain assumptions, axioms of the probability theory; formore on that see, for example, [16]. We consider the following system of axioms:

PA1) p.s/ � 0 for any point s 2 S

PA2) if E D ¹s1; s2; : : : ; skº � S , then p.E/ D p.s1/C p.s2/C � � � C p.sk/

PA3) p.S/ D 1.

Therefore, we have assumed that probability values are nonnegative, the probabilityof any event E is the sum of the probabilities of elementary events composing E, andthe total probability is 1. These axioms immediately imply that if E1; : : : ; Ek are anypairwise disjoint events, that is, E1; : : : ; Ek � S and Ei \ Ej D ;; 1 � i; j � k,then p.E1 [ � � � [ Ek/ D p.E1/ C � � � C p.Ek/, that is, the probability is finitely-additive. Moreover, for any event E we have p.E/ D p.E [ ;/ D p.E/ C p.;/,thus, p.;/ D 0, the empty event must have zero probability.

In some cases we can conduct a random experiment in reality, for instance, wecan toss a coin many times and record the numbers of heads, n.H/, and tails, n.T /,occurred. If the experiment was repeated n times and the favorable outcomes to anevent E were observed k.E/ times among the n outcomes, then the frequency ratiof .E/ D k.E/

nis called the experimental or frequency probability of the event E.

Clearly, the frequency f .E/ depends, among other things, on the length n of theexperiment. If with n increasing, f .E/ is stabilizing to a number p.E/, we canuse f .E/ as an estimation of the probability p.E/ of the event E, but this is onlya plausible approximation. For example, there is nothing unusual to get two headsin a row, thus in this series n D 2, p.H/ D 1=1 D 1, and p.T / D 0. However,if we use this very short series to estimate the probability of getting a tail, we havep.T / D f .T /=2 D 0, which obviously makes no sense. More advanced courses inthe probability theory treat in more detail this issue – what is the appropriate lengthof an experiment.

9One can often hear in everyday talk, “It’s probable” or “That’s unlikely.” Based on such individualjudgment, some people play lotteries while the others do not, because the latter do not believe that thereare reasonable chances to win. Any discussion of such subjective probabilities is beyond the scope ofthis book.


Any collection of numbers, satisfying axioms PA1)–PA3), can be used as a proba-bility distribution. For example, experimenting with a coin and choosing the samplespace S D ¹H;T º, we can assign p.H/ D 1=3 and p.T / D 2=3. However, unlesswe have a specifically tailored (very biased) coin, the results of our physical experi-ments will likely be essentially different from the results predicted by the mathemat-ical model. So that, to assign a probability distribution, we use either some previousexperience (the results of real experiments) or a theory, if it exists.

Probably, it is physically impossible to make a perfect coin, however, real experi-ments have confirmed that if a coin was chosen at random, then as the first approxi-mation it is quite realistic to assign the probabilities p.H/ D p.T / D 1=2. On theother hand, the same experiments show that no real coin satisfies this probability dis-tribution precisely, but exhibits slight deviations from the theoretical probability 1/2.Nevertheless, it is customary in theoretical studies to accept the hypothesis of equallylikely probabilities or equal chances10, that is, to assign equal probabilities to eachpoint of the sample space.

Definition 1.6.2. It is said that the assumption of equally likely probabilities is validfor a given problem with the sample space S D ¹s1; s2; : : : ; snº if the probabilitydistribution on S is given by

p.s1/ D p.s2/ D � � � D p.sn/ D1

n:

Whether or not this assumption holds true in any particular case, should be verifiedby comparing our calculations with experiments. The following well-known exampleis illuminative. Our intuition might tell us that the number of girls born must onaverage be the same as the number of boys, and many computations using the equalprobabilities 1=2 as the first approximation, give good results. However, the many-year observations have shown that in reality the probability for a new-born baby to bea boy is slightly bigger, namely 0.51, versus 0.4911 for a girl.

From now on we always suppose the hypothesis of equally likely probabilities to bevalid, unless the opposite is explicitly stated.

The goal of this section is to show applications of the developed combinatorialmethods and results to the probability theory. First we translate a few basic set-theorynotions to probabilistic language. Remind that an event is just a subset of the basic(universal) set, the latter is called here the sample space. All the events under con-sideration are subsets of some fixed sample space S . Therefore, we can define thefollowing operations with events through their set-theory counterparts.

10The terms probability or chance should not be confused with the term odds. The expression “oddsin favor of an event E” means the ratio p.E/

p.E/, while “odds against an event E” means the reciprocal

ratio p.E/p.E/

.11There are data indicating that this gap maybe is shrinking.


Definition 1.6.3. 1) The event E D S nE is called complementary to an event E.

2) Two events are called disjoint or mutually exclusive if their set-theory intersec-tion is empty, that is, if they have no common favorable outcomes. Thus, if E1and E2 are disjoint events, then p.E1 \E2/ D 0.

3) A system of events ¹E1; : : : ; Ekº is called exhaustive ifSkiD1Ei D S .

Example 1.6.4. Let us toss a coin and choose the sample space ¹H;T º. Then theevents “To get an H” and “To get a T ” are disjoint, mutually complementary, andtogether exhaust the sample space. The events “To get an odd number” and “Toget a number less than 3” in one rolling of a die are not mutually exclusive. Thecomplementary event to “To get a number less than 3” is “To get a number greaterthan or equal to 3”. Any event and its complement make up an exhaustive system.

The following properties are immediate consequences of the definitions, axiomsPA1)–PA3), and the results of Section 1.1. It is critical that any probability distributionis finitely additive.

Theorem 1.6.5. 1) For any event E,

p.E/ D 1 � p.E/: (1.6.1)

2) For any events E1 and E2,

p.E1 [E2/ D p.E1/C p.E2/ � p.E1 \E2/: (1.6.2)

In particular, ifE1 andE2 are disjoint, that is,E1\E2 D ;, then p.E1\E2/ D0, and p.E1 [E2/ D p.E1/C p.E2/.

Problem 1.6.2. Two fair right tetrahedrons, a green one and a blue one, with facesmarked 1 through 4, were tossed. We record the numbers on the faces they landed.

1) What is the probability that the sum of these numbers is 7?

2) What is the probability that the sum of these numbers is greater than or equalto 7?

3) What is the probability that the sum of these numbers is greater than 7?

Solution. In this and similar problems “fair” means that we accept the hypothesisof equally likely outcomes. Since the tetrahedrons are different, the sample spaceconsists of 4 � 4 D 16 ordered pairs, S D ¹.1:1/; .1; 2/; : : : ; .4; 4/º, where the pairs.1; 2/ and .2; 1/ are different. Hence, the probability of any outcome is 1/16. Thesum of 7 can occur as either 3C 4 or 4C 3, and these outcomes are disjoint, since atthe same throwing of a tetrahedron we cannot observe both a 3 and a 4. Hence, theanswer to part 1) is 1

16C

116D 2 � 1

16D

18

.


2) Since the largest possible outcome in this part is a 4, the sum of 7 or more meanseither 7 or 8, therefore, comparing with Part 1) of the problem, there is one morefavorable outcome, the pair .4; 4/, which is also disjoint with the preceding ones, andthe answer to part 2) is 3 � 1

16D

316

.In part 3), the only favorable outcome is the pair .4; 4/, thus the answer is p..4; 4//D 1=16. The events in parts 1) and 3) are disjoint and their union is the event inpart 2), that is why the answer in part 2) is the sum of those in parts 1) and 3).

Problem 1.6.3. This weekend Kathy either goes to the movies, with the probabilityof this event 0.7, or to the restaurant with the probability 0.5.

1) Given this information, is it possible to conclude that these two events are mu-tually exclusive?

2) What is the smallest and the largest possible probability that this weekend Kathywill have both these pleasures?

3) How can we change the problem to be able to determine precisely the proba-bility that this weekend Kathy gets at least one of these pleasures? Both thesepleasures?

Problem 1.6.4. Two dice are rolled simultaneously. What is the probability to get atleast one number greater than 4?

Consider again Problem 1.4.18, where we found the number of n-arrangementswith repetition from the set A D ¹0; 1º, containing an even number of 0s, but now westate the question in probabilistic terms.

Problem 1.6.5. Given all 2n n-arrangements with repetition from the set A D ¹0; 1º,we choose at random one of them, assuming that every arrangement has equal chancesto occur. What is the probability to pick an arrangement containing an even numberof 0s?

Solution. The sample space consists of 2n arrangements. According to the solutionof Problem 1.4.18, 2n�1 of them (exactly half of the sample space), are favorableoutcomes for our problem. Therefore, the probability we sought, is p D 2n�1=2n D

1=2.

Analyzing the solution, we observe an important feature of all similar problems:To solve a probabilistic problem with the finite sample space, we have to solve two

enumerative combinatorial problems.

Problem 1.6.6. All permutations of the letters of word MISSISSIPPI are written onballs, and one of these balls is chosen at random. What is the probability that we pickup the ball with the word MISSISSIPPI?


Solution. The sample space consists of all permutations with repetition of the let-ters of word MISSISSIPPI and by Theorem 1.5.2 contains C.11I 1; 4; 4; 2/ ele-ments. Among them there is only one favorable outcome, thus, the probability is1=C.11I 1; 4; 4; 2/ D 4Š4Š2Š

11Š� 0:000029.

Problem 1.6.7. Among all permutations with repetition of the letters of word DAD,one is chosen at random. What is the probability to find the chosen combination ofletters in an English dictionary?

Solution. The sample space consists of 3Š=2Š D 3 permutations with repetition, ADD,DAD, DDA, but only the first two strings are meaningful English words, that is, arefavorable outcomes in our problem. Therefore, the probability we sought, is p D2=3.

Any probability distribution on a sample space S puts a number p.s/ 2 Œ0; 1� into acorrespondence to a point s of the sample space, therefore, this distribution constitutesa function f W S ! R. Since the domain of this function consists of the outcomes ofa random experiment, the values of the function are also random. Such functions arecalled random variables. An initial probability distribution is also a random variable.In many problems it may be advantageous to change the values of a given probabilitydistribution, as long as we preserve axioms PA1)–PA3).

Definition 1.6.6. Given a random experiment with a sample space S , any real-valuedfunction

f W S ! R

with the domain S is called a random variable or a random function whenever itsatisfies the three properties similar to the probabilistic axioms PA1)–PA3):

RA1) f .s/ � 0 for any point s 2 S

RA2) If E D ¹s1; s2; : : : ; skº � S , then f .E/ D f .s1/C f .s2/C � � � C f .sk/

RA3) f .S/ D 1.

In particular, any probability distribution is a random function.

Problem 1.6.8. Consider a sample space S D ¹1; 2; : : : ; nº, where n is a given natu-ral number. Let f be a linear function, f .s/ D cs; c being a real constant. Find thecoefficient c so that the function f is a random variable on S .

Solution. We must verify the properties RA1)–RA3); RA1) is clear if c � 0, RA2)is a rule of computing f .E/ through the values f .s/;8s 2 S , and we only have tocompute the normalization constant c by making use of RA3). We have

1 D f .S/ D f .1/C � � � C f .n/ D c � 1C c � 2C � � � C c � n D cn.nC 1/

2

by Problem 1.1.6, thus for f to be a random variable, it must be c D 2n.nC1/

.


The equation p.E1 [E2/ D p.E1/C p.E2/ tells us that two events are mutuallyexclusive. Another important mutual characteristic of a pair of events E1; E2 is their(stochastic) dependence or independence. It turns out that this property is connectedwith the equation p.E1\E2/ D p.E1/�p.E2/, which is not always valid. Intuitively,

Two events are independent,if occurrence or non-occurrence of either of themdoes not affect the probability of another event to occur.

(1.6.3)

To define the dependence/independence in more precise analytic terms, it is con-venient to connect it with another important concept, namely, with the conditionalprobability of an event. First, we again model this notion by using an example.

Example 1.6.7. Based on many-year statistic, the probability for a freshman to grad-uate in four years from The Liberal College is 0.85, while for the freshman majoringin sciences this probability is only 0.70. The sample space in this problem consists ofall students ever graduated from the college. In the problem we have two probabil-ities – for all students the probability is 0.85 while for the science majors it is 0.70.The second number is different from the first one, because in computing it we haveused some additional information on the students’ majors, actually we reduced thesample space by removing all non-science majors. Since the second probability wascomputed under an extra condition, it is called the conditional probability.

To arrive at a definition, we sketch a computation of the conditional probabilityof an event E, given another event (a condition) C , in terms of favorable outcomes.Computing the probability p.E/, we have to take into account all outcomes favorableto E and relate them to the whole sample space S . However, when computing theconditional probability we certainly know that the event C has occurred, thus, nowwe consider only those favorable outcomes of E, which are favorable to C as well.Moreover, we must relate them not to the entire original sample space S , but onlyto the subset of outcomes favorable to C , hence, we must reduce the original samplespace. If we express all these quantities in terms of the size jS j of the sample space andof the probabilities p.C / and p.E \ C/, we derive formula (1.6.4). It is convenientto reverse this reasoning and use (1.6.4) as a definition of the conditional probability.

Definition 1.6.8. Consider a random experiment with the sample space S , a genericevent E, and a specified event (condition) C , such that p.C / > 0. The conditionalprobability p.EjC/ of an event E given the event C , is defined by

p.EjC/ Dp.E \ C/

p.C /: (1.6.4)

It is often convenient to rewrite this formula as

p.EjC/p.C / D p.E \ C/:


Problem 1.6.9. What is the probability to get a 3 in one roll of a die given that theoutcome is odd?

Solution. Introduce the event E3 D ¹x D 3º and the condition C D ¹x is oddº;we know that p.E3/ D 1=6 and p.C / D 1=2. The intersection of these events isE3 \ C D E3, thus, p.E3 \ C/ D 1=6. By (1.6.4), the conditional probability isp.E3jC/ D .1=6/=.3=6/ D 1=3.

Problem 1.6.10. What is the probability to get a 2 in one roll of a die given that theoutcome is odd?

Solution. It is clear without computations that if the outcome is odd, it cannot be 2, butlet us formally compute the result. Let E2 D ¹x D 2º and C D ¹x is oddº; p.C / D1=2. The intersection of the two events is empty, E2\C D ¹2º \ ¹1; 3; 5º D ;, thus,p.E2 \ C/ D 0 and the conditional probability is p.E2jC/ D 0=.1=2/ D 0.

Now we can define the independence of two events in terms of conditional proba-bility.

Definition 1.6.9. Two events E and C are called (stochastically) independent if

p.EjC/ D p.E/; (1.6.5)

otherwise the events are called dependent.

Comparing (1.6.4) with (1.6.5), we see that two events are independent if

p.E \ C/ D p.E/p.C /; (1.6.6)

thus, equation (1.6.6) formalizes our “intuitive” definition (1.6.3). It is worth notingthat the independence is a symmetric property, which is obvious from (1.6.6), but notfrom (1.6.3).

Problem 1.6.11. A card is drawn at random from a regular deck containing 52 cards.Are the events A – “To pick an Ace” and C – “To pick a club” dependent or indepen-dent?

Solution. The deck contains 4 Aces, so that p.A/ D 4=52 D 1=13. Calculate theconditional probability p.AjC/. Obviously, p.A \ C/ D 1=52 and p.C / D 13=52,therefore p.AjC/ D p.A\C/

p.C/D 1=13. Since p.AjC/ D p.A/, we conclude that

these events are independent.

Problem 1.6.12. To win the jackpot in the New York Lottery Mega Millions game,one must guess correctly 5 numbers among 1; : : : ; 56 and one more number from1; : : : ; 46. What is the probability to win the jackpot if you have one ticket?


Solution. Since the order is not important, there are C.56; 5/ D 3 819 816 ways tochoose five numbers and C.46; 1/ D 46 ways to select the Mega Ball number. Sincethe last choice is independent from the first five numbers (Why?), there are C.56; 5/ �C.46; 1/ D 175 711 536 different tickets, and this is the cardinality of the samplespace. Therefore, the probability we look for, is 1=175 711 536 � 5:69 � 10�9.

Thus, if we have enough funds and time to buy 175 711 536 $1-tickets, we definitelyget the jackpot. Considering the appropriate taxes, not to mention a slight possibilitythat someone else has a winning ticket, we can estimate how large the jackpot is to beto pay off such an expense.

If we occasionally buy a lottery ticket, we cannot predict the future–the chancesare slim, but who knows. . . . Sometimes people win the jackpot. However, if we playany game of chance systematically, we may want to estimate our chances in the longrun. The mathematical instrument for such estimations is called the mathematicalexpectation or the expected value of a random variable.

Definition 1.6.10. Consider a probability distribution p.s/ on the sample space S anda random function f .s/; s 2 S . The mathematical expectation or the expected valueof the random variable f is the sum

E.f / DXs2S

p.s/f .s/: (1.6.7)

Problem 1.6.13. Find the expected value of the net gain in the preceding problem ifthe jackpot was $10 000 000.

Solution. The sample space has only two points, s1DW with the probability p.s1/D1=175 711 536 and s2 D L with p.s2/ D 1 � 1=175 711 536. Corresponding gainsare f .s1/ D $10 000 000 � $1 and f .s2/ D �$1. Therefore,

E.f / D $9 999 999 � 1=175 711 536 � $1 � .1 � 1=175 711 536/ � �$0:94;

and in the long run we should expect to loose about 94 cents from each dollar spent.

Problem 1.6.14. Find the expected value of the net gain in the preceding problem ifthe jackpot was $100 000 000.

Problem 1.6.13 gives an example of a binomial distribution, that is, a random ex-periment with exactly two outcomes, usually called a success and a failure, whoseprobabilities do not change in time. We follow the tradition and denote the probabil-ity of success by p D p.success/ and the probability of failure by q D p.failure/,thus, 0 � p; q � 1 and p C q D 1. If we repeat a binomial experiment n times,assuming all the outcomes being independent (such series is called Bernoulli’s


trials), then a typical problem is to compute the probability of getting r successes inthese n trials. Since there are C.n; r/ ways to select r “successful” trials among n,the probability of getting r successes is, by the Product Rule,

p.r; n/ D C.n; r/prqn�r : (1.6.8)

We used (1.6.8) in the solution of Problem 1.6.13 with n D 1; r D 1; p D

1=175 711 536 and q D 1 � 1=175 711 536.Consider again the definition of the conditional probability (1.6.4). Since the oper-

ation of intersection of two sets is commutative, it implies the following property

p.EjC/p.C / D p.C jE/p.E/: (1.6.9)

Thus, we can express the conditional probability of two events through their condi-tional probability in reversed order, that is, as

p.C jE/ Dp.EjC/p.C /

p.E/

if p.E/ > 0. This property can easily be extended to the case of several conditions.

Theorem 1.6.11. Let events C1; C2; : : : ; Ck make a partition of the sample spaceS , that is, all Cj are non-empty, pairwise disjoint, and exhaust the sample space S .Then the following equation, called Bayes’s formula, is valid for any eventE withp.E/ > 0 and any j; 1 � j � k,

p.Cj jE/ Dp.EjCj /p.Cj /Pk

jD1 p.E \ Cj /p.Cj /: (1.6.10)

Proof. By Problem 1.1.12, we have

E D E \ S D E \� k[jD1

Cj�D

k[jD1

.E \ Cj /:

The intersections E \ Cj ; 1 � j � k, are also mutually exclusive, thus p.E/ DPkjD1 p.E \ Cj /. Combining the latter with the formula for the conditional proba-

bility p.Cj jE/ D p.E \ Cj /=p.E/, we deduce (1.6.10).

Problem 1.6.15. Let us note that all p.Cj / ¤ 0, since we have assumed Cj ¤ ;. Isit possible that the denominator in (1.6.10) is zero?

Problem 1.6.16. A die was randomly selected among a set, containing 999 999 reg-ular dice and one die with all faces marked by 1, and was rolled 10 times. What is theprobability that the fake die was chosen, given that a 1 was observed in all 10 trials?


Solution. Consider three events,

C D ¹Observe a 1 in 10 consecutive trialsº

D D ¹Choose a fair dieº with p.D/ D 1 � 10�6

F D ¹Choose a fake dieº with p.F / D 10�6:

By Bayes’s formula,

p.F jC/ Dp.C jF /p.F /

p.C jD/p.D/C p.C jF /p.F /

D1 � 10�6

.1=6/10 � .1 � 10�6/C 1 � 10�6� 0:94:

The result is so close to 1 that it may look counterintuitive, and it is useful to compareit with the negligible probability to observe 10 consecutive 1s in rolling a fair die,which is .1=6/10 � 1:65 � 10�8. Compare it also with the result of EP 1.6.35.

Now we consider a classical birthday problem.

Problem 1.6.17. What is the probability that among s members of the Combi Clubat least two have the same birthday?

Solution. To simplify computations, we consider a non-leap year with 365 days andassume that for any day of the year the probability, that someone was born this day,is the same; thus, this probability is 1/365. Moreover, as we always do in problemsinvolving people, we suppose that the birthdays of all the people involved are indepen-dent, in particular, any day of a year has equal probability to be someone’s birthday.

It is easier in this problem to compute the probability of the complementary event,that is, the probability that no two members have the same birthday. First of all wenotice that for any member there are 365 options to fix the birthday, hence the samplespace contains 365s points. To find the number of favorable outcomes, we choose sdays from 365 for s birthdays–this can be done in C.365; s/ ways. However, whenwe distribute members’ birthdays among these s days, we can permute them in sŠways, generating different favorable outcomes. Hence, there are sŠ C.365; s/ DP.365; s/ favorable outcomes, so that the probability of the complementary eventis P.365; s/=365s , and the probability that at least two members have the same birth-day is 1�P.365; s/=365s . We see that if s � 366, then this probability is 1, which isobvious. An easy numerical experiment shows that this probability is increasing andbecomes bigger than 1/2 for s D 23.

Remark 1.6.12. It is instructive to rephrase this problem in terms of placing balls inurns – see EP 1.5.24.



EP 1.6.1. Describe the sample space if we simultaneously toss two indistinguishablecoins, that is, the outcomes .H; T / and .T;H/must be identified. What is the samplespace if we roll simultaneously 3 identical dice?

EP 1.6.2. Describe the sample space if we simultaneously flip a coin and roll a die.

EP 1.6.3. Describe the sample space if we flip a coin six times. What is the proba-bility that at least one head and at least two tails will appear in the six tosses? Whatis the probability that a streak of at least four consecutive tails will appear in the sixtosses?

EP 1.6.4. Simultaneously toss a coin and roll a die. What is the probability to get ahead and a multiple of 3?

EP 1.6.5. Simultaneously roll a die and draw a card from a standard desk of 52 cards.What is the probability to get an even number and a red face card?

EP 1.6.6. A lottery ticket contains six boxes – two for letters followed by four fordigits. If there is only one winning ticket, what is the probability to win the lottery?

EP 1.6.7. A 9-digit natural number is chosen at random. What is the probability thatall its digits are different?

EP 1.6.8. A woman can give birth to a girl, a boy, two girls, a girl and a boy, twoboys, etc. Consider this as a random experiment with outcomes to be the numberof children and the gender composition of the children born. Describe the samplespace if the order at birth is important, that is, we consider the pairs “boy-girl” and“girl-boy” as different. What is the sample space if the order does not count?

EP 1.6.9. Assuming that a new-born baby has equal chances to be a girl or a boy,what are the probabilities to have no boy, one boy, two boys, three boys, . . . , n boysin a family with n children? Compare with EP 1.4.9.

EP 1.6.10. A die is rolled once. Find the complementary event to the followingcombined events.

1) “To get an odd number AND To get a number less than 3”.

2) “To get an odd number OR To get a number less than 3”.

EP 1.6.11. Let E be any event, E its complement, and p D p.E/. What are the

events E\E, E[E, E \ E, E [E, and what are their probabilities, in terms of p?


EP 1.6.12. 7 people get in an elevator on the first floor of an 11-story building. Whatis the probability that no two of them get out of the elevator at the same floor?

EP 1.6.13. A 5 � 5 � 10 wooden parallelepiped with red sides cut into 250 unit cubes.What is the probability that a randomly chosen unit cube has no red face? One redface? Two red faces? Three red faces? Has four or more red faces?

EP 1.6.14. A die has 4 blue and 2 red faces. What is the probability that the two redfaces have a common edge? What is the probability that the two blue faces have acommon edge?

EP 1.6.15. Six faces of a regular die are marked by letters A, B, A, C, U, S. Find theprobability that on six rolls of the die, the letters shown can be rearranged to spell“ABACUS”.

EP 1.6.16. There are d dolphins in the ocean. d1 of them were caught, marked, andreleased back. Next time, d2 of them were caught. Assuming independence, computethe probability that m marked species were caught second time.

EP 1.6.17. A gentleman has 10 dress shirts and 10 ties, one matching tie to everyshirt. Preparing for a long meeting, he selects at random 2 shirts and 2 ties. What isthe probability that he gets exactly one matching pair of a shirt and a tie? At least onematching pair? Two matching pairs? If he selects at random 5 shirts and 5 ties, whatis the probability to have exactly 2 matching pairs?

EP 1.6.18. What is the probability to get (at least) two consecutive tails if a fair coinis tossed 12 times?

EP 1.6.19. What is the probability that in a random permutation of numbers 1; 2; 3;: : : ; 1000 at least one number occupies its own place (for example, the 5 is the 5th

number in the permutation)?

EP 1.6.20. Among the whole numbers, some can be written down without the digit1, like 527, while the others contain a 1, like 21 345. If you randomly pick a wholenumber from 0 through 999 999 inclusive, what is more probable, to pick a numberwith or without a 1 in its decimal representation? Does the probability change if weconsider numbers from 1 to 999, or from 1 through 999 999 999?

EP 1.6.21. The U.S. Senate consists of 100 Senators, two from each state. If 10senators are chosen at random, what is the probability that this cohort contains asenator from New York State?


EP 1.6.22. An urn contains 3 black, 3 white, and 3 yellow balls. n balls are taken atrandom without replacement. For each n; n D 1; 2; : : : ; 9, find the probability thatamong the n selected balls there are balls of all three colors?

EP 1.6.23. An urn contains 8 balls with the letters of the word STALLION. If 4 ballsare chosen at random without replacement, what is the probability that either the wordTOLL or the word LION can be composed of these balls?

EP 1.6.24. Two fair right indistinguishable tetrahedrons, with faces marked 1 through4, are tossed.

1) What is the probability that the sum of the numbers on the bottom faces is 7?

2) What is the probability that the sum of these numbers is greater than or equalto 7?

EP 1.6.25. Let m and n be natural numbers. Consider four points O.0; 0/, A.m; 0/,B.m; n/, and C.0; n/ in the coordinate plane, and choose a random rectangle R withsides parallel to the coordinate axes and with vertices at points with integer coordi-nates inside or at the boundary of the rectangle OABC . What is the probability thatR is a square?

EP 1.6.26. Let S D ¹1; 2; : : : ; nº, where n can be any natural number, and f .s/ Dcs2; c is constant. Find c so that f is a random variable on S .

EP 1.6.27. Let S D ¹1; 2; : : : ; nº, where n can be any natural number, and f .s/ Dc=s; c is constant. Find c so that f is a random variable on S .

EP 1.6.28. Let S D ¹1; 2; 3; : : :º, that is, in this problem the sample space is infinite,and f .s/ D c=s2; c is constant. Find c so that f is a random variable on S .

EP 1.6.29. What is the probability to get at least one 3 in a roll of two dice if the sumis odd?

EP 1.6.30. Two hunters simultaneously shoot a wolf. Under the given conditions, theprobability to kill the animal for each of them is 1/3. What is the probability for thewolf to survive?

EP 1.6.31. Six cards are drawn at random from a regular deck of 52 cards. What isthe probability that

1) The Queen of spades was chosen among these 6 cards?

2) All 4 suits will appear among these 6 cards?


EP 1.6.32. Several cards are drawn at random from a regular deck of 52 cards. Wewant to guarantee with the probability more than 1/2 that at least 2 cards of the samekind appear among the cards chosen. What is the smallest number of cards that mustbe drawn for that?

EP 1.6.33. If p.A/ D 0:55, p.B/ D 0:75, and the events A and B are independent,what are the conditional probabilities p.AjB/ and p.BjA/?

EP 1.6.34. In a certain population, 30% of men and 35% of women have a collegedegree; it is also known that 52% of the population are women. If a person chosen atrandom in this population has a college degree, what is the probability that the personis a woman?

EP 1.6.35. The Student Government at The Game College sold 250 lottery ticketsworth $1 each. There are one $100 prize, one $50, and three $10 prizes. If a studentbought 2 tickets, what is the expected value of her net gain?

EP 1.6.36. Among 10 000 coins all but one are fair, and one has tails on both sides.A randomly chosen coin was thrown 12 times.

1) What is the probability that this coin was false, if a tail was observed in all 12trials?

2) What is the probability that this coin was fair, if a tail was observed in all 12trials?

EP 1.6.37. Every juror makes a right decision with the probability p. In a jury ofthree people two jurors follow their instincts, but the third juror flips a fair coin, andthen the verdict follows the majority of jurors. What is the probability that the jurymakes the right decision? Does this probability change if the jury consists of fourjurors, and only one among them flips a coin?

EP 1.6.38. Assuming independence, what is the probability that at least two of thefirst 43 Presidents of the USA have the same birthday?

EP 1.6.39. We roll a fair die 6 times. If a 1 occurs first, or if a 2 occurs second, or ifa 3 occurs third, . . . , or if a 6 occurs sixth, we get $1. What is the expected value ofour gain?

Remark 1.6.13. This result illustrates an important theorem that the expected valueof the finite sum of random variables is equal to the sum of their expected values.


EP 1.6.40. The President of the Combi Club introduced the following game. A par-ticipant pays $1 and selects at random an integer number between 1 and 1 000 000inclusive. If the decimal representation of the number contains a 1, the participantgets $2, otherwise the participant looses the game. What is the expected value of thegame?

EP 1.6.41. State the inclusion-exclusion formula (1.1.4) in probabilistic terms of thissection.

EP 1.6.42. At four class tests, a student scored 76, 81, 89, and 92. At the fifth test,she can equally likely get scores from 75 through 95 inclusive. What is the probabilitythat her average will be 85? At last 85? What is the probability that her average willbe 85 if her fifth score is 85?

EP 1.6.43. In a class of 20 students, each two have a common grandfather. Prove thatamong them at least 14 students have the same grandfather.

EP 1.6.44. The spring in an old-fashioned watch breaks at a random moment. Whatis the probability that the hour hand will show time after 2 a.m. but before 4 a.m.?

EP 1.6.45. The Combi Club has a round roulette table with a rotating pointer. Thetable is divided in three sectors, one half-circle marked by the digit 3, and two quarter-circles marked by 1 and by 2, respectively. You pay $ 1 to make a single spin andget back the reward in $ equal to the number in the sector the pointer stops. Usingnegative numbers for a loss, find a probability distribution for the net gain if the gamewas played once. What is the average (expected) gain for each play? What would bethe fair price of the game? Solve this problem if you pay $ 2 for a spin; $ 1.50 for aspin. Explore the similar problem if the table is divided in 6 equal sectors marked bythe digits 1, 2, 3, 4, 5, 6.

EP 1.6.46. A multiple-choice exam consists of 20 questions, with 4 possible answersfor every question. If a student randomly guesses the answer to each question, find theprobabilities that she gets zero, one, two, three, four, . . . , 20 correct answers. Whatis the expected number of questions guessed correct? Solve the same problem if eachquestion has 5 possible answers.

EP 1.6.47. Assuming that any day has equal chances to be the birthday of a personchosen at random from a very big population, find the probability that two randomlyselected people both have birthdays on Sunday; find the probability that at least oneof two randomly selected people will have a birthday on Sunday; find the probabilitythat two randomly selected people have birthdays on the same day of a week.


EP 1.6.48. Each member of the Combi Club is required to take a course in Combina-torial Analysis (course C) and in Probability Theory (course P). The Registrar Officereports that 80% pass course C, 75% pass course P, and 90% pass at least one of thecourses. Find the probability of passing both courses. What is the probability that aperson who passes course C will also pass course P? Are passing course C and passingcourse P independent events?

Chapter 2

Basic Graph Theory

Graphs and, in particular, trees are graphical mathematical models useful in manyproblems. This chapter is devoted to a brief introduction to the graph theory. Thefirst two sections introduce the vocabulary. Graphs are defined in terms of setsand mappings in the spirit of Section 1.1, but we soon resort to more intuitive andtransparent language of geometric diagrams. In the next three sections we studyimportant special classes of graphs – trees, planar graphs, Eulerian graphs, anduse the methods and results of Chapter 1 to solve some problems of graphicalenumeration. More such problems are considered in Chapters 4 and 5. Ourexposition was strongly influenced by Wilson’s beautiful book [48]. For a deeperstudy of the subject we recommend the course of B. Bollobás [8]; we mostlyfollow the latter in terminology.

2.1 Vocabulary

In this section we create the basic vocabulary of the graph theory and considerelementary properties of graphs. Recall that 2V2 denotes the set of all two-elementsubsets of a set V .

Drawing graphs � Euclid’s biography

Definition 2.1.1. A graph G is a triple G D .V;E; f /, where V ¤ ; is a non-emptyset, whose elements are called vertices of the graphG,E is a set, maybe empty, whoseelements are called edges of G, and the mapping

f W E ! 2V2 [ V

is called the incidence function of the graph. We write here 2V2 [V instead of 2V2 [2V1

to simplify notations. If E D ;, all vertices are said to be isolated; in this case f isthe empty mapping. The number of vertices p D jV j is called the order of the graphG and the number of edges q D jEj is called the size of G.

Thus, for any edge e 2 E, its image f .e/ contains either one vertex v if f .e/ � V ,or two vertices v1; v2 if f .e/ D ¹v1; v2º � 2V2 ; these vertices are called the end-vertices of the edge e. If a vertex v is an end-vertex of an edge e, the vertex and theedge are called incident to one another. If two edges ek and el have the same pair ofend-vertices, f .ek/ D f .el/ D ¹vi ; vj º, these edges are called parallel or multiple.

Section 2.1 Vocabulary 93

If f .e/ 2 V , that is, the edge e has the only end-vertex, e is called a loop. Verticeshaving only one incident edge, which is not a loop, are called pendant or leaves.

Example 2.1.2. Consider a graph GD .V;E; f / (Fig. 2.1), where V D¹v1; : : : ; v5º,E D ¹e1; : : : ; e6º, and

f .e1/ D ¹v1; v2º; f .e2/ D f .e3/ D ¹v1; v3º;

f .e4/ D ¹v2; v4º; f .e5/ D ¹v3; v4º; f .e6/ D v4:(2.1.1)

This graph has no pendant vertex, the edges e2 and e3 are parallel, the vertex e6 is aloop, the vertex v5 is isolated.

s s

s s

s

��

$

%

v1 v2

v3v4

v5

e1

e2 e3 e4

e5e6

Figure 2.1. Diagram g represents the graph G in Example 2.1.2.

We often suppress the symbol f and denote graphs by G D .V;E/. In this textwe mostly consider undirected graphs, which means that the two end-vertices of anedge are not ordered. Moreover, we deal only with finite graphs, always assumingjV j <1 and jEj <1.

We consider graphs with parallel (multiple) edges, called sometimes multigraphs,and/or with loops, called pseudographs. Some authors use the term “graph” onlyfor the so-called simple graphs, that is, for graphs without parallel edges and loops.Hereafter, we call a graph G simple if G has no parallel edges nor loops. To definesimple graphs in set-theory terms, it suffices to consider the set of edges E as a subsetof 2V2 .

It is useful to visualize abstract concepts. All the more this is true in the caseof graphs. We call the graphs in the sense of Definition 2.1.1 abstract graphs andrepresent them by drawing geometric graphs or diagrams. These are sets of points inthe plane (one can even imagine points in Rn) representing the vertices of a graph,some of them connected by smooth arcs, maybe line segments, representing the edgesof the graph. For each edge only its end-points belong to V , in other words, no interior

94 Chapter 2 Basic Graph Theory

point of any edge belongs to V . It should be noted that an actual shape of an edgemakes no difference in our considerations. For any such a diagram, we can alwayseasily restore the incidence function of the graph presented. Such diagrams are calledembeddings of a graph into the Euclidean plane R2. For example, Fig. 2.1 depictsdiagram g corresponding to graph G defined by (2.1.1).

Now we define directed graphs.

Definition 2.1.3. A directed graph (digraph) G is a triple G D .V;E; Nf /, where VandE have the same meaning as above and the incidence function is now the mapping

Nf W E ! V � V:

If Nf .e/ D .v1; v2/ for an edge e 2 E, then v1 is called the initial vertex of e andv2 the end vertex of e.

Let a digraph G have vertices v1 and v2 and edges e1 and e2, such that Nf .e1/ D.v1; v2/ and Nf .e2/ D .v2; v1/. Since the ordered pairs .v1; v2/ ¤ .v2; v1/, thisimplies that the oriented edges e1 ¤ e2.

Example 2.1.4. Consider a digraph G D .V;E; Nf / shown in Fig. 2.2, where V D¹v1; : : : ; v5º, E D ¹e1; : : : ; e5º, and

Nf .e1/ D .v1; v2/; Nf .e2/ D .v3; v1/; Nf .e3/ D .v1; v3/;

Nf .e4/ D .v2; v4/; Nf .e5/ D .v3; v4/:

s s

s s

s��

@@

@@

@@

?-

-v1 v2

v3v4

v5

e1

e2 e3 e4

e5

��

@@@@@@

Figure 2.2. Digraph G.

We slightly abuse the language and do not distinguish an abstract graph and itsgeometric realization like the graph G in (2.1.1) and the corresponding diagram g inFig. 2.1. Usually this does not lead to any misunderstanding. If we need to empha-size this distinction, we reserve capital letters (G) for graphs and small ones (g) for


the corresponding geometric diagrams. The same graph (incidence relation (2.1.1))can be drawn in infinitely many other ways; compare, for instance, the diagram g

(Fig. 2.1) and the diagram g0 (Fig. 2.3).

s s

ss

s��

��

��

& %w1 w2

w3

w5

w4

�1

�2

�3

�4 �5

�6

Figure 2.3. Diagram g0.

Definition 2.1.5. A graph G D .V;E/ is said to be regularly embedded in Rn if itsedges have no common points except for the end-vertices. A graph is called planar ifit can be regularly embedded in R2.

Example 2.1.6. The diagram (graph) g (Fig. 2.1) is regularly embedded in R2, butg0 (Fig. 2.3) is not, since the edges �2 and �6 intersect at a point which is not a vertexof the graph. However, this graph is planar.

Problem 2.1.1. Draw a regular plane embedding of the graph g0 (Fig. 2.3), that is, itsembedding into R2.

Lemma 2.1.7. Every finite (and even countable) graph can be regularly embeddedin R3.

Proof. Indeed, let a graph G be of order p and size q. Consider a line L in R3 anda bundle of q different half-planes bounded by L; any half-plane corresponds to oneand only one edge (Fig. 2.4). Select p different points on L, one for each vertex.If two vertices v1 and v2 of the graph are incident to an edge e1, we connect thecorresponding points in L by a half-circle located in the half-plane corresponding toe1. If e2 is a loop at a vertex v3, we draw a circle in the half-plane corresponding toe2, such that this circle is tangent to L at v3. It is obvious that these q circles andhalf-circles have no points in common but maybe their end-vertices .


��

&%

PPPPPP

PPPPPP

ss s

L

v1v2

e1

e2

v3

Figure 2.4. Regular embedding of a graph in R3.

Definition 2.1.8. The degree, deg.v/, of a vertex v 2 V is the total number of edgesincident to this vertex; by definition, every loop must be counted twice1. A vertex ofan even (odd) degree is for short called an even .odd/ vertex. If V D ¹v1; : : : ; vpº,then the sequence .deg.v1/; : : : ; deg.vp// is called the degree sequence of a graph;unlike the set ¹deg.v1/; : : : ; deg.vp/º, this sequence depends on the numbering of thevertices.

Example 2.1.9. In diagram g (Fig. 2.1), deg.v1/ D deg.v3/ D 3; deg.v2/ D 2,deg.v4/ D 4 since v4 is incident to edges e4; e5 and to the loop e6, and deg.v5/ D 0

since v5 is an isolated vertex. The degree sequence is .3; 2; 3; 4; 0/.

Definition 2.1.10. If any two vertices of a simple graph are adjacent, the graph iscalled complete. A complete graph of order p is denoted by Kp.

Problem 2.1.2. What is the degree of any vertex inKp? What is the degree sequenceof Kp?

Definition 2.1.11. A simple graph G D .V;E/ is called bipartite if V D V1 [ V2with V1\V2 D ;, and each edge connects a vertex in V1 with a vertex in V2. A simplebipartite graph G D .V1[V2; E/ is called complete if each vertex in V1 is connectedwith every vertex in V2. If jV1j D m and jV2j D n, the complete bipartite graph isdenoted by Km;n.

Problem 2.1.3. What are the degrees of vertices of Km;n?

Lemma 2.1.12. In any graph of size q,Xv2V

deg.v/ D 2q: (2.1.2)

1However, there are problems where it is more convenient to count a loop just once.


Proof. We again do double counting, computing twice the total number of the end-vertices. First, we just sum up over all the end-vertices and second, we take intoconsideration that each edge has two ends. This reasoning shows, in particular, whyit is often convenient to consider loops as having two ends.

Lemma 2.1.12 is called the handshaking lemma, because if we depict the partici-pants of a party by vertices of a graph such that two vertices of the graph are connectedby an edge if and only if the two corresponding people exchanged a handshake, thenthe lemma just states that the total number of shaken hands is even.

Corollary 2.1.13. In any graph the number of odd vertices is even.

Proof. Indeed, if we split the left-hand side of (2.1.2) asXv2V

deg.v/ DX

v2V W deg.v/ is even

deg.v/CX

v2V W deg.v/ is odd

deg.v/;

then the first sum on the right is even and the total sum is even, thus the second sumon the right must be even. But this sum contains only odd addends, so there must bean even number of such vertices.

Definition 2.1.1 of abstract graphs is convenient, for it includes graphs with loopsand parallel edges. However it implies, for example, that graphs G D .V;E; f / andG1 D .V1; E; f1/ with V ¤ V1 are different even if jV j D jV1j, since the incidencefunctions have different domains. Moreover, diagrams g and g0 (Fig. 2.1 and 2.3) havedifferent appearance, despite the fact that they realize the same incidence relationshipamong five points. In many problems it is natural to identify such graphs. To this endthe following definition is useful.

Definition 2.1.14. Two graphs G D .V;E; f / and G1 D .V1; E1; f1/ are calledisomorphic, denoted by G Š G1, if there are two one-to-one correspondences,

' W V ! V1

and W E ! E1

compatible with the incidence functions in the sense that

f1. .e// D '.f .e//; 8e 2 E:

Example 2.1.15. Diagrams (graphs) g and g0 (Fig. 2.1 and 2.3) are isomorphic to oneanother; the one-to-one correspondences between them can be established as follows:

'.v1/ D w1; '.v2/ D w3; '.v3/ D w2; '.v4/ D w5; '.v5/ D w4

.e1/ D �3; .e2/ D �1; .e3/ D �2; .e4/ D �4; .e5/ D �6;

.e6/ D �5:


Definition 2.1.16. A graph of order p is called labelled if its vertices are labelledby the first p natural numbers, thus V D ¹1; 2; : : : ; pº. Labelled graphs are calledisomorphic if the bijection ' in Definition 2.1.14 preserves not only the incidencerelation but also the labelling, that is '.i/ D i; 1 � i � p.

For example, if we identify vi � i and wi � i for 1 � i � 5, then it turns outthat the diagrams g and g0 (Fig. 2.1 and 2.3), which are isomorphic in the sense ofDefinition 2.1.14, are not isomorphic as the labelled graphs.

We end this section by solving a few problems of enumeration of labelled graphs.Fix a setE and consider all labelled graphsG D .V;E; f /, where V D ¹1; 2; : : : ; pº,of order p and size q D jEj � 0. To count all different graphs, we have to enumeratevarious incidence functions E ! 2V2 [ V ; by Lemma 1.1.13, j2V2 [ V j D

p.p�1/2C

p D p.pC1/2

. Then by Theorem 1.1.35, there are�p.p C 1/

2

�q(2.1.3)

such graphs.If we want to count the graphs without loops, but with parallel edges, it is enough

to consider incidence functions f W E ! 2V2 ; in this case there are�p.p�1/

2

�q(possibly pairwise isomorphic) graphs. If we want to exclude both loops and paralleledges, we should consider only injective mappings f W E ! 2V2 ; to ensure theirexistence, we assume p.p � 1/=2 � q. By Theorem 1.1.37 and Corollary 1.1.44,there are 2p.p�1/=2 simple graphs of order p with any q; 0 � q � p.p � 1/=2, andby Theorem 1.1.45 among them there are�p.p�1/

2

�Š�p.p�1/

2� q

�Š

(2.1.4)

(possibly isomorphic) simple graphs of order p and size q.However, this number counts separately graphs with different labelling of edges.

If we want to identify such graphs, that is, to make the edges indistinguishable, wemust identify incidence functions corresponding to different ordering of edges. There-fore, instead of injective mappings we have to consider combinations of p.p � 1/=2elements taken q at a time; by (1.4.1), there are�p.p�1/

2

�Š

qŠ�p.p�1/

2� q

�Š

(2.1.5)

such graphs.Consider, for instance, simple graphs of order p D 3 and size q D 2; if V D¹v1; v2; v3º; E D ¹e1; e2º, then by (2.1.4) there are 3Š=1Š D 6 such abstract graphs


Gi .V;E; fi /; 1 � i � 6, whose incidence functions are

G1 W f1.e1/ D ¹v1; v2º; f1.e2/ D ¹v2; v3º

G2 W f2.e1/ D ¹v2; v3º; f2.e2/ D ¹v1; v2º

G3 W f3.e1/ D ¹v1; v3º; f3.e2/ D ¹v1; v2º

G4 W f4.e1/ D ¹v1; v2º; f4.e2/ D ¹v1; v3º

G5 W f5.e1/ D ¹v2; v3º; f5.e2/ D ¹v1; v3º

G6 W f6.e1/ D ¹v1; v3º; f6.e2/ D ¹v2; v3º:

Now, if we draw corresponding diagrams, omitting edge labels and using the stan-dard vertex labels vj D j; j D 1; 2; 3, we get only three different diagrams (Fig. 2.5)as given by (2.1.5) with p D 3 and q D 2 (cf. also Theorem 2.3.7). These labelledgraphs are nonisomorphic since the vertices of degree 2 have different labels. If inany of these graphs, we switch the labels of two pendant vertices, we get a labelledgraph isomorphic to the initial one.

m m m1 2 3 m m m3 1 2 m m m2 3 1

Figure 2.5. Nonisomorphic labelled simple graphs with three vertices and two edges.

However, if we erase the vertex labels, thus considering non-labelled graphs, allthe three diagrams in Fig. 2.5 become identical, thus, there is only one non-labelledsimple graph of order p D 3 and size q D 2. We solve more problems of graphicalenumeration in the sequel sections.


EP 2.1.1. Draw all simple graphs with 1 � p � 5 vertices and 0 � q � p.p � 1/=2edges.

EP 2.1.2. Draw complete graphs K1 �K6.

EP 2.1.3. 1) Is it possible to organize a tournament with 40 participating teams, ifeach team must play precisely three games?

2) Answer the same question if there are 13 teams and every team must play 5games.

EP 2.1.4. Are there graphs of order p D 6 with the following degree sequences?Draw, if any, all nonisomorphic diagrams with the given degree sequence.


1) .2; 2; 3; 4; 6; 7/

2) .2; 2; 3; 4; 6; 8/.

EP 2.1.5. Is there a graph of order 6 with each vertex of degree 3?

EP 2.1.6. Prove that if G1 D .V1; E1/ and G2 D .V2; E2/ are two isomorphicgraphs, then jV1j D jV2j; jE1j D jE2j, and the degrees of the corresponding ver-tices are equal. Are these necessary conditions also sufficient for two graphs to beisomorphic?

EP 2.1.7. What graphs in Fig. 2.6 are pairwise isomorphic?

��@

@@@@ �

��

� �

��

@@@s

sssssssssss&%'$

Figure 2.6. Are there isomorphic graphs here? – Cf. EPs 2.1.6–2.1.7.

EP 2.1.8. Draw all nonisomorphic simple graphs of orders 1 through 4.

EP 2.1.9. 30 teams compete in a tournament, where each team must play every otherteam exactly once. Prove that at any time there are two teams that have played thesame number of games.

EP 2.1.10. How many are there nonisomorphic planar graphs with 2n vertices and nedges if all edges are segments of straight lines and do not have common end points?

EP 2.1.11. 1) Every person who now lives on Earth, had pairwise discussions withseveral other people. Prove that the total number of people having an odd num-ber of conversations, is even.

2) Prove that any polyhedron has an even number of faces with an odd number ofedges.

3) Prove that any polyhedron has an even number of vertices where an odd numberof edges meet.

EP 2.1.12. Given a simple graph with v vertices each of degree d , what are restric-tions on d and on the parity of the product d � v?

Section 2.2 Connectivity in Graphs 101

EP 2.1.13. At how many points do all edges of the complete bipartite graph Km;nintersect, if no three edges intersect at a point?

EP 2.1.14. No three edges of the complete graph Kn intersect at a point, exceptmaybe at a vertex of the graph. In how many parts do all the edges split the inte-rior of the graph? Think on cutting a birthday cake.

EP 2.1.15. At a meeting of the Combe Club some members are friends and someare not. Assuming that the binary relation of being friends is symmetric (Defini-tion 1.1.22, Part 2)) prove that there are at least two people at the meeting who havethe same number of friends in the audience.

EP 2.1.16. The inhabitants of planet Triplan exchange handshakes only in triples,that is, by simultaneously connecting three limbs of three inhabitants. State and provean analogue of the handshaking lemma (Lemma 2.1.12) for the Triplan world.

EP 2.1.17. Prove that there are C.p.p�1/2C q � 1; q/ labelled graphs of order p and

size q with parallel edges but without loops, and there are C.p.pC1/2C q � 1; q/

labelled graphs of order p and size q with parallel edges and loops.

2.2 Connectivity in Graphs

In this section we study graphs as mathematical models for problems concerningwith connectivity between different objects.

Definition 2.2.1. Consider a graph G D .V;E; f /, a subset V1 � V , V1 ¤ ;, of itsset of vertices, and a subset E1 � E of its set of edges, such that all end-vertices ofedges in E1 belong to V1. The graph G1 D .V1; E1; f1/ is called a subgraph of Gif its incidence function f1 is the restriction (see Definition 1.1.33) of the incidencefunction f , f1 D f jE1 .

The definition means that we pick one or more vertices of the given graph andseveral, maybe none, edges of the given graph, connected with the selected vertices.We consider examples after the next definition.

Definition 2.2.2. Any subgraph G1 of G with V1 D V is called a spanning subgraphor a factor of G.

A spanning subgraph always exists, for instance, the graph itself is its own spanningsubgraph, but in general it is not unique. Subgraphs can be thought of as derived fromthe given graph by removing some edges and vertices; after removal a vertex we mustremove all edges incident to this vertex.


Example 2.2.3. This example again refers to the graph G D .V;E; f / (Exam-ple 2.1.2 and Fig. 2.1). Consider the subset

V1 D ¹v1; v2; v3; v4º � V

and the setE1 D ¹e1; e2; e3; e4; e5; e6º

of all edges of g whose end-vertices belong to V1. Thus, in this example E1 D E,and the graph G1 D .V1; E1; f1/ with f1 being the restriction of f onto E1, is asubgraph of G. If we consider the same subset V2 D V1 D ¹v1; v2; v3; v4º � V

and the empty set of edges E2 D ;, the graph G2 D .V2; E2; f2/ with the “empty”incidence function f2 is another subgraph of G with the same set of vertices as G1.If we choose the same subset of vertices V3 D V1 and the set of edges E3 D ¹e1; e2º,we get yet another subgraph of G with the same set of vertices.

If we start with another subset of vertices, say V4 D ¹v5º � V , the correspondingsubgraph is G4 D .V4;;; f4/, where f4 D f j; is the empty mapping. The subsetV5 D ¹v2; v4º � V generates a subgraph G5 D .V5; E5; f5/, where E5 D ¹e4; e6º.The graphG6 D .V6; E6; f6/, where V6 D V ,E6 D ¹e1; e2; e4; e5º, and f6 D f jE4 ,is a spanning subgraph of G.

Graph theory provides a convenient language for formalizing the concept of con-nectivity of different objects.

Definition 2.2.4. A sequence of intermittent vertices vil and edges ejk ,

.vi0 ; ej1 ; vi1 ; ej2 ; vi2 ; : : : ; ejk ; vik /;

is called a walk of length k connecting its end-vertices vi0 and vik . A walk is calleda trail if all its edges are different. If all vertices of a trail, except maybe for its end-vertices, are different, the trail is called a path. A trail is called a circuit if vi0 D vik .If the end-vertices of a path coincide, the path is called a cycle. Thus, any loop.v0; e1; v0/ is a cycle.

The corresponding objects in a digraph are called a directed walk, directed trail,etc. A directed cycle is called a contour. It is important that all edges in a directedwalk must have the same orientation, that is, the end vertex of a preceding edge mustbe the initial vertex of the sequel edge.

Hence, a trail can contain repeating vertices, that is, have self-crossings, a circuit isa closed trail, and a cycle is a closed path. If k D 0, the walk consists of one vertexand has zero length. The edge ejm is obviously a loop if vim�1 D vim . If ejn�1 D ejn ,the edge ejn is passed twice. A circuit is a closed trail and we do not single out anyits vertex as an end-vertex. It follows from these definitions that any open path ina simple graph of order n contains at most n � 1 edges. A path in a graph can beconsidered as a subgraph of the graph.


Example 2.2.5. Consider a walk .v1; e2; v3; e3; v1; e1; v2; e1; v1; e3; v3/ – see graphG D .V;E; f / in Fig. 2.1. This walk is not a trail since it contains repeating edges,say, e1. However, the walk .v1; e2; v3; e3; v1; e1; v2/ is a trail but not a path, becauseit has a repeating vertex v1. The walk .v4; e5; v3; e3; v1; e1; v2; e4; v4; e6; v4/ is acircuit, and the path .v4; e5; v3; e3; v1; e1; v2; e4; v4/ is a cycle. It is often possible,without any ambiguity, to write down walks with only edges and skip some vertices.For instance, the latter cycle can be represented as .e5; e3; e1; e4/.

The following simple properties of graphs are useful in many instances.

Lemma 2.2.6. 1) If two vertices of a graph are connected by a walk W of lengthn, then they are connected by a path of length at most n.

2) Each circuit contains a cycle.

Proof. If the walkW is not a path, then while traversing it, we must eventually arriveat some vertex v0 the second time, otherwise the graph would be infinite. Thus, Wcontains a circuit starting and ending at v0. Removing from W all elements of thiscircuit, except for the vertex v0, we get a walk W 0, which must be shorter than W ,otherwise W would be a path. Since W is finite, repeating this process several timeswe remove from W all repeating edges and vertices and derive the path we look for.The same reasoning yields the second statement of the lemma.

Consider again the diagram g in Fig. 2.1. If we depart from vertex v1 and travelthrough g using its edges as roadways, we can reach any other vertex except for theisolated vertex v5, and the same is true if we depart from the vertices v2�v4. However,from v5 we can reach only v5 itself, using zero path. This observation leads to thenext definition.

Definition 2.2.7. Given a vertex v 2 V , denote by C.v/ the set of all vertices of agraph G D .V;E/ connected with v by a path2 in G. The subgraph G0 of G, gener-ated by C.v/, is called a connected component of G; that is, a connected componentconsists of all vertices in C.v/ and all edges of G, whose end-vertices also are inC.v/.

A graph consisting of only one connected component, is called a connected graph,thus, every connected component is a connected graph.

An edge e in a graph G is called a cut-edge or a bridge if its removal increases thenumber of connected components in the graph.

We denote the number of connected components in a graph G by cc.G/.

Example 2.2.8. The diagram g (Fig. 2.1) has two connected components, namely, g1and g2 D .V2;;;;/, where V2 D ¹v5º – see Fig. 2.7.

2Or, which is equivalent due to Lemma 2.2.6, by a walk.


s s

s s��

$

%

v1 v2

v3v4

e1

e2 e3 e4

e5e6

g1

s v5

g2

Figure 2.7. Diagrams g1 and g2 – see graph G1.

Lemma 2.2.9. If the degree of every vertex of a finite graph is at least 2, the graphcontains a cycle.

Proof. Without loss of generality, we can consider a simple connected graph. Pickany vertex v and any edge e incident to v. Another end-vertex of e also has the degreeat least 2, so that it has at least one incident edge distinct from e. Continuing thisway, after several steps we must meet some vertex the second time, since the graph isfinite. The part of our walk, which starts and ends at this repeating vertex, is the cyclewe sought for.

Problem 2.2.1. Why is the other end-vertex of e distinct from v?

Lemma 2.2.10. Consider a graph G D .V;E/ and let G0 D .V;E n ¹eº/ be asubgraph of G derived by removal of an edge e 2 E. If the edge e is a bridge in G,then cc.G0/ D cc.G/C 1.

Proof. The removal of a bridge e can affect only the connected components con-taining e. Thus without loss of generality, we can consider a connected graph G,cc.G/ D 1, and prove that cc.G0/ D 2. Now, if we assume that G0 has at least threeconnected components, cc.G0/ � 3, then there are three vertices, say v1; v2; v3 2 V ,belonging to these three different connected components. Since G is connected, thereare three paths connecting v1; v2, and v3 pairwise. Moreover, since e is a bridge,each of these three paths must contain the edge e. Then it is easy to see for ourselvesthat we can always rearrange these paths and construct a new path � , which connectscertain two among the three vertices v1; v2; v3, and does not contain the bridge e,contrary to our assumption on the existence of three connected components inG0.


Problem 2.2.2. There is a missed step in the end of the latter proof; restore it, namely,give a more precise construction of the path � . Hint: consider several possible cases.

Lemma 2.2.11. For any finite graphG D .V;E/ of order p and size q, the inequalityp � cc.G/ � q is valid.

Proof. We prove the inequality by mathematical induction on the size q of the graph.If q D 0, then each vertex is isolated, so that cc.G/ D p and the conclusion is clear.Suppose now that the inequality holds for any graph of size less than q; q > 0, andprove the statement for graphs of size q.

Consider separately two possible cases: either each edge of G is a bridge or thereis an edge e 2 E, which is not a bridge. In the latter case, we remove e and get aconnected graphG0 with q�1 edges; by the inductive assumption p�cc.G0/ � q�1and since cc.G/ D cc.G0/ (e is not a bridge!) we immediately derive that

p � cc.G/ D p � cc.G0/ � q � 1 < q:

In the former case, we remove an arbitrary edge e0 2 E and denote the remainingsubgraph by G0. By Lemma 2.2.10, cc.G0/ D cc.G/ C 1. Now by the inductiveassumption p�cc.G0/ � q�1, or p�cc.G/�1 � q�1, whence p�cc.G/ � q.

Geometrical diagrams visualize graphs, however in many occasions, for instance, tostore graphs in computer memory, it is more convenient to represent them analyticallyrather than geometrically. First we recall a few standard definitions.

Definition 2.2.12. An m � n matrix M D .ai;j /; 1 � i � m; 1 � j � n, is arectangular array of m rows and n columns, where ai;j stands for the entry at thecrossing of the i th row and j th column. If m D 1, a matrix consists of one row andcan be considered as a row vector (n-tuple) of length n. A matrix withm D n is calleda square matrix. A square matrix is called symmetric if ai;j D aj;i ; 8i; j .

The matrix M � D .bi;j /; 1 � i � m; 1 � j � n, with bi;j D aj;i , is called thetranspose of the matrix M .

Definition 2.2.13. The dot product of two n-vectors v1 D .a1;1; a1;2; : : : ; a1;n/ andv2 D .a2;1; a2;2; : : : ; a2;n/ is the sum of pairwise products of their correspondingcomponents,

v1 � v2 D a1;1a2;1 C a1;2a2;2 C � � � C a1;na2;n:

Definition 2.2.14. Consider a p�q matrix A D .ai;j / and a q�r matrix B D .bi;j /.Their product is a p�r matrixC D A�B D .ci;j /, where each entry is the dot productof a q-vector ai D .ai;1; ai;2; : : : ; ai;q/ by a q-vector bj D .b1;j ; b2;j ; : : : ; bq;j /, thatis, ci;j D ai � bj D

Pq

kD1ai;kbk;j .


Problem 2.2.3. Let

A D

�0 1 2

0 0 0

�; B D

0@ 0 10 10 0

1A ; C D

0@ 0 1 21 0 0

0 0 0

1A1) Find the transposes A�; B�; C �.

2) Calculate the products A � B , B � A, and A � C . Does the matrix product C � Aexist? Is the matrix multiplication commutative?

3) Find the transposes .A � B/� and .B � A/�. Is there any relationship betweenA�; B�, and .A � B/� or .B � A/�? Prove this relationship.

Definition 2.2.15. Let G D .V;E/ be a graph of order p with the set of verticesV D ¹v1; : : : ; vpº. The adjacency matrix ofG, corresponding to the given numberingof the vertices, is a p � p square matrix

A.G/ D .ai;j /; 1 � i; j � p;

where ai;j is the number of edges connecting the vertices vi and vj . Thus, ai;i is thenumber of loops at the vertex vi . This matrix is symmetric as long as we consideronly undirected graphs.

If we compute the column sum �.vj /DPpiD1 ai;j or a row sum �.vj /D

PpiD1 aj;i ,

1 � j � p, in the adjacency matrix, which is the same due to the symmetry of A.G/,then we find the number of edges incident to vj , but unlike the degree of vertex vj ,each loop in this sum is counted only once. Thus,

deg.vj / D

� pXiD1;i¤j

ai;j

�C 2aj;j D

� pXiD1

ai;j

�C aj;j :

Example 2.2.16. The adjacency matrix of the graph G (Fig. 2.1) is

A.G/ D

0BBBB@0 1 2 0 0

1 0 0 1 0

2 0 0 1 0

0 1 1 1 0

0 0 0 0 0

1CCCCA :The matrix A.G/ has a 2 � 2 block structure,

A.G/ D

�A1 0

0 A2

�;

where A1 D A.g1/ is the 4 � 4 adjacency matrix of the graph g1 and A2 D A.g2/ isthe 1 � 1 adjacency matrix of the graph g2 D .¹v5º;;;;/ (Fig. 2.7); the latter graph


consists of one isolated vertex v5, whence A2 is the 1 � 1 matrix A2 D .0/. All otherelements of A.G/ are zeros situated in two rectangular blocks. The two graphs g1and g2 are the two connected components of the graph G. Any adjacency matrix hassuch a block structure.

Problem 2.2.4. Prove that if a graph has k connected components, then its adjacencymatrix is a k � k block matrix, these k blocks are the adjacency matrices of theconnected components of the graph and are situated along the main diagonal; all theother elements of the adjacency matrix being zeros. The size of each block is equal tothe order of the corresponding connected component of the graph.

Let us take another look at the graph G (Fig. 2.1) – the vertices v1 and v3 have twoincident edges, e2 and e3. Hence there are two walks from v1 to v3, and we can readthis off from the adjacency matrix A.G/, since a1;3 D a3;1 D 2. All other entriesof A.G/ can be interpreted the same way. For instance, we see that v4 is connectedby walks of length 1 with v2, v3, and with itself, thus there is a loop at v4. In manyinstances it is necessary to count walks of length bigger than 1. To approach thisproblem, we compute the square A2.G/ D A.G/ � A.G/ of the adjacency matrixA.G/,

A2.G/ D

0BBBB@5 0 0 3 0

0 2 3 1 0

0 3 5 1 0

3 1 1 3 0

0 0 0 0 0

1CCCCA :Consider here the very first entry, a1;1 D 5. By Definition 2.2.13,

a1;1 D .0; 1; 2; 0; 0/ � .0; 1; 2; 0; 0/ D 1 � 1C 2 � 2 D 5: (2.2.1)

This shows that there are five walks of length 2 starting and ending at v1. Indeed,we can go from v1 to v2 and come back using the edge e1, which gives 1 � 1 D 1

walk; or go from v1 to v3 using either e2 or e3 and come back using again eithere2 or e3, which gives 2 � 2 D 4 more walks. We see for ourselves that each term in(2.2.1) represents a walk of length 2 from v1 to itself. The five closed walks of length2 starting at v1 and returning back to v1, are

.v1; e1; v2; e2; v1/; .v1; e2; v3; e2; v1/; .v1; e2; v3; e3; v1/; .v1; e3; v3; e2; v1/;

and .v1; e3; v3; e3; v1/. The other entries of A2.G/ have the same meaning.Obviously, this reasoning holds true in general and leads to the following statement.

Theorem 2.2.17. The number of walks of length r from vi to vj in a graph G D

.V;E/; V D ¹v1; : : : ; vpº, is equal to the element aŒr�i;j of the matrix .A.G//r , whereA.G/ is the adjacency matrix of G. The matrices .A.G//r ; r D 2; 3; : : : , here existsince A.G/ and all its powers are square matrices.


Problem 2.2.5. Prove this theorem by mathematical induction.

The graph-theory language is often useful in solving various problems that seemunrelated to graphs. For instance, the information in the following problem can beconveniently represented by a graph.

Problem 2.2.6. The organizing committee of The Combi Club Annual Meeting ob-served that among each four participants there is a person who knows three otherpeople in this quartet. Show that every quartet of the participants contains someonewho is familiar with all the other participants of the meeting.

Solution. Consider a graph with vertices corresponding to the participants, where twovertices are adjacent if and only if the corresponding participants do not know oneanother. We can clearly assume that the graph has at least 4 vertices. This graph has noparallel edges nor loops. We have to prove that if the graph contains no quadruple ofvertices, such that each of these four vertices is adjacent at least with one of the threeother vertices in the quadruple, then each quadruple of vertices contains an isolatedvertex.

If the graph contains the only edge, the conclusion is obvious. So that we supposethat there are at least two edges, say a and b. If they are not adjacent, the quadrupleof their vertices contradicts our assumptions, thus, a and b have a common vertexv0. Similar reasoning yields that, except for a and b, the graph can have at most oneedge c, and this edge must have a common vertex with the edges a and b. Now, ccannot be incident to v0, since that would imply deg.v0/ D 3. Therefore, the edgesa, b, and c make a triangle, and all the other vertices are isolated.


EP 2.2.1. Find all subgraphs of the graph G, given by incidence function (2.1.1), orwhich is the same, by the diagram g in Fig. 2.1. Are there isomorphic graphs amongthem?

EP 2.2.2. Are any two of the three diagrams in Fig. 2.8 isomorphic?

EP 2.2.3. Find all connected components of the graphs in Fig. 2.3, 2.7 and Exam-ple 2.2.3.

EP 2.2.4. Does there exist a connected graph with 5 vertices, such that each its edgeis a bridge, and moreover, removal of every edge generates exactly two connectedcomponents?

EP 2.2.5. List all walks of length 3 or less in the graphs in Fig. 2.7 and Example 2.2.3.Classify them as trails, paths, circuits, or cycles.


t t t t tt t t t tt t t t t

� ��

� ��

Figure 2.8. Diagrams to EP 2.2.2.

EP 2.2.6. Is it true or false that a graph of size at least 1 has walks of any finite length?

EP 2.2.7. Prove that the connectivity is an equivalence relation on the set V of thevertices of a graph. What are its equivalence classes?

EP 2.2.8. 1) How many cycles are in the complete graphs K4, K5, Kn?

2) How many paths of length 3 are in K5?

EP 2.2.9. Prove that a graphG D .V;E/ of order p and size q satisfying the inequal-ity q C 2 � p cannot be connected, that is, cc.G/ � 2.

EP 2.2.10. Prove that the size of a connected graph of order p is at least p � 1.

EP 2.2.11. Find the adjacency matrices of the graphs in Fig. 2.1, 2.3, and 2.7.

EP 2.2.12. 1) How many walks of length 2 connect any two vertices in the com-plete graph Kn?

2) In graph K3;3?

EP 2.2.13. For the incidence matrix of the graph G (Fig. 2.1) computeP5iD1

�P5jD1 ai;j

�and

P5jD1

�P5iD1 ai;j

�.

EP 2.2.14. Prove that the vertices vi and vj in a graph G of order p are connected ifand only if the .i; j /th-element of the matrix

A.G/C A2.G/C � � � C Ap�1.G/

is not zero.

EP 2.2.15. Prove that the vertices of a bipartite graph can be renumbered so that itsadjacency matrix is a 2 � 2 block matrix

�0 A1A2 0

�, where A1; A2 are square matrices

and both zero blocks represent rectangular matrices with all zero entries. What arethe sizes of the blocks A1 and A2?


EP 2.2.16. Prove that if two graphs have the same adjacency matrix, they are isomor-phic. Is the converse statement true?

EP 2.2.17. Prove that two graphs G D .V;E/ and G0 D .V 0; E 0/ with the adjacencymatrices .ai;j / and .a0i;j / are isomorphic if and only if they have the same order pand there exists a p-permutation � such that ai;j D a0�.i/;�.j /.

EP 2.2.18. Each edge of the complete graph Kn is colored in one of n � 1 colors,such that for every vertex all its incident edges have different colors. For what n is itpossible?

EP 2.2.19. Each student club on a campus has a two-color flag. For these flags, thecollege bought fabrics of 8 different colors. It is known that every color meets on theflags with at least four other colors. Prove that the college can select no more thanfour clubs whose flags represent all the 8 colors.

EP 2.2.20. To connect every pair among 25 cities by a direct flight, it is necessaryto have C.25; 2/ D 300 flights. Suppose now that 25 cities are connected by only277 direct flights with at most one direct flight between any two cities. Prove that iftransfers are allowed, then any city can be reached by air from any other city with atmost one transfer. Prove that if a graph with v vertices has at least C.v � 1; 2/ C 1edges, then the graph is connected. This bound is very simple and crude – how farcan you improve this estimate?

EP 2.2.21. Prove that if a graph of order p has a directed path of length at least p,this path must pass through some vertex at least twice, therefore, the graph contains acircuit.

2.3 Trees

In this section we study properties of a special kind of graphs important in manyapplications.

That’s the Tree! � Joseph Kruskal’s biography � Another Kruskal �Kruskal’s algorithm � Algorithms � Cayley’s biography � Prufer’sbiography

Definition 2.3.1. A graph without cycles, or acyclic graph, is called a forest.A connected forest is called a tree (Fig. 2.9).A rooted tree is a tree, which has a singled out vertex, called the root.

Section 2.3 Trees 111

urrr r r

r

rr r

r rrr

r

rr��

��PPP ��

��

��

((((��

�@

Figure 2.9. An example of a tree.

Thus, a forest is a family of trees and a tree is a connected graph without cycles.The following theorem lists several important equivalent properties of the trees.

Theorem 2.3.2. Let G D .V;E/ be a finite graph of order jV j D p. Then thefollowing statements are equivalent.

1) G is a tree

2) G is a connected graph and each of its edges is a bridge

3) G is an acyclic graph and its size is jEj D p � 1

4) G is a connected graph and jEj D p � 1

5) For any pair of vertices of G there is the unique path connecting them

6) G is acyclic but any new edge added to G generates precisely one cycle.

Proof. We establish the following chain of implications,

1/) 2/I 1/&2/) 3/) 4/) 5/) 6/) 1/:

First we prove that 1) implies 2). Since a tree is connected by definition, it sufficesto prove that every edge is a bridge. On the contrary, if we assume that some edgee 2 E is not a bridge, then we can remove it and still get a connected graphG0. Hencethe end-vertices of e are connected in G0 by a walk, which clearly does not containe. Now, the addition of e to the latter walk would generate a cycle in G, which isimpossible since G is a tree and cannot contain cycles.

Next we prove that 1) and 2) together imply 3). Indeed, if we remove any edgefrom a cycle, the remaining graph is still connected, implying that G must be acyclic.To prove that G has p � 1 edges, we apply mathematical induction on p D jV j.


If p D 1, this single vertex must be isolated and the assertion is trivial. Supposethe assertion holds true for all trees of order less than some p > 1 and consider agraph G D .V;E/ of order p. Let e 2 E, then e is a bridge by the assumption.Thus, if we remove e and denote the remaining graph by G0, then by Lemma 2.2.10,cc.G0/ D cc.G/C 1 D 2. Let G1 D .V1; E1/ and G2 D .V2; E2/ be two connectedcomponents of G0. They cannot be empty, cannot have cycles, and by the inductiveassumption, jE1j D jV1j � 1; jE2j D jV2j � 1. Adding up these equations andconsidering that E D E1 [E2 [ ¹e0º; V D V1 [ V2, where both unions are disjoint,we arrive at the conclusion.

To prove that 3) implies 4), we again assume that, on the contrary, G is not con-nected, that is, it consists of k � 2 connected components. Each of these componentsis a tree and by the assumption for each of them jE 0j D jV 0j � 1. Adding up k � 2such equations leads to jEj D jV j � k D p � k < p � 1, which contradicts thepremise.

The implication 4/) 5/ follows readily if we notice that if there are two verticesconnected by two different paths, then these paths together make up a cycle. Remov-ing any edge e off this cycle, we get a connected graphG0.V;E 0/, whereE 0 D En¹eº,such that jV j D p and jE 0j D p � 2 in contradiction with Lemma 2.2.11.

Next we prove the implication 5/ ) 6/. If we can add an edge and generate twocycles, this would mean that the end-vertices of the new edge were connected by twopaths in the original graph, which is impossible.

Finally, to prove that 6/) 1/, we have to prove that the graph is connected, whichis obvious; indeed, a new cycle connects any two of its vertices twice, thus, oneconnection must have existed before we added the edge.

We study some other properties of trees. Compare the graphs G1 (Fig. 2.7) andT D .¹v1; v2; v3; v4º; ¹e1; e2; e5º; fT / (Fig. 2.10).

s ss sv1 v2

v3 v4

e1

e2

e5

Figure 2.10. Graph T .

Problem 2.3.1. Write down explicitly the incidence function fT of the graph T inFig. 2.10.

Solution. From Fig. 2.10 we observe immediately, that fT .e1/ D ¹v1; v2º, fT .e2/ D¹v1; v3º, and fT .e5/ D ¹v3; v4º.


The graph T is a connected spanning subgraph of the graph g1 in Fig. 2.7 – itcontains all the vertices of g1 and some of its edges.

Definition 2.3.3. If a spanning graph of a graph G is a tree, this tree is called aspanning tree of G.

Thus, the tree T (Fig. 2.10) is a spanning tree of the graph g1. A graph may haveseveral spanning trees. The next statement follows immediately from Theorem 2.3.2.

Corollary 2.3.4. Every graph has a spanning forest. Every connected graph has aspanning tree.

In many applications it is useful to supply edges of a graph with an additional pieceof information, usually called the weight of this edge. The weight can be a numberlike the length of an edge, or a symbol like a traffic sign indicating whether this isa one-way or two-way street. If every edge of a graph carries a weight, the graph iscalled weighted. Weighted graphs have weighted spanning trees.

Example 2.3.5. Consider a connected weighted graph G7 (Fig. 2.11), where theweights are w1 D 2;w2 D 5;w3 D 1;w4 D 3. This graph has three different span-ning trees shown in Fig. 2.12. These trees have different weights, namely W.T1/ Dw2Cw3Cw4 D 9,W.T2/ D w1Cw3Cw4 D 6, andW.T3/ D w1Cw2Cw4 D 10;among them the tree T2 has the smallest weight – it is the minimum spanning tree ofthe graph G7.

s ss s��

@@@@@

v1 v2

v3 v4e4; w4

e2; w2 e3; w3

e1; w1

Figure 2.11. Graph G7.

There are several algorithms for finding a minimum spanning tree in a graph. Wepresent the well-known algorithm of Kruskal . Connectedness of a graph is, obvi-ously, a necessary condition for the existence of a spanning tree.


s ss s��

@@@@@

v1 v2

v3 v4e4; w4

e2; w2 e3; w3

s ss s��

v1 v2

v3 v4

e1; w1

e4; w4

e3; w3

s ss s@

@@@@

v1 v2

v3 v4

e1; w1

e4; w4

e2; w2

Figure 2.12. Spanning trees T1; T2; T3.

Kruskal’s Algorithm for Finding a Minimum Spanning Tree

Given a connected weighted graph G D G.V;E/ with n vertices, find a minimumspanning tree in G. We assume that all weights are nonnegative numbers.

1. Select an edge e with the smallest weight. If the graph has several edges withthe same minimum weight, we can choose any of them. The edge e and itsend-vertices form the initial subgraph (sub-tree) T1 of G.

2. For m D 1; 2; : : : ; n � 2, select an unused edge with the smallest weight, suchthat this edge does not make a cycle with the edges selected earlier. In particular,we can use an edge with the same weight as the one in the previous step. Appendthe edge chosen and, if necessary, its end-vertices to the subgraph Tm generatedat the previous step, to built the next subgraph TmC1.

3. Repeat Step 2 n � 2 times. The subtree Tn�1, where n is the order of the givengraph G, is a minimum spanning tree of G.

Remark 2.3.6. Not every graph among T2; : : : ; Tn�2 is to be a tree, some of themcan be forests, but Tn�1 is a tree.

Problem 2.3.2. Prove that Kruskal’s algorithm generates a minimum spanning treein any connected graph.


Problem 2.3.3. Find a minimum spanning tree for the complete graphK8, where theweights of the edges are given in the following symmetric table.

Solution. In this problem, the vertices are denoted by ci ; cj , etc.; the .i; j /-entry ofthe table is the weight, wi;j D wj;i , of the edge incident to the vertices ci and cj .The reader can notice that the weights are all the integer numbers from 1 through 28inclusive.

c1 c2 c3 c4 c5 c6 c7 c8c1 0 5 10 7 22 27 25 13c2 5 0 8 12 28 23 17 6c3 10 8 0 1 9 19 3 26c4 7 12 1 0 4 14 2 21c5 22 28 9 4 0 11 16 18c6 27 23 19 14 11 0 15 20c7 25 17 3 2 16 15 0 24c8 13 6 26 21 18 20 24 0

Table 2.1. The weights of the complete graph K8 in Problem 2.3.3.

The following figures exhibit all the consecutive steps of Kruskal’s algorithm beingapplied to Problem 2.3.3. The smallest weight is w3;4 D 1, thus we start with thegraph with 8 isolated vertices c1; : : : ; c8 (Fig. 2.13) and first connect the vertices c3and c4 by an edge of weight 1 (Fig. 2.14). The second smallest weight is d4;7 D 2.Adding an edge of weight 2 connecting the vertices c4 and c7, we get the graph shownin Fig. 2.15.

The next smallest weight is d3;7 D 3. However, we cannot connect the vertices c3and c7, because such an edge would form a cycle with the two previously includededges (Fig. 2.16), which is forbidden by Part 2 of the algorithm.

Thus, we look for the next smallest weight, w4;5 D 4, and at the next step weconnect the vertices c4 and c5 by the edge of weight 4 (Fig. 2.17). Figures 2.18–2.21 show the sequel subgraphs leading to a minimum spanning tree of weight 20(Fig. 2.21).

In the end of this section we again consider a problem of graph enumeration andprove Cayley’s formula on the enumeration of labelled trees.

Theorem 2.3.7. There are pp�2 nonisomorphic labelled trees with p � 1 vertices.

Proof. If p D 1, then pp�2 D 1, and the statement is obvious, since the uniquelabelled tree with one vertex is this isolated vertex labelled by 1. Now, let p � 2

and T be a labelled tree of order p. Delete the end-vertex with the smallest label,


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

Figure 2.13. The initial graph without edges. All vertices are isolated.

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

Figure 2.14. First step of Kruskal’s algorithm. The first (non-spanning) subgraph withonly one edge is formed.

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

Figure 2.15. A subtree with two edges.


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8 !!!!!!!!!!!!!!!!!

Figure 2.16. This subgraph with three edges is not a tree.

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

��

��

�

Figure 2.17. The subgraph with three edges.

record the label of the adjacent vertex, and repeat this step until only two verticesremain. This procedure generates a sequence of p � 2 natural numbers ranging from1 through p with possible repetitions. This sequence is called the Prüfer codeof the tree T . By Theorem 1.1.35, there are pp�2 such sequences, and since there isan obvious one-to-one correspondence between such codes and the labelled trees oforder p, the proof is complete.

Example 2.3.8. For instance, if p D 3, then pp�2 D 3. All nonisomorphic labelledtrees with 3 vertices are shown in Fig. 2.5.


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

��

Figure 2.18. This subgraph is not a tree, since it is not connected.

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

��

��

Figure 2.19. This subgraph with 5 edges also is a forest, not a tree.

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

��

��

��

ll

ll

lll

ll

lll

Figure 2.20. Second to the last step of the algorithm. Two subtrees merge into a treewith 6 edges. This subtree is not a spanning tree yet since the vertex c6 is still isolated.


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

��

��

llll

lll

ll

lll

Figure 2.21. The minimum spanning tree of the initial graph in Problem 2.3.3, itsweight is w.T / D 36.

Problem 2.3.4. Find a minimum spanning tree in graph G8 (Fig. 2.22).

��@

@@@@@@@@@��

s

s

s

s s

1

3 2 5 11

2 1

Figure 2.22. Graph G8.

Corollary 2.3.9. Let 1 � d1 � d2 � � � � � dp be the degree sequence of a tree oforder p. The number of labelled trees of order p with this degree sequence is givenby the multinomial coefficient .1:5:1/,

C.p � 2I d1 � 1; d2 � 1; : : : ; dp � 1/ D.p � 2/Š

.d1 � 1/Š � � � .dp � 1/Š:

Proof. Indeed,PpiD1 di D 2p � 2 by Lemma 2.1.12. If vi is a pendant vertex then

di D 1, and it is clear from the proof that this label does not appear in the Prüfer code


at all. If di � 2, then together with the removal of this vertex we must remove di � 1adjacent vertices, hence vi appears in the Prüfer code di � 1 times, and the same istrue for any other vertex, which proves the corollary.

Problem 2.3.5. Label the tree in Fig. 2.9 and compute its Prüfer code. Repeat thiswith another labelling of the same tree and compare their Prüfer codes.

Problem 2.3.6. Restore a labelled tree if its Prüfer code is 133132.

Solution. We do not distinguish vertices and their labels. From the proof of Theo-rem 2.3.7, we see that since the length of the code is 6, the tree must have 6C 2 D 8vertices. The vertices (labels) 1, 2, and 3 are present in the code, thus, they were notremoved at the first deletion step, hence the very first vertex removed was 4, and thisvertex was connected to 1. The next smallest vertex was 5 and it was connected to 3.The next vertex, which is 6, was connected to 3 again. The vertex removed after thatwas 7, and it must have been connected to 1. The vertex 1 does not appear in thecode after that, thus now it is the smallest and we remove it, keeping in mind that itis adjacent to 3. At this stage only three vertices, 2, 3, and 8 remain, but we cannotremove 2 now, hence, we have to remove 3, which is, apparently, adjacent with 2.Thus, vertices 8 and 2 are connected – see Fig. 2.23.

s s

s

sss ss

1

3

5

4

2

8

7

6

Figure 2.23. This tree has the Prüfer code 133132 – Problem 2.3.6.


EP 2.3.1. Draw all non-isomorphic trees with 5 vertices and those with 6 vertices.

EP 2.3.2. What graphs coincide with their spanning trees?

EP 2.3.3. Draw a diagram having only one spanning tree.


EP 2.3.4. 41 points in the plane are connected by straight segments, such that anytwo points are connected by either a segment or a broken line, and for any two pointsthis broken line is unique. Prove that there are precisely 40 segments connecting thepoints.

EP 2.3.5. Find all spanning trees of the graph G1 (Fig. 2.7) and those of the tree T(Fig. 2.10).

EP 2.3.6. How many non-isomorphic spanning trees does the bipartite graph K3;3have?

EP 2.3.7. n towns are connected by highways without intersections, such that fromeach town a driver can reach every other town, and there is the only route between anytwo towns. Prove that the number of highways is n � 1.

EP 2.3.8. How many are there non-isomorphic trees with n vertices if the degree ofany vertex is no more than 2?

EP 2.3.9. Prove that in any simple finite graph G D .V;E; f /

2q � .p � cc.G//.p � cc.G/C 1/:

EP 2.3.10. Draw the labelled trees with the Prüfer codes 234, 3123, 4444, 7485553.Is there a labelled tree with the Prüfer code 126?

EP 2.3.11. Prove that there is a one-to-one correspondence between the non-isomor-phic labelled trees and Prüfer codes – see Theorem 2.3.7.

EP 2.3.12. A tree has p vertices. What is the largest possible number of its pendantvertices?

EP 2.3.13. Prove that in any tree of order p � 2 there are at least two pendantvertices. Moreover, a stronger statement holds true – any acyclic graph of order p � 2has at least two pendant vertices.

EP 2.3.14. Prove that a graph is a forest if and only if for any two distinct verticesthere is at most one path connecting them.

EP 2.3.15. Generalize Theorem 2.3.2 to forests: If a forest of t trees has v verticesand d edges, then v D d � t .

EP 2.3.16. A forest has 67 vertices and 35 edges. How many connected componentsdoes it have?


EP 2.3.17. Cayley’s second formula. For 1 � k � n, prove that there are k.nCk�1/n�2 labelled forests with nC k� 1 vertices and k connected components, such thatk distinguished vertices belong to different connected components.

EP 2.3.18. Prove that 1 � d1 � d2 � � � � � dp is the degree sequence of a tree oforder p if and only if

PpiD1 di D 2p � 2.

EP 2.3.19. Prove that every sequence of integer numbers

1 � d1 � d2 � � � � � dp;

such thatPpiD1 di D 2p � 2k; k � 1, is the degree sequence of a forest with k

connected components.

EP 2.3.20. Prove that F.p/, the number of forests of order p, satisfies the recurrencerelation

F.n/ D

nXkD1

C.n � 1; k � 1/kk�2F.n � k/:

EP 2.3.21. Theorem 2.3.2 claims that a tree of order p has p � 1 edges. For non-acyclic graphs this conclusion clearly fails. Nevertheless, prove that a connectedgraph of order p must have at least p � 1 edges.

EP 2.3.22. How many edges are to be removed from a connected graph with 12vertices and 15 edges to generate a spanning tree of the graph? Does this numberdepend upon the order in which the edges are being removed?

EP 2.3.23. There are 300 cities in a state and 3 000 highways connecting them, suchthat each city is connected with at least one other city. How many of the highwayscan simultaneously be closed for repair if no city should be completely isolated fromthe others?

EP 2.3.24. Show that a graph is connected if and only if it has a spanning subtree.

2.4 Eulerian Graphs

This section is concerned with edge traversal problems.

Euler’s biography � The Seven Bridges of Königsberg � Fleury’s al-gorithm � Who is Fleury? � Eulerian graphs � Hamilton’s biography

The next problem should remind the reader an old puzzle.

Section 2.4 Eulerian Graphs 123

Problem 2.4.1. Can you draw either of the two graphs in Fig. 2.24 without traversingan edge twice and without interruption the drawing (that is, a pencil must not leavethe paper)?

Definition 2.4.1. A circuit (a trail) in a graph is called Eulerian if it containsall edges of the graph. A graph is called Eulerian if it contains an Eulerian circuit.A graph is called semi-Eulerian if it contains an Eulerian trail.

��@

@@@@@@@@@ �

��@

@@@@@@@@@

��

@@@@@

s

s

s

ss

s

s

s

ss

s

Figure 2.24. Is any of these “envelopes” Eulerian? Semi-Eulerian?

The results of this section essentially depend on the parity of the vertex degrees ofa graph, that is, whether the degree is even or odd. We call a vertex even (odd) if itsdegree is even (odd).

Problem 2.4.2. Is there a graph with just one odd vertex?

Theorem 2.4.2. A connected graph is Eulerian if and only if it has only even vertices.A connected graph is semi-Eulerian if and only if it contains exactly two odd vertices.

Problem 2.4.3. Is any of graphs in Fig. 2.24 Eulerian? Is any of them semi-Eulerian?

Proof of Theorem 2:4:2. The necessity of these conditions, including the connected-ness, is obvious. Indeed, if we begin to traverse an Eulerian circuit and remove everyedge traversed, after passing through any vertex its degree decreases by 2, so that theparity of any vertex’s degree does not change. After completing the route we mustarrive at the initial vertex after traversing and removing behind us every edge. Thus,the degree of each vertex gradually reduces to zero, implying that initially the degreewas even. In the case of semi-Eulerian graphs the same argument works if we beginat either one of the two odd vertices; we must finally arrive at another odd vertex.


We prove the sufficiency by induction on the size q of the graph. Begin again withthe Eulerian case. If q D 1, the graph consists of one vertex with an attached loop,so the statement is obvious. Suppose now that for all connected graphs of the sizejEj < q; q � 2, with all even vertices the statement is valid and consider a connectedgraph G D .V;E/; jEj D q. By Lemma 2.2.9, the graph G contains a cycle C .If this cycle includes all the edges of G, there is nothing more to prove. Otherwise,we remove all edges of the cycle C from G, which can result in decomposing G intoseveral connected components, G1; : : : ; Gl , of smaller sizes.

Since G is connected, every component Gi contains some vertex vi 2 C , whosedegree in G must be at least 4, so that its degree in Gi is at least 2. By the inductiveassumption, each of Gi ; 1 � i � l , has an Eulerian circuit Ci , and we conclude thatvi 2 Ci as well. It is now obvious how to assemble all cycles C;C1; : : : ; Cl in anEulerian cycle in the graph G.

To consider the case of semi-Eulerian graphs, we connect the two existing odd ver-tices by an additional edge, thus making the graph Eulerian, and apply the statementwe have just proved.


EP 2.4.1. Examine graphs in Figures 2.1–2.23 – which of them are Eulerian? Semi-Eulerian? Find, if any, semi-Eulerian trails or Eulerian circuits in these graphs.

EP 2.4.2. In the proof of Theorem 2.4.2 we claim that each component Gi has avertex such that its degree in G is at least 4. Prove this claim.

EP 2.4.3. Draw the floor plans of the buildings on your campus, where you are (were)taking your classes. Draw a graph representing each room with a vertex, such thattwo vertices are connected by an edge if the corresponding rooms have a commonwall. Are these graphs Eulerian? Semi-Eulerian? Find, if any, semi-Eulerian trails orEulerian cycles in these graphs.

EP 2.4.4. Prove that the following procedure, called Fleury’s algorithm, returns anEulerian circuit in any Eulerian graph:

Start at any vertex and pass any edge incident to this vertex. Remove the edgepassed and go through any other edge incident to the vertex reached, subject to theonly restriction: a bridge can be used only if there is no other edge available.

EP 2.4.5. Apply Fleury’s algorithm to those graphs in Figures 2.1–2.24, which areEulerian or semi-Eulerian, and find semi-Eulerian trails or Eulerian circuits in thosegraphs.

To consider vertex traversal problems, we introduce Hamiltonian graphs.

Section 2.5 Planarity 125

Definition 2.4.3. A path (circuit) without repeating vertices in a graph G is calledHamiltonian if it contains every vertex of G. A graph is called Hamiltonian if it has aHamiltonian path.

EP 2.4.6. Prove the following necessary condition of the existence of a Hamiltoniancircuit: If a graph G contains a Hamiltonian circuit, then it contains a connectedspanning subgraph H , which has the equal order and size, and the degree of everyvertex of H is 2.

In the opposite direction, prove the following sufficient condition: If in a simplegraph G D .V;E/ of order p � 3,

deg.v/ � p=2; 8v 2 V;

then G has a Hamiltonian circuit.

EP 2.4.7. Find a Hamiltonian circuit in the complete graph K5.

EP 2.4.8. Is any of the graphs in Fig. 2.1–2.24 Hamiltonian?

EP 2.4.9. Extend Definition 2.4.1 and Theorem 2.4.2 to digraphs. Is the digraph G(Fig. 2.2) Eulerian or semi-Eulerian? Find, if any, the corresponding directed paths.

2.5 Planarity

We prove in this section a remarkable property of planar graphs, also related tothe name of Euler, called Euler’s formula or Euler’s polyhedron theorem.

Jordan’s biography � Kuratowski’s biography

Definition 2.5.1. A regular embedding of a planar graph in R2 is called a plane graph.Therefore, the edges of a plane diagram cannot have common points, except maybefor the vertices of the graph.

For example, the diagram g0 (Fig. 2.3) is not plane, since the edges �2 and �6intersect at a point, which is not a vertex of the graph, but g0 is planar because it caneasily be redrawn without this intersection.

Consider a connected plane graph consisting of two vertices connected by an edge.If we choose two arbitrary points in the plane outside of this graph, it is possible toconnect them with a smooth curve having no common point with the graph. Considernow a connected graph with cycles, say g1 (Fig. 2.7). The edges of g1 split the entire


plane into four separate regions, among them there is one unbounded external domain,and three bounded – one interior to the loop e6, one between the parallel edges e2 ande3, and the last one bounded by the edges e1; e3; e4; e5. These regions are called facesof the graph g1. Any two points inside a face can be connected by a smooth curvelying completely inside the face, and it is (almost) obvious3 that if two points lie indifferent faces, they cannot be connected by a smooth plane curve unless this curveintersects at least one edge of the graph.

Theorem 2.5.2 (Euler). Let G be a connected plane graph of order p and size q. IfG has f faces, including an unbounded face, then

p � q C f D 2: (2.5.1)

The expression p � q C f is called the Eulerian characteristic of a graph.

Proof. We carry out mathematical induction on the number of faces f . If the graphGhas at least one bounded face, then the boundary of the latter consists of edges of G,and these edges form a cycle – cf. Fig. 2.7. Thus, if f D 1, then this face mustbe unbounded, therefore G is a tree, hence p D q C 1 by Theorem 2.3.2 and theconclusion follows straightforwardly as p � q C f D q C 1 � q C 1 D 2.

Now fix f � 2, assume that (2.5.1) is valid for all plane connected graphs withless than f faces, and consider a graph G with f faces. At least one of these facesis bounded, thus the boundary of this face is a cycle, so that the edges making up thiscycle, are not bridges. Moreover, by virtue of planarity such an edge must belongto the boundaries of two faces. The deletion of any such edge generates a planeconnected graphG0 of the same order p, of size q�1, and with f �1 faces. Applyingthe inductive assumption to the graphG0, we get the equation p�.q�1/C.f �1/ D 2,which immediately reduces to (2.5.1).

For example, if G is a tree with p vertices, then q D p � 1, f D 1, thus p � .p �1/C 1 D 2, in agreement with (2.5.1).

Corollary 2.5.3. In a connected simple plane graph G of order p � 3 and size q,

q � 3p � 6: (2.5.2)

Proof. Since G is simple, it cannot have parallel edges, thus every face is bounded byat least three edges, and every edge adjoins two faces, hence 3f � 2q. Combiningthis inequality with (2.5.1) we deduce (2.5.2).

Problem 2.5.1. Prove that any cycle in a bipartite graph has an even length, that is, itconsists of an even number of edges.

3This statement is a deep topological theorem by C. Jordan .

Section 2.5 Planarity 127

Corollary 2.5.4. The complete graph K5 and the complete bipartite graph (theThomsen graph) K3;3 shown in Fig. 2.25, are not planar.

Proof. If the graph K5 were planar, (2.5.2) would immediately lead to contradiction.As for K3;3, every cycle in it has an even length by Problem 2.5.1, thus its lengthis at least 4, therefore the inequality 3f � 2q of the preceding corollary can bestrengthened to 4f � 2q, which together with (2.5.1) and (2.5.2) gives q � 2p � 4for any bipartite graph. In the case of K3;3, p D 6; q D 9, and the latter becomes9 � 8. This contradiction shows that K3;3 cannot be planar.

��

��

��

HHHH

HHHHHH

HHHHH

JJJJJJJJs s

s ss

s s ss s s��

��

@@

@@

@

��

@@@

@@

HHHHH

HHH

HH

Figure 2.25. Graphs K5 and K3;3.

It turns out that all the non-planarity in our world is due to these two simple non-planar graphs, as follows from the following result. We state it without a proof, whichcan be found, for example, in [8, p. 24].

Theorem 2.5.5 (Pontryagin–Kuratowski4 ). A graph G is planar if and only if itdoes not contain a subgraph that can be derived fromK3;3 orK5 by subdividing someof their edges by inserting additional vertices.


EP 2.5.1. Is there a planar graph with 6 vertices, each of them of degree 3? Of degree4? Of degree 5?

EP 2.5.2. A diagram with 9 vertices and 9 edges is embedded into a plane. Eachvertex has the degree of 3 and every edge is incident to 3 vertices. Draw the diagram.

EP 2.5.3. A connected planar graph of order 6 has 5 vertices of degree 3 and a ver-tex of degree 1. In how many regions does the graph divide the plane? Draw thecorresponding diagram.

4According to some sources, this theorem was proved by Pontryagin a few years before Kuratowski,but that proof was not published.


EP 2.5.4. Five neighboring cities decided to build highways connecting each of themwith every other city. Is it possible to build this road network without intersections orover- and underpasses?

EP 2.5.5. Give an example of a connected but not complete graph.

EP 2.5.6. Draw complete bipartite graphs K1;1 �K3;4.

EP 2.5.7. Prove that any tree of order p � 2 is bipartite.

EP 2.5.8. The converse of the preceding problem is false – not every bipartite graphis a tree. Find an example of such a bipartite graph. For which m and n the completebipartite graphs Km;n are trees?

EP 2.5.9. Prove that a simple graph G is bipartite if and only if it does not contain anodd circuit. Moreover, G is bipartite if and only if it does not contain an odd cycle.

EP 2.5.10. Let ı be the least vertex degree in a simple graph of order p. Provethat if ı � .p � 1/=2, then the graph is connected. Compare this conclusion withCorollary 2.5.3.

EP 2.5.11. Is any of graphs in Fig. 2.26 planar?

��

@@@@@

@@@@@

��

�

��

@@@@@s ss ss

ss s

��

@@@@@

@@@

@@

��

�

��

@@@

@@s ss ss

ss s

Figure 2.26. Is either of these graphs planar?

EP 2.5.12. Prove that if several lines in a plane divide the plane in p parts, thent � l C p D 1, where t is the number of intersection points of these lines, and l is thenumber of pieces the lines are split by the points of intersection.

EP 2.5.13. Give examples of graphs whose Eulerian characteristic is not 2.

Chapter 3

Hierarchical Clustering and Graphs

Given a set of objects, the cluster analysis aims at splitting this set into separategroups according to a certain prescribed measure of the proximity of the givenobjects. In this chapter we apply the graph theory to develop simple clusteringalgorithms. These algorithms essentially use the notion of a spanning tree, whichwas developed in Section 2.3.

3.1 Introduction

In this section we introduce basic terminology of hierarchical cluster analysis.

Clustering

A student wants to put some money in mutual funds. To make the right choice, sheconsiders many different funds analyzing their characteristics such as long-term andshort-term performance, the manager’s philosophy of investing, administrative costs,and other features. Comparing various funds, she can pick up a few funds that looksuitable for her goals. The things under consideration, like mutual funds, are calledobjects or entities. Certain properties of objects, like performance or attitude to risk,are called features, or variables, or attributes. However, if every object is charac-terized by several variables, it is difficult to compare different objects, and we wantto have a kind of a “common denominator” to be able to measure similarity of theobjects.

We can separate all mutual funds under consideration into several groups containingsimilar funds. Such classification is useful in many occasions. For instance, if theinvestor learns on a new fund within a short time after its inception, it is difficult,without any information, to make a prediction of the fund’s future performance basedon its own history. However, if the student can include the fund in a group of similarfunds, she can apply the information on the whole group to each element of the groupand make more reliable predictions. Furthermore, if we have many similar objects,it is often just impossible to study every one of them separately, but we can study

130 Chapter 3 Hierarchical Clustering and Graphs

a representative of each group of similar objects and apply the information found toevery item1.

To perform such analysis, we must first separate the objects into smaller groups,called clusters (overlapping groups are sometimes called clumps). This process, calledclustering, is an essential part of the cluster analysis. In this chapter we discuss somebasic concepts and algorithms of this subject. For more on the cluster analysis thereader can consult, for example, [12, 15, 18, 25, 29, 30, 36].

Obviously, the objects combined in a group should be similar, should have somecommon features. In the cluster analysis, however, it is often more convenient tomeasure the dissimilarity rather than the similarity of various objects. The more twoobjects have in common, the less is their dissimilarity. Ultimately, the similarity ofidentical objects is infinite and their dissimilarity is zero. We do not discuss herehow to assign the (dis)similarity values to multivariate objects, because it essentiallydepends upon particularities of a specific problem. We assume that the dissimilaritiesare assigned in advance – given a set of objects to be explored, we are provided witha table of their dissimilarities, called the dissimilarity table or dissimilarity matrix.

Definition 3.1.1. A square symmetric matrix (table) with non-negative elements,whose main diagonal contains only zeros, is called a dissimilarity matrix (table).

Example 3.1.2. Table 3.1 contains the average altitudes above the sea level of fif-teen states in the U.S. If we are interested in the altitudes only, we can consider theabsolute values of the differences between the altitudes as a measure of the dissim-ilarity between two states. Even though this difference is not the real geographicaldistance, similar quantities, subject to certain conditions, are called in mathematicsdistances or metrics. In this sense, the dissimilarity between Alabama and Delawareis 50 � 6 D 44, the dissimilarity between Florida and Georgia is j10 � 60j D 50, andthe dissimilarity between Florida and Louisiana is 0 – unlike the mathematical met-ric, the dissimilarity of two different objects can be 0. Table 3.1 can be transformedinto the Dissimilarity Table 3.2, where the main diagonal contains only zeros sinceeach object is absolutely similar to itself. We have completed only the upper triangle,because the table is symmetrical with respect to the main diagonal.

AL DE FL GA KY LA MD MO MS NC SC TN TX VA WV

50 6 10 60 75 10 35 80 30 70 35 90 170 95 150

Table 3.1. The rounded average altitudes above the sea level of fifteen southern states in theU.S. (in tens of feet).

1This is similar to partitioning a set into the disjoint equivalence classes and studying the representa-tives of these classes instead of the entire original set.

Section 3.1 Introduction 131

AL DE FL GA KY LA MD MO MS NC SC TN TX VA WV

AL 0 44 40 10 25 40 15 30 20 20 15 40 120 45 100

DE 0 4 54 69 4 29 74 24 64 29 84 164 89 144

FL 0 50 65 0 25 70 20 60 25 80 169 85 140

GA 0 15 50 25 20 30 10 25 30 110 35 90

KY 0 65 40 5 45 5 40 15 95 20 75

LA 0 25 70 20 60 25 80 160 85 140

MD 0 45 5 35 0 55 135 60 115

MI 0 50 40 5 60 140 65 120

MO 0 10 45 10 90 15 70

NC 0 35 20 100 25 80

SC 0 55 135 60 115

TN 0 80 5 60

TX 0 75 20

VA 0 55

WV 0

Table 3.2. Dissimilarity table for the average altitudes.

We can construct different clusterings depending upon the level of dissimilarity weare willing to accept – this level is called a threshold value or just a threshold. Thatis, we can form different partitions (clusterings) of the fifteen states. For instance, thefollowing is a partition of these states into ten clusters with a threshold value of 5, thatis, the maximum distance between any two objects in each of the following clustersdoes not exceed 5:

¹DE;FL;LAº; ¹MD;MS; SC º; ¹ALº; ¹GAº;

¹NC º; ¹KY;MOº; ¹TN º; ¹VAº; ¹TXº; ¹W V º:

We can also set up another clustering with the same dissimilarity level of 5 but nowwith nine clusters,

¹DE;FL;LAº; ¹MD;MS; SC º; ¹ALº; ¹GAº;

¹NC;KY º; ¹MOº; ¹TN; VAº; ¹TXº; ¹W V º:

We see that in general this procedure is not unique. If we select bigger threshold levelof 10, then the corresponding clustering may be the following one, containing justeight clusters,

¹DE;FL;LAº; ¹MD;MS; SC º; ¹AL;GAº;

¹NC;KY;MOº; ¹TN º; ¹VAº; ¹TXº; ¹W V º:


It is clear also and we see that in the example above, that if we increase the thresholdvalue, some clusters may merge (amalgamate) into bigger ones.

Compare these clusterings. While deriving the second clustering, we had to relo-cate some objects, and the group ¹TN; VAº of the second clustering does not belongcompletely to any cluster in the third clustering. On the other hand, every cluster ofthe first clustering is contained completely in some cluster of the third one. A processthat makes a series of consecutive clusters such that every cluster of the precedinglevel is a subset of a cluster on the next level, is called the hierarchical clustering.We begin with the completely disjoint clustering, where every object forms its ownsingle-element cluster. Then step by step, we merge (amalgamate) two or more clus-ters with the smallest dissimilarity into larger ones, until we reach a threshold value.Such algorithms are called agglomerative.

We can also proceed in the opposite direction. Namely, we can depart from a con-joint clustering, when the initial cluster contains all the objects under consideration,and split it repeatedly into smaller groups, until we reach either the threshold value orthe completely disjoint clustering. Such algorithms are called divisive.

Problems, involving classification of real data, cannot be reduced only to applyinga mathematical clustering algorithm. Before that, the data must be collected and con-sistently presented, and the dissimilarity values must be assigned. After building theclusters, they must be validated and assessed. The results have to be properly inter-preted. All these are crucial issues, because any algorithm generates some clustering,but without further considerations we cannot conclude whether the clusters derivedreflect the real structure of data or this is just an artifact of the algorithm. We leaveout all these issues along with the problem of computer implementation.

In this chapter we consider only agglomerative hierarchical algorithms for cluster-ing discrete sets of data. These algorithms are based on the properties of the graphrepresenting the initial collection of objects. In Section 3.2, using a small model ex-ample, we develop a simple single-link hierarchical clustering algorithm. It is calledsingle-link, because at every step we connect two existing clusters by a single edge(link) of the underlying graph. Section 3.3 is devoted to Hubert’s single-link al-gorithm. We discuss relations of the single-link hierarchical clustering algorithm withminimum spanning trees. Section 3.4 is devoted to another hierarchical clustering al-gorithm – Hubert’s complete-link algorithm. In Section 3.5 we apply the single-linkalgorithm to a more realistic problem. We also validate the clustering derived in thisexample, by making use of Pearson’s coefficient of correlation, thus demonstratingthe quality of the clustering developed.


EP 3.1.1. Construct other clusterings of these fifteen states (Table 3.1) with the samethreshold levels of 10 or 5.

Section 3.2 Model Example 133

EP 3.1.2. Find a clustering of these fifteen states corresponding to the thresholdvalue 1; corresponding to the threshold value of 164.

EP 3.1.3. Construct clusterings of these fifteen states consisting of four or five clus-ters. Find corresponding threshold values.

EP 3.1.4. Find a dissimilarity level that generates the unique clustering in this prob-lem.

3.2 Model Example

Here we consider a simple but not simplistic model example to introduce a clas-sical agglomerative algorithm of hierarchical clustering.

Cluster Analysis

There are eight cities, c1; c2; : : : ; c8, in a region. It is necessary to connect all of themby highways. It is possible to build a highway connecting every pair of cities. In thegraph theory terms, such a road network can be described as the complete graph K8.This network contains C.8; 2/ D 28 highways and is rather expensive. On the otherhand, it is possible to link every city with only one of the other cities, thus havinga minimal number of roads built. Even though it is less expensive to construct, thisnetwork, which can be modelled by a spanning tree with 8 vertices and, therefore,with 7 edges, may be inconvenient for the commuters, who will have to waste theirtime and fuel, because many pairs of the cities do not have a direct connection.

Thus, a local mathematician has offered an intermediate approach, namely, to splitall the cities into several groups-clusters. The cities within each cluster are to beconnected completely, but any two different clusters should be connected by onlyone road. A cluster should, obviously, include the cities that are close to each other.However, the closeness can be measured in many various ways. A reasonable way tomeasure the closeness is to use the number of commuters between the cities.

The information about the average number of commuters in both directions be-tween the cities, in thousands of people per day, is contained in Table 3.3. For in-stance, the amounts of commuters are 24 between the cities c1 and c2, 2 between c1and c6 and 6 between c2 and c6. Thus, there is a large flow of commuters betweenc1 and c2 – in this sense these two cities are close, even though geographically theymay be located far away from one another. So that, they are to be considered similarand should be placed in one cluster. Yet, c6 is distant from them. However, if we usethese quantities – 24, 2, 6, etc., as a measure of closeness (a generalized distance),


c1 c2 c3 c4 c5 c6 c7 c8c1 0 5 10 7 22 27 25 13c2 0 8 12 28 23 17 6c3 0 1 9 19 3 26c4 0 4 14 2 21c5 0 11 16 18c6 0 15 20c7 0 24c8 0

Table 3.3. The dissimilarity table for the model example.

then the similarity between nearby cities may be greater than the similarity betweenthe distant ones. In this problem and, as we have already mentioned, in clusteringanalysis generally, it is often more suitable to use the dissimilarities of objects ratherthan their similarities. We can always convert the commuter data into dissimilarityvalues, for example, by taking inverses or subtracting from some maximum value.

We consider the total amount of commuters in both directions, so Table 3.3 is sym-metrical with respect to the main diagonal and we filled in only the upper triangleof the table. Moreover, since we want to start with a simple example, all the entriesare different (the table contains no ties) and they are all natural numbers from 1 toC.8; 2/ D 28. The mathematician must now solve the problem of combining thecities into clusters according to this dissimilarity matrix.

We first develop a simple intuitive algorithm, which starts with one-element clustersand step-by-step combines them until we reach some goal, which should be set inadvance. At the initial step, the algorithm treats each given object as a single-elementcluster. The set of these clusters is called the clustering of level zero. If we use thegraph-theory language, we can depict this clustering as a graph having only isolatedvertices, with no edge. Then, at every step, the algorithm uses only one edge to merge,agglomerate two closest (that is, with the smallest dissimilarity) clusters into a newcluster of the next level. Such an edge connecting two clusters of the same level intoa cluster of the next level is called a link. That is why this and similar procedures arecalled single link algorithms or single linkage.

We begin with a descriptive version of an agglomerative single-link clustering algo-rithm, then apply it to the dissimilarity table above and give a more formal treatmentof the algorithm. In Section 3.3 we present a version of the algorithm known as Hu-bert’s single-link algorithm [28]. This algorithm is based on the notion of spanningtrees.

Denote the consecutive clusterings by boldface capital letters with one index, C0,C1, C2, and so forth. The italic capital letters with double indices, Ck;l , denote clus-


Pair ¹ci ; cj º Dissimilarity d.i; j / D S0.i; j /¹c3; c4º d.3; 4/ D 1

¹c4; c7º d.4; 7/ D 2

¹c3; c7º d.3; 7/ D 3

¹c4; c5º d.4; 5/ D 4

¹c1; c2º d.1; 2/ D 5

¹c2; c8º d.2; 8/ D 6

¹c1; c4º d.1; 4/ D 7

¹c2; c3º d.2; 3/ D 8

¹c3; c5º d.3; 5/ D 9

¹c1; c3º d.1; 3/ D 10

¹c5; c6º d.5; 6/ D 11

¹c2; c4º d.2; 4/ D 12

¹c1; c8º d.1; 8/ D 13

¹c4; c6º d.4; 6/ D 14

¹c6; c7º d.6; 7/ D 15

¹c5; c7º d.5; 7/ D 16

¹c2; c7º d.2; 7/ D 17

¹c5; c8º d.5; 8/ D 18

¹c3; c6º d.3; 6/ D 19

¹c6; c8º d.6; 8/ D 20

¹c4; c8º d.4; 8/ D 21

¹c1; c5º d.1; 5/ D 22

¹c2; c6º d.2; 6/ D 23

¹c7; c8º d.7; 8/ D 24

¹c1; c7º d.1; 7/ D 25

¹c3; c8º d.3; 8/ D 26

¹c1; c6º d.1; 6/ D 27

¹c2; c5º d.2; 5/ D 28

Table 3.4. Dissimilarity Table 3.3 for the model example rearranged in the ascendingorder of the dissimilarities.

ters – the first index, k, indicates the level of clustering and the second index, l , standsfor the number of this particular cluster in the clustering of the kth level. Thus, C3;4denotes the fourth cluster in the third-level clustering C3. Now we build clusteringsfor the model example. In our notations, ¹ci ; cj º is a pair (two-element set) com-prising the i th city ci and j th city cj , and a number d.ci ; cj /, or d.i; j / for short,at the crossing of the i th row and j th column of the dissimilarity matrix, stands forthe dissimilarity of these two cities; due to the symmetry, d.i; j / D d.j; i/. First,


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

Figure 3.1. The threshold graph G.0/ for the model example. It corresponds to theC0-clustering.

we rearrange all pairs of the cities in the ascending order of their dissimilarities – seeTable 3.4.

At the initial step, we derive a disjoint clustering, such that each city forms its owncluster containing only one element. Thus, the clustering of level zero is

C0 D ¹C0;1;C0;2;C0;3;C0;4;C0;5;C0;6;C0;7;C0;8º

where C0;i D ¹ciº; i D 1; 2; : : : ; 8. Then at every step we must determine thedissimilarities between all the existing clusters, both new and old.

Definition 3.2.1. The dissimilarity diss.C0;i ;C0;j / between two clusters of level zerois defined as the dissimilarity between the corresponding objects, that is,

diss.C0;i ;C0;j / D d.i; j /:

It is helpful to visualize the process of clustering by drawing graphs of special kind,called threshold graphs.

Definition 3.2.2. Given a dissimilarity n � n matrix and a threshold value �, thethreshold graph G.�/ is a simple weighted graph with n vertices corresponding to nentities under consideration, such that two vertices vi and vj are adjacent if and onlyif d.i; j / � �. The weight of the edge ei;j connecting two vertices vi and vj is thegiven dissimilarity d.i; j /.

Thus if � D 0, two vertices are adjacent in G.0/ if and only if their dissimilarity iszero; if there is no such a pair of vertices, G.0/ contains only n isolated vertices and


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

d.3; 4/ D 1d.7; 8/ D 24

d.2; 3/ D 8ZZZZZ�

��

ZZZZZ �

��

Figure 3.2. The complete threshold graph G.1/ for the model example; only a fewedges and dissimilarities are shown.

no edge. On the other hand, if the threshold value � is greater than or equal to thelargest entry of the dissimilarity matrix, then the threshold graph is (isomorphic to)the complete graph Kn, and we denote it by G.1/.

The smallest dissimilarity in the problem is d.3; 4/ D 1. So that, if the thresholdvalue (an acceptable level of dissimilarity) is less than 1, we cannot combine any twocities in one cluster and have to stop here. In terms of our model example, that meansthat no cluster has an infrastructure, and we have to build a road between all the pairsof cities (Fig. 3.2).

Suppose next that the threshold is at least 1, � � 1. Then we have to consider all28 pairwise unions

C0;1 [ C0;2;C0;1 [ C0;3; : : : ;C0;1 [ C0;8;

C0;2 [ C0;3; : : : ;C0;2 [ C0;8; : : : ;

:::

C0;7 [ C0;8:

In the corresponding complete graph (the same Figure 3.2), its 28 edges, having theweights d.1; 2/; : : : ; d.7; 8/ and connecting the pairs of vertices, respectively,

¹c1; c2º; ¹c1; c3º; : : : ; ¹c1; c8º; ¹c2; c3º; : : : ; ¹c7; c8º;

correspond to these 28 pairwise unions.Since the lowest weight is d.c3; c4/ D 1, the clusters C0;3 and C0;4 have the small-

est dissimilarity diss.C0;3;C0;4/ D d.c3; c4/ D 1. In terms of our problem they have


ss

s

s

s

s

ss

c1 c2

c3

c4

c5c6

c7

c8

Figure 3.3. The threshold graph G.1/ corresponds to the threshold level � D 1 andthe clustering C1 – only two vertices are connected.

the largest flow of commuters between them. Thus, we have to connect them first, andwe amalgamate these two clusters of level zero in the cluster C1;1 of the first level. Allthe other clusters of level zero automatically become clusters of the first level. Thisway, we get the first-level clustering

C1 D ¹C1;1;C1;2;C1;3;C1;4;C1;5;C1;6;C1;7º

consisting of one two-element cluster C1;1 D C0;3 [ C0;4 D ¹c3; c4º and six one-element clusters C1;i D C0;i�1 D ¹ci�1º for i D 2; 3 and C1;i D C0;iC1 D ¹ciC1º

for i D 4; 5; 6; 7. This clustering is shown in Fig. 3.3, where the connected componentwith the vertices ¹c3; c4º corresponds to the cluster C1;1.

To complete this step of the algorithm, we must define the dissimilarities betweennew clusters. The dissimilarities between the clusters C1;2; : : : ;C1;7 are the sameas those between the corresponding “old” clusters of level zero. The dissimilaritybetween C1;1 and any cluster ¹ciº; i D 1; 2; 5; 6; 7; 8, is, by Definition 3.2.1, thesmaller of d.c3; ci / and d.c4; ci /. For instance,

diss.C1;1;C1;4/ D min¹d.c3; c5/I d.c4; c5/º D min¹9I 4º D 4:

It should be repeated that while deriving C1 from C0, we have used only one link –the threshold graph corresponding to C1 contains just one more edge than the graphcorresponding to C0. All the 28 pairs of vertices

¹C0;1;C0;2º; : : : ; ¹C0;7;C0;8º;

each pair taken together with the incident edge, represent connected two-vertex sub-graphs of the graph in Fig. 3.2 – we have selected among them a subgraph with the


ss

s

s

s

s

ss

c1c2

c3

c4

c5c6

c7

c8

ZZZZZ

QQQQQ

QQQQQ

QQ

XXXXXX

XXXXXX

XXXXX

��

��

Figure 3.4. A road map corresponding to the first-level clustering C1 D ¹C1;1;C1;2;C1;3;C1;4;C1;5;C1;6;C1;7º.

minimal weight and linked two vertices of this subgraph into a cluster. Again in termsof our model example, this means that we have to build a road between c3 and c4.

Then we have to connect each other city with either c3 or c4, but not with both,using roads between clusters. Given two clusters, C1;1 D ¹c3; c4º and C1;i D

¹ci�1º; i D 2; 3, and C1;i D ¹ciC1º; i D 4; 5; 6; 7, the decision regarding whichcity, c3 or c4, should be connected with ci , is based on the dissimilarity between theclusters C1;1 and C1;i . For example, since

diss.C1;1;C1;4/ D min¹d.c3; c5/I d.c4; c5/º D d.c4; c5/ D 4;

the cluster ¹c5º must be connected with the vertex c4. The corresponding road mapmay look like the one in Fig. 3.4.

If the threshold � D 1, we should stop here and the road map is given by thespanning tree in Fig. 3.4. However, if we can accept a larger threshold, we are tocontinue. To build a second-level clustering, we proceed in the same way. Namely,we consider all pairs of the first-level clusters and look for a connecting link with thesmallest dissimilarity.

The edge ¹c3; c4º, which had been already utilized, cannot be used again. Amongthe unused edges, the smallest dissimilarity is d.c4; c7/ D 2, and we form the second-level cluster C2;1 as the union of the two first-level clusters containing the cities c4and c7. To make this cluster, we have again used a single link – the edge ¹c4; c7º. Allthe other first-level clusters move into the second-level clustering C2 unchanged, justbeing renumbered (Fig. 3.5):

C2 D ¹C2;1;C2;2;C2;3;C2;4;C2;5;C2;6º

where C2;1 D C1;1 [ C1;6 D ¹c3; c4; c7º, C2;i D C1;i ; i D 2; 3; 4; 5, and C2;6 D

C1;7.


ss

s

s

s

s

ss

c1 c2

c3

c4

c5c6

c7

c8

Figure 3.5. The threshold graph G.2/ corresponds to the second-level clustering C2 –one more edge is added to G.1/; C2 consists of one three-element cluster ¹c3; c4; c7ºand five one-element clusters ¹c1º, ¹c2º, ¹c5º, ¹c6º, and ¹c8º.

We can express this in terms of connected subgraphs. In addition to the same two-vertex subgraphs with vertices other than c3 and c4, which were considered before,we have to look for connected subgraphs with three vertices. Namely, we consider thesubgraphs, which contain the two vertices c3 and c4, the incident edge of these twovertices, another vertex, and an edge connecting the latter with either c3 or c4. Theminimal dissimilarity is now S1.4; 7/ D 2 and we have to connect clusters ¹c7º and¹c3; c4º in a cluster of the second level.

This clustering corresponds to the threshold value � D 2 and is shown in Fig. 3.5.It is worth noting that the dissimilarity d.3; 7/ between the objects c3 and c7 in thecluster C2;1 is greater than 2, but these vertices can be connected within the clusterby the edges ¹c3; c4º and ¹c4; c7º, such that their weights do not exceed the thresholdvalue. This is an important feature of the single-link algorithms – for any two objectsx and y in a cluster there always exists a sequence of objects in this cluster connectingx and y, such that the dissimilarity of any two neighbors in this sequence does notexceed the threshold value, even though the dissimilarity of x and y may be greaterthan the threshold.

We continue the construction of the hierarchical clustering for the model example.Suppose that we can accept a value of the threshold greater than 2. The next unuseddissimilarity d.c3; c7/ gives nothing new, because the cities c3 and c7 have alreadybeen linked in a cluster. Therefore, d.c3; c7/ does not generate the next clustering(Fig. 3.6).

Thus, we skip d.3; 7/ and use the next bigger dissimilarity d.4; 5/ D 4, generatingthe next clustering

C3 D ¹¹c3; c4; c5; c7º; ¹c1º; ¹c2º; ¹c6º; ¹c8ºº


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

!!!!!!!!

!!!!!!!!

!

Figure 3.6. The threshold graph G.3/ does not generate a new clustering.

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8 !!!!!!!!!!!!!!!!!�

��

�

Figure 3.7. The threshold graph G.4/ contains one new edge ¹c4; c5º. It correspondsto C3-clustering containing one four-element cluster ¹c3; c4; c5; c7º and four one-element ones ¹c1º; ¹c2º; ¹c6º; ¹c8º.

which corresponds to the threshold value � D 4. Five sets in C3 represent all fiveclusters of the third level (Fig. 3.7). Again, the dissimilarity between some vertices inthe first cluster C3;1 is greater than 4, but for any two vertices there exists a connectingpath such that every edge in the path has a weight (dissimilarity) of 4 or less. In formalterms, we consider all the unions C2;a [ C2;b formed by a single edge and look forthe link with the smallest weight, which generates a new cluster.


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

!!!!!!!!

!!!!!!

!!!

��

Figure 3.8. The threshold graph G.5/ generates the fourth-level clustering C4 con-sisting of one four-element cluster ¹c3; c4; c5; c7º, one two-element cluster ¹c1; c2º,and two one-element clusters ¹c6º and ¹c8º.

The next smallest weight to use is d.1; 2/ D 5, and if we are willing to continueand use the value of the threshold � D 5, we have to merge the cities ¹c1º and ¹c2º ina cluster. Thus, we derive the next clustering (Fig. 3.8)

C4 D ¹C4;1;C4;2;C4;3;C4;4º D ¹¹c1; c2º; ¹c3; c4; c5; c7º; ¹c6º; ¹c8ºº:

A road map corresponding to the clustering C4, is shown in Fig. 3.9.In this way we construct the hierarchy of consecutive clusterings, corresponding to

increasing values of the threshold. Now it is the turn of d.2; 8/ D 6, and the fifth-levelclustering is (Fig. 3.10)

C5 D ¹¹c1; c2; c8º; ¹c3; c4; c5; c7º; ¹c6ºº:

The next unused edge with the lowest weight is ¹c1; c4º with d.1; 4/ D 7, and wecome up with the clustering (Fig. 3.11)

C6 D ¹¹c1; c2; c3; c4; c5; c7; c8º; ¹c6ºº:

The edges with weights 8, 9, and 10 do not generate new clusters. Finally, bymaking use of the edge ¹c5; c6º with the weight d.5; 6/ D 11, we get the one-clusterclustering C7 D ¹C7;1º, where C7;1 D ¹c1; c2; c3; c4; c5; c6; c7; c8º – see Fig. 3.12;if all the objects are merged in one cluster, the clustering is called conjoint.

It is worth noting that in terms of our model, both C0 and C7 result in the sameroad network shown in Fig. 3.2.


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8!!!!!!!!!!!!!!!!!��

��l

lllllllllllPPPPPPPPPPPP�

��

Figure 3.9. The road map corresponding to the clustering C4. The clusters C4;1 andC4;2 are connected by the edge ¹c1; c4º, for this edge has the smallest dissimilarityamong all the edges connecting the two clusters.

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

!!!!!!!!!!!!!!!!!

��

��

��

Figure 3.10. The threshold graph G.6/ generates the fifth-level clustering C5,which contains one four-element cluster ¹c3; c4; c5; c7º, one three-element cluster¹c1; c2; c8º, and a one-element cluster ¹c6º.

Problem 3.2.1. Draw road maps corresponding to all other levels of clustering, C2,C3, C5, C6, in the model example.

Analyzing our discussion of the model example, we see that the algorithm abovecan be stated in the following more formal way suitable for computer realization;


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

��

��

llllllllllll

Figure 3.11. The threshold graph G.7/ generates the sixth-level clustering C6, whichcontains a seven-element cluster ¹c1; c2; c3; c4; c5; c7; c8º and a one-element cluster¹c6º.

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8!!!!!!!!!!!!!!!!! l

lllllllllll ��

bbb

bb

XXXXXX

XXXXXX

��"""""

Figure 3.12. The threshold graph G.11/; it generates the conjoint clustering C7 D¹C7;1º.

we have presented it in the pseudocode form. In what follows we use the usefulnotation

a WD b;

which means that the value b must be assigned to the variable a, or to put it anotherway, the current value of the variable a must be replaced by the value of b. Forexample, if a D 5 and b D 0, then after the command a WD b is executed, the value


a D 0 while the value b remains unchanged, b D 0; the initial value of a, namely,a D 5 is deleted.

Problem 3.2.2. Starting with the initial valuem D 2, find the value of the variable mafter repeating twice the command m WD m � 1.

Agglomerative Single-Link Algorithm

Given a set of n objects X D ¹x1; x2; : : : ; xnº, their dissimilarity table, and a thresh-old value � � 0.

1. Rearrange the dissimilarity table in ascending order.

2. Set m D 0 and make a completely disjoint clustering of zero level C0 D¹C0;1;C0;2; : : : ;C0;nº, with one-element clusters C0;i D ¹xiº; i D 1; : : : ; n.

3. Setm WD mC1 and consider the first unused entry, say d.xk; xl/ in the dissim-ilarity table. If d.xk; xl/ > �, then stop and return the last derived clustering.Otherwise, there are two possibilities.

3(A). The two-element set ¹xk; xlº is a subset of an existing cluster. Then skipd.xk; xl/ and return to step 3, that is, increase m by 1.

3(B). The objects xk and xl belong to different existing clusters, say xk 2Cm�1;a and xl 2 Cm�1;b; a ¤ b. Form a cluster of the mth level asthe union Cm;1 D Cm�1;a [ Cm�1;b , renumber all the other clusters ofthe .m � 1/th level to the mth level, without changing their elements, andreturn to step 3.

Remark 3.2.3. The conjoint clustering can occur before we achieve the thresholdlevel and as we have seen in the example, not every threshold graph generates a newclustering.

Remark 3.2.4. Since we look only for disjoint clusters, a clustering of any level isjust a partition of the initial set of objects. Therefore, our algorithm generates a familyof nested partitions of the given set. Moreover, we know (see Problems 1.1.18–1.1.19)that every partition of a set generates an equivalence relation on this set and vice versa.This relationship is dealt with in EP 3.2.3.

Remark 3.2.5. Part 3(A) of this algorithm is quite analogues to the condition of notforming cycles in Part 2 of Kruskal’s algorithm (Section 2.3) of constructing the min-imum spanning trees.



EP 3.2.1. Given the initial value of the variable k D 1, what is the value of k afterthe command k WD .�1/k is executed 3 times? 4 times?

EP 3.2.2. Prove that given n objects, n levels of clustering C0; : : : ;Cn�1 exist, wherethe last one is the conjoint clustering. Moreover, as far as the dissimilarity tablecontains n.n � 1/=2 entries, there are no more than 1C n.n � 1/=2 threshold graphs(exactly 1C n.n � 1/=2 if there are no ties).

EP 3.2.3. Describe explicitly the equivalence relations corresponding to the partitionsof the set C D ¹c1; c2; c3; c4; c5; c6; c7; c8º generated by the clusterings C0; : : : ;C7in the model example.

EP 3.2.4. Construct dissimilarity tables for a set with n elements such that there areexactly 2, or exactly 3, or exactly 1C n.n � 1/=2 threshold graphs.

EP 3.2.5. Change the ¹c3; c7º entry in Table 3.3 to 2 and ¹c2; c8º entry to 7, respec-tively, so that a new table contains ties. Apply the algorithm of this section to this newdissimilarity table and compare the resulting clusterings with the ones derived above.

EP 3.2.6. Using the algorithm of this section, construct all consecutive clusterings ofthe set X D ¹x1; x2; : : : x6º with the Dissimilarity Table 3.5. What level of clusteringcorresponds to the threshold level of 3? Of 2?

x1 x2 x3 x4 x5 x6x1 0 6 8 3 4 8x2 0 2 4 1 5x3 0 6 2 3x4 0 9 2x5 0 4x6 0

Table 3.5. The dissimilarity table for EP 3.2.6.

Section 3.3 Hubert’s Single-Link Algorithm 147

3.3 Hubert’s Single-Link Algorithm

In this and the following sections we consider two well-known algorithms byHubert – the single link and complete link agglomerative clustering algorithms.

L. Hubert

In the preceding section we represented the objects and their dissimilarities by weight-ed graphs. Since the dissimilarity was defined for each pair of objects, these graphsare complete. Therefore each cluster, as a set of vertices, is represented by a subgraphof the complete graph corresponding to the initial set of objects. Vice versa, eachconnected subgraph of this complete graph can be viewed as a cluster consisting ofthe vertices of this subgraph. Therefore, we will freely interchange the language ofobjects and their collections (clusters) on one hand, and the language of vertices,graphs, and subgraphs, on the other hand.

Hubert [28] gave versions of a single-link algorithm and a complete-link algorithmbased on the concept of a threshold graph. Hubert’s single-link algorithm leads to thesame clustering as the agglomerative single-link algorithm of the preceding section.In this section we present Hubert’s single-link algorithm in more formal pseudocodeform. First of all, more notation are in order.

As before, we denote clustering of mth level by

Cm D ¹Cm;1;Cm;2; : : : ;Cm;nmº; m D 0; 1; 2; : : : ;

where nm stands for the number of clusters contained in the clustering Cm of mth

level. In particular, n0 D n. After Cm has been derived, we consider all 12nm.nm�1/

pairwise unions of these clusters

Cm;a [ Cm;b

where 1 � a; b � n.m/; a ¤ b. The union Cm;a[Cm;b contains certain objects, say,the elements xi ; : : : ; xj . Given the union Cm;a [ Cm;b for fixed a and b; a ¤ b, wecan form several connected subgraphs of the threshold graph G.�/ spanned by thesevertices xi ; : : : ; xj . Namely, to derive such a subgraph from two clusters Cm;a andCm;b , we consider all possible connections of a vertex from Cm;a with a vertex fromCm;b using only one edge, called single link.

For the clusters Cm;a and Cm;b of mth level, denote the smallest dissimilarity be-tween a vertex in Cm;a and a vertex in Cm;b by

Sm.a; b/ D min¹d.xi ; xj / j xi 2 Cm;a; xj 2 Cm;bºI


clearly, this function is symmetric, that is, Sm.a; b/ D Sm.b; a/. If the initial dis-similarity matrix contains ties, there may be several edges with the minimal weight,any one of those can be selected. At every step we merge two existing clusters, thusdecreasing the number of clusters by 1.

The function Sm D Sm.a; b/ is defined on all pairs of clusters ¹Cm;a;Cm;bº ofthe mth level. Since we only consider finite sets, this function attains its minimumvalue on a certain pair of clusters, say, Cm;p and Cm;q . Let us denote this minimumvalue of the function Sm.a; b/ over all pairs of indices ¹a; bº by mina;b¹Sm.a; b/º DSm.p; q/. The subscripts p D p.m/ and q D q.m/ depend on m, but suppressingthis dependence in the notation does not lead to any ambiguity. We use the functionSm.a; b/ to present Hubert’s single-link algorithm.

Hubert’s Single-Link Algorithm

Given a set of n objects X D ¹x1; x2; : : : ; xnº, the dissimilarity table, and a thresholdvalue �.

1. Set m D 0 and form the disjoint clustering of zero level,

C0 D ¹C0;1;C0;2; : : : ;C0;nº;

consisting of n one-element clusters C0;k D ¹xkº; k D 1; : : : ; n. Define thefunction S0 and the dissimilarities between the clusters of level zero by

S0.a; b/ D diss.C0;a;C0;b/ D d.xa; xb/:

Find the minimum value Smin0 of the function S0.a; b/ over all the pairs .a; b/

Smin0 D min

a;bS0.a; b/ D S0.p; q/;

attained at the pair .p; q/. This pair indicates the clusters of zero level, C0;p andC0;q , to be merged in a cluster of the 1st level,

C1;1 D C0;p [ C0;p:

All the other zero-level clusters remain the same, we only have to renumberthem,

C1;r D C0;s; r � 2; s ¤ p; s ¤ q:

2. Set m WD mC 1, calculate the values

Sm.a; b/ D min¹d.xi ; xj / j xi 2 Cm�1;a; xj 2 Cm�1;bº

for each pair of indices ¹a; bº; a ¤ b, and find the minimum value

Sminm D min

a;b¹Sm.a; b/º D Sm.p; q/;


where .p; q/ is a pair .a; b/ at which the minimum is attained. To build the nextclustering Cm, we merge those two clusters Cm�1;p and Cm�1;q , whose secondindices are p and q from above, into the cluster

Cm;1 D Cm�1;p [ Cm�1;p

by making use of an edge with the weight Sminm D Sm.p; q/. If there are ties,

that is, there exist several edges with the same weight dmin.p; q/, we can useeither of them. All the other clusters of the level m � 1 become the clusters oflevel m without any change, after just renumbering.

3. Update the dissimilarity table as follows. The dissimilarity between every two“old” clusters (promoted from the preceding level) remains the same. The dis-similarity between Cm;1 and any cluster Cm;r D Cm�1;s; s ¤ p; s ¤ q, is thesmaller of

Sm�1.p; r/ D diss.Cm�1;p;Cm�1;r/

andSm�1.q; r/ D diss.Cm�1;q;Cm�1;r/;

thus for any r > 1,

diss.Cm;1;Cm;r/ D min¹Sm�1.p; r/ISm�1.q; r/º:

4. Return to step 2 and continue until we reach the threshold value � or all theobjects are merged into one conjoint cluster, whichever occurs first.

Remark 3.3.1. Thus to find the next clustering, it is necessary to calculate the doubleminimum

Sminm .p; q/ D min

a;b¹Sm.a; b/º D min

a;b¹min¹d.xi ; xj / j xi 2 Cm;a; I xj 2 Cm;bºº:

We illustrate this algorithm using the model example from the preceding section,thus in the rest of this section we denote the objects by ci . The algorithm starts withsingle-element clusters corresponding to each city c1; : : : ; c8. That is, we set m D 0

and form the disjoint clustering

C0 D ¹C0;1;C0;2; : : : ;C0;nº

where C0;k D ¹ckº; k D 1; : : : ; 8. This clustering corresponds to the subgraph of thegraph G.1/ with no edge – every vertex is an isolated one. The function S0.a; b/ isshown in the right column of Table 3.4, its minimum value is Smin

0 D S0.3; 4/ D 1.Now, setm D 0C 1 D 1. To every union C0;a[C0;b there corresponds the unique

connected subgraph of G.1/, this subgraph contains two vertices and their incidentedge. Therefore, at this step dmin D d.c3; c4/; p D 3; q D 4, Smin

0 D S0.p; q/ D 1,


and we have to join the clusters C0;3 and C0;4 in a cluster C1;1 of the first level. Thenwe upgrade all other zero-level clusters to the first level, and update the dissimilaritytable. For example, since C1;2 D ¹c1º, we get

diss.C1;1;C1;2/ D min¹S0.1; 3/IS0.1; 4/º

D min¹diss.C0;1;C0;3/I diss.C0;1;C0;4/º

D min¹d.c1; c3/I d.c1; c4/º D min¹10I 7º D 7:

These computations lead to the updated dissimilarity table of the 1st level (Table 3.6)and to the same 1st level clustering as in Section 3.2, see Fig. 3.3,

C1 D ¹C1;1;C1;2;C1;3;C1;4;C1;5;C1;6;C1;7º

where C1;1 D C0;3 [ C0;4 D ¹c3; c4º, C1;i D C0;i�1 D ¹ci�1º for i D 2; 3 andC1;i D C0;iC1 D ¹ciC1º for i D 4; 5; 6; 7.

It is worth noting that after all these discussions, we certainly have a clear geomet-rical picture of this procedure, but we do not need it for the computations; Hubert’salgorithm works analytically, without any appeal to graphs.

Now, set m D 1 C 1 D 2. From Table 3.6, Smin1 D S1.1; 6/ D 2, therefore, at

this level p D 1; q D 6, and we have to merge the clusters C1;1 D ¹c3; c4º andC1;6 D ¹c7º in the 1st cluster of the 2nd level,

C2;1 D C1;1 [ C1;6 D ¹c3; c4; c7º:

We renumber all the other 1st level clusters as clusters of the 2nd level and use thesame algorithm to calculate the dissimilarities between new clusters, see Table 3.7.We reiterate that again all considerations based on the graph theory, in particular, onthe spanning trees, were left behind the scene – see Section 3.2. The whole proce-dure is based completely on the dissimilarity tables and is convenient for a computerrealization.

At the next step we set m D 3. In the graph theory terms of the previous section,we looked at the threshold graph G.2/, which contained one three-element and fiveone-element clusters. The smallest unused dissimilarity was d.3; 7/ D 3, but addingthe corresponding edge to G.2/ did not create a new cluster. Therefore, we had toleave out d.3; 7/ and proceed on to d.4; 5/ D 4. However, now we are using thepurely analytical Hubert’s algorithm and are to browse Table 3.7. From that table,Smin2 D S2.1; 4/ D 4, therefore, at this level p D 1; q D 4, and we have to merge the

clusters C2;1 D ¹c3; c4; c7º and C2;4 D ¹c5º in the 1st cluster of the 3rd level

C3;1 D C2;1 [ C2;5 D ¹c3; c4; c5; c7º:

We renumber all the other 2nd level clusters to the 3rd level and use the same algorithmto calculate the dissimilarities between new clusters, see Table 3.8.


diss.C1;a;C1;b)diss.C1;1;C1;6/ D 2diss.C1;1;C1;4/ D 4diss.C1;2;C1;3/ D 5diss.C1;3;C1;7/ D 6diss.C1;1;C1;2/ D 7diss.C1;1;C1;3/ D 8

diss.C1;4;C1;5/ D 11diss.C1;2;C1;7/ D 13diss.C1;1;C1;5/ D 14diss.C1;5;C1;6/ D 15diss.C1;4;C1;6/ D 16diss.C1;3;C1;6/ D 17diss.C1;4;C1;7/ D 18diss.C1;5;C1;7/ D 20diss.C1;1;C1;7/ D 21diss.C1;2;C1;4/ D 22diss.C1;3;C1;5/ D 23diss.C1;6;C1;7/ D 24diss.C1;2;C1;6/ D 25diss.C1;2;C1;5/ D 27diss.C1;3;C1;4/ D 28

Table 3.6. The updated dissimilarity table of the 1st level.

Comparing with the algorithm of Section 3.2, we see that Hubert’s algorithm atevery step leads directly to the next-level clustering without pausing at intermediatethreshold graphs, which do not generate the next level of clustering. This way, webuild up the single-link clusterings of all higher levels, which are, of course, the sameas in Section 3.2, up to the conjoint clustering C7. We show here only the updateddissimilarity tables of the sequel levels.

When amalgamating, step by step, the clusters, we are increasing the thresholdvalue and respectively, generating the threshold graphs. They are the same as beforeand shown in Fig. 3.1–3.3, 3.5–3.8, 3.10–3.12.

If we compare these figures with Fig. 2.13–2.21 in Section 2.3, we recognize simi-lar graphs and easily convince ourselves that the steps of the agglomerative clusteringalgorithms of this and the previous sections correspond to the steps of Kruskal’s algo-rithm of constructing a minimum spanning tree.

To visualize the process of clustering, a special kind of tree-like graphs is useful.These graphs are called dendrograms. Below we build the single-link dendrogram


diss.C2;a;C2;b)diss.C2;1;C2;4/ D 4diss.C2;2;C2;3/ D 5diss.C2;3;C2;6/ D 6diss.C2;1;C2;2/ D 7diss.C2;1;C2;3/ D 8

diss.C2;4;C2;5/ D 11diss.C2;2;C2;6/ D 13diss.C2;1;C2;5/ D 14diss.C2;4;C2;6/ D 18diss.C2;5;C2;6/ D 20diss.C2;1;C2;6/ D 21diss.C2;2;C2;4/ D 22diss.C2;3;C2;5/ D 23diss.C2;2;C2;5/ D 27diss.C2;3;C2;4/ D 28

Table 3.7. The updated dissimilarity table of the 2nd level.

diss.C3;a;C3;b)diss.C3;2;C3;3/ D 5diss.C3;3;C3;5/ D 6diss.C3;1;C3;2/ D 7diss.C3;1;C3;3/ D 8

diss.C3;1;C3;4/ D 11diss.C3;2;C3;5/ D 13diss.C3;1;C3;5/ D 18diss.C3;4;C3;5/ D 20diss.C3;3;C3;4/ D 23diss.C3;2;C3;4/ D 27

Table 3.8. The updated dissimilarity table of the 3rd level.

corresponding to our model problem, see Fig. 3.13. It is clear from this example howto build a dendrogram for any problem. Different horizontal levels of the dendrogram,top to down, correspond to consecutive clusterings in the problem. Thus, the levelA$ A gives the clustering

C3 D ¹¹c1º; ¹c2º; ¹c8º; ¹c3; c4; c5; c7º; ¹c6ºº;



Table 3.9. The updated dissimilarity table of the 4th level.

diss.C5;a;C5;b)diss.C5;1;C5;2/ D 7

diss.C5;2;C5;3/ D 11diss.C5;1;C5;3/ D 20




X x1 x2 x3 x4 x5x1 0 4 1 3 8x2 0 2 5 10x3 0 6 7x4 0 9x5 0

Table 3.12. The dissimilarity table for EP 3.3.3 and EP 3.4.2.

the level B$ B at Fig. 3.13 generates the clustering

C5 D ¹¹c1; c2; c8º; ¹c3; c4; c5; c7º; ¹c6ºº:


EP 3.3.1. Using Hubert’s single-link algorithm, build all threshold graphs and clus-terings of the set X D ¹x1; x2; : : : ; x6º, given Dissimilarity Table 3.5. What level ofclustering corresponds to the threshold levels of 2? Of 3? Of 4?


s s s s s s s sc1 c2 c3 c4 c5 c6c7c8C0

C1

C2

C3

C4

C5

C6

C7

-�

-�

A A

B B

Figure 3.13. The dendrogram for the model example.

EP 3.3.2. Draw the dendrogram for Dissimilarity Table 3.5.

EP 3.3.3. Using Hubert’s single-link algorithm, derive a conjoint clustering of the setX with the dissimilarities given in Table 3.12. Draw the corresponding dendrogram.

3.4 Hubert’s Complete-Link Algorithm

In this section we consider a different approach to amalgamated clustering, calledcomplete-link clustering. An essential distinction between the single-link andcomplete-link algorithms is the rule of merging two existing clusters into oneof a higher level. Instead of connected subgraphs of the threshold graph G.1/used in the single linkage, now we consider the maximum complete subgraphsof G.1/. Examples show that the single linkage and the complete linkage mayresult in different clusterings.

Clustering algorithms

We are concerned with another Hubert’s clustering algorithm called complete-linkclustering [28]. We use the same notations as in the previous sections, but consider

Section 3.4 Hubert’s Complete-Link Algorithm 155

only dissimilarity matrices without ties2. We again start with an informal descriptionof the algorithm and then write down its pseudo-code.

Like the single linkage, the complete linkage uses the same sequence of the thresh-old graphs. To avoid any ambiguity, we denote complete-link clusterings by Ccomp

m .Given a clustering

Ccompm D ¹Cm;1;Cm;2; : : : ;Cm;nmº

of the mth level, m D 0; 1; 2; : : : , we consider all pairwise unions

Cm;a [ Cm;b; a; b D 1; 2; : : : ; nm; a ¤ b:

Let the union Cm;a [ Cm;b contain objects xi ; : : : ; xj . While building the single-linkage, we looked for an edge (a single link) with the smallest dissimilarity. Nowwe are adding edges connecting a vertex in Cm;a with a vertex in Cm;b , in increasingorder of their dissimilarities, until we reach a complete subgraph of the thresholdgraph G.1/ spanned by all vertices xi ; : : : ; xj . Only at that point, the union Cm;a [

Cm;b becomes a cluster of the next, .mC 1/st level.To formalize this procedure, let Tm.a; b/ stand for the maximal dissimilarity over

all edges used in this construction, that is,

Tm.a; b/ D max¹d.xi ; xj / j xi 2 Cm;a; xj 2 Cm;bº:

Similarly to Sm, Tm is a symmetrical function on pairs of clusters of themth level, butunlike Sm, Tm is the maximal, not minimal dissimilarity. Let Cm;p;Cm;q be a pair ofclusters where this function attains its minimum value over all the pairs of clusters ofthe mth level. Denote this minimum value by

T minm D min

a;b¹Tm.a; b/º D Tm.p; q/:

To build the next clustering CmC1, we merge these two clusters Cm;p and Cm;q andupdate the dissimilarity table. We give a pseudocode of this algorithm.

Hubert’s Complete-Link Algorithm

Given a set X D ¹x1; x2; : : : ; xnº, the dissimilarity table, and the threshold value �.

1. Set m D 0 and form the disjoint clustering of level zero,

Ccomp0 D ¹C0;1;C0;2; : : : ;C0;nº

2Clustering in the presence of ties is discussed, for example, in [29, p. 76].


consisting of n one-element clusters C0;k D ¹xkº; k D 1; : : : ; n. Define thefunction T0 and the dissimilarities between the clusters of level zero by

T0.a; b/ D diss.C0;a;C0;b/ D d.xa; xb/:

Find the minimum value T min0 of the function T0.a; b/ over all the pairs .a; b/

T min0 D min

a;bT0.a; b/ D T0.p; q/

attained at the pair .p; q/. This pair determines the clusters of zero level, C0;pand C0;q , to be merged in a cluster of the 1st level

C1;1 D C0;p [ C0;p:

The other zero-level clusters remain the same, we only have to renumber them,

C1;r D C0;s; r � 2; s ¤ p; s ¤ q:

2. Set m WD mC 1, calculate the values

Tm.a; b/ D max¹d.xi ; xj / j xi 2 Cm;a; xj 2 Cm;bº

for all pairs of clusters of the mth level, and find their minimum value

T minm D Tm.p; q/ D min

a;bTm.a; b/:

To form the next clustering CcompmC1, we define

CmC1;1 D Cm;p [ Cm;q:

All the other clusters of themth level become, after renumbering, the clusters oflevel mC 1 without changes.

3. Update the dissimilarity table as follows. The dissimilarity between every two“old” clusters (promoted from the preceding level) remains the same. The dis-similarity between the “new” cluster Cm;1 and any “old” cluster Cm;r withr ¤ p and r ¤ q is the larger of the two dissimilarities diss.Cm;p;Cm;r/and diss.Cm;q;Cm;r/.

4. Continue until we reach the threshold value or all the objects are merged intoone conjoint cluster, whichever occurs first.

Remark 3.4.1. In Step 2 we combine two clusters into a new one only when we reachan edge with the maximal dissimilarity between the entities in the two clusters; so tosay, we link them completely. In terms of graphs, we merge two complete subgraphsGp and Gq by using all edges with one end in Gp and another end in Gq .


Remark 3.4.2. Using the single linkage, we calculate a double minimum of the dis-similarities, first over a fixed pair of clusters and then over all pairs of clusters – seeRemark 3.3.1. Unlike that, in the complete linkage we calculate the minimum of max-imal values – first we calculate the maximal dissimilarity of the objects over a fixedpair of clusters and then the minimum of these maximums over all pairs of clusters.

We apply this algorithm to our model example with Dissimilarity Table 3.4. Thethreshold graphs do not depend on the method used, whether it is the single- orcomplete-linkage. If some edges in the sequel figures are dashed, this means thatthis new edge does not generate a new cluster. The subgraph of the threshold graphG.1/, generated by two vertices c3 and c4 is a complete graph isomorphic to the com-plete subgraph K2. This is the same threshold graph G.1/ as in the single linkage –see Fig. 3.3. Therefore, first two clusterings are the same as in the single-linkage,

Ccomp0 D ¹C0;1;C0;2;C0;3;C0;4;C0;5;C0;6;C0;7;C0;8º

where C0;i D ¹ciº; i D 1; : : : ; 8, and

Ccomp1 D ¹C1;1;C1;2;C1;3;C1;4;C1;5;C1;6;C1;7º

where C1;1 D C0;3 [ C0;4 D ¹c3; c4º, C1;i D C0;i�1 D ¹ci�1º for i D 2; 3, andC1;i D C0;iC1 D ¹ciC1º for i D 4; : : : ; 7.

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

Figure 3.14. The threshold graph G.2/. The edge ¹c4; c7º is dashed (cf. Fig. 3.5) forit does not generate the next level complete-link clustering.

However, the threshold graph G.2/ (Fig. 3.14) does not contain a complete sub-graph – its subgraph, spanned by the vertices c3, c4, and c7, is not a complete graph,since vertices c3 and c7 are not adjacent. Thus, even though G.2/ generates a single-link clustering (cf. Sections 3.2–3.3), it does not generate a complete-link clustering.


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

!!!!!!!!

!!!!!!

!!!

Figure 3.15. The threshold graph G.3/ is the same as in Fig. 3.6. The subgraphspanned by the vertices ¹c3; c4; c7º is complete.

Now, the threshold graph G.3/ (Fig. 3.15) contains a complete subgraph, isomor-phic to K3, spanned by the vertices c3; c4; c7. Therefore, we merge these three ver-tices into a cluster C2;1, and the next complete-link clustering is

Ccomp2 D ¹C2;1;C2;2;C2;3;C2;4;C2;5;C2;6º

where C2;1 D C1;1 [ C1;6 D ¹c3; c4; c7º. Five other clusters contain only one vertexeach. We notice that only three edges (three links) have been used here. At this stepthe single-link and the complete-link clusterings still coincide.

The next threshold graph G.4/ (Fig. 3.16) is generated by the edge ¹c4; c5º. How-ever, this edge does not generate a new complete subgraph, and the threshold graphG.4/ does not generate the next level of complete clustering.

The threshold graph G.5/ (Fig. 3.17) contains a K2-isomorphic subgraph with thevertices c1 and c2. Hence, it generates a new complete-link clustering

Ccomp3 D ¹C3;1;C3;2;C3;3;C3;4;C3;5º

where C3;1 D ¹c1; c2º, C3;2 D ¹c3; c4; c7º, C3;3 D ¹c5º, C3;4 D ¹c6º, and C3;5 D

¹c8º. Starting at this step, Hubert’s complete-link algorithm generates clusteringsdistinct from the single linkage. The threshold graph G.6/ (Fig. 3.18), generated bythe edge ¹c2; c8º with the dissimilarity d.2; 8/ D 6, also does not contain a newcomplete subgraph. The sequel four threshold graphs, G.7/–G.10/ (Fig. 3.19–3.20)also do not contain new complete subgraphs and generate no new clustering.

However, in the threshold graph G.11/ (Fig. 3.21) the vertices c5 and c6 becomeconnected, and since they belong to no existing cluster, we have to merge them in a


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

!!!!!!!!

!!!!!!!!

!

#

#

#

Figure 3.16. Complete linkage: threshold graph G.4/, cf. Fig. 3.7. The subgraphspanned by the vertices ¹c3; c4; c5; c7º is not complete.

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

!!!!!!!!!!!!!!!!!

#

#

#

Figure 3.17. Complete linkage: threshold graph G.5/, cf. Fig. 3.8.

cluster of the next level. Thus, we derive the 4th clustering,

Ccomp4 D ¹C4;1;C4;2;C4;3;C4;4º

where

C4;1 D ¹c5; c6º; C4;2 D ¹c1; c2º; C4;3 D ¹c3; c4; c7º; C4;4 D ¹c8º:


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

!!!!!!!!

!!!!!!

!!!

#

#

#

��

��

�

Figure 3.18. Complete linkage: threshold graph G.6/, cf. Fig. 3.10.

The next complete-link clustering is generated by the edge ¹c1; c8º with the dis-similarity d.1; 8/ D 13 (Fig. 3.21),

Ccomp5 D ¹C5;1;C5;2;C5;3º

where

C5;1 D ¹c1; c2; c8º; C5;2 D ¹c3; c4; c7º; C4;3 D ¹c5; c6º:

The threshold graphs G.15/–G.18/ (Fig. 3.23–3.24) also do not generate a new clus-tering.

Only the threshold graph G.19/ (Fig. 3.24) generates the second to the last com-plete-link clustering

Ccomp6 D ¹C6;1;C6;2º

with two clusters C6;1 D ¹c3; c4; c5; c6; c7º and C6;2 D ¹c1; c2; c8º. The finalconjoint complete-link clustering Ccomp

7 is generated by the threshold graph G.28/.We remark that in this example only the first three levels of the single-linkage andcomplete-linkage coincide. From the fourth level on, the clusters are different.

Now we translate this construction into formal analytic language and derive thecomplete-link clustering by making use of the dissimilarity tables. Start with thesame Dissimilarity Table 3.4 of zero level. Using the function Tm instead of Sm, wecompute the following tables.

From Table 3.13, we see that T min1 D T1.1; 6/ D 3, thus we have the same

complete-link clustering of the 1st level

Ccomp1 D ¹C1;1;C1;2;C1;3;C1;4;C1;5;C1;6;C1;7º


diss.C1;a;C1;b)diss.C1;1;C1;6/ D 3diss.C1;2;C1;3/ D 5diss.C1;3;C1;7/ D 6diss.C1;1;C1;4/ D 9

diss.C1;1;C1;2/ D 10diss.C1;4;C1;5/ D 11diss.C1;1;C1;3/ D 12diss.C1;2;C1;7/ D 13diss.C1;5;C1;6/ D 15diss.C1;4;C1;6/ D 16diss.C1;3;C1;6/ D 17diss.C1;4;C1;7/ D 18diss.C1;1;C1;5/ D 19diss.C1;5;C1;7/ D 20diss.C1;2;C1;4/ D 22diss.C1;3;C1;5/ D 23diss.C1;2;C1;6/ D 25diss.C1;1;C1;7/ D 26diss.C1;2;C1;5/ D 27diss.C1;3;C1;4/ D 28

Table 3.13. Complete linkage: the updated dissimilarity table of the 1st level.

where C1;1 D C0;3 [ C0;4 D ¹c3; c4º, C1;i D C0;i�1 D ¹ci�1º for i D 2; 3, andC1;i D C0;iC1 D ¹ciC1º for i D 4; : : : ; 7.

The next dissimilarity table is Table 3.14, thus, T min2 D T2.2; 3/ D 5, and we

derive the same complete-link clustering of the 2nd level

Ccomp2 D ¹C2;1;C2;2;C2;3;C2;4;C2;5;C2;6º

where C2;1 D C1;1 [ C1;6 D ¹c3; c4; c7º.From the following Dissimilarity Tables 3.15–3.18 we observe the corresponding

values of the function Tm, T min3 D T3.3; 4/ D 11, T min

4 D T4.2; 4/ D 13, T min5 D

T5.2; 3/ D 19, and T min6 D T6.1; 3/ D 28.

Finally, we draw the dendrogram (Fig. 3.25) for the complete-link clustering in thisexample – compare it with the one in Fig. 3.13. Are these dendrograms identical?


diss.C2;a;C2;b)diss.C2;2;C2;3/ D 5diss.C2;3;C2;6/ D 6

diss.C2;4;C2;5/ D 11diss.C2;2;C2;6/ D 13diss.C2;1;C2;4/ D 16diss.C2;1;C2;3/ D 17diss.C2;4;C2;6/ D 18diss.C2;1;C2;5/ D 19diss.C2;5;C2;6/ D 20diss.C2;2;C2;4/ D 22diss.C2;3;C2;5/ D 23diss.C2;1;C2;2/ D 25diss.C2;1;C2;6/ D 26diss.C2;2;C2;5/ D 27diss.C2;3;C2;4/ D 28

Table 3.14. Complete linkage: The updated dissimilarity table of the 2nd level.

diss.C3;a;C3;b)diss.C3;3;C3;4/ D 11diss.C3;1;C3;5/ D 13diss.C3;2;C3;3/ D 16diss.C3;3;C3;5/ D 18diss.C3;2;C3;4/ D 19diss.C3;4;C3;5/ D 20diss.C3;1;C3;2/ D 25diss.C3;2;C3;5/ D 26diss.C3;1;C3;4/ D 27diss.C3;1;C3;3/ D 28

Table 3.15. Complete linkage: the updated dissimilarity table of the 3rd level.


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

G.7/

!!!!!!!!

!!!!!!!!

!

#

#

#

��

��

�@

@

@

@

@

@

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

G.8/

!!!!!!!!!!!!!!!!!

#

#

#

��

��

�@

@

@

@

@

@

@@

@@

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

G.9/

!!!!!!!!!!!!!!!!!

#

#

#

��

��

�@

@

@

@

@

@

@@

@@

��

��

��

��

��

Figure 3.19. Complete linkage: threshold graphs G.7/–G.9/, cf. Fig. 3.11.


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

!!!!!!!!

!!!!!!

!!!

#

#

#

��

��

�@

@

@

@

@

@

@

@

��

��

��

��

��

PP

PP

P

Figure 3.20. Complete linkage: threshold graph G.10/.


Table 3.16. Complete linkage: the updated dissimilarity table of the 4th level.

diss.C5;a;C5;b)diss.C5;2;C5;3/ D 19diss.C5;1;C5;2/ D 26diss.C5;1;C5;3/ D 28





s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

G.11/

!!!!!!!!

!!!!!!!!

!

#

#

#

��

��

�@

@

@

@

@

@

@

@

��

��

��

��

��

PP

PP

PP

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

G.12/

!!!!!!!!!!!!!!!!!

#

#

#

��

��

�@

@

@

@

@

@

@

@

��

��

��

��

��

PP

PP

PP

AA

AA

AA

AA

AA

AA

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

G.13/

!!!!!!!!!!!!!!!!!

��

,,,,,

#

#

#

@

@

@

@

@

@

@

@

��

��

��

��

��

PP

PP

PP

AA

AA

AA

AA

AA

AA

Figure 3.21. Complete linkage: threshold graphs G.11/–G.13/, cf. Fig. 3.12.


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

!!!!!!!!

!!!!!!

!!!

��

,,,,,

#

#

#

��

��

��

��

��

l

l

l

l

l

PPPPPPP

l

l

AA

AA

AA

AA

AA

��

��

��

Figure 3.22. Complete linkage: threshold graph G.14/.

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

G.15/

!!!!!!!!!!!!!!!!!

��

,,,,,

#

#

#

��

��

��

��

��

l

l

l

l

l

PPPPPPP

l

l

AA

AA

AA

AA

AA

��

��

��

c

c

c

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

G.16/

!!!!!!!!!!!!!!!!!

��

,,,,,

#

#

#

��

��

��

��

��

l

l

l

l

l

PPPPPPP

l

l

AA

AA

AA

AA

AA

��

��

��

c

c

cP PPPPPP

Figure 3.23. Complete linkage: threshold graphs G.15/–G.16/.


s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

G.17/

!!!!!!!!

!!!!!!!!

!��

,,,,,

#

#

#

��

��

��

��

��

l

l

l

l

l

PPPPPPP

l

l

AA

AA

AA

AA

AA

��

��

��

c

c

cP PPPPPP

�

�

�

�

�

�

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

G.18/

!!!!!!!!!!!!!!!!!

��

,,,,,

#

#

#

��

��

��

��

��

l

l

l

l

l

PPPPPPP

l

l

AA

AA

AA

AA

AA

��

��

��

c

c

cP PPPPPP

�

�

�

�

�

�

@

@

@

@

@

@

s

s

s

s

s

s

s

sc1 c2

c3

c4

c5c6

c7

c8

G.19/

!!!!!!!!!!!!!!!!!

��

,,,,,

��

#####

��

ccc

cc

,,,,,,,,,,,,

PPPPPPPPPPPP

l

l

l

l

l

PPPPPPP

l

l

AA

AA

AA

AA

AA

�

�

�

�

�

�

@

@

@

@

@

@

Figure 3.24. Complete linkage: threshold graphs G.17/–G.19/.


s s s s s s s sc1 c2 c3 c4 c5 c6c7c8Ccomp0

Ccomp1

Ccomp2

Ccomp3

Ccomp4

Ccomp5

Ccomp6

Ccomp7

Figure 3.25. Complete linkage: the dendrogram for the model example.


EP 3.4.1. Using Hubert’s complete-link algorithm, build all consecutive thresholdgraphs and clusterings of the setX D ¹x1; x2; : : : ; x6º, given Dissimilarity Table 3.5.What level of clustering corresponds to the threshold levels of 2? Of 3? Of 4?

EP 3.4.2. Using Hubert’s complete-link algorithm, build conjoint clustering of the setX with the dissimilarities given in Table 3.12. Draw the corresponding dendrogram.

EP 3.4.3. Give an example of a 4-element set with different single-link and complete-link clusterings.

3.5 Case Study

In this section we apply the single-link algorithm developed in Sections 3.2–3.3to a set of real data and use Pearson’s correlation coefficient to assess the qualityof the derived clustering.

Carl Pearson’s biography and work � What is correlation?

Table 3.19 contains the final grades and GPA scores of 15 students in an IntroductoryStatistics class. The students s1� s15 are listed in alphabetical order. The GPA scoreswere calculated earlier, so that they do not reflect the grades in this class. Using the

Section 3.5 Case Study 169

final grades, we build single-link clusterings of this 15-element set and compare theresults with the students’ GPA scores. Our goal in doing this comparison is to assessthe validity of the presented clustering algorithm. As a measure of dissimilarity, wehave chosen the absolute value of the difference between the final grades, and usedthis measure to complete the Dissimilarity Table 3.20.

Student s1 s2 s3 s4 s5 s6 s7Final Grade 62 54 71 60 36 81 84

GPA 1.808 2.369 3.058 2.825 2.460 3.681 3.508

Student s8 s9 s10 s11 s12 s13 s14 s15Final Grade 69 55 70 58 61 60 40 75

GPA 2.793 2.738 3.123 3.100 2.197 2.285 2.113 2.703

Table 3.19. The final grades and GPA scores.

s1 s2 s3 s4 s5 s6 s7 s8 s9 s10 s11 s12 s13 s14 s15s1 0 8 9 2 26 19 22 7 7 8 4 1 2 22 13s2 0 17 6 18 27 30 15 1 16 4 7 6 14 21s3 0 11 35 10 13 2 16 1 13 10 11 31 4s4 0 24 21 24 9 5 10 2 1 0 20 15s5 0 45 48 33 19 34 22 25 24 4 39s6 0 3 12 26 11 23 20 21 41 6s7 0 15 29 14 26 23 24 44 9s8 0 14 1 11 8 9 29 6s9 0 15 3 6 5 15 20s10 0 12 9 10 30 5s11 0 3 2 18 17s12 0 1 21 14s13 0 20 15s14 0 35

Table 3.20. The dissimilarity table for the Statistics Class grades.

In this problem we have many ties, therefore, some intermediate steps are notunique, but it does not affect our conclusions. The agglomerative single-link algo-rithm (Sect. 3.2) gives the following results. There are two elements, whose dis-similarity is zero, thus the first-level clustering C1 contains one two-element cluster¹s4; s13º and thirteen one-element clusters. Next, if the threshold level does not ex-ceed 1, we derive nine clusters of the second level

C2 D ¹¹s1; s4; s12; s13º; ¹s3; s8; s10º; ¹s2; s9º; ¹s5º; ¹s6º; ¹s7º; ¹s11º; ¹s14º; ¹s15ºº:


C6;1 Student s1 s2 s4 s9 s11 s12 s13GPA 1:808 2.369 2.825 2.738 3.100 2.197 2.285

C6;2 Student s3 s6 s7 s8 s10 s15GPA 3:058 3.681 3.508 2.793 3.123 2.703

C6;3 Student s5 s14GPA 2:460 2.113

Table 3.21. C6-clustering.

There are eight clusters at the next level

C3 D ¹¹s1; s4; s11; s12; s13º; ¹s3; s8; s10º; ¹s2; s9º; ¹s5º; ¹s6º; ¹s7º; ¹s14º; ¹s15ºº:

Next we have six clusters

C4 D ¹¹s1; s2; s4; s9; s11; s12; s13º; ¹s3; s8; s10º; ¹s6; s7º; ¹s5º; ¹s14º; ¹s15ºº:

At the dissimilarity level of 4, there are only four clusters

C5 D ¹¹s1; s2; s4; s9; s11; s12; s13º; ¹s3; s8; s10; s15º; ¹s6; s7º; ¹s5; s14ºº:

There is no merger at the level 5, however two of these clusters amalgamate at thedissimilarity level of 6,

C6 D ¹¹s1; s2; s4; s9; s11; s12; s13º; ¹s3; s6; s7; s8; s10; s15º; ¹s5; s14ºº:

At the level of 7 only two clusters remain,

C7 D ¹¹s1; s2; s3; s4; s6; s7; s8; s9; s10; s11; s12; s13; s15º; ¹s5; s14ºº:

Ultimately, these two clusters amalgamate into conjoint clustering at the 14th level.Now we want to assess the derived clustering. Table 3.21 represents three clusters

in C6. Every chart contains the GPA scores of the students in the correspondingcluster.

The real data always have significant variability, thus there is no perfect match.However, we see that at this threshold level the clusters C6;2 and C6;3 demonstrategood uniformity of the GPA scores contained, while C6;1 shows larger variety ofscores.

Next, consider the clustering C5, shown in Table 3.22. Again, we see that there isa noticeable closeness of the GPA scores within the clusters C5;2;C5;3, and C5;4. Inparticular, cluster C5;3 contains two highest GPA scores.

To give a quantifiable assessment of the clusterings derived, we do some statistics.Tables 3.23–3.29 contain the averaged grades and the averaged GPA scores for each


C5;1 Student s1 s2 s4 s9 s11 s12 s13GPA 1.808 2.369 2.825 2.738 3.100 2.197 2.285

C5;2 Student s3 s8 s10 s15GPA 3.058 2.793 3.123 2.703

C5;3 Student s6 s7GPA 3.681 3.508

C5;4 Student s5 s14GPA 2.460 2.113

Table 3.22. C5-clustering.

cluster at all levels. Finally, Table 3.30 contains Pearson’s correlation coefficientsfor the GPA scores and averaged grades for every level of clustering. We see thatevery next level of clustering, except for C4, increases the correlation of the finalgrades and the GPA scores. This observation validates the clustering algorithm ofSections 3.2–3.3.

Clustering C0 Average grade Average GPA scoreCluster C0;1 62.4 2.717

Table 3.23. The grades and GPA scores over the entire class – C0-clustering.

Clustering C1 Average grade Average GPA scoreCluster C1;1 60 2.555Cluster C1;2 62 1.808Cluster C1;3 54 2.369Cluster C1;4 71 3.058Cluster C1;5 36 2.460Cluster C1;6 81 3.681Cluster C1;7 84 3.508Cluster C1;8 69 2.793Cluster C1;9 55 2.738Cluster C1;10 70 3.123Cluster C1;11 58 3.100Cluster C1;12 61 2.197Cluster C1;13 40 2.113Cluster C1;14 75 2.703



Clustering C2 Average grade Average GPA scoreCluster C2;1 60.75 2.279Cluster C2;2 70 2.991Cluster C2;3 54.5 2.554Cluster C2;4 36 2.460Cluster C2;5 81 3.681Cluster C2;6 84 3.508Cluster C2;7 58 3.100Cluster C2;8 40 2.113Cluster C2;9 75 2.703


Clustering C3 Average grade Average GPA scoreCluster C3;1 60.2 2.443Cluster C3;2 70 2.991Cluster C3;3 54.5 2.554Cluster C3;4 36 2.460Cluster C3;5 81 3.681Cluster C3;6 84 3.508Cluster C3;7 40 2.113Cluster C3;8 75 2.703


Clustering C4 Average grade Average GPA scoreCluster C4;1 58.57 2.475Cluster C4;2 70 2.991Cluster C4;3 82.5 3.594Cluster C4;4 36 2.460Cluster C4;5 40 2.113Cluster C4;6 75 2.703



Clustering C5 Average grade Average GPA scoreCluster C5;1 58.57 2.475Cluster C5;2 71.25 2.919Cluster C5;3 82.5 3.595Cluster C5;4 38 2.287


Clustering C6 Average grade Average GPA scoreCluster C6;1 58.57 2.475Cluster C6;2 75 3.144Cluster C6;3 38 2.287


Clustering level C0 C1 C2 C3 C4 C5 C6Correlation Coefficient 0.659 0.671 0.788 0.850 0.832 0.925 0.929

Table 3.30. Pearson’s coefficient of correlation.

Part II

Combinatorial Analysis

Chapter 4

Enumerative Combinatorics

The methods developed in this chapter allow us to solve more advanced prob-lems with the same question “How many?”. Section 4.1 treats the inclusion-exclusion principle. Inversion formulas, including the Möbius inversion and theirapplications, are studied in Section 4.2. Generating functions are considered inSections 4.3–4.4, and Section 4.5 is devoted to the Pólya–Redfield enumerationtheory.

4.1 The Inclusion-Exclusion Principle

Inclusion-Exclusion Principle � Eratosthenes’ biography � Sieveof Eratosthenes � Stirling’s biography � Derangements � Bell’sbiography � Bell numbers � Bonferroni � Bonferroni inequalities �

Napier (Neper) biography � What is neper? � Maclaurin’s biography� Totient numbers

Problem 4.1.1. Each member of the Combi Club plays at least one game, 5 studentsgo to football, 12 to basketball, and 8 to volleyball. How many members are there inthe club?

Solution. Denote by Sf ; Sb , and Sv the sets of students who play, respectively, foot-ball, basketball, and volleyball, and by S the entire membership of the club. Obvi-ously, jSf j D 5; jSbj D 12; jSvj D 8 and S D Sf [ Sb [ Sv. But we cannotapply the Sum Rule, since a member can play two or three games and the subsetsSf ; Sb; Sv do not have to be disjoint.

Unless we have some additional information, this problem has several solutions.For instance, if each student participates in one and only one sport, then the threesubsets are mutually disjoint and by the Sum Rule we have jS j D 5 C 12 C 8 D

25. However, if all five football players and all eight volleyball players also playbasketball, then the entire membership consists of only 12 students. Since 5C8 > 12,the latter option would necessarily imply that at least one club member plays all threegames. Thus, if we have no more information, we must conclude that the quantityof club members satisfy the bilateral inequality 12 � jS j � 25 and we cannot sayanything more.

178 Chapter 4 Enumerative Combinatorics

Hence, if we want to make a more specific conclusion, we need certain additionalinformation about intersections of the sets given. The following assertion, whichinvolves these quantities, is equation (1.1.4), which was proved in Section 1.1. Westate it here again. Hereafter it is referred to as the Inclusion-Exclusion Principle. Itis also called the (Eratosthenes ) Sieve Formula.

Theorem 4.1.1. If Xi ; 1 � i � k, are finite sets, then

jX1 [X2 [ � � � [Xkj

D jX1j C jX2j C � � � C jXkj � jX1 \X2j � � � � � jXk�1 \Xkj

C jX1 \X2 \X3j C � � � C .�1/k�1jX1 \X2 \ � � � \Xkj:

(4.1.1)

The right-hand side of this equation consists of k groups of terms. The first groupcontains the cardinalities of the k given sets. The second group contains C.k; 2/ car-dinalities of their pair-wise intersections, and all the terms in this group have negativesigns. The third group contains C.k; 3/ cardinalities of the triple intersections of thesets Xi with the plus sign, and so forth. Since C.k; k/ D 1, the last group containsone term .�1/k�1jX1 \ X2 \ � � � \ Xkj. If all sets Xi are pairwise disjoint, thenthe cardinal numbers of all intersections are zero and (4.1.1) reduces to the Sum Rule(1.2.1). Now let us modify Problem 4.1.1.

Problem 4.1.2. Each member of the Combi Club plays at least one game, 5 peopleplay football, 12 basketball, and 8 play volleyball. In addition, this time we know thattwo of them are devoted to both football and volleyball, three members go to footballand basketball, and four play basketball and volleyball. The best mathematician inthe club plays all three sports. How many members are there in the club? How manyamong them play only volleyball?

Solution. As before, denote by Sf ; Sb and Sv the sets of students who play, respect-fully, football, basketball, and volleyball, jSf j D 5; jSbj D 12; jSvj D 8, and againS D Sf [Sb[Sv. However, now we are given the cardinalities of all the terms in for-mula (4.1.1) with k D 3, and we can straightforwardly apply the inclusion-exclusionprinciple (4.1.1) with k D 3:

jS j D jSf j C jSbj C jSvj � jSb \ Sf j � jSb \ Svj � jSf \ Svj C jSb \ Sf \ Svj

D 12C 5C 8 � 3 � 4 � 2C 1 D 17:

Actually Theorem 4.1.1 contains more information. Thus, applying the same for-mula (4.1.1) with k D 2, we have

j.Sb \ Sv/ [ .Sf \ Sv/j D jSb \ Svj C jSf \ Svj � jSb \ Sf \ Svj

D 4C 2 � 1 D 5:

Section 4.1 The Inclusion-Exclusion Principle 179

Therefore, among eight volleyball players five people also play either basketball orfootball, so 8 � 5 D 3 students play only volleyball and no other game.

We leave it to the reader to solve the two following problems.

Problem 4.1.3. In the same club, how many members play only football? Only bas-ketball?

Problem 4.1.4. Each member of the Combi Club participates in at least one sport, 5students go to football, 12 to basketball and 8 to volleyball. What additional informa-tion should we have to insure that the club has precisely 13 members? Or exactly 14,or 15, . . . , or 24 members?

Problem 4.1.5. How many n-arrangements with repetition from the elements of theset A D ¹0; 1; 2; 3º contain at least one digit 1, at least one digit 2, and at least onedigit 3?

Solution. With no restriction, there are 4n n-arrangements with repetition. Amongthem there are 3n arrangements from the set A0 D ¹1; 2; 3º, that is, the arrangementswhich certainly do not contain 0 and maybe do not contain some other digits either.Similarly, there are 2n arrangements not containing two digits and there is 1n D 1

arrangement consisting only of 0s. Now, by (4.1.1) there are 4n � 3 � 3n C 3 � 2n � 1n-arrangements satisfying the problem. For example, if n D 3 then there are 43 � 3 �33C3 �23�1 D 6 D 3Š arrangements, which is clear since all eligible 3-arrangementsare precisely the permutations of the three-element set A0 D ¹1; 2; 3º. If n D 2 orn D 1, 4n � 3 � 3n C 3 � 2n � 1 D 0, which is also obvious since any arrangement inquestion must contain at least three numbers 1, 2, 3.

Theorem 4.1.1 can be stated in other terms, see for instance, [23, p. 18]. First weintroduce some notation. Let q properties Pi ; 1 � i � q, be defined for the elementsof a finite set X; jX j D n < 1, that is, each element x 2 X either possesses ordoes not possess the property Pi for every i; 1 � i � q. If an element x possessesthe property Pi , we denote this by Pi .x/ D 1; otherwise, if x does not possess thisproperty, we write Pi .x/ D 0. Therefore, these properties are mappings Pi W X !¹0; 1º. LetXi be the subset ofX , whose elements have the propertyPi , thus these setsare the total preimages of 1, Xi D P�1i .¹1º/, and ni D jXi j be its cardinal number.Let Xi;j be the subset of X , whose elements possess both properties Pi and Pj , andni;j D jXi;j j be its cardinal number, etc. Let X0 D X n ¹X1 [ � � � [Xnº be thesubset of all elements of X possessing no property Pi ; 1 � i � q, and n0 D jX0j.

The following equation is also called the Sieve Formula.


Theorem 4.1.2.

n0 D n �

qXiD1

ni CXi1<i2

ni1;i2 � � � �

C .�1/sX

i1<i2<��<is

ni1;i2;:::;is C � � � C .�1/qn1;2;:::;q:

(4.1.2)

Proof. The conclusion follows immediately from Theorem 4.1.1 being applied to theset X D X0 [X1 [ � � � [Xn, if we notice that X0 is disjoint with every set Xi ; 1 �i � q.

If all sets Xi have equal cardinalities, all sets Xi;j also have equal cardinalities, allsets Xi;j;k have equal cardinalities, and so forth, then (4.1.2) can be simplified.

Corollary 4.1.3. If ni D n�1; 8i , ni;j D n�2; 8i; j , . . . , ni1;i2;:::;is D n�s for alls-tuples of subscripts, and so forth for 1 � s � q, then

n0 D n � qn�1 C C.q; 2/n

�2 � � � � C .�1/

sC.q; s/n�s C � � � C .�1/qn�q :

This corollary immediately implies the next one.

Corollary 4.1.4. If jX j D n � jY j D m, then there are

jSur.Y X /j D mn � C.m; 1/.m � 1/n C � � � C .�1/m�1C.m;m � 1/

surjective mappings fromX to Y . In particular, ifm D n, then a surjective mapping issimultaneously injective and therefore bijective, thus, jSur.Y X /j D nŠ and the latterequation becomes

nŠ D

nXkD0

.�1/kC.n; k/.n � k/n:

Now we can easily solve the following important problem.

Problem 4.1.6. In how many ways is it possible to place n different balls into mdifferent urns with no urn left empty?

Solution. Considering Definition 1.1.4 of the preimage of an element, the answer isgiven by the number jSur.Y X /j with jX j D n � jY j D m. On the other hand, if allurns are indistinguishable, then any permutations of the urns without changes in theirenclosures lead to the same placement of different balls, hence there are

S2.n;m/ D1

mŠjSur.Y X /j (4.1.3)

ways to put n different balls into m identical urns with no empty urns.


Definition 4.1.5. The numbers S2.n;m/ are called Stirling numbers of the secondkind. These numbers also count partitions of sets, namely, the number S2.n;m/ isequal to the number of partitions of an n-element set with m nonempty parts, if theorder of parts is immaterial.

Properties of the Stirling numbers of the second kind are discussed in EPs 4.1.22,4.1.23, 4.2.3, more properties of these numbers can be found in [31]. The Stirlingnumbers of the first kind, S1.n;m/, are defined in the end of this section.

Problem 4.1.7. In how many ways is it possible to paint four walls of a room in threecolors, so that any two adjacent walls have different colors?

Solution. There are Arep.3; 4/ D 34 ways to paint the walls without any restriction.Let us enumerate the corners of the room by digits from 1 through 4 consecutively,starting at any fixed corner in any direction, and consider the following propertiesPi ; i D 1; 2; 3; 4, on the set of all possible colorings:

A coloring has a property Pi ; 1 � i � 4, if two walls adjacent at the i th corner areof the same color.

Then n1 D n2 D n3 D n4 D 3 � 32, n1;3 D n2;4 D 3 � 3, n1;2 D n2;3 D

n3;4 D n1;4 D 3 � 3, n1;2;3 D n1;2;4 D n1;3;4 D n2;3;4 D 3, and n1;2;3;4 D 3; theseparameters were defined before Theorem 4.1.2. Due to (4.1.2) we get

34 � 4 � 3 � 32 C .4 � 3 � 3C 2 � 3 � 3/ � 4 � 3C 3 D 18

different colorings. Similarly, if only two colors are available, there are 16 � 32 C24 � 8C 2 D 2 colorings; though this is clear without calculations.

Remark 4.1.6. Any coloring in this problem can be represented by a plane graphwith four vertices, corresponding to the four walls, such that two vertices are adjacentif and only if they correspond to the neighboring walls. Now Problem 4.1.7 can bestated as a graph coloring problem:

In how many ways is it possible to color the vertices of a cycle of length four inthree colors so that any pair of adjacent vertices has different colors?

The Inclusion-Exclusion Principle leads to useful inequalities involving the cardinalnumbers of subsets. Consider again equation (4.1.1) with k D 3 and X D X1[X2[X3. In this case (4.1.1) becomes

jX j D jX1j C jX2j C jX3j � jX1 \X2j � jX2 \X3j � jX1 \X3j C jX1 \X2 \X3j:

It is obvious from here that

jX j � jX1j C jX2j C jX3j � jX1 \X2j � jX2 \X3j � jX1 \X3j:


Moreover, since

jX1 \X2j C jX2 \X3j C jX1 \X3j � jX1 \X2 \X3j;

we clearly have an opposite bound

jX j � jX1j C jX2j C jX3j:

Similar inequalities can be derived for any k, therefore, the Inclusion-Exclusion Prin-ciple produces a series of alternating upper and lower bounds for jX j.

Problem 4.1.8 (Bonferroni’s inequalities). Let X1; : : : ; Xn be nonempty subsets of afinite set X , and T be any subset of ¹1; 2; : : : ; nº. Denote

� D max

²jT j W T � ¹1; 2; : : : ; nº such that

\i2T

Xi ¤ ;

³and for l D 1; 2; : : : ; � � 1,

�l D

ˇ n[iD1

Xi

ˇ�

lXjD1

.�1/j�1� XT�¹1;:::;nº; jT jDj

ˇ\i2T

Xi

ˇ�:

Then .�1/l�l � 0.It is instructive to verify these inequalities in some simple case, for example, if

n D 3, X D ¹1; 2; 3º, X1 D ¹1º, X2 D ¹1; 2º, and X3 D ¹1; 2; 3º.

Theorem 4.1.2 can be further generalized if we supply the elements of the set Xwith weights. Consider a mapping w W X �! W , where a set W can be specifiedin some convenient way. The image w.x/ 2 W of an element x 2 X is called theweight of x. To give an example of weights, let us suppose that X is the inventory ofall items in a store. Then the price of any item x 2 X can be viewed as the weightof x, and the mapping w assigns to any item x in stock its price w.x/. For the timebeing we do not need any rich algebraic structure on W , it is enough to assume thatwe can add elements of W , and the operation of addition on W is commutative andassociative. We define also the quantities

w.Pi1 ; Pi2 ; : : : ; Pir / DX

x2Xi1\Xi2\��\Xir

w.x/

where the properties Pi and the sets Xi were defined after Problem 4.1.5.Set also

w.r/ DX

w.Pi1 ; Pi2 ; : : : ; Pir /

where the sum runs over all r-element subsets of the set of properties ¹P1; : : : ; Pqº,that is, w.r/ is the sum of the weights of elements possessing at least r properties.Moreover, let w.0/ D

Px2X w.x/ and E.r/ be the sum of weights of those elements

of X that have exactly r properties.


Theorem 4.1.7. In these notations

E.r/ D w.r/ � C.r C 1; r/w.r C 1/C � � � C .�1/q�rC.q; r/w.q/

for 0 � r � q. In particular, if r D 0, then

E.0/ D w.0/ � w.1/C � � � C .�1/qw.q/

which implies .4:1:2/ if we choose here all weights w.x/ � 1.

Next we consider a classical application of (4.1.2), called the derangement problem.For many other applications of these results see, for instance, [43].

Definition 4.1.8. A permutation .a1; : : : ; an/ of the natural segment Nn D ¹1; 2; : : : ;

nº is called the derangement if ai ¤ i for all i D 1; 2; : : : ; n.

Problem 4.1.9. How many derangements are there among all nŠ permutations of thenatural segment Nn D ¹1; 2; : : : ; nº?

Solution. Denote the number of derangements byDn and let Pi be the property ai Di; 1 � i � n, defined on all n-permutations. By Theorem 4.1.2, we get immediately

Dn D nŠ � C.n; 1/.n � 1/ŠC C.n; 2/.n � 2/Š � � � � C .�1/nC.n; n/

D nŠ

�1 � 1C

1

2Š�1

3ŠC � � � C

.�1/n

nŠ

�:

(4.1.4)

Quite similarly, the number Dn.r/ of n-permutations possessing exactly r out of nproperties P1; : : : ; Pn is equal to Dn.r/ D C.n; r/Dn�r , thus,

Dn.r/ DnŠ

rŠ

�1 � 1C

1

2Š�1

3ŠC � � � C

.�1/n�r

.n � r/Š

�:

It is obvious that Dn D Dn.0/.

Remark 4.1.9. The factorials satisfy recurrence relations nŠ D n.n � 1/Š and nŠ D.n � 1/Œ.n � 1/Š C .n � 2/Š�. Since the derangement numbers Dn satisfy similarrecurrence relations

Dn D nDn�1 C .�1/nD .n � 1/ŒDn�1 CDn�2�; (4.1.5)

they are called subfactorials.

Problem 4.1.10. Prove the recurrence relations (4.1.5).


Remark 4.1.10. Expanding1 e�1 in the Maclaurin series and using the well-known from calculus property of alternating series with monotone decreasing terms[46, p. 607], we get an estimate jDn � nŠ=ej < 1

nC1, that is, for any n � 2 the de-

rangementDn can be defined as the nearest integer to nŠ=e, since 1nC1

< 12

whenevern > 2.

Problem 4.1.11. In how many ways is it possible to place eight rooks on the chess-board, so that none of them could attack another and none of them are on the mainwhite diagonal?

Solution. Formula (4.1.4) immediately gives the answer, D8 D 14 833; we note that8Š=e � 14 832:9

Problem 4.1.12. The Combi Club bought 2n tickets and reserved n seats for two ballgames. The n tickets to the first game were distributed at random among n students.Then the n tickets to the second game were distributed, also at random, among thesame n students. In how many ways is it possible to distribute these 2n tickets, so thatno student gets the same seat twice?

Solution. Without any restriction, the tickets can be distributed in .nŠ/2 ways. How-ever, if we want to avoid repetitions of same-seat tickets, the second distribution canbe done in Dn ways. Formula (4.1.4) and the Product Rule result in

nŠ �Dn D nŠ �

²nŠ

�1 �

1

1ŠC1

2Š�1

3ŠC � � � C

.�1/n

nŠ

�³�1

e.nŠ/2

different ways to distribute the tickets.

As another application of Theorem 4.1.2, we again derive two formulas that wereproven in Problems 1.4.18–1.4.19.

Problem 4.1.13. Demonstrate formulas

2n D C.n; 0/C C.n; 1/C � � � C C.n; n/

and1 D C.n; 0/2n � C.n; 1/2n�1 C � � � C .�1/nC.n; n/20:

Solution. Let us paint a ball using n different colors, not all of which have to be used.Let a property Pi mean that the i th color is applied at the ball. Then n0 D 1, sincethere is only one way not color the ball at all. Moreover, ni1;i2;:::;ik D 2

n�k; 1 � k �

n, and the total number of colorings is 2n. Now the second formula follows directlyfrom (4.1.2).

1Surely, here e � 2:718281828 is Napier’s number, the base of natural logarithms.


However, if Pi means the property that a coloring contains precisely i colors, thenni D C.n; i/; ni;j D ni;j;k D � � � D 0, and (4.1.2) implies the first formula in theproblem.

Next we apply Theorem 4.1.2 to another graph coloring problem. Given a graphG D .V;E/ of order p, we want to paint its p vertices in � given colors.

Definition 4.1.11. A coloring is called regular if the end-vertices of every edge havedifferent colors. A graph G is called k-chromatic if its vertices can be regularlycolored in k colors. The smallest such a number is called the chromatic number� D �.G/ of the graph G. The number of various regular colorings of a graph G ink colors is denoted by �.G; k/.

Theorem 1.1.35 immediately implies that for a graph of order p there are kp, notnecessarily regular, k-colorings. To exclude non-regular colorings from this number,we denote by �.e˛; eˇ ; : : : ; eı/ the number of colorings such that the end-verticesof the edge e˛ have the same color, the end-vertices of eˇ also have the same color(maybe different from the color of e˛), etc. Now Theorem 4.1.2 immediately yieldsthe first statement of the following result.

Theorem 4.1.12. 1) For any simple graph G of order p

�.G; k/ D kp �X˛

�.e˛/CX˛¤ˇ

�.e˛; eˇ / �X

˛¤ˇ¤ ¤˛

�.e˛; eˇ ; e /C � � � :

(4.1.6)

2) If G is a tree, then �.G; k/ D k.k � 1/p�1.

Proof. To prove 2), we notice that the size of the tree G is p � 1, thus there are p � 1ways to select an edge whose end-vertices have the same color, and this color can bechosen in k ways. After that we can paint p� 2 remaining vertices in any of k colors,hence

P˛ �.e˛/ D k

p�2k.p � 1/. Similarly,P˛¤ˇ �.e˛; eˇ / D k

p�2.p � 1/.p �

2/=2, etc. Substituting these expressions in (4.1.2) and using the binomial expansion(1.4.4) we complete the proof.


EP 4.1.1. Consider the following properties on the set of the first 13 whole numbersS D ¹0; 1; 2; : : : ; 12º.

P1: a number x 2 S is a multiple of 5 or x D 0



P4: a number x 2 S and x2 C x > 5.


How many elements of the set S satisfy the following properties?

1) P1 _ P2 _ P3

2) P1 ^ P2 ^ P3

3) P1 ^ P2 ^ P3

4) P1 _ .P2 ^ P3/.

EP 4.1.2. Prove the following modifications of the results of this section.

1) If X1; : : : ; Xn are subsets of a finite set X and Y D X n Y is the complementof Y with respect to X , thenˇ n\

kD1

Xk

ˇD jX j C

X;¤I�¹1;2;:::;nº

.�1/jI jˇ\k2I

Xk

ˇ;

where the index I runs over all the nonempty subsets of the set ¹1; 2; : : : ; nº.

2) Consider finite sets X1; : : : ; Xn, their union X DSnkD1Xk , and a function

f W X ! R. For any set Y � X define f .Y / DPx2Y f .x/ and f .;/ D 0.

Prove the equation

f .X/ DXI¤;

.�1/jI jC1f

� \k2I

Xk

�:

EP 4.1.3. 1) If the largest among 66 consecutive odd integers is 213, what is thesmallest?

2) If the largest among several consecutive positive odd integers is 213, what isthe largest possible length (that is, the number of elements) of this sequence?Answer the same question if the sequence can contain negative numbers.

EP 4.1.4. What is the cardinality of the union of five sets if their cardinalities are,respectively, 17, 23, 41, 45, and 56, each pair of the sets contains six elements, everytriple of the sets contains four elements, and any four sets are mutually disjoint?

EP 4.1.5. At The Top-Rate College, a1 students received at least one F grade duringa semester, a2 students received at least two F grades,. . . , al students received at leastl F grades during this semester, while no student had more than l F grades. Howmany F grades have all the students received during this semester?

EP 4.1.6. How many prime numbers do not exceed 300?

EP 4.1.7. Prove that there are Œa=n� natural numbers not exceeding a and divisibleby n, where Œx� is the integer part of the number x.


EP 4.1.8. How many natural numbers less than 777 are not divisible by 3, by 7, andby 11? How many are not divisible by 4 and by 6?

EP 4.1.9. How many seven-digit telephone numbers contain each of the digits 1 and9 at least once?

EP 4.1.10. A paper reports that among 1000 people surveyed, 800 have driver li-censes, 750 are from 20 through 30 years old, and 500 have never had a ticket forspeeding, while 450 are from 20 through 30 years old and have never had a ticket forspeeding. Are these data consistent?

EP 4.1.11. Consider three n-families of sets

¹A1; : : : ; Anº; ¹B1; : : : ; Bnº; ¹C1; : : : ; Cnº

such thatA1 [ � � � [ An D B1 [ � � � [ Bn D C1 [ � � � [ Cn

defD M

andjAi \ Bj j C jAi \ Ckj C jBj \ Ckj � n; 8i; j; k:

Prove that jM j � n3

3.

EP 4.1.12. Let F be a forest of order p and size q. Prove that

�.F; k/ D kp�q.k � 1/q:

EP 4.1.13. A gentleman had 11 daughters. If any girl got married while at leastone of her older sisters remained unwed, then these still non-married but older sistersapproached their father crying and complaining so bitterly that he had to double theirdowry. In how many ways could these sisters arrange their weddings if the gentlemanremarked that when the last his daughter got married, he had to double the dowry 11times?

EP 4.1.14. There are 5 people and n � 5 different pairs of gloves. In how manyways can each of these people choose a right glove and a left one so that no one getsa complete pair of gloves?

EP 4.1.15. Ten couples are dining at a round table. In how many ways can they beseated so that no two males, no two females, and no two spouses are sitting alongside?

EP 4.1.16. In how many ways can we roll a fair die 12 times, so that a 1 never appearsafter another 1?


EP 4.1.17. The membership of the Combi Club comprises 30 students of five majors.Together they composed 40 problems for the Math Fair. Any two students of thesame major composed equal number of problems, while any two students majoringin different subjects composed different number of problems. How many studentscomposed only one problem?

EP 4.1.18. Solve again Problem 4.1.7 if we have to paint not only the walls, but alsothe floor and the ceiling of the room; consider two different cases, if there are three oronly two paints available.

EP 4.1.19. Prove that for any k D 1; 2; : : : ; 9, there are

kn � C.k; 1/.k � 1/n C C.k; 2/.k � 2/n � � � � C .�1/k�1C.k; k � 1/

n-digit numbers consisting only of the digits 1; 2; : : : ; k.

EP 4.1.20. The quantity of natural numbers that do not exceed a natural number nand are mutually prime with n, is denoted by �.n/ and is called the Euler (totient)function; �.1/ D 1 by definition.

1) Evaluate �.2/, �.3/, �.4/, �.5/, �.6/, �.7/.

2) Prove that

�.n/ D n

mYkD1

�1 �

1

pk

�where p1; : : : ; pm are all prime factors of n. Other properties of the totientfunction are considered in EP 4.2.2.

EP 4.1.21. Use Corollary 4.1.4 to prove that the Stirling numbers of the second kindS2.n;m/ give the number of ways an n-element set can be partitioned intom subsets.

EP 4.1.22. 1) Prove that the Stirling numbers of the second kind satisfy the recur-rence relation

S2.n;m/ D S2.n � 1;m � 1/CmS2.n � 1;m/

assuming S2.n � 1; n/ D 0 and the initial conditions S2.n; 0/ D 0; 8n.

2) Compute S2.n;m/ for 1 � m � n � 4.

3) Prove that

S2.n;m/ D1

mŠ

X nŠ

k1Šk2Š � � � kmŠ

where the sum runs over all positive integer solutions of the equation k1Ck2C� � � C km D n.


4) Verify that S2.n; n � 1/ D C.n; 2/.

5) Prove that kn DPnmD1 S2.n;m/.k/m, where

.k/m D k.k � 1/.k � 2/ � � � .k �mC 1/ DkŠ

.k �m/Š; k � n:

The latter equation can be written as

kn D

nXmD1

S2.n;m/C.k;m/mŠ: (4.1.7)

Formula (4.1.7) represents powers of natural numbers through the binomial co-efficients, and the Stirling numbers of the second kind.

EP 4.1.23. 1) Let jX j D 3 and jY j D 4. How many functions f W X ! Y areinjective but not surjective? Surjective but not injective ? Neither injective norsurjective?

2) Answer the same question if jX j D 4 and jY j D 3.

3) Answer the same question if jX j D jY j D 4.

Definition 4.1.13. The Stirling numbers of the first kind S1.n;m/ are the coefficientsin the inversion of formula (4.1.7), which represents the binomial coefficients throughthe powers:

nŠC.k; n/ D

nXmD1

.�1/n�mS1.n;m/km:

EP 4.1.24. Prove that the number B.n;m/ DPmkD0 S2.n; k/ is the number of place-

ments of n different balls into m indistinguishable urns with empty urns allowed.

EP 4.1.25. Compute �.K2/; �.K3/; �.K4/; �.K1;2/; �.K2;3/; �.K3;3/, where�.G/ is the chromatic number of a graph G – see Definition 4.1.11.


4.2 Inversion Formulas

In this section we derive inversion formulas such as the Möbius inversion, and ap-ply these results to enumeration of cyclic sequences and bracelets. Other familiesof inversion formulas are considered in Theorem 4.3.13 and problems thereafter.

Möbius’ biography � Möbius strip � Fermat’s biography � FermatLast Theorem � Carl Gauss, Prince of Mathematics

Let us revisit Theorem 4.1.1, denoting for the sake of brevity,

X1;2 D X1 \X2; X1;2;3 D X1 \X2 \X3

etc. Let X�i denote the subset of elements of X possessing only the property Pi ; 1 �i � q, X�i;j denote the set of elements having exactly two properties Pi ; Pj , and soon. It is clear that we can represent X1 as

X1 D X�1 [X

�1;2 [ � � � [X

�1;q [X

�1;2;3 [ � � � [X

�1;2;:::;q

where all sets on the right are pairwise disjoint. Consequently,

jX1j D jX�1 j C jX

�1;2j C � � � C jX

�1;qj C jX

�1;2;3j C � � � C jX

�1;2;:::;qj: (4.2.1)

At the same time we can write

X1 D X�1 [X1;2 [ � � � [X1;q [X1;2;3 [ � � � [X1;2;:::;q:

Applying Theorem 4.1.1 to the latter, we have

jX�1 j D jX1j � jX1;2j � � � � � jX1;qj C jX1;2;3j C � � � C .�1/q�1jX1;2;:::;qj: (4.2.2)

Equations (4.2.1)–(4.2.2) are inverse to one another. Indeed, we can consider(4.2.1) as the equation for the unknown cardinality jX�1 j. Then formula (4.2.2) solvesequation (4.2.1) for jX�1 j, that is, expresses jX�1 j through jXi;:::;j j. Vice versa, (4.2.1)represents jX1j in terms of jX�1 j, etc. Such transformations are useful in many prob-lems. We now consider the method, called the Möbius inversion, of invertingfinite sums similar to (4.2.1)–(4.2.2). In what follows, we customarily write d jn if dis a natural divisor of an integer n;

Pd jn means that the summation index d runs over

all divisors of n.

Definition 4.2.1. The Möbius function � W N ! ¹�1; 0; 1º is defined as

�.n/ D

8<ˆ:

1 if n D 1

0 if n > 1 has a factor p˛ with a prime p andan integer ˛ � 2

.�1/r if n > 1 and has r different prime factors.

Section 4.2 Inversion Formulas 191

Example 4.2.2. By the definition, �.1/ D 1. Since 2, 3, and 5 are primes, we have�.2/ D �.3/ D �.5/ D �1. Next, 4 D 22, thus �.4/ D 0. From these equations wehave X

d j2

�.d/ D �.1/C �.2/ D 0;Xd j3

�.d/ DXd j5

�.d/ D 0

and Xd j4

�.d/ D �.1/C �.2/C �.4/ D 0:

In the next lemma we prove that these zeros persist.

Lemma 4.2.3. Xd jn

�.d/ D

´1 if n D 1

0 if n > 1:

Proof. If n D 1, the statement is obvious. Otherwise, let for any n > 1,

n D p˛11 � p

˛22 � � �p

˛rr

be its prime factorization. Set n� D p1 � p2 � � �pr , thus, n� contains all the differentprime factors of n, though in n� every factor appears only once. If d divides n butdoes not divide n�, then d contains a factor p˛ with a prime p and an integer ˛ � 2.Therefore, �.d/ D 0 and for such a d ,

Pd jn �.d/ D

Pd jn� �.d/.

However for any k; 0 � k � r , the number n� has C.r; k/ divisors d such that dcan be written as the product of k different prime factors; if k D 0, we set d D 1.Thus by Definition 4.2.1, for these d , �.d/ D .�1/k andX

d jn

�.d/ DXd jn�

�.d/ D C.r; 0/ � C.r; 1/C � � � C .�1/rC.r; r/ D 0

– see Problem 1.4.18.

The Möbius function appears in the following equations called the Möbius inver-sion formulas.

Theorem 4.2.4. Let two infinite sequences ¹f .m/º1mD1 and ¹g.m/º1mD1 satisfycountably many equations

f .m/ DXd jm

g.d/; m D 1; 2; : : : : (4.2.3)

The system of equations .4:2:3/ has a solution

g.m/ DXd jm

�.d/f�md

�; m D 1; 2; : : : (4.2.4)


where � is the Möbius function. Vice versa, simultaneous equations .4:2:4/, m D1; 2; : : : , imply .4:2:3/ for all m D 1; 2; : : : . In other words, the infinite set of simul-taneous equations .4:2:3/ is equivalent to the infinite set of simultaneous equations.4:2:4/.

Proof. If d divides m, then by (4.2.3),

f�md

�D

Xıj.md /

g.ı/

hence Xd jm

�.d/f�md

�D

Xd jm

�.d/

� Xıj.md /

g.ı/

�:

Ifm D d � ı �m1, then for a fixed ı, d runs over the set of divisors of the integerm=ı.Since all sums are finite, we can change the order of summation and getX

d jm

�.d/

� Xıj.md /

g.ı/

�D

Xıjm

g.ı/

� Xd j.mı /

�.d/

�:

If ı ¤ m, that is, mı¤ 1, then

Pd j.mı /

�.d/ D 0 by Lemma 4.2.3, so that

Xıjm

g.ı/

� Xd j.mı /

�.d/

�D g.m/:

The second part of the theorem can be proved similarly.

We apply this theorem to calculate the number of special arrangements called cyclicsequences. To define them, we consider a set A D ¹a1; a2; : : : ; anº and all nm m-arrangements with repetition of its elements. Let

˛1 D .ai1 ; ai2 ; : : : ; aim/

be any of them. Together with ˛1 we consider its circular shifts, that is, m-arrange-ments

˛2 D .aim ; ai1 ; ai2 ; : : : ; aim�1/

˛3 D .aim�1 ; aim ; ai1 ; ai2 ; : : : ; aim�2/

:::

˛m D .ai2 ; ai3 ; : : : ; aim ; ai1/:

Two arrangements ˛i ; j ; i ¤ j , can coincide termwise, however, we distinguishthem since they bear different indices i and j .


Definition 4.2.5. The set ˛ D ¹˛1; ˛2; : : : ; ˛mº is called a cyclic sequence of lengthm corresponding to the arrangement ˛1; of course, it also corresponds to any of thearrangements ˛2; : : : ; ˛m.

The problem of enumeration of the cyclic sequences is complicated by their pe-riodicity, since it may happen that ˛1 D ˛dC1, ˛2 D ˛dC2; : : : ; ˛d�1 D ˛2d�1,˛d D ˛2d , etc.; in this case we say that d is a period of the cyclic sequence ˛.A cyclic sequence can have several periods. If d is the smallest period of a cyclic se-quence ˛ D ¹˛1; ˛2; : : : ; ˛mº, then among m arrangements ˛1; ˛2; : : : ; ˛m there areonly d different n-arrangements. Thus, d must divide m, and each ˛i ; 1 � i � m,consists of m

ddifferent d -arrangements with repetition such that each of them gener-

ates a cyclic sequence of length d with the minimal period equal to d .

Definition 4.2.6. Given n elements, the number of cyclic sequences of length d fromthese elements with the minimal period d is denoted by cycper.n; d/. The number ofcyclic sequences of length m of any period is denoted by CYC.n;m/.

Theorem 4.2.7.

CYC.n;m/ DXd jm

1

d

�Xıjd

�.ı/nd=ı�: (4.2.5)

Proof. Applying (4.2.4) with f .m/ D nm, we have

cycper.n; d/ D1

d

Xıjd

�.ı/nd=ı

and (4.2.5) follows.

Problem 4.2.1. A bracelet consists of four geometrically identical beads of two col-ors. Two bracelets are considered identical if they can be superposed (made indistin-guishable) by rotating them on the wrist without flipping, that is, not taking them off.How many different bracelets are there?

Solution. Formula (4.2.5) with n D 2 and m D 4 gives CYC.2; 4/ D 6. These sixdifferent bracelets are shown in Fig. 4.1.

t tt t t dt t t dd t t td d d td d d dd dFigure 4.1. Six different bracelets in Problem 4.2.1.


Problem 4.2.2. 1) How many are there bracelets consisting of six beads (m D 6)of three colors (n D 3)?

2) How many geometrically indistinguishable bracelets with m D 6 and n D 3 doexist if one can not only rotate but also flip them over?

Problem 4.2.3 (The little Fermat theorem). Show that for any prime d and naturaln,

dˇ.nd � n/:

Solution. If d is prime, then in (4.2.5) either ı D 1, leading to �.ı/ D 1, or elseı D d , resulting in �.ı/ D �1. Therefore,X

ıjd

�.ı/nd=ı D nd � n:

Since a number cycper.n; d/ is integer, each addend in (4.2.5) must be integer, thus, d

divides the expression in parentheses in (4.2.5), that is dˇP

ıjd �.ı/nd=ı , or d j.nd �

n/.


EP 4.2.1. Solve again Problem 4.2.1, assuming that two bracelets are indistinguish-able if they can be superposed with one another by rotation or by reflection in thebracelet plane (flipping).

EP 4.2.2. Euler’s totient function �.n/ was defined in EP 4.1.20.

1) Prove that �.n/ is multiplicative, that is �.m � n/ D �.m/ � �.n/ if m and n aremutually prime integers.

2) Let d1; d2; : : : ; dk be all divisors of n. Prove the Gauss formula

kXjD1

�.dj / D n:

3) Prove that �.n/ DPd jn �

�nd

�d .

4) Use the latter formula and (4.2.5) to prove the equation

CYC.n;m/ D1

m

Xd jm

��md

�nd :


EP 4.2.3. Prove the inversion formulas for the Stirling numbersXk

.�1/kS1.n; k/S2.k;m/ D .�1/nımn

and Xk

.�1/kS2.n; k/S1.k;m/ D .�1/nımn

where the Kronecker delta was defined in EP 1.4.4, 11).

EP 4.2.4. Consider two infinite sequences of polynomials ¹Pn.t/; n D 0; 1; 2; : : :º

and ¹Qn.t/; n D 0; 1; 2; : : :º connected by the two sets of equations

Pn.t/ D

nXmD0

˛n;mQm.t/; n D 0; 1; : : :

and

Qn.t/ D

nXkD0

ˇn;kPk.t/; n D 0; 1; : : : :

For any two sequences ¹unºn�0 and ¹vnºn�0 of real numbers, prove the inversionformulas

un D

nXmD0

˛n;mvm .8n � 0/” vn D

nXkD0

ˇnkuk .8n � 0/:

EP 4.2.5. Deduce from EP 4.2.4 the inversion formulas

un D

nXmD0

C.n;m/vm .8n � 0/” vn D

nXkD0

ın;kuk .8n � 0/:

EP 4.2.6. Prove the following pairs of inversion formulas.

1) The equations an DPnkD0.�1/

kC.n; k/bn�k; 8n D 0; 1; 2; : : : , are equiva-

lent to the equations bn DPnkD0 C.n; k/an�k; 8n D 0; 1; 2; : : : .


kC.n; k/bn�k; 8n D 0; 1; 2; : : : , are equiva-

lent to the equations bn DPnkD0.�1/

kC.n; k/ak; 8n D 0; 1; 2; : : : .

3) The equations an DPnkD0 C.nC p; k C p/bk; 8n D 0; 1; 2; : : : , are equiva-

lent to the equations bn DPnkD0.�1/

n�kC.nCp; kCp/ak; 8nD0; 1; 2; : : : .

4) Apply the inversion formulas above to derive formula (4.1.7).


4.3 Generating Functions I. Introduction

In many problems we have to deal with number sequences, for instance, withthe combinations C.m; n/ or the cyclic sequences CYC.m; n/, whose terms, inturn, depend on one or several integer parametersm; n; : : : . We have to manipu-late these sequences, which may result in cumbersome calculations. The methodof generating functions (GF) is a general way to work out such problems. Thismethod replaces operations on sequences with corresponding operations on cer-tain functions or power series, called the GF of these sequences, which can besimpler and allows us to invoke powerful techniques of algebra and calculus. Inthis section we develop the method of GF and show on many examples how toderive GF and use them to solve various problems. More applications of themethod are considered in the sequel sections.

Polya’s biography � How to Solve It? � Redfield’s biography �

Taylor’s biography � Cauchy’s biography � Hadamard’s biography �

Abel’s biography � Abel Prize � Lambert’s biography � Lambert Wfunction

In the first example of this section we use the Taylor series of the exponential func-tion. The reader unfamiliar with calculus, can interpret the following equation as thestatement that the exponential function ez can be represented for small jzj and forany n D 1; 2; 3; : : : as ez � 1C z C z2=2ŠC z3=3ŠC � � � C zn=nŠC terms that aresmaller than zn=nŠ. For instance, ez � 1C terms which are much smaller than 1;or if we need better accuracy, ez � 1C zC terms which are much smaller than jzj;or ez � 1 C z C z2=2ŠC terms which are much smaller than jzj2=2, and so forth.Such understanding is quite adequate for all our purposes. We start with a problemthat shows why the method of generating functions (GF) is useful.

Example 4.3.1. Consider the obvious equation ex � ex D e2x and expand the expo-nential functions on both sides in the Taylor series2

ex D 1C x C1

2Šx2 C � � � C

1

nŠxn C � � �

and

e2x D 1C 2x C1

2Š.2x/2 C � � � C

1

nŠ.2x/n C � � � ;

deriving the equation�1C x C

1

2Šx2 C � � � C

1

nŠxn C � � �

��1C x C

1

2Šx2 C � � � C

1

nŠxn C � � �

�D 1C 2x C

22

2Šx2 C � � � C

2n

nŠxn C � � � :

2To justify these manipulations, some elementary calculus is needed.

Section 4.3 Generating Functions I. Introduction 197

Multiplying out termwise the two series on the left, combining like terms and equatingthe coefficients of x

n

nŠon both sides of the equation, we easily verify the equation (see

the solution of Problem 1.4.18)

2n D C.n; 0/C C.n; 1/C C.n; 2/C � � � C C.n; n/:

This example demonstrates the essence of the method of GF – direct manipulationswith sequences are replaced by transformations of certain functions or correspond-ing power series. Certainly in Problem 1.4.18 we derived the latter formula in moreintuitive way. However, the method of GF gives us a powerful technique for solv-ing various combinatorial, probabilistic, and many other essentially more involvedproblems, where elementary approaches may not work.

To introduce the method, we have to discuss some preliminaries. For an infinitesequence (a finite sequence can always be augmented by infinitely many zeros on theright)

a D ¹a0; a1; a2; : : : ; an; : : :º D ¹anº1nD0

its GF is a formal power series

fa.t/ �

1XnD0

antn; (4.3.1)

where t is an indeterminate3 or a variable. The series is called formal, because we donot discuss its convergence at all, the powers tn; n D 0; 1; 2; : : : , here are just labels,which distinguish different terms of the sequence a. When we expand, step-by-step,the sum in (4.3.1) as

a0 C

1XnD1

antn

a0 C a1t C

1XnD2

antn

a0 C a1t C a2t2C

1XnD3

antn

etc., (4.3.1) generates, one after another, consecutive terms of the sequence a. At thispoint, the noun “function” in the sentence “GF” does not signify a function (mapping)in the sense of Section 1.1. Due to this reason, we used the tilde sign � instead of theequality sign in (4.3.1).

3Some authors denote the indeterminate by z and call the GF of a sequence ¹anº1nD0 its z-transformation.


To apply the formal power series, one has to develop some algebraic concepts4.However, in this book we prefer to avoid the formal algebraic approach and justifythe method by making use of convergent power series only. Hereafter we supposethat the series in (4.3.1) has a positive radius of convergence Ra > 0, therefore thesum of the series

P1nD0 ant

n exists in the disk jt j < Ra and is a function fa.t/

defined (and holomorphic) in the disk. All GF appearing in this book are representedby convergent power series. Hereafter we write

fa.t/ D

1XnD0

antn (4.3.2)

where fa.t/ is a function of t in some neighborhood of the point t D 0. The actualvalue of Ra is incidental for our purpose, any positive radius does. The choice of con-vergent power series narrows down the class of admissible sequences, nevertheless,this class is broad enough for all our applications.

The reader unfamiliar with calculus, can safely skip our discussion of power seriesand consider the conclusions as operational rules for solving corresponding problems.Moreover, we can arrive at the same results by considering terminating series insteadof the infinite ones, that is, by making use of polynomials – see Problems 4.3.11and 4.4.5 below, where we worked out this approach in detail. If we use these gen-erating polynomials, we do not have to deal with the convergence issue at all, thoughcomputations may be lengthier and more cumbersome, as can be seen in examplesbelow. That is why hereafter we use the convergent series.

Given a sequence a D ¹anº1nD0, we need to know whether the corresponding powerseries (4.3.1) is convergent or divergent. According to the Cauchy–Hadamardcriterion [44, p. 195], the series in (4.3.1)–(4.3.2) has a positive radius of convergence5

Ra > 0 if and only if the next quantity is finite,

1

RaD lim sup

r!1

npjanj <1 (4.3.3)

which essentially means that janj has at most exponential growth as n ! 1. Herejanj stands for the absolute value (the modulus) of real or complex numbers an. Forexample, sequences ¹a C bnkº1nD0 and ¹a C bknº1nD0 satisfy (4.3.3) for any param-eters a; b; k. However, faster growing sequences such as ¹nŠº1nD0, may have Ra D 0.A simple sufficient condition for the series (4.3.2) to have a positive radius of conver-gence is

janj � A1 C An2 for all n D 0; 1; 2; : : : (4.3.4)

with some positive constants A1 > 0 and A2 > 0.

4See, for example, [1, 19, 42].5The meaning of this was explained at the very beginning of this section.


Recall that the first derivative and indefinite integral of the power functions aregiven by the formulas d

dt.tp/ D ptp�1 and

Rtpdt D tpC1

pC1Cconst, the latter is valid

if p ¤ �1. Inside the disk of convergence, the convergent series can be differentiatedand integrated term-by-term, that is if jt j < Ra, then

d

dt

1XnD0

antnD

1XnD0

nantn�1

and Z � 1XnD0

antn

�dt D

1XnD0

an

nC 1tnC1 C const .

The termwise differentiability and integrability of convergent power series are theonly properties beyond the precalculus level, we use hereafter in applications of themethod of GF.

Throughout we deal mainly with two well-known infinite series. These are thepower series of the exponential function

et D 1C t Ct2

2ŠC � � � C

tn

nŠC � � � D

1XnD0

tn

nŠ(4.3.5)

which is convergent for all (complex) t and the geometric series

1

1 � tD 1C t C t2 C � � � C

tn

nŠC � � � D

1XnD0

tn (4.3.6)

which is convergent for jt j < 1. Differentiating (4.3.6) termwise p � 1 times, wederive the formula

1

.1 � t /pD 1C

p

1Št C

p.p C 1/

2Št2 C

p.p C 1/.p C 2/

3Št3 C � � �

Cp.p C 1/ � � � .p C n � 1/

nŠtn C � � � D

1XnD0

.p C n � 1/Š

nŠ.p � 1/ŠtnI

(4.3.7)

the coefficient of tn in (4.3.7) is

C.p C n � 1; n/ D Crep.p; n/:

The same result can be derived without referring to infinite series. Indeed, let usconsider a truncated series (4.3.6), that is, a polynomial

Pn.t/ D 1C t C t2C � � � C tn:

By the formula for the sum of a finite geometric progression, see EP 1.1.8 we derive

Pn.t/ D1 � tnC1

1 � t: (4.3.8)


Problem 4.3.1. Prove by mathematical induction that for p D 1; 2; : : : , the coeffi-cient of tk; k � n, in the polynomial .Pn.t//p, given by (4.3.8), is C.p C k � 1; k/.

Solution. Each coefficient of Pn.t/ is 1, and also C.1 C k � 1; k/ D 1, which es-tablishes the basis of induction. Now suppose that the conclusion is valid for allexponents not exceeding some p and consider

.Pn.t//pC1D .Pn.t//

p� Pn.t/:

Both factors on the right are polynomials. When we multiply them out, the powertk occurs k C 1 times – if we multiply the term tk from .Pn.t//

p by t0 D 1 fromPn.t/, or if we multiply tk�1 from .Pn.t//

p by t1 from Pn.t/, . . . , or if we multiplyt0 D 1 from .Pn.t//

p by tk from Pn.t/. Since the coefficient of tj in .Pn.t//p isC.p C j � 1; j / by the inductive assumption, the coefficient in question is

C.p C 0 � 1; 0/C � � � C C.p C j � 1; j /C � � � C C.p C k � 1; k/ D C.p C k; k/

due to equation (1.4.3) and EP 1.4.4, 1) thus proving the claim.

Problem 4.3.2. Find the coefficient of tk in the polynomial .Pn.t//p for n < k � 2nand p � 2, if the polynomial Pn.t/ is given by (4.3.8).

To proceed on with the method of GF, we introduce some operations on sequencesand the corresponding operations on their GF. Linear combinations of sequences, thatis, the multiplication of a sequence by a number (a scalar) and the addition of se-quences, are defined straightforwardly, termwise.

Example 4.3.2. Consider the sequences

a D ¹1; 0; 1; 0; 1; : : :º

andb D ¹0; 1; 0; 1; 0; : : :º

that is, an D .1 C .�1/n/=2 and bn D .1 � .�1/n/=2; n � 0, and their GF fa.t/

and fb.t/. The linear combination of a and b with coefficients ˛ and ˇ is defined as˛aCˇb D ¹˛; ˇ; ˛; ˇ; ˛; ˇ; : : :º, and the corresponding GF is the linear combinationof the GF with the same coefficients a and b, that is, f˛aCˇb.t/ D fa.t/C fb.t/.

Problem 4.3.3. Use formula (4.3.6) to show that in Example 4.3.2

fa.t/C fb.t/ D˛ C ˇt

1 � t2:

Problem 4.3.4. What properties (commutativity, associativity, etc.) do these opera-tions on sequences possess? Notice that the sequence ¹0; 0; : : :º is the neutral elementfor the termwise addition of sequences.


To define a “multiplication” of sequences, we consider two sequences a D ¹anº1nD0and b D ¹bnº1nD0, and polynomials Pa.t/ D

PpnD0 ant

n and Qb.t/ DPqnD0 bnt

n,whose coefficients are initial terms of the sequences a and b, respectively. Let

R.t/ D Pa.t/Qb.t/ D

pCqXnD0

cntn:

Problem 4.3.5. Show that the coefficients cn of the polynomial R.t/ for n �min¹p; qº are given by

cn D a0bn C a1bn�1 C � � � C an�1b1 C anb0:

In particular, c0 D a0b0; c1 D a0b1 C a1b0; c2 D a0b2 C a1b1 C a2b0; : : : .

Taking into account the latter equation and looking for an operation on sequencesthat corresponds to the multiplication of polynomials or power series, we arrive at thefollowing definition, which mimics the Cauchy rule of multiplication of power series.

Definition 4.3.3. Given two sequences a and b, the sequence c D ¹cnº1nD0, wherecn D a0bn C a1bn�1 C � � � C an�1b1 C anb0; n D 0; 1; 2; : : : , is called theirconvolution and is denoted by c D a � b.

Now we give the major definition of this section.

Definition 4.3.4. Let a sequence a satisfy property (4.3.3). The function fa.t/ in(4.3.2) is called the Generating Function (GF) of the sequence a.

Example 4.3.5. The GF of the finite sequence ¹1; 0; 1; 0; 1; 0; 1º is

f .t/ D 1C 0 � t C 1 � t2C 0 � t3C 1 � t4C 0 � t5C 1 � t6 D 1C t2C t4C t6 D1 � t8

1 � t2;

the GF of the infinite sequence ¹1; 0; 1; 0; 1; 0; : : :º is

f .t/ D 1C t2 C t4 C � � � D1

1 � t2:

When we employ the method of GF and work, instead of sequences, with their GF,at the last step we must return from the derived GF to its sequence and we want to becertain that this sequence is the one we looked for. The method of GF is based on thefollowing statement.

Theorem 4.3.6. There exists a one-to-one correspondence between the set of se-quences satisfying .4:3:3/ and the set of power series G with a positive radius ofconvergence. This correspondence preserves algebraic operations, which means thata linear combination of sequences corresponds to a linear combination, with the samecoefficients, of their GF, and the convolution of sequences corresponds to the productof their GF.


Proof. It should be mentioned that we have always considered the largest possiblevalue of the radius of convergence, that is, if Ra is the radius of convergence of se-ries a, then there is no R > Ra such that the series

P1nD0 ant

n converges in thedisk jt j < R. The statement on the one-to-one correspondence follows immediatelyfrom the uniqueness of the Taylor series [46, p. 651–652]. The correspondence oflinear combinations is obvious. The conclusion regarding convolution follows fromDefinition 4.3.3 (the Cauchy rule of multiplication of power series).

The set of GF has an algebraic structure of a ring; the definition can be found, forexample, in [32]. For us that means only that we can add and multiply sequences andtheir corresponding GF using the standard commutative, associative and distributiverules and keeping in mind that by the product of two sequences, we understand theirconvolution.

Proposition 4.3.7. Prove that the set of sequences, satisfying .4:3:4/, is a commuta-tive ring with the unity element 1 D ¹1; 0; : : :º with respect to the following opera-tions:

1) the usual termwise multiplication by real numbers

2) the usual termwise addition of sequences as addition, and

3) the convolution of sequences as multiplication.

Proof. Let janj � A1 C .A2/n and jbnj � B1 C .B2/n for all n � 0. Since A2 >0; B2 > 0, then

.A2 C B2/nD

nXkD0

C.n; k/Ak2Bn�k2 � An2 C B

n2

so that janC bnj � A1CAn2CB1CBn2 � .A1CB1/C .A2CB2/

n. Thus, the sumaC b also satisfies (4.3.4).

It remains to prove that the convolution c D a � b satisfies (4.3.4), since the verifi-cation of other ring axioms is straightforward and we leave it to the reader. Thus, leta; b 2 H and so that,

jaibn�i j � A1B1 C A1Bn�i2 C Ai2B1 C A

i2B

n�i2 :

From here

jcnj � .nC 1/A1B1 C A1

nXiD0

Bn�i2 C B1

nXiD0

Ai2 C Bn2

nXiD0

.A2=B2/i :

Set ı D max¹1IA1IB1IA2IB2º, thus, jcnj � 4.nC 1/ınC1, and since nC 1 � 2n,we get jcnj � C1 C C n2 with constants C1 and C2 for all n D 0; 1; : : : .


Problem 4.3.6. Prove the first part of Proposition 4.3.7 by making use of the inequal-ity jan C bnj � 2max¹janjI jbnjº.

Problem 4.3.7. Prove that the set of power series with a nonzero radius of conver-gence is a commutative ring with the usual addition and multiplication. The unity ofthis ring is a constant function f .t/ D 1 D 1C 0 � t C 0 � t2 C � � � .

Theorem 4.3.6 explains why the method of GF is useful – instead of performing te-dious calculations with sequences, we work with (analytic) functions and have avail-able powerful techniques of algebra and analysis. At the end we return back to thesequence we sought for. At that point we can use the following known formulas ex-pressing the Taylor coefficients of a function f through its derivatives [46, p. 654] orthrough contour integrals [44, p. 174],

an D1

nŠf .n/a .0/ D

1

2�i

IjzjD�

z�n�1fa.z/dz; n D 0; 1; : : :

where � is small so as the circumference jzj D � lies inside the circle of convergenceof fa. If we know fa exactly or approximately, these formulas allow to find thenumbers an or to estimate their asymptotic behavior.

Depending upon a particular problem, it may be suitable to use other systems oflinearly independent functions � instead of the powers tn. In particular, we will seethat in problems, where the ordering of elements must be taken into account, it isuseful to employ exponential generating functions (EGF) based on the system � D

¹tn=nŠº1nD0. General methods of constructing GF are discussed in detail in [19].

Definition 4.3.8. A function

ea.t/ D

1XnD0

antn

nŠ(4.3.9)

is called the Exponential Generating Function (EGF) of a sequence a.

Example 4.3.9. For the sequence s D ¹1; 1; : : :º, the GF is

fs.t/ D 1C 1 � t C 1 � t2C � � � C 1 � tn C � � � D

1

1 � t;

while its EGF is

es.t/ D 1C 1 �t

1ŠC 1 �

t2

2ŠC � � � C 1 �

tn

nŠC � � � D et :

For the sequence b D ¹1; 1 � 3; 1 � 3 � 5; : : :º, the EGF is eb.t/ D .1 � 2t/�3=2.

In the case of EGF, the definition of the convolution must be modified.


Definition 4.3.10. The sequence d D ¹dnº1nD0, where

dn D

nXiD0

C.n; i/aibn�i

is called the binomial convolution (or the Hurwitz composition) of the sequencesa and b.

Problem 4.3.8. State and prove an analogue of Theorem 4.3.6 for EGF.

In addition to algebraic operations considered in Theorems 4.3.6, some other op-erations on sequences and the corresponding transformations of their GF are useful.For a sequence a D ¹anº1nD0 and a fixed natural number k, we consider a sequenceb D ¹bnº1nD0, where b0 D b1 D � � � D bk�1 D 0 and bn D an�k for n � k. Thenclearly,

fb.t/ D tkfa.t/: (4.3.10)

On the other hand, if bn D anCk for all n � 0, then

fb.t/ D t�k.fa.t/ � a0 � a1t � � � � � ak�1t

k�1/:

The sequences ¹bnº1nD0 are called shifts of the sequence a.The termwise differentiation and integration of GF imply the following results.

Theorem 4.3.11. Let a D ¹anº1nD0 be a given sequence.

1) If b D ¹bnº1nD0, bn D .nC 1/anC1, then fb.t/ Dddtfa.t/.

2) If c D ¹cnº1nD0, cn D nan, then fc.t/ D tddtfa.t/.

3) If d D ¹dnº1nD0, dn Dan�1n; n � 1, and d0 D 0,then fd.t/ D

R t0 fa.x/dx.

4) If l D ¹lnº1nD0, ln DannC1

, then fl.t/ D1t

R t0 fa.x/dx.

The proof of the following lemma is immediate.

Lemma 4.3.12. If s D ¹1; 1; : : : º, that is, sn D 1; 8n � 0, then for any sequencea D ¹anº1nD0

a0 � s0 D a0

a0 � s1 C a1 � s0 D a0 C a1

a0 � s2 C a1 � s1 C a2 � s0 D a0 C a1 C a2

etc. Therefore, the convolution of sequences

a � s D ¹a0; a0 C a1; a0 C a1 C a2; : : : ; a0 C a1 C � � � C an; : : :º

is the sequence of consecutive partial sums of the sequence a.


Problem 4.3.9. Prove that the sequence I D ¹1; 0; 0; : : : º is the unit element for theconvolution, that is a � I D I � a D a for any sequence a.

Due to Lemma 4.3.12, the sequence s D ¹1; 1; : : :º is called the summator. Weknow from Theorem 4.3.6 and Example 4.3.9, that the GF for the sequence a � s is.1 � t /�1fa.t/. From this observation we can, for instance, immediately concludethat the coefficient of, say, t37 in the Taylor series of the rational function

.1 � 3t2 � 4t7 C 12t21 � 5t45/.1 � t /�1 (4.3.11)

is 1�3�4C12 D 6. We keep the notation for the summing sequence s D ¹1; 1; : : :ºfor the rest of this chapter.

Problem 4.3.10. 1) Find the coefficients of tn; n � 0, in the Taylor series of therational function (4.3.11).

2) Why are there different formulas for the coefficients for n < 45 and for n � 45?

The same argument based on the summing property of the sequence s D ¹1; 1; : : : ºand the identity

1

1 � tD .1C t C t2 C � � � C t9/ � .1C t10 C t20 C � � � C t90/

� .1C t100 C t200 C � � � C t900/ � � � ; jt j < 1

(4.3.12)

which proof we leave to the reader, immediately implies that any natural number hasa base 10 representation, and this representation is unique.

In the rest of this section and in the next one we consider various applications ofthe method of GF and solve problems.

Problem 4.3.11. Compute the sum 12 C 22 C � � � C n2 for any natural n.

Solution. There are different ways to approach this problem. If we know the value ofthis sum, we can carry out a simple inductive proof, as we have done in Problem 1.1.4.However, we are going to apply the method of GF to demonstrate the essential ingre-dients of the method. We don’t even have to know the sum in advance – the methodallows us to find the sum explicitly.

Let us then introduce the sequence a D ¹anº1nD0, where an D 12C 22C � � � C n2.We immediately observe that this is a sequence of partial sums for a simpler sequenceb D ¹bnº1nD0 with bn D n2; n � 0. We also know from Lemma 4.3.12 that to findexplicitly the sequence a, we can convolve the sequence b and the summator s. Hencewe conclude that a D s�b, which is equivalent to fa.t/ D .1� t /

�1fb.t/. This equa-tion tells us that if we find GF fb explicitly, we will be able to calculate fa and thensolve the problem by computing its coefficients. Let us try to simplify the problemeven more and reduce the sequence of squares b to the sequence of natural numbers


themselves. From calculus we remember that ddx

�x2�D 2x. This observation gives

us a plan of the solution.We begin with the summator s D ¹1; 1; 1; : : :º and its GF

fs.t/ D1

1 � tD 1C t C t2 C � � � C tn C tnC1 C � � �

– notice that the coefficient of tn here is 1. Differentiating both sides of this equation(if jt j < 1, we can differentiate the series termwise) we anew get (4.3.7) with p D 2,

1

.1 � t /2D 1C 2t C 3t2 C 4t3 C � � � C ntn�1 C .nC 1/tnC2 C � � � (4.3.13)

hence the function 1.1�t/2

is the GF of the sequence d D ¹1; 2; 3; : : :º.We should be careful here, since the indices start at zero and so that dn D nC 1.

Thus, we have to shift this sequence, which in terms of GF corresponds to multipli-cation by t – see Theorem 4.3.11, 2). Therefore, we introduce a shifted sequencec D ¹cnº with cn D n; n � 0, and derive

fc.t/ D tfd.t/ D t1

.1 � t /2:

Then, the coefficient of tn in the series fc.t/ is cn D n. Repeating this step, that is,differentiating fc.t/ and multiplying by t , we get the GF for the sequence of squaresb D ¹n2º1nD0,

fb.t/ D td

dt

²t

1

.1 � t /2

³thus we know without any calculation that the coefficient of tn in the Taylor series ofthe above function fb.t/ is bn D n2. Finally, the GF for the sequence we want in thisproblem is derived if we multiply the function fb by the GF of the summator, that is,we have to consider

fa.t/ D .1 � t /�1t

d

dt

²t

1

.1 � t /2

³D t .1C t /.1 � t /�4:

From (4.3.7) with p D 4 we get the equation

.1 � t /�4 DXn�0

.nC 1/.nC 2/.nC 3/

6tn;

thus,

t .1C t /.1 � t /�4

D

Xn�0

.nC 1/.nC 2/.nC 3/

6tnC1 C

Xn�0

.nC 1/.nC 2/.nC 3/

6tnC2

D

Xn�0

1

6n.nC 1/.2nC 1/tn


and the coefficient of tn here gives the required formula

an D 12C 22 C � � � C n2 D

1

6n.nC 1/.2nC 1/:

We will solve this problem one more time, now using the generating polynomials,that is, truncated power series instead of infinite power series. Indeed, if we wantto find an, it suffices to consider polynomials of nth degree and truncate all powersgreater than n in all computations. This approach can be traced back at least to Niven[38, Chap. 7]. We demonstrate the method in detail here and also in Problem 4.4.5 inSection 4.4.

Consider a polynomial (4.3.8)

Pn.t/ D 1C t C t2C � � � C tn D

1 � tnC1

1 � t:

Repeating the same steps as above, we compute the functions ddtPn.t/, t ddtPn.t/,

t ddt

®t ddtPn.t/

¯, and finally

Pn.t/

�td

dt

²td

dtPn.t/

³�Dt�1 � tnC1

� �1C t � .nC 1/2tn C .2n2 C 2n � 1/tnC1 � n2tnC2

�.1 � t /4

:

Since we are interested in the coefficient of tn, only two terms in the numerator of thelatter fraction, namely t and t2 can contribute to this coefficient. Using Problem 4.3.1to compute the coefficients of t and t2, we again find the same expression an D16n.nC 1/.2nC 1/ as in Problem 1.1.4.

In this problem we have used the operator t ddt

and its square�t ddt

�2. The following

problem6 treats arbitrary natural degrees of this operator.

Problem 4.3.12. 1) For any n D 1; 2; 3; : : : prove that�td

dt

�n 1

1 � tD 1nt C 2nt2 C 3nt3 C � � � :

2) Prove that

1nt C 2nt2 C 3nt3 C � � � DPn.t/

.1 � t /nC1

where Pn is a polynomial of degree n with Pn.0/ D 0 and all the other coeffi-cients positive; moreover, Pn.1/ D nŠ.

6See [40, Problem I.45] where this and more general problems are considered.


Using GF we can derive new inversion formulas, distinct from the Möbius inversionin Theorem 4.2.4.

Theorem 4.3.13. Consider two sequences a D ¹anº1nD0 and b D ¹bnº1nD0 related bythe infinite set of equations

an D

nXkD0

.�1/kC.m; k/bn�k; 8n D 0; 1; 2; : : : (4.3.14)

where a natural number m is fixed. Then

bn D

nXkD0

C.mC k � 1; k/an�k; 8n D 0; 1; 2; : : : : (4.3.15)

Vice versa, equations .4:3:15/ imply .4:3:14/.

Proof. It suffices to notice that by the binomial formula (1.4.4) with a D 1 and b D�t , fa.t/ D .1�t /

mfb.t/, thus, fb.t/ D .1�t /�mfa.t/, which implies (4.3.15).

The next problem demonstrates a useful method of construction of GF.

Problem 4.3.13. Find a GF for the sequence

¹C.m; 0/; C.m; 1/; : : : ; C.m;m/; 0; 0; : : :º

and use it to calculate again the binomial coefficients C.m; k/.

Solution. Consider a polynomial inmC1 variables (indeterminates) t; x1; x2; : : : ; xmand expand it against the powers of t :

.1C x1t /.1C x2t / � � � .1C xmt /

D 1C .x1 C x2 C � � � C xm/t

C .x1x2 C � � � C xm�1xm/t2C � � � C .x1x2 � � � xm/t

m:

(4.3.16)

The coefficient of tk here is the sum of all k-element products of the indeterminatesx1; x2; : : : ; xm. There is a one-to-one correspondence between these products andk-element subsets of the set X D ¹x1; x2; : : : ; xmº. By definition of combinationswithout repetition, the number of such subsets is C.m; k/. Thus, equation (4.3.16)generates an explicit roster of all k-combinations of the elements of an m-elementset, 0 � k � m.


If we do not need the complete list of combinations, it is convenient to put in(4.3.16) x1 D x2 D � � � D xm D 1, thus reducing (4.3.16) to the binomial formula(1.4.4)

.1C t /m D

mXkD0

C.m; k/tk : (4.3.17)

Differentiating (4.3.17) k times and substituting t D 1, we again derive formula(1.4.1), C.m; k/ D mŠ=..m � k/ŠkŠ/.

In the same fashion we can construct the GF

1XrD0

Crep.n; r/tr

for the number of combinations with unlimited repetition from elements of n types.Again, we introduce symbols for these elements and list, at least potentially, all ofthese combinations:

f .t/ D .1C x1t C x21 t2C � � � /.1C x2t C x

22 t2C � � � /.1C xnt C x

2nt2C � � � /

D 1C .x1 C x2 C � � � /t

C .x21 C � � � C x2n C x1x2 C � � � C x1xn C x2x3 C � � � C xn�1xn/t

2C � � � :

(4.3.18)

Setting x1 D � � � D xn D 1, (4.3.18) becomes

f .t/ D .1C t C t2 C � � � /n D .1 � t /�n D

1XrD0

Crep.n; r/tr :

Differentiating this series r times and substituting t D 0, we again derive equation(1.4.6), Crep.n; r/ D C.nC r � 1; r/.

Here as well as in the preceding problems, we can avoid operations with infiniteseries by considering instead of (4.3.18) generating polynomials

fr.t/ D .1C x1t C x21 t2C � � � C xr1t

r/ � � � .1C xnt C x2nt2C � � � C xrnt

r/:

Problem 4.3.14. Find the generating polynomials for the sequences

I D ¹1; 0; 0; : : : ; º

S D ¹1; 1; 1; : : : ; º

Problem 4.3.15. Under what condition a sequence of polynomials P0.t/; P1.t/; : : :is the sequence of generating polynomials for a numerical sequence ¹anº1nD0?


In the same way we can explicitly construct GF for combinations satisfying arbi-trary restrictions on any of its elements. Moreover, since we usually do not need anexplicit list of all combinations or other combinatorial objects in terms of the indeter-minates x1; x2; : : : , there may be no need to employ these parameters. For example,if we want to find the quantity considered in Problem 1.4.14, we can immediatelywrite down its GF as .1C tC� � �C tr1/.1C tC t2C� � � /n�1, which by differentiationreturns the same formula (1.4.8).

As another example, we compute the number bC.n; r/ of combinations of elementsof n types taken r at a time with unlimited repetition, under the additional restrictionthat every combination must contain at least one element of each type. This numberis also equal to the number of ways to place r identical balls in n � r different urns sothat no urn is empty. Comparing with (4.3.18), we see that to satisfy this requirement,it is necessary to delete the term 1 from each infinite series in the product. Thus,we immediately construct the GF as f .t/ D .t C t2 C � � � /n. From here we getf .t/ D

P1rDn C.r � 1; r � n/t

r , and finally,

bC.n; r/ D ´ 0; r < n

C.r � 1; r � n/; r � n:

Next we consider the simplest ordered combinatorial objects – the arrangements.If we try to find a GF for arrangements as a polynomial similar to (4.3.16), we fail,and the reason for the failure can be easily seen. We cannot derive the complete list ofall arrangements, because the multiplication is commutative, .x1 � x2 C x2 � x1/t2 D2x1x2t

2, etc. One way to overcome this obstacle is to use noncommutative variablesto avoid the appearance of like terms as in the example above. However, we canproceed in a more conventional way.

Notice that formula (4.3.17) can be rewritten as

.1C t /m D

mXrD0

A.m; r/tr

rŠ

which is the EGF for the number of arrangements, so that in the case of orderedtotalities the EGF may work better.

Indeed, given p identical objects x, there exists the unique arrangement.x; x; : : : ; x/ containing all of these objects, therefore, the EGF for this arrangementis

e.t/ D 1 �tp

pŠ:

If an arrangement contains k < p objects, the EGF can also be formed in the uniqueway as 1 � t

k

kŠ, since the objects are indistinguishable. Then, the EGF for arrangements

containing any number k; 0 � k � p, of these objects is

e.t/ D 1Ct

1ŠCt2

2ŠC � � � C

tp

pŠ


and the EGF for arrangements that may contain any finite number of indistinguishableobjects, now appears as an infinite series

e.t/ D 1Ct

1ŠCt2

2ŠCt3

3ŠC � � � :

We recognize here the power series for the exponential function et , which has theinfinite radius of convergence.

It is easily seen now, that if there are p indistinguishable (identical) objects of onetype and q indistinguishable objects of another type, then the EGF for arrangementswith repetition, containing any number of elements of the two types, is

e.t/ D

�1C

t

1ŠCt2

2ŠC � � � C

tp

pŠ

��1C

t

1ŠCt2

2ŠC � � � C

tq

qŠ

�:

It is clear now that the EGF for arrangements with repetition of elements of m typeswithout any restrictions on their repetitions is

e.t/ D

�1C

t

1ŠCt2

2ŠC � � �

�mD emt :

The Taylor series for emt D 1C mt1ŠCm2t2

2ŠC� � �C

mntn

nŠC� � � again recovers formula

(1.3.1), Arep.m; n/ D mn.

Problem 4.3.16 (Problem 4.1.5 revisited). Among n-arrangements with repetitionfrom the set A D ¹0; 1; 2; 3º, how many contain at least one digit 1, at least one 2, andat least one 3?

Solution. Since there are no restrictions on the symbol 0, the corresponding factor inthe EGF is the same series 1C t

1ŠC

t2

2ŠC � � � D et . However, to ensure the presence

of each of the three other symbols 1, 2, 3, we must delete terms 1 D t0

0Šfrom the

corresponding series, and these series become t1ŠCt2

2ŠC � � � D et � 1. Thus, the EGF

for the quantities in this problem is

et .et � 1/3 DXn�0

.4n � 3nC1 C 3 � 2n � 1/tn

nŠ:

Problem 4.3.17. Solve again Problems 1.4.18–1.4.20 by making use of EGF. For thereader’s convenience we recall these problems.

Problem 1.4.18. Find the number of n-arrangements with repetition from theset A D ¹0; 1º, containing an even number of 0s.

Problem 1.4.19. Find the number of n-arrangements with repetition from theset A D ¹0; 1; 2º, containing an even number of 0s.


Problem 1.4.20. Find the number of n-arrangements with repetition from theset A D ¹0; 1; 2; 3º, containing an even number of 0s and an even number of 1s.

Solution. Since a 0 can appear only an even number of times, the corresponding factorin the EGF is 1C t2

2ŠCt4

4ŠC � � � D .1=2/.et C e�t /. All other factors are the same as

in the preceding problem, that is, these are et . Thus, we get the following EGF.

In Problem 1.4.18: .1=2/.et C e�t /et DPn�0 2

n�1 tn

nŠ.

In Problem 1.4.19: .1=2/�et C e�t

�e2t D

Pn�0

12.3n C 1/ t

n

nŠ.

In Problem 1.4.20: 14.et C e�t /2e2t D 1C

Pn�1.4

n�1 C 2n�1/ tn

nŠ.

Problem 4.3.18. Compute the sum F.j; k; n/ DPkiD1 C.n� i; j /, where j; k; n are

given natural numbers and n � j C k.

Solution. Set ai .j / D C.n � i; j /, hence, ai .j / D 0 for i > n � j . Let fi .t/ be theGF of the sequence ¹ai .j /º

1jD0; we are to find partial sums of this sequence. We will

compute them by making use of the binomial formula (4.3.17), namely,

fi .t/ D

1XjD0

ai .j /tjD

n�iXjD0

C.n � i; j /tj D .1C t /n�i :

Let fF be the GF of the sequence F.j; k; n/, where we consider n and k as fixedparameters and j as a variable:

fF .t/ D

1XjD0

F.j; k; n/tj D

1XjD0

² kXiD1

ai .j /

³tj D

kXiD1

² 1XjD0

ai .j /tj

³

D

kXiD1

.1C t /n�i D1

t.1C t /n �

1

t.1C t /n�k :

We notice that the series here are actually finite sums and at the very last step we usedthe formula for the sum of a geometric progression,

kXiD1

qn�i D qn�kk�1XlD0

ql D qn�kqk � 1

q � 1

with q D 1C t ¤ 1. Applying (4.3.17) twice, we get

F.j; k; n/ D C.n; j C 1/ � C.n � k; j C 1/;

assuming that C.j; j C 1/ D 0.



EP 4.3.1. Find explicitly the GF for the following finite or infinite sequences.

1) a1 D ¹1; 1; 1; : : :º

2) a2 D ¹0; 0; 0; 1; 1; 1º

3) a3 D ¹0; 0; 0; 1; 1; 1; 0; 0; 0; 1; 1; 1; : : :º

4) a4 D ¹1; 1; 1; 0; 0; 0; 1; 1; 1; 0; 0; 0; : : :º

5) a5 D ¹anº1nD0; an D C.5; n/

6) a6 D ¹anº1nD0; an D .�1/

n

7) a7 D ¹anº1nD0; an D cos.˛n/

8) a8 D ¹anº1nD0; an D sin.˛n/.

EP 4.3.2. This problem refers to EP 4.1.1. Let an; n D 0; 1; 2; : : : , be the number ofelements in the set S possessing exactly n of the properties Pj ; j D 1; 2; 3; 4. Writedown explicitly the GF of the sequence ¹anº1nD0.

EP 4.3.3. Using the result of EP 1.1.31 prove that the GF fa and the EGF ea of anysequence a are connected by the equation

fa.t/ D

Z 10

e�xea.xt/dx:

EP 4.3.4. Introduce a sequence of complementary partial sums (tails) of thesequence a,

cn D anC1 C anC2 C � � � ; n D 0; 1; 2; : : : :

Prove that if the GF fa.1/ is convergent, then

.1 � t /fc.t/ D fa.1/ � fa.t/:

EP 4.3.5. Prove (4.3.7) by mathematical induction.

EP 4.3.6. Prove identity (4.3.12).

EP 4.3.7. Find the coefficients of t7 and t11 in the Taylor series of the fraction1�2t2C3t5�t8C10t10

2�3tCt2.

EP 4.3.8. Compute the sumPmkD0 C.n; k/C.m; k/; n � m.

EP 4.3.9. Compute the sum S.m; p/ DPmkD0.�1/

kC.p; k/; p � m.


EP 4.3.10. There are 10 married couples living at a townhouse. In how many waysis it possible to select a committee out of these people, consisting of 2 men and 3women?

EP 4.3.11. Solve the preceding problem if no person can serve on the committeetogether with her/his spouse.

EP 4.3.12. In how many ways can one buy 2 different books if a bookstore has 3bestsellers?

EP 4.3.13. Solve the preceding problem if one can also buy two copies of the sametitle.

EP 4.3.14. How many ways are there to buy for gifts 20 copies of these 3 bestsellers?We can buy 20 copies of the same title.

EP 4.3.15. How many ways are there to buy for gifts 20 copies of these 3 bestsellersif we want to buy at least one and no more than 10 copies of the same title?

EP 4.3.16. How many are there 10-letter words composed of 5 vowels and the letter“z”, which contain each vowel at least once?

EP 4.3.17. Use the identity .1C t /m � .1C t /p D .1C t /mCp and GF to prove theequation

C.mC p; n/ D

mXkD0

C.m; k/C.p; n � k/:

EP 4.3.18. Use GF and the identity

.1 � t /�1�n.1C t /�1�n D .1 � t2/�1�n

to prove the equation

2mXjD0

.�1/jC.nC j; n/C.nC 2m � j; n/ D C.nCm;m/:

EP 4.3.19. Find the sum of the third powers and the sum of the fourth powers of thefirst n natural numbers.

EP 4.3.20. A bookstore has four novels, in English, French, German, and Russian,100 copies of each. In how many ways is it possible to buy 50 books so that to haveeven numbers of English, French, and German books and an odd number of Russianbooks? Answer the same question if the number of English and French titles togetherdoes not exceed 4? Answer the same question if the quantity of German books istwice or more than that of Russian books?


EP 4.3.21. Let f be the GF of a sequence ¹a0; a1; : : :º. What sequence is generatedby the function f 2?

EP 4.3.22. Find appropriate analogues of (4.3.12) and use them to prove that eachnatural number has the unique binary (that is, base 2), ternary (base 3), quaternary(base 4), etc., representation.

EP 4.3.23. Use GF or EGF to prove, similarly to Theorem 4.3.13, the following pairsof inversion formulas.


kC.n; k/bn�k; 8n D 0; 1; 2; : : : , are equiva-lent to bn D

PnkD0 C.n; k/an�k; 8n D 0; 1; 2; : : : .

2) The equations an DPnkD0 C.nC p; k C p/bk; 8n D 0; 1; 2; : : : , are equiva-

lent to bn DPnkD0.�1/

n�kC.nC p; k C p/ak; 8n D 0; 1; 2; : : : .

EP 4.3.24. Does Theorem 4.3.13 remain valid if the equation

an0 D

n0XkD0

.�1/kC.m; k/bn0�k

fails for only one natural n0? The same question with regard EP 4.3.23.

EP 4.3.25. Show that the assertion un DPnmD0 C.n;m/vm .8n � 0/, of EP 4.2.5

is equivalent to the equation fu.t/ D etfv.t/ between the EGF of the sequencesu D ¹unº and v D ¹vnº.

EP 4.3.26. Restore all computations in Problem 4.3.12.

EP 4.3.27. Find again the number of combinations with and without repetition usinggenerating polynomials, that is, truncated GF.

EP 4.3.28. To facilitate performance of the participants of a contest, the Combi Clubbought 15 identical chocolate bars. In how many ways is it possible to distribute themamong five participants of the contest, so that each contestant receives at least two butno more than four chocolates?

EP 4.3.29. How many three-term geometric progressions .a; aq; aq2/ like 2; 6; 18,and four-term geometric progressions .a; aq; aq2; aq3/ like 3; 9; 27; 81, are there inthe set ¹1; 2; 3; : : : ; 99; 100º?

EP 4.3.30. Prove that if fa; fb; fc; fd are the GF of sequences a; b; c;d, respectively,and fd D fa � fb � fc, then

dn DX

iCjCkDn

aibj ck

where the sum runs over all nonnegative integer solutions of the equation iCjCkDn.


EP 4.3.31. Use equations (2.1.3)–(2.1.5) and the result of EP 2.1.17 to derive thefollowing GF:

1) .1C t /p.p�1/=2 DPq�0 C.p.p�1/=2; q/t

q for the number of simple labelledgraphs of order p and size q

2) .1Ct /p.pC1/=2 DPq�0 C.p.pC1/=2; q/t

q for the number of labelled graphsof order p and size q with loops but without multiple edges

3) .1�t /�p.p�1/=2 for the number of labelled graphs of order p and size q withoutloops but with multiple edges

4) .1 � t /�p.pC1/=2 for the number of labelled graphs of order p and size q withloops and multiple edges.

EP 4.3.32. 1) Use method of undetermined coefficients and the equation

.1C x/1=2.1C x/1=2 D 1C x

to compute the coefficients of

.1C x/1=2 D a0 C a1x C a2x2C � � � ;

namely,

.1Cx/1=2 D 1C1

2xC� � �C

.1=2/.1=2�1/.1=2�2/ � � � .1=2�nC 1/

nŠxnC� � � :

2) Use the same method to derive the equation

.1Cx/�1=2 D 1�1

22C.2; 1/xC

1

24C.4; 2/x2C� � �C

.�1/n

22nC.2n; n/xnC� � � :

3) What property of the binomial coefficients can be deduced from the equationŒ.1C x/�1=2�2 D 1

1Cx?

EP 4.3.33. Let �.p/ be the number of rooted labelled trees of order p and

#.t/ D

1XpD1

�.p/tp

pŠ

be the corresponding EGF. Prove that

#.t/ D �W.t/;

where W.t/ is the Lambert W function, that is, the (many-valued) solution of thetranscendental equation WeW D �t . A rooted tree is a tree with a singled out vertex.

Section 4.4 Generating Functions II. Applications 217

EP 4.3.34. Prove that the convolution of sequences is commutative and associative,that is a � b D b � a and a � .b � c/ D .a � b/ � c.

EP 4.3.35. Prove that the binomial convolution of sequences is commutative and as-sociative.

EP 4.3.36. Derive the EGF for the Stirling numbers of the second kind S2.k; n/,

1

nŠ.et � 1/n D

Xk�nC1

S2.k; n/tk

kŠ:

EP 4.3.37. Use (4.3.13) to prove the equation

1XkD1

k

2kD 2:

The latter implies immediately thatP1kD1

k�12kD 1.

EP 4.3.38. What is the probability to get a heads at the first flip of a fair coin? At thesecond flip? At the third? . . . At the nth flip? Use the previous problem to find theexpected value of the number of flips before the first head occurs.

4.4 Generating Functions II. Applications

In this section we employ the method of GF to study partitions and compositionsof natural numbers. Then we take up linear difference equations (or recurrencerelations) with constant coefficients and solve more problems.

N. M. Ferrers � Partitions in Number Theory � Young Tableaus (dia-grams) � Fibonacci summary � Fibonacci numbers � Hardy’s biog-raphy � Lucas’ biography � Jacobi’s biography � Jacobians

Problem 4.4.1. The postage for a letter is 84 cents. In how many ways can one buystamps to send the letter if the post office has only 42 cent and 1 cent stamps?

Solution. The simplest way is to buy two 42 cent stamps. However, it is also possibleto buy one 42 cent stamp and 42 penny stamps, or to buy 84 penny stamps. Therefore,there are three ways to pay the postage, namely,

84.cents/ D 1.cent/ � 84

D 1.cent/ � 42C 42.cents/ � 1

D 42.cents/ � 2:


In this problem we represent the integer number 84 as a sum of integer numbersin several ways. It should be also noticed that stamps can be put in any order, whilein other problems the order of addends can be essential. Similar problems7 occur inmany applications. In this section we study them using the method of GF. We startwith a formal definition.

Definition 4.4.1. A set of ordered pairs of natural numbers

….n; k/ D ¹.n1; k1/; .n2; k2/; : : : ; .nl ; kl/º

where k1C� � �Ckl D k and n1k1C� � �Cnlkl D n, is called a partition of a naturalnumber n in k terms (or addends) n1; n2; : : : ; nl . Since the addition is commutative,without loss of generality we will always list the terms of partitions in increasing orderof their terms, that is, we always assume that 1 � n1 < n2 < � � � < nl .

For instance, the partition ….15; 8/ D ¹.1; 4/; .2; 3/; .5; 1/º, where l D 3, n1 D 1,n2 D 2, n3 D 5, k1 D 4, k2 D 3, k3 D 1, 4C3C1 D 8 and 1 �4C2 �3C5 �1 D 15,corresponds to the following representation of 15 as a sum of 8 terms: 15 D .1C 1C1C 1/C .2C 2C 2/C .5/. In Problem 4.4.1 we found three partitions of the number84,

Q.84; 84/ D ¹.1; 84/º, that is, 84 D 1 � 84,

Q.84; 42/ D ¹.1; 42/; .42; 1/º, that

is, 84 D 1 � 42C 42 � 1, andQ.84; 2/ D ¹.42; 2/º, that is, 84 D 42 � 2.

Theorem 4.4.2. The GF for the number of partitions is

.1C t C t2 C t3 C � � � / � .1C t2 C t4 C t6 C � � � / � � �

� .1C tk C t2k C t3k C � � � / � � �

D ¹.1 � t /.1 � t2/ � � � .1 � tk/ � � � º�1:

(4.4.1)

Proof. If we multiply out all the series in (4.4.1), the power tn appears as the product

tn D t1�k01 � t2�k

02 � � � t i �k

0i � � � :

Here the exponent 1 � k01 indicates that the number n contains k01 of units, that is, k01of the infinitely many infinite series in (4.4.1) contributed the term 1 D t0 as a factorto tn. Next, the exponent 2 � k02 shows that n contains k02 of 2s, etc.; some k0i may beequal to zero. Leaving out zero exponents, denoting nonzero terms by ki , and usingthe geometrical series (4.3.6) we deduce (4.4.1).

Problem 4.4.2. Solve again Problem 4.4.1 by making use of GF (4.4.1).

7These problems are discussed in detail by G. Andrews [3].


Problem 4.4.3. Show that the GF for the number of partitions with different addends,that is, with k1 D k2 D � � � D kl D 1 (no term repeats), is .1C t / � .1C t2/ � � � .1Ctk/ � � � . The GF of partitions with all terms not exceeding a given number q, that is,with ni � q for every i , is

.1C t C t2 C t3 C � � � / � .1C t2 C t4 C t6 C � � � / � � �

� .1C tq C t2q C t3q C � � � / D ¹.1 � t /.1 � t2/ � � � .1 � tq/º�1:(4.4.2)

The coefficient of tn in (4.4.1) depends only on n and counts all the partitions ofn with any k D 1; 2; : : : . However, if we consider the number of partitions of n,containing precisely k parts, this quantity depends on two integer parameters n and k.If the quantities we sought depend on several parameters, it may be useful to employGF of two or more variables. In the next theorem we find a GF for partitions ofintegers, taking also into account both n and the number k of terms in a partition.

Theorem 4.4.3. The GF of the partitions containing exactly k terms is

P.t; k/ D tk¹.1 � t /.1 � t2/ � � � .1 � tk/º�1: (4.4.3)

Proof. To consider both parameters, n and k of ….n; k/, we introduce a function oftwo variables

F.t; u/ D .1C ut C u2t2 C u3t3 C � � � / � .1C ut2 C u2t4 C u3t6 C � � � /

� .1C ut i C u2t2i C u3t3i C � � � / � � �

D ¹.1 � ut/ � .1 � ut2/ � � � º�1: (4.4.4)

Multiplying out the series in (4.4.4), we get the addends (cf. the proof of Theo-rem 4.4.3)

uk01 � tk

01 � uk

02 � t2k

02 � � �uk

0i � t ik

0i � � � D uk

01Ck

02C��tk

01C2k

02C��

that is the exponent of the factor t contains k01 of units, k02 of twos, and so on, totalingto k01 C k

02 C � � � addends. Therefore, expanding F.t; u/ against the powers of u, we

derive

F.t; u/ D

1XkD0

P.t; k/uk (4.4.5)

where the coefficient P.t; k/ includes only powers tn such that the correspondingpartition of n contains exactly k terms. Hence, if we expand P.t; k/ against thepowers of t , the coefficient of tn will give the number of partitions of n consistingexactly of k parts. Thus, P.t; k/ is the GF we are looking for, since it lists partitionswith precisely k � 1 parts.


Now we find P.t; k/ explicitly. First we remark that by (4.4.4) and (4.4.5), P.t; 0/D F.t; 0/ D 1. Moreover, one can directly verify the equation

.1 � ut/F.t; u/ D F.t; ut/:

Inserting series (4.4.5) in this equation, we get

P.t; k/ � tP.t; k � 1/ D tkP.t; k/; k D 1; 2; : : :

so that .1 � tk/P.t; k/ D tP.t; k � 1/; k � 1. If we replace here k with k � 1, weget .1 � tk�1/P.t; k � 1/ D tP.t; k � 2/. From these two equations

.1 � tk/.1 � tk�1/P.t; k/ D t2P.t; k � 2/; k � 2:

Repeating this process, that is, reducing the latter to P.t; k � 3/, then to P.t; k �4/; : : : , and eventually to P.t; 0/, we get (4.4.3).

Problem 4.4.4. Show that the number of partitions of number 2r C k in r C k partsdoes not depend on k.

Solution. By Theorem 4.4.3, this number is the coefficient of t2rCk in P.t; r C k/.Since .2r C k/ � .r C k/ D r , this coefficient is equal to the coefficient of tr in theTaylor series of the function ¹.1 � t /.1 � t2/ � � � .1 � trCk/º�1. However, the lattercoefficient depends only on the first r factors

¹.1 � t /.1 � t2/ � � � .1 � tr/º�1:

Then, due to (4.4.2) this coefficient is equal to the number of partitions of r intoaddends not exceeding r . Since a partition of r cannot contain a term bigger thanr itself, the latter condition (“addends not exceeding r”) imposes no restriction onpartitions, thus the quantity we want is the total number of the partitions of r , whichdoes not depend on k.

This problem and some others can be easily solved by making use of special dia-grams.

Definition 4.4.4. Let be k1C � � � C kl D k and n1k1C � � � C nlkl D n. The Ferrers(or Young) diagram of a partition

….n; k/ D ¹.n1; k1/; .n2; k2/; : : : ; .nl ; kl/º

is a set of n D n1k1C� � �Cnlkl dots in the plane, situated in k D k1C� � �Ckl rowssuch that for any i D 1; 2; : : : ; l there are ki rows containing ni dots each.


We always consider normalized diagrams, such that the left-most dots of all rowsform a vertical column and the numbers of points in consecutive rows, from top tobottom, do not increase. Thus, diagrams explicitly display the terms of a partition ashorizontal rows of dots, from largest to smallest. For example, the Ferrers diagram inFig. 4.2 depicts the partition ….13; 6/ D ¹.1; 3/; .3; 2/; .4; 1/º.s s s ss s ss s ssss

Figure 4.2. The Ferrers diagram of the partition ….13; 6/.

Another solution of Problem 4:4:4. Since a partition….2rCk; rCk/ contains rCkparts, the left-most column of its normalized Ferrers diagram consists of r C k dots.Hence, the complementary part of the diagram contains .2r C k/ � .r C k/ D r

dots, so that this part corresponds to a partition of the number r . Vice versa, if weappend a column, consisting of r C k dots, on the left to the normalized diagram ofany partition of a number r , we derive the diagram of some partition….2rCk; rCk/,which proves the statement.

Next we consider compositions, or ordered partitions of integer numbers. To for-malize ordering, we again use the language of mappings. We remind the notation ofa natural segment Nm D ¹1; 2; : : : ; mº.

Definition 4.4.5. A composition of a natural number n 2 N, containing m parts, is amapping f W Nm ! N such that f .1/C f .2/C � � � C f .m/ D n.

Example 4.4.6. The partition (Fig. 4.2)

….13; 6/ D ¹.1; 3/; .3; 2/; .4; 1/º

that is 13 D .1C1C1/C.3C3/C.4/, generates a composition 13 D ¹1; 1; 1; 3; 3; 4º.This composition can be realized as a mapping

f1 W ¹1; 2; 3; 4; 5; 6º ! N

such that f1.1/ D f1.2/ D f1.3/ D 1; f1.4/ D f1.5/ D 3, and f1.6/ D 4.However, the mapping with the same domain and codomain,

f2 W ¹1; 2; 3; 4; 5; 6º ! N;

such that f2.1/ D f2.2/ D f2.5/ D 1; f2.4/ D f2.6/ D 3, and f2.3/ D 4,corresponds to another composition with the same parameters n D 13 and m D 6,namely, 13 D ¹1; 1; 4; 3; 1; 3º.


Using the same reasoning as before, we see for ourselves that the GF for the com-positions of a number n, consisting of m parts, is

.t C t2 C � � � /m D tm.1 � t /�m:

To find the coefficient of tn in this series, we have to compute the coefficient of tn�m

in the series .1 � t /�m – see (4.3.7). Hence, we have proved the next statement.

Theorem 4.4.7. There areC.n � 1;m � 1/ (4.4.6)

compositions of a number n containing m parts.

Remark 4.4.8. Formula (4.4.6) simultaneously counts combinations with unlimitedrepetitions from elements of m types containing at least one element of each type.

Corollary 4.4.9. There are C.nCm � 1;m � 1/ compositions if a composition cancontain zero terms. To prove this, we can just replace each term xj of a compositionwith xj � 1, thus increasing n by m.

Problem 4.4.5. In how many ways is it possible to get the sum of n after rolling afair die several times, if after each roll we add up the numbers it landed on?

Solution. The question, as it is stated, is ambiguous because the answer depends onwhether or not we consider the order, in which the outcomes occur. For instance, wecan get the sum of 4 as the result of rolling a 3 followed by a 1, or as the result ofrolling a 1 followed by a 3 – are these two outcomes different or we consider them thesame and identify such results?

These are two different problems. First we assume that the result depends on theorder of addends. Thus, we have to compute the number of compositions of a naturalnumber n, such that any part of a composition does not exceed 6, and there is norestriction on the number of parts. If the number m of rollings, that is, the number ofparts of a composition, is given, we have a one-to-one correspondence between thecompositions we sought and mappings

f W ¹1; 2; : : : ; mº ! ¹1; 2; 3; 4; 5; 6º

such that f .1/ C f .2/ C � � � C f .m/ D n. Since any composition has at least onepart, such mappings are listed by the polynomial

.t C t2 C � � � C t6/m D tm.1C t C t2 C � � � C t5/m

and the GF for such mappings is

1XmD1

tm.1C t C t2 C � � � C t5/m Dt .1C t C t2 C � � � C t5/

1 � t � t2 � � � � � t6: (4.4.7)


If we want to avoid operations with infinite series, we can do that by specifying ann and truncating all series, keeping in mind that m cannot exceed n – even if a dieshows a 1 every time, it is sufficient to roll the die n times to accumulate the sum of n.

For example, if n D 4, we can consider a polynomial

4XmD1

.t C t2 C � � � C t6/m:

Moreover, if n D 4, the faces with 5 and 6 are irrelevant for the problem, and we caneven consider a polynomial of a smaller degree,

4XmD1

.t C t2 C t3 C t4/m:

After simple calculation we find the coefficient of t4 in the latter polynomial to be 8,that is, the sum of 4 can be obtained in 8 ways shown in equation (4.4.8):

4 D

8ˆ<ˆˆ:

4

1C 3

3C 1

2C 2

1C 1C 2

1C 2C 1

2C 1C 1

1C 1C 1C 1:

(4.4.8)

Let us return to an arbitrary n. If we allow m to be equal to 0, assuming by defini-tion that there exists the unique composition consisting of zero parts, then the GF issimpler:

1XmD0

tm.1C t C t2 C � � � C t5/m D1

1 � t � t2 � � � � � t6

but obviously, the latter has the same coefficient of tn; 8n � 1, as the former onegiven by (4.4.7).

If we specify the number m of rollings, then the GF is .t C t2C � � � C t6/m, whereby the multinomial theorem EP 1.5.19 the coefficient of tn isX

k1Ck2C��Ck6Dm; k1C2k2C��C6k6Dn

C.mI k1; k1; : : : ; k6/

and C.mI k1; k1; : : : ; k6/ are multinomial coefficients (1.5.1). Here ki can be anywhole numbers including zero. For example, if n D 4 and m D 3, the latter sumcontains the only addend with k1 D 2 and k2 D 1 and reduces to C.3I 1; 2/ D 3.


Indeed, among the eight terms in (4.4.8) exactly three terms, 1C 1C 2; 1C 2C 1,and 2C 1C 1, contain 3 addends.

Now we solve Problem 4.4.5 assuming that the result does not depend on the or-der of faces. Therefore, we have to find the number of partitions with all parts notexceeding 6, and the GF is given by (4.4.2) with q D 6,

.1C t C t2 C � � � / � .1C t2 C t4 C t6 C � � � / � � � .1C t6 C t12 C t18 C � � � /:

In particular, the coefficient of t4 here is 5, consequently, an unordered sum of 4 canbe obtained in five ways, including a 4 itself, namely,

4 D

8<ˆ:4

3C 1

2C 2

2C 1C 1

1C 1C 1C 1

:

In the end of this section we apply the method of GF for solving linear recurrencerelations (called also linear difference equations) with constant coefficients.

Definition 4.4.10. A sequence ¹anº1nD0 satisfies a linear (non-homogeneous) differ-ence equation of order r if for all n D 0; 1; 2; : : :, there hold equations

anCr D c1 � anCr�1 C c2 � anCr�2 C � � � C cr � an C dnCr (4.4.9)

where c1; : : : ; cr are given constant coefficients, cr ¤ 0, and ¹dnº1nD0 is a givensequence. If all dn D 0, then (4.4.9) is called homogeneous.

A well-known example of such a sequence is the sequence of the Fibonaccinumbers

¹1; 1; 2; 3; 5; 8; : : :º

satisfying the difference equation anC2 D anC1 C an for all n � 0. The theory oflinear difference equations is in many instances similar to the theory of linear ordi-nary differential equations. In particular, the reader can readily prove the followingsuperposition principle for linear difference equations.

Proposition 4.4.11. 1) If two sequences satisfy a linear homogeneous differenceequation, then any linear combination of these sequences also satisfies thisequation.

2) A linear difference equation of order r has r linearly independent solutions; tospecify a solution, one must assign r additional conditions.

To develop the theory of linear difference equations, we prove in Theorem 4.4.13that if a sequence ¹anº1nD0 satisfies a difference equation (4.4.9), then its GF is arational function. First we give one more definition.


Definition 4.4.12. The polynomial

g.t/ D tr � c1tr�1� c2t

r�2� � � � � cr�1t � cr

is called the characteristic polynomial of difference equation (4.4.9).

Let the characteristic polynomial g.t/ have the roots ˛1; ˛2; : : : ; ˛s with multiplic-ities l1; l2; : : : ; ls , l1 C l2 C � � � C ls D r . Introduce another polynomial

k.t/ D trg.1=t/:

It is known that if all coefficients ci in (4.4.9) are real numbers (we consider onlythis case), then complex roots of g, if there are any, must appear in pairs of complexconjugate numbers, that is, if ˛ D a C b{ is a root, where a; b are real numbers and{ is the imaginary unit, {2 D �1, then ˛ D a � b{ also must be a root of the samemultiplicity [32]. Moreover, g can be factored as

g.t/ D .t � ˛1/l1 � � � .t � ˛s/

ls

thereforek.t/ D .1 � ˛1t /

l1 � � � .1 � ˛st /ls : (4.4.10)

We use these observations to find the general solution of any linear homogeneousdifference equation with constant coefficients. As before, the GF of the sequencea D ¹anº1nD0 is denoted by fa.t/ D

P1nD0 ant

n.

Theorem 4.4.13. The GF fa of a sequence a, satisfying a homogeneous equation.4:4:9/, is a rational function,

fa.t/ Dh.t/

k.t/(4.4.11)

where the polynomial k of degree r was defined in .4:4:10/ and h is a polynomial ofdegree at most r � 1. Moreover,

fa.t/ D

1XnD0

² sXiD1

Pi .n/˛ni

³tn:

Thus, the general term of the sequence a is given by the expression

an D

sXiD1

Pi .n/˛ni ; n � 0 (4.4.12)

where each polynomial Pi has degree li � 1, in particular, Pi D const whenever ˛iis a simple root.

Vice versa, if an are given by .4:4:12/, then the sequence ¹anº1nD0 satisfies a ho-mogeneous equation .4:4:9/.


Proof. Multiplying the GF fa.t/ DP1nD0 ant

n by the polynomial k.t/ D trg.1=t/,and making use of equation (4.4.9), we readily verify that h.t/ D fa.t/ � k.t/ is apolynomial of degree at most r � 1, which immediately implies (4.4.11).

To prove (4.4.12), we decompose the rational function (4.4.11) in the sum of partialfractions,

fa.t/ D

sXiD1

liXjD1

ˇij .1 � ˛i t /�j

where ˇij are constants and (4.3.7) implies the expansion

.1 � ˛t/�j D

1XnD0

C.nC j � 1; j � 1/ ˛ntn:

Note also thatliXjD1

ˇijC.nC j � 1; j � 1/˛ni D Pi .n/˛

ni

where Pi .n/ is a polynomial of degree at most li � 1; moreover, any such polynomialcan be obtained by an appropriate choice of the constants ˇij . This proves (4.4.12).

Each polynomialPi .n/ has at most li nonzero coefficients, since its degree does notexceed li �1. Hence, these polynomials altogether have l1C� � �C ls D r coefficientsand to find them we need r additional conditions; for example, we can assign the firstr terms ¹a0; a1; : : : ; ar�1º of the sequence a.

Remark 4.4.14. We immediately see that if a root is simple, then the correspondingpolynomial is a constant, hence it follows from (4.4.12) that if all roots are simple,then every an is a linear combination of the nth powers of the s roots of the character-istic polynomial.

Problem 4.4.6. How many are there n-arrangements with repetition from the two-element set A D ¹a; bº, such that no two symbols a are situated next to one another?

Solution. Denote the number of these arrangements by f .n/. We define f .0/ D1, for there is the unique empty arrangement; it is also clear that f .1/ D 2 sincethere are two such one-element arrangements, namely .a/ and .b/. If n � 2, we cansplit all such arrangements into two disjoint subsets – those beginning with an a andbeginning with a b. The second subset contains f .n � 1/ arrangements because thefirst character b puts no restriction on the second symbol. However, the first subsetcontains f .n�2/ arrangements, since the first a must be followed by a b to avoid tworepeating symbols a. By the Sum Rule

f .n/ D f .n � 1/C f .n � 2/; n � 2: (4.4.13)


Consequently, the sequence f .n/ satisfies the homogeneous second order differenceequation (4.4.13) with the characteristic polynomial g.t/ D t2 � t � 1. The quadraticequation g.t/ D 0 has two different roots ˛1 D 1

2.1 C

p5/ and ˛2 D 1

2.1 �

p5/,

hence their multiplicities are l1 D l2 D 1. Thus, Pi ; i D 1; 2, are zero-degreepolynomials, that is, constants. Denoting P1 D p and P2 D q, we get by (4.4.12)

f .n/ D p

1Cp5

2

!nC q

1 �p5

2

!n; n � 0:

Initial conditions f .0/ D 1 and f .1/ D 2 give the system of two linear algebraicequations ´

p C q D 1

12.1C

p5/p C 1

2.1 �p5/q D 2

which results in p Dp5C3

2p5; q D

p5�3

2p5

, and finally

f .n/ D

p5C 3

2p5

1Cp5

2

!nC

p5 � 3

2p5

1 �p5

2

!n; n � 0: (4.4.14)

Terms of the sequence ¹1; 2; 3; 5; 8; 13; : : :º are called the Fibonacci numbers; theycan also be defined by the same difference equation but with the initial conditionsf .0/ D f .1/ D 1, leading to the sequence ¹1; 1; 2; 3; 5; 8; 13; : : :º – see EP 4.4.2.

Problem 4.4.7. How many directed paths are there in the directed graph in Fig. 4.3,which start either at the vertex A or at B and arrive at the vertex Cn?

-rA C1r -rC3

-r rB C2

��@

@@@@R

-

--��

��@

@@@@Rr r

rCn�1

Cn�2 Cn

@@@R

@@@

��

Figure 4.3. A graph in Problem 4.4.7.

Solution. Denote the number of paths from A to Cn by an and from B to Cn by bn.It is clear that whether we start at A or at B , there are two ways to reach the vertexCn – either through Cn�1 and then down to Cn, or through Cn�2 and then directly toCn by the horizontal edge. Thus, we immediately derive the system of two decoupledlinear difference equations ´

an D an�1 C an�2

bn D bn�1 C bn�2


with the initial conditions a1 D a2 D 1 and b1 D 1; b2 D 2. Therefore, anare the Fibonacci numbers, an D f .n/, and bn are the shifted Fibonacci numbers,bn D f .nC 1/, and the total number of paths is (cf. EP 4.4.2)

an C bn D f .n/C f .nC 1/ D f .nC 2/

D1p5

24 1Cp52

!nC2�

1 �p5

2

!nC235 :One of many applications of the difference equations is evaluation of determinants.

Remind here a few definitions, also needed in subsequent sections. Consider a permu-tation8 g D .x1; x2; : : : ; xm/ of the first m natural numbers. The permutation g issaid to have an inversion if there are indices i < j such that xi > xj . A permutationis called even (odd) if it has an even (odd) number of inversions.

Definition 4.4.15. The determinant of an n � n matrix

M D

0BBB@a1;1 a1;2 � � � a1;na2;1 a2;2 � � � a2;n:::

:::: : :

:::

an;1 an;2 � � � an;n

1CCCAis the alternating sum of nŠ products, such that each product contains one elementfrom every row and from each column of the matrix. Here “alternating” means thatthe product a1;i1a2;i2 � � � an;in has the sign .�1/�.i1;i2;:::;in/, where �.i1; i2; : : : ; in/ isthe number of inversions in the permutation .i1; i2; : : : ; in/. The determinant of thematrix M is denoted by

det.M/ D

ˇˇˇa1;1 a1;2 � � � a1;na2;1 a2;2 � � � a2;n:::

:::: : :

:::

an1 an;2 � � � an;n

ˇˇˇ :

For instance, ifM D .a/ is a 1�1matrix, its determinant is the unique entry of thematrix M , det.M/ D a; if M D

�a bc d

�is a 2 � 2 matrix, then det.M/ D ad � bc.

The determinant of an n � n matrix can be computed by using the expansion acrossthe 1st row,

det.M/ D a1;1 det.M1;1/ � a1;2 det.M1;2/

C a1;3 det.M1;3/ � : : :C .�1/n�1a1;n det.M1;n/

8Permutations are studied in more detail in Section 4.5.


where M1;i ; 1 � i � n, are .n� 1/ � .n� 1/ matrices obtained from M by deletingits 1st row and i th column. The determinant det.M1;i / is called the minor of a matrixelement a1;i . Quite similarly, one can expand a determinant along any row or column.This is a recursive procedure: we reduce, step-by-step, the order of a determinant tobe computed, until we reach 2 � 2 determinants that can be calculated straightfor-wardly, and then we work backward, computing minors of third, fourth, etc., orders.The procedure described is lengthy and there are different ways to speed up the com-putations, see, e.g. [37]. In the following problem we show that some determinantscan be efficiently computed by making use of difference equations.

Problem 4.4.8. Compute a three-diagonal determinant (a special case of the Jacobideterminant) of order n,

dn D

ˇˇˇ1 1 0 0 � � � 0

1 1 1 0 � � � 0

0 1 1 1 � � � 0:::

0 � � � 0 0 1 1

ˇˇˇ :

Solution. Expanding dn along the first row (or the first column), we immediately de-rive a recurrence relation dn D dn�1 � dn�2. Its characteristic polynomial g.t/ Dt2 � t C 1 has simple complex roots exp

®˙

�i3

¯, so dn D pen

�i3 C qe�n

�i3 , where

p and q are constants, that is, polynomials of zero degree. The values of the determi-nants d1 D j1j D 1 and d2 D j 1 11 1 j D 0 are immediate, and we use them to set up asystem of algebraic linear equations´

pei�3 C qe�i

�3 D 1

pei2�3 C qe�i

2�3 D 0

:

Solving this system, we find p D .ei�3 C 1/�1, q D .e�i

�3 C 1/�1, and finally

dn D2p3

sin .nC1/�3

.

Similarly, a determinant of order n,

dn D

ˇˇˇ2 1 0 0 � � � 0

1 2 1 0 � � � 0

0 1 2 1 � � � 0:::

0 � � � 0 0 1 2

ˇˇˇ

satisfies the equation dn D 2dn�1 � dn�2. However, in this case the characteristicequation t2 � 2t C 1 D .t � 1/2 D 0 has a multiple root ˛ D 1 of multiplicity l D 2,therefore, the general solution of the equation dn D 2dn�1�dn�2 should be looked at


as dn D .p �nCq/˛n D p �nCq. Again, we immediately compute the determinantsof orders 1 and 2, d1 D j2j D 2 and d2 D j 2 11 2 j D 3, and use these values to set upan algebraic system of two linear equations

® pCqD22pCqD3 . From here p D q D 1, and

we find dn D nC 1.

Problem 4.4.9. In how many parts do n convex closed curves divide the plane if anytwo curves have two common points, but no three have a common point?

Solution. Denote by an the number of parts, generated by n curves. If we add onemore, .nC 1/st curve to the existing n curves, it has two intersection points with eachof the initial n curves, totalling to 2n points. These 2n points split the new, .nC 1/st

curve into 2n pieces, and each of these pieces divides exactly one of the initial anparts of the plane into two pieces. Therefore, the new curve increases the numberof parts, the plane was decomposed to by n curves, by 2n resulting in the differenceequation

anC1 � an D 2n: (4.4.15)

Unlike the preceding ones, this difference equation is nonhomogeneous, therefore, itsgeneral solution is the sum of the general solution of the corresponding homogeneousequation and a particular solution of nonhomogeneous equation (4.4.15). The homo-geneous equation anC1 � an D 0 has a linear characteristic polynomial g.t/ D t � 1with one simple root ˛ D 1, thus the general solution of the homogeneous equationis ahom

n D p˛n D p, where p is an arbitrary constant.The right-hand side of (4.4.15) is 2n – this is a polynomial in n, therefore we look

for a particular solution of the nonhomogeneous equation as a polynomial as well.However, 2n is a first-degree polynomial in n, and its degree, which is 1, is a root ofthe characteristic polynomial g.t/, so that we must look for a particular solution as asecond-degree polynomial anonhom

n D qn2 C rn C s. Inserting this into (4.4.15) weget q D 1; r D �1; s may be any number, we choose s D 0. Thus, anonhom

n D n2�n

and an D anonhomn C ahom

n D n2 � nC p. Next, a1 D 2 because one convex closedcurve divides the plane in two parts, cf. Theorem 2.5.2. Using this initial condition,we find p D 2 and an D n2 � nC 2.

The next problem deals with a sequence ¹cnº1nD0 satisfying a nonlinear difference

equation. Such equations, like their differential counterparts, can be explicitly solvedonly in rare occasions. In this problem we are able to solve a nonlinear equation byusing the GF of the sequence sought.

Problem 4.4.10. In how many ways can one compute the product of nC 1 quantities˛1; ˛2; : : : ; ˛nC1 taken in this fixed order, if the multiplication is non-associative?

Solution. Associativity means that .ab/c D a.bc/, thus, the question is, in how manyways can we insert parentheses among n C 1 factors ˛1; ˛2; : : : ; ˛nC1 taken in this


order? Denote the number of possible products by cn. For example, if n D 1, thereis only one way to multiply two elements ˛1; ˛2 , namely, ˛1 � ˛2, hence c1 D 1.However, for n D 2 there are two possibilities, .˛1 � ˛2/ � ˛3 and ˛1 � .˛2 � ˛3/, thus,c2 D 2. We leave it to the reader to verify that c3 D 5. It is convenient to definec0 D 1.

To derive the GF

fc.t/ D c0 C c1t C � � � C cntnC � � �

let us notice that it is always possible to determine the position of the very last multi-plication, that is, we can find an element ˛r such that in order to compute the entireproduct, we multiply the leftmost r elements, and independently multiply the otherrightmost nC 1� r elements, and only after that multiply the two partial products; inthe example with two factors above r D 1, and in the example with three factors eitherr D 2 or r D 1. Now let us notice that there are cr�1 ways to multiply the leftmostr elements and cn�r ways to multiply the other nC 1 � r elements. By the ProductRule, there are cr�1 � cn�r ways to calculate this product with the last multiplicationafter the ˛r . Since r runs from 1 through n, we have by the Sum Rule

cn D c0 � cn�1 C c1 � cn�2 C � � � C cr�1 � cn�r C � � � C cn�1 � c0; n � 1: (4.4.16)

Comparing (4.4.16) with the convolution c � c, we see that the right-hand side of(4.4.16) is the coefficient of tn in the power series of .fc.t//

2. Computing f 2c .t/ andusing the condition c0 D 1, we derive a quadratic equation for fc.t/,

tf 2c .t/ � fc.t/C 1 D 0

which has roots 12t.1˙

p1 � 4t/. Since fc.0/ D c0 D 1, we see that

1Cp1 � 4t

2t

is an extraneous root, so that we must set

fc.t/ D1 �p1 � 4t

2t:

By making use of the result of EP 4.3.32

.1C x/1=2 D 1C1

2x C � � � C

12.12� 1/.1

2� 2/ � � � .1

2� nC 1/

nŠxn C � � �

with x D �4t , we expand .1 � 4t/1=2 in the power series and get again the Catalannumbers (Remark 1.4.8) Catn D 1

nC1C.2n; n/.


Recurrence relations allow us to find certain sums explicitly, in closed form. Let uscompute again the sum found in Problems 1.1.4 and 4.3.11.

Problem 4.4.11. Evaluate anew the sum s.n/ DPnkD1 k

2, now by making use of arecurrence relation.

Solution. The equation s.n C 1/ � s.n/ D .n C 1/2 is obvious. This is a first-order linear nonhomogeneous recurrence equation with the characteristic polynomialt�1 D 0 and with a quadratic polynomial on the right. Hence, we look for a particularsolution of the nonhomogeneous equation as a polynomial of 3rd degree, s.n/ Dan3 C bn2 C cn C d . Inserting this polynomial in the equation and equating thecoefficients of n3; n2, and n, we find a D 1=3; b D 1=2; c D 1=6; d remainsundetermined. We set d D 0, since after all we have to add the general solution of thecorresponding homogeneous equation, which is p.1/n D p, hence, it is a constant pas well. Satisfying the obvious initial condition s.1/ D 1, we find that p D 0 andfinally we again derive the formula

s.n/ D1

3n3 C

1

2C1

6n D

1

6n.nC 1/.2nC 1/:

The last problem of this section employs some elementary complex analysis. GF,used in this text, have converging Taylor series, therefore their sums are analytic func-tions within the circle of convergence and the powerful techniques of complex anal-ysis may be used in applications of this method. We consider one typical example,referring the reader to [13] for a detailed treatment of the topic.

Problem 4.4.12 (Hardy ). Compute the sum

H.m/ D

Œm=2�XkD0

.�1/k

m � kC.m � k; k/; m D 1; 2; : : :

where Œm=2� is the integer part of m=2.

Solution. Applying (1.4.2) we readily verify the identity

1

m � kC.m � k; k/ D

1

m¹C.m � k; k/C C.m � k � 1; k � 1/º

or

1

m � kC.m � k; k/ D

1

m¹C.m � k;m � 2k/C C.m � k � 1;m � 2k/º: (4.4.17)

Consider the contour integral

I.m; n/ D1

2�i

IjwjD 1

2

w�n�1.1C w/mdw:


The integrand has two singular points, w D 0 and w D �1, but only the first one liesinside the contour jwj D 1

2. Computing the residue at the .nC 1/-fold pole w D 0,

we deduce the formula I.m; n/ D C.m; n/. From here and (4.4.17),

1

m � kC.m � k; k/ D

1

2�im

IjwjD 1

2

w�mC2k�1.1C w/m�k�1dw:

Therefore, for k > Œm=2� the integrand is a holomorphic function in the disk jwj < 1and

H.m/ D

Œm=2�XkD0

.�1/k

2�im

IjwjD 1

2

w�mC2k�1.1C w/m�k�1dw

D

1XkD0

.�1/k

2�im

IjwjD 1

2

w�mC2k�1.1C w/m�k�1dw:

The geometric seriesP1kD0

��w2

1Cw

�kconverges uniformly for jwj � q < 1, hence,

the order of summation and integration can be interchanged, yielding

H.m/ D1

2�im

IjwjD 1

2

w�m�1.2C w/.1C w/m�1² 1XkD0

��w2=.1C w/

�k ³dw

D1

2�im

IjwjD 1

2

w�m�1.2C w/.1C w/mdw

1C w C w2

where we have used the formula for the sum of geometric series. Instead of calculatingthe residue at a multiple pole w D 0, it is more convenient to change the directionwe traverse the contour and consider this integral over the boundary of the exterior

domain jwj > 12

, where the integrand has only two simple poles at w D �12˙ ip32

,since the residue at infinity is equal to zero. Computing residues at these points, weget

H.m/ D2.�1/m

mcos

�2

3m�

�D

´.�1/m 2

mif m � 0.mod 3/

.�1/m�1 1m

if m � ˙1.mod 3/:


EP 4.4.1. Prove Proposition 4.4.11.

EP 4.4.2. Solve again equation (4.4.13) for the Fibonacci numbers,

f .n/ D f .n � 1/C f .n � 2/;


now with the initial conditions f .0/ D f .1/ D 1, and derive the formula

f .n/ D1p5

" 1Cp5

2

!n�

1 �p5

2

!n#:

Calculate the first six Fibonacci numbers by making use of this formula.The number 1

2.1C

p5/ is called the golden ratio.

EP 4.4.3. Prove that the Fibonacci numbers fn satisfy the equations

1) f .mC n/ D f .m/f .n/C f .m � 1/f .n � 1/

2) f .1/C f .3/C � � � C f .2nC 1/ D f .2nC 2/

3) 1C f .2/C f .4/C � � � C f .2n/ D f .2nC 1/

4) f .nC 1/ D C.n; 0/C C.n � 1; 1/C � � � C C.n � k; k/; k D Œn=2�

5) f .nC 2/ � 1 D f .0/C f .1/C � � � C f .n/

6) The latter equation is a special instance of the identity

mXkD0

C.mCk�1; k/f .n�k/C

mXkD1

C.nCk�1; n/f .2mC1�2k/ D f .nC2m/:

EP 4.4.4. Prove that the sum of any 8 consecutive Fibonacci numbers is not a Fi-bonacci number.

EP 4.4.5. Let a sequence ¹g.n/º1nD0 satisfy the difference equation

g.n/ D ag.n � 1/

where a is a constant number and g.0/ D ˛. Prove that the EGF of this sequence,g.t/, satisfies the functional equation

e�atg.t/ � eatg.�t / D 0:

EP 4.4.6. Prove that for any solution of the difference equation

anC2 D anC1 C an

independently upon the initial data, both absolute values janC1an�1 � a2nj andjanC2an�1 � anC1anj do not depend on n. For the Fibonacci numbers, each of thesevalues is 1.


EP 4.4.7. Prove that the nth Fibonacci number f .n/ is equal to the continuant (aspecial three-diagonal determinant)

f .n/ D

ˇˇˇ1 1 0 0 � � � 0

�1 1 1 0 � � � 0

0 �1 1 1 � � � 0:::

0 � � � 0 0 �1 1

ˇˇˇ :

EP 4.4.8. Prove that for every natural n the number

an D

3Cp5

2

!nC

3 �p5

2

!n� 2

is equal to m2 for some natural m if n is odd, or else it is equal to 5m2 for somenatural m if n is even.

EP 4.4.9. A frog sits initially at the point of the number line marked by 1. From anypoint k; k D 1; 2; : : : , it can jump for one or two steps to the right, either to kC 1 orto k C 2. In how many ways can the frog reach the point n from its initial location?Two ways are identical if the frog visits the same points.

EP 4.4.10. How many are there 12-digit natural numbers, containing only digits 3and 9, such that no two digits 3 come together?

EP 4.4.11. Consider all bit strings of length n, that is, n-permutations

ˇ D .ˇ1; ˇ2; : : : ; ˇn/

where each ˇi is either a 0 or a 1. How many are there these permutations such thatˇi D ˇi.modn/C1 D 0 for every i; 1 � i � n?

EP 4.4.12. Prove that the GF for the Fibonacci numbers f .n/, is

f .1/t C f .2/t2 C � � � C f .n/tn C � � � Dt

1 � t � t2:

EP 4.4.13. Prove that the nth Fibonacci number f .n/ is the closest integer to the

power 1p5

�1Cp5

2

�n.

EP 4.4.14. The Lucas numbers Ln satisfy the same difference equation (4.4.13)as the Fibonacci numbers f .n/, Ln D Ln�1CLn�2, however, with the initial condi-tions L0 D 2 and L1 D 1. Find an explicit formula for the Lucas numbers. Comparethe first six Lucas and Fibonacci numbers.


EP 4.4.15. Prove that Ln D f .n � 1/C f .nC 1/.

EP 4.4.16. Try to find a particular solution of the nonhomogeneous difference equa-tion (4.4.15) using a first-degree polynomial pn C q and see for yourself why thisapproach does not work and we had to use a second-degree polynomial.

EP 4.4.17. Find the general solutions of difference equations

1) xnC2 C xnC1 � 2xn D 0

2) xnC2 C 4xn D 0

3) xnC3 � 3xnC2 C 3xnC1 � xn D 0

4) xnC2 C 2xnC1 � 3xn D 5 � 2n; x0 D 0; x1 D 1

5) xnC2 C 2xnC1 � 3xn D 5; x0 D 0; x1 D 1.

EP 4.4.18. Prove that if an D bn C bn�1; 8n � 1, and a0 D b0, then fa.t/ D

.1C t /fb.t/.

EP 4.4.19. An arithmetic progression can be defined as a solution of the recurrencerelation anC1 � an D d . Find the general term an of the arithmetic progression as afunction of d and a1 by solving this recurrence relation.

EP 4.4.20. A geometric progression can be defined as a solution of the recurrencerelation anC1 D q � an. Find the general term an of the geometric progression bysolving this recurrence relation.

EP 4.4.21. Kate invested $1 200 at 6% interest rate compounded monthly. If afterevery month she withdraws $50, find the balance in her account after one year.

EP 4.4.22. Prove that the Catalan numbers Catn satisfy the recurrence relation

CatnC1 D4.4n2�1/.nC1/.nC2/

Catn�1; n � 1.

EP 4.4.23. Find a sequence a such that its GF is

1) fa.t/ Dp2 � t

2) fa.t/ D log.1C t /.

EP 4.4.24. The characteristic equation of a linear homogeneous recurrence relationhas roots 0, 1, -1, 3. Find the general solution of this recurrence relation and write therelation explicitly.

EP 4.4.25. The ratio of a geometric progression is 1Cp5

2. Prove that each term of the

sequence, starting from the second one, is the difference of the two its neighbors.


EP 4.4.26. Solve the following systems of difference equations

1)

²xnC1 D yn � 2

ynC1 D xn C 3

2)

²xn D yn�1 � yn�2 C 4

yn D yn�1 C xn�1, x1 D 3; x2 D 5; y1 D 1.

EP 4.4.27. Find the GF for the number an of whole (nonnegative integer) solutions¹x; y; z; tº of the equation

x C 2y C 5z C 7t D n:

EP 4.4.28. In how many ways can a natural number n be written as a sum of threenatural addends?

EP 4.4.29. In how many ways can a natural number be represented as a sum of certainnatural addends?

EP 4.4.30. Use formula (1.2.2) in EP 1.2.10 to prove that the number of partitionsof n with k terms is the same as the number of partitions of n with each term notexceeding k.

EP 4.4.31. Denote by Comp.m; nI k/ the number of compositions of a natural num-ber m with n parts not exceeding k. Prove the recurrence relations

Comp.m; nI k/ D Comp.m � 1; nI k/C Comp.m � 1; n � 1I k/

� Comp.m � k � 1; n � 1I k/

and

Comp.m; nI k/ DnX

jD0

C.n; j /Comp.m � jk; n � kI k � 1/:

EP 4.4.32. How many ways are there to pay 90 cents using quarters, dimes, andnickels? First set up the GF for this quantity.

EP 4.4.33. Is it possible to change a silver dollar using exactly 50 coins? If yes, inhow many ways?

EP 4.4.34. The postage for a letter is 97 cents. In how many ways is it possible tobuy stamps if the post office has 42 cent, 20 cent, 3 cent, and 1 cent stamps? Considertwo cases, when the order of stamps does or does not matter.


EP 4.4.35. Together, 30 members of the Combi Club have composed 40 problemsfor the Club contest. Among the members there are freshmen, sophomores, juniors,seniors, and graduate students. Any two students of the same rank composed the samenumber of problems, while any two students of different ranks composed differentnumbers of problems. How many students composed one problem?

EP 4.4.36. In how many ways can a number 1 000 000 be represented as a product ofthree natural numbers, if the order of factors does not count?

EP 4.4.37. Find the number of terms in an expansion .x C y C z/n after combininglike terms. For example, the expansion .x C y/2 D x2 C 2xy C y2 contains threeterms.

EP 4.4.38. What is larger, the number of all partitions of a natural number n or thenumber of partitions of 2n in n parts?

EP 4.4.39. In how many parts is a sphere divided by n planes containing the centerof the spheres, if no three planes contain the same diameter of the sphere?

EP 4.4.40. At a hot dog eating contest, everyone of n participants ate no more thanm hot dogs. Denote by ci ; 1 � i � n, the number of frankfurters consumed by thei th contestant, and by dk; 0 � dk � m, the number of contestants consumed at leastk hot dogs. Prove that

c1 C c2 C � � � C cn D d1 C d2 C � � � C dm:

EP 4.4.41. Find closed-form formulas for the following sums

1)PnkD1 k2

k

2)PnkD1 k

22k

3)PnkD1 k

22�k .

EP 4.4.42. Compute a three-diagonal determinant of order n

dn D

ˇˇˇˇ

1 �1 0 0 � � � 0 0

1 1 �1 0 0 � � � 0

0 1 1 �1 0 � � � 0:::

0 0 � � � 0 0 1 �1

0 0 � � � 0 0 1 1

ˇˇˇˇ:


EP 4.4.43. For every integer n � 0, compute the determinant of order k C 1

dn;k D

ˇˇˇC.n; 0/ C.n; 1/ � � � C.n; k/

C.nC 1; 0/ C.nC 1; 1/ � � � C.nC 1; k/:::

:::: : :

:::

C.nC k; 0/ C.nC k; 1/ � � � C.nC k; k/

ˇˇˇ :

Definition 4.4.16. If we drop the sign .�1/� of each term in Definition 4.4.15 of thedeterminant of a matrix, the resulting number is called the permanent of this matrix.

EP 4.4.44. Compute the permanents of the square matrices leading to the determi-nants in Problem 4.4.8 and EPs 4.4.7, 4.4.8, 4.4.42, 4.4.43.

EP 4.4.45. Prove that if an D cnan�1Cdn, where ¹cnº and ¹dnº are given sequences,c0 D 0, and a0 D d0, then

an D

nXkD0

0@ nYjDkC1

cj

1A dk :Assume that

QnjDnC1 D 1.

EP 4.4.46. In how many parts do n lines split a plane, if no two lines are parallel andno three of them intersect at a point?

EP 4.4.47. In how many ways can a convex n-gon be split in n� 2 triangles by n� 3nonintersecting diagonals? Derive the GF and compare it with that for the Catalannumbers Catn.

EP 4.4.48. Recall that the Bell numbers Bn (Definition 1.1.14) count the number ofpartitions of an n-element set. Derive a difference equation for the Bell numbers,Bn D

PnkD1 C.n � 1; k � 1/Bn�k . This is a linear difference equation of variable

order.

EP 4.4.49. Let Cat.t/ be the GF for the Catalan numbers Catn and a be a wholenumber. Prove that .Cat.t//a

p1�4t

is the GF for the sequence of binomial coefficients

¹C.2k C a; k/º and .Cat.t//a for ¹ aaC2k

C.2k C a; k/º [35].

EP 4.4.50. Use EGF to prove the inversion formulas

an DXk

C.n; k/.x C k/n�kbk; 8n

m

bn DXk

C.n; k/.�1/n�k.x C n/n�k�1.x C k/ak; 8n:


4.5 Enumeration of Equivalence Classes

This section is devoted to the Pólya–Redfield enumeration theory, which gives ageneral method of deriving GF in various problems, in particular, in the problemswhere one has to find the number of equivalence classes. We have already solvedsuch problems when all the equivalence classes had the same cardinality and itwas enough to apply Lemma 1.1.25 or some equivalent statements. However,simple examples like Problem 4.5.1 below, show that equivalence classes canhave different cardinalities.

Moreover, not only can equivalence classes have different cardinalities, but theelements of sets under consideration may have various weights. Pólya’s theoryapplies to such problems as well. The subsequent exposition of the main Pólyatheorem [39] follows N.G. De Bruijn [5, pp. 144–184]. Other approaches to thistheory can be found, for example, in [41] or in [6, Appendix by J. Riget].

De Bruijn’s biography and work � Plato � Platonic Solids � Burn-side’s biography � Frobenius’ biography � Valence

The next problem illustrates some basic ideas of the theory.

Problem 4.5.1. Consider ten Hindu-Arabic numerals 0; 1; : : : ; 9. Some of them, like7, after rotating upside down through 180ı in their plane become meaningless sym-bols. However, some others after this rotation interchange with another digit, forexample, 6 becomes 9 and vice versa; moreover, the digits 0; 1; 8 do not change at all.

Denote by D the set of the whole numbers with five digits, if a number consists ofless than five digits, we add a few zeros in front of such a number, like 00236; thusjDj D 105. Two such five-digit numbers are said to be equivalent if one of themcan be transformed into another by rotating through the angle of 180ı or 0ı withoutremoving off the plane. How many nonequivalent numbers are there?

Solution. To solve the problem, we consider two mappings gi W D ! D; i D 0; 1,where g0 W D ! D is the identical mapping ofD, which clearly does not change anynumber, while the mapping g1 W D ! D rotates a number upside down if the resultis a number in D, and leaves the number unchanged if the number cannot be rotated.For example, g1.19 806/ D 90 861 and g1.12 880/ D 12 880. The reader can readilyverify that the binary relation on the set of all whole five-digit numbers described inthis problem is an equivalence relation. Thus, Problem 4.5.1 can be stated as follows.

Consider the following binary relation % on the setD, which is easily seen to be anequivalence relation:

Two numbers d1; d2 2 D are in the binary relation % if and only if either d1 D d2,that is, d2 D g0.d1/, or d2 D g1.d1/. How many equivalence classes exist inD withrespect to this equivalence relation?

Section 4.5 Enumeration of Equivalence Classes 241

We finish the solution of Problem 4.5.1 after the proof of Lemma 4.5.8.Hereafter, bijections of a set D onto itself are called substitutions or permutations

of the elements of D. The discussion in Problem 4.5.1 suggests that two elements d1and d2 of a given setD should be considered indistinguishable (identical, equivalent)if there exists a substitution g of the elements of D such that g.d1/ D d2. Any set Gof substitutions generates the following binary relation � on D:

Two elements d1; d2 2 D are %-equivalent if and only if there exists a substitutiong 2 G such that g.d1/ D d2.

We want to find conditions that guaranty that this binary relation % is an equivalencerelation on D, that is, it is reflexive, symmetric, and transitive. It is clear that in orderfor the binary relation % to be reflexive it is sufficient if G contains the identicalsubstitution. For % to be symmetric, it is enough if along with each substitution g,its inverse g�1 also belongs to G. To guarantee the transitivity of the binary relation%, the set of substitutions G must contain the superposition g1 ı g2 of any two of itselements g1; g2 2 G. Thus, to generate an equivalence relation on D, it suffices forG to have a special algebraic structure, namely, to be a group of substitutions with thesuperposition of substitutions as the group operation. We recall the definition.

Definition 4.5.1. A set X with a binary operation ı defined on the Cartesian productX �X is called a group if

1) X has the neutral element e such that e ı x D x ı e D x; 8x 2 X

2) the operation ı is associative, that is, x ı .y ı z/ D .x ı y/ ı z; 8x; y; z 2 X

3) each element x 2 X is invertible, that is, for any x 2 X there exists an elementx�1 2 X such that x ı x�1 D x�1 ı x D e.

The cardinality of a group is called the order of the group.

Example 4.5.2. For instance, all nŠ substitutions of an n-element setD make a group,called the symmetric group Sym D Symn; it follows from Lemma 1.1.47 that theorder of this group is nŠ.

Thus, in this section we consider an m-element set D D ¹d1; d2; : : : ; dmº anda group of substitutions G acting on D. The group operation is the superposition ofsubstitutions. This group may be the entire symmetric group Symm of all substitutionsof D, in which case jGj D mŠ, or it may be any subgroup of Sym, for example,a trivial group ¹eº of order 1 consisting of the only neutral element (substitution)e 2 G.

We study certain properties of substitutions. Any substitution g 2 G splits D intodisjoint subsets called cycles or orbits. Namely, fix an element d 2 D and consider asequence of elements

g.d/; g.g.d// D g2.d/; g.g.g.d/// D g3.d/; : : :


where we have used a standard notation g.gk/ D gkC1; g1 D g. Since the set D isfinite, after a several steps some element in this sequence must repeat. Suppose thatfor a given d 2 D all k elements d; g.d/; g2.d/; : : : ; gk�1.d/ are different, butgk.d/ D d . Then these k elements

D1 D ¹d; g.d/; g2.d/; : : : ; gk�1.d/º

are said to form a cycle of length k.If a cycle D1 ¤ D, that is k < m, then there exists an element d1 2 D n D1,

which generates another cycle starting with d1 and disjoint with D1, and so on. Aftera finite number of steps each element of D will get into one and only one cycle. Itis possible that each element forms its own one-element cycle; this is the case for theneutral (identical) substitution e. As another extreme, it is possible that all elementsof D belong to one cycle.

Definition 4.5.3. If a substitution g splits D in b1 cycles of length 1 (one-elementcycles), b2 cycles of length 2, b3 cycles of length 3, etc., g is said to have the cycletype .b1; b2; b3; : : :/.

Since D is finite, all but a finite number of entries of the cycle type sequence arezeros; more specifically, it is obvious that bmC1 D bmC2 D � � � D 0. It is alsoclear that b1 C b2 C b3 C � � � D jDj. Thus we write cycle type sequences as m-element sequences .b1; b2; b3; : : : ; bm/. Consequently, any substitution gives rise toa partition of the integer number jDj, however, different substitutions can yield thesame partition.

Substitutions acting on an m-element set D D ¹x1; : : : ; xmº, can be convenientlyrepresented by 2 �m matrices (we use the same symbol for a substitution and for itsmatrix)

g D

�x1 x2 : : : xmxj1 xj2 : : : xjm

�where xji D g.xi /. For example, the identical substitution e is given by the matrix

g0 D

�x1 x2 : : : xmx1 x2 : : : xm

�clearly exhibiting its structure – every element remains unmoved and makes its owncycle, the cycle type of e is .m; 0; 0; : : : ; 0/. On the other hand, a substitution gc givenby the matrix

gc D

�x1 x2 : : : xmx2 x3 : : : x1

�(4.5.1)

consists of the only cycle of length m, because gc moves x1 to x2, then x2 tox3; : : : ; xn to x1; hence it has the cycle type .0; : : : ; 0; 1/, where all elements, buta 1 at the mth place, are zeros.


Definition 4.5.4. Let a group of substitutions G act on an m-element set D and asubstitution g 2 G have the cycle type .b1; b2; b3; : : : ; bm/. A monomial

pg.x1; x2; : : : ; xm/ D xb11 x

b22 : : : xbmm

where x1; x2; : : : ; xm are indeterminates, is called the cycle index of a substitution g.A polynomial (the group average of the cycle indices of substitutions)

PG.x1; x2; : : : ; xm/ D1

jGj

Xg2G

pg.x1; x2; : : : ; xm/

D1

jGj

Xg2G

xb11 x

b22 � � � x

bmm

(4.5.2)

is called the cycle index of the group G acting on the set D.

Example 4.5.5. The identical (neutral) substitution e has the cycle type .m; 0; 0; : : : ;0/, hence

pe.x1; x2; : : : ; xm/ D xm1 :

If Ge D ¹eº; jGej D 1, then

PGe .x1; x2; : : : ; xm/ D xm1 : (4.5.3)

Remark 4.5.6. In polynomial (4.5.3) only x1 is an essential variable, all others arefictitious.

Problem 4.5.2. Consider three sets of substitutionsGe D ¹eº; Gc D ¹gcº, andG2 D¹e; gcº, where the substitution gc was defined by matrix (4.5.1). Does any of Ge, Gc ,and G2 make up a group?

Problem 4.5.3. Find the cycle index of the group of rotations of a square9 in theplane, when this group acts on the set D of the vertices of the square.

Solution. Here jDj D 4 and G D ¹g0; g1; g2; g3º, where gk is a rotation of thesquare over the angle of k�=2; k D 0; 1; 2; 3; the identical substitution is ge D g0.By (4.5.3), the cycle index of the identical rotation g0 is x41 . With regard to thesubstitutions g1 and g3, all the vertices make one cycle of length 4, hence their cycletype is .0; 0; 0; 1/ and corresponding monomials are x4. In particular,

g1 D

�x1 x2 x3 x4x2 x3 x4 x1

�and g3 D

�x1 x2 x3 x4x4 x1 x2 x3

�:

9See EP 4.5.10.


This is an example of two different substitutions with the same cycle structure. Underthe action of g2 the set of vertices breaks up into two cycles, each consisting of twononadjacent vertices, hence g2 D

�x1 x2 x3 x4x3 x4 x1 x2

�has the type .0; 2; 0; 0/ and its index

is x22 . Averaging these four monomials, we have by (4.5.2)

PG.x1; x2; x3; x4/ D1

4.x41 C x

22 C 2x4/: (4.5.4)

Problem 4.5.4. Find the cycle index of the group of rotations of a right tetrahedron10

with an equilateral base, when this group acts on the setD of faces of the tetrahedron.

Solution. Here jDj D 4 and the group G contains only three substitutions: the iden-tical substitution e and two rotations over the angles ˙120ı about the height of thepyramid, that is, about the axis perpendicular to the base. Thus,

PG.x1; x2; x3; x4/ D1

3.x41 C 2x1x3/: (4.5.5)

Problem 4.5.5. A tetrahedron is called regular if all its faces are equilateral triangles.Find the cycle index of the group of rotations of a regular tetrahedron11, when thisgroup acts on the set D of faces (or vertices, which is equivalent) of the tetrahedron.

Solution. Now we can rotate the tetrahedron about the axes perpendicular to eachface, thus we have eight rotations over ˙120ı angles about these axes. However,for a regular tetrahedron there is another kind of rotations – after rotating such atetrahedron through the angle of 180ı about the axis connecting the midpoints of twoskew edges, the tetrahedron coincides with itself. Since a pair of edges can be chosenin C.3; 2/ D 3 ways, there are three such substitutions. On the total,

PG.x1; x2; x3; x4/ D1

12.x41 C 8x1x3 C 3x

22/: (4.5.6)

Problem 4.5.6. Find the cycle index of the group of rotations of a cube12 acting onthe set of

1) vertices

2) edges

3) faces of the cube.

Solution. It is readily seen that there are 24 different rotations of a cube splitting inthe following five types.

10See EP 4.5.10.11See EP 4.5.10.12See EP 4.5.10.


(A) Identical rotation (� neutral substitution) e

(B) Three 180ı rotations about the lines connecting the centers of opposite parallelfaces

(C) Six˙90ı rotations about the same lines as in (B)

(D) Six 180ı rotations about the lines connecting the midpoints of opposite paralleledges

(E) Eight˙120ı rotations about the lines connecting the opposite vertices.

1) A cube has 8 vertices, thus in this case jDj D 8 and the (A)-type substitu-tions have the cycle type .8; 0; 0; : : :/, (B)-type substitutions have the cycle type.0; 4; 0; : : :/, (C)-type substitutions have the cycle type .0; 0; 0; 2; 0; : : :/, (D)-type substitutions have the same cycle type .0; 4; 0; : : :/ as the (B)-type ones,and (E)-type substitutions have the cycle type .2; 0; 2; 0; : : :/. Therefore,

PG.x1; x2; : : : ; x8/ D1

24.x81 C 9x

42 C 6x

24 C 8x

21x23/: (4.5.7)

2) A cube has 12 edges, therefore now jDj D 12 and the substitutions of the samefive kinds have cycle types, respectively, .12; 0; 0; : : :/, .0; 6; 0; : : :/, .0; 0; 0; 3;0; : : :/, .2; 5; 0; : : :/, and .0; 0; 4; 0; : : :/. Hence,

PG.x1; x2; : : : ; x12/ D1

24.x121 C 3x

62 C 6x

34 C 6x

21x52 C 8x

43/:

3) Now jDj D 6 and we deduce in the same way as before,

PG.x1; x2; : : : ; x6/ D1

24.x61 C 3x

21x22 C 6x

21x4 C 6x

32 C 8x

23/: (4.5.8)

Consider again a finite setD and a groupG of substitutions, acting on the elementsof D. This group generates the following equivalence relation on D – two elementsd1; d2 2 D are said to be equivalent if there exists a substitution g 2 G such thatg.d1/ D d2. As we have already noted, group axioms of G imply that this binary re-lation is an equivalence relation. Next we derive an important formula for the numberof classes of equivalence, which is traditionally called the Burnside lemma. To stateit, we need a definition.

Definition 4.5.7. By .g/ we denote the number of fixed elements of the substitutiong 2 G, that is, the number of elements d 2 D such that g.d/ D d .

Lemma 4.5.8 (Burnside or Cauchy–Frobenius lemma [8, p. 278]). If a groupof substitutionsG generates an equivalence relation on a finite setD, then the numberof the equivalence classes is

n D1

jGj

Xg2G

.g/: (4.5.9)


Remark 4.5.9. Therefore, the number of the equivalence classes is the average overthe group G of the numbers of fixed elements of the substitution g 2 G.

Proof. We calculate twice the cardinal number of the set of ordered pairs of substitu-tions and their fixed elements,

X D ¹.g; d/ jg 2 G; d 2 D; g.d/ D dº:

On the one hand, if a substitution g 2 G is fixed, then the number of these orderedpairs is .g/; summing up over all g 2 G, we have

jX j DXg2G

.g/:

On the other hand, let �.d/ be the number of substitutions g 2 G such that g.d/ D dfor a particular element d . Thus,

jX j DXd2D

�.d/

and we derive the equation Xd2D

�.d/ DXg2G

.g/:

For a fixed d 2 D let us consider a set Gd D ¹g 2 G jg.d/ D dº. This is asubgroup13 of G of order jGd j D �.d/.

Let d1 2 D be equivalent to d in the above sense, hence, there exists a substitutionh 2 G such that h.d1/ D d . If g.d/ D d1, then h.g.d// D h.d1/ D d andh ıg 2 Gd . Therefore, to each substitution g 2 G with the property g.d1/ D d therecorresponds a substitution g1 2 Gd . Vice versa, if g1 2 Gd , then g.d/ D d1, whereg D h�1 ı g1. Whence, the number of elements g such that g.d/ D d1, is equal tojGd j.

Now, letK.d/ denote the equivalence class containing an element d . Any substitu-tion g 2 G shifts d to an element of the same equivalence class. We have also shownthat for every element d1 equivalent to d , d1 � d , the number of substitutions g suchthat g.d/ D d1, is the same and is equal to jGd j. Thus,

�.d/ D jGd j D jGj=jK.d/j:

Summing up these equations over all d 0 2 K.d/ yields the equationXd 02K.d/

�.d 0/ D jGj:

To complete the proof of the Burnside Lemma, we have to add up all these equationsover all the equivalence classes.

13See EP 4.5.11.


The Burnside lemma will be essentially used in the proof of Pólya’s theorem, how-ever, we can immediately apply it to solve Problem 4.5.1.

Solution of Problem 4:5:1 (continued). Since the identity substitution does not moveany element ofD, .g0/ D 105. To compute .g1/, we notice that 105�55 numberscontain a digit, which does not turn over, these digits being 2; 3; 4; 5; 7. Moreover,there are 3 � 52 “symmetric” numbers, which do not change after the rotation. Indeed,to determine such a number, one has to select the middle digit, which is either 0, or1, or 8, and then to choose the first two digits of the number from the set 0; 1; 6; 8; 9;these three digits determine the number completely. Therefore, .g1/ D 105 � 55 C3 � 52. By formula (4.5.9) there are

1

2.105 C 105 � 55 C 3 � 52/ D 98 475

non-equivalent numbers.

Using the Burnside lemma, we can straightforwardly solve more problems onbracelets and similar things.

Problem 4.5.7. A bracelet consists of five beads of the same size and shape, but ofthree different colors. Two bracelets are considered to be identical (equivalent) ifwe cannot distinguish them after rotating about the wrist without flipping (not takingthem off the wrist). How many different bracelets are there?

Solution. Let D be the set of all possible placements of these five beads in the ver-tices of a regular pentagon, and G D ¹g0; g1; g2; g3; g4º be the group of rota-tions of this pentagon; here gk is the rotation of the pentagon through the angle of25k� radians, k D 0; 1; 2; 3; 4, g0 D e being the identical rotation. Assuming all

five beads to be different physical entities, we identify the elements of D with ar-rangements with repetition, that is, jDj D 35 D 243. Thus, .g0/ D 243 and .g1/ D .g2/ D .g3/ D .g4/ D 3, since a bracelet is a fixed element withrespect to a nontrivial rotation only if all beads are of the same color. By (4.5.9),the number of the equivalence classes, that is, the number of different bracelets, is15.243C 3 � 4/ D 51.If a bracelet consists of n distinguishable beads, the answer is 1

n.nŠC0C� � �C0/ D

.n�1/Š. Similarly, n people can be arranged in a dancing circle in 1n.nŠC0C� � �C0/ D

.n�1/Š ways. We assume that a bracelet (or a dancing circle) is located in a plane androtate it in this plane about the axis perpendicular to the plane. However, if we canput the bracelet off the wrist, turn it over and then put it back on the wrist – imagineall dancers standing upside down, then there are only 1

2.n � 1/Š indistinguishable

bracelets.

Remark 4.5.10. Compare this problem with Problem 4.2.1.


Problem 4.5.8. Solve Problem 4.5.7 for other numbers of beads and colors, for ex-ample, if there are 6 beads and 3 or 4 colors.

To develop a more power theory, we introduce another setR and consider the powerset RD (Definition 1.1.31). The group G generates a certain equivalence relation onthe power set RD .

Definition 4.5.11. Two mappings, f1 W D ! R and f2 W D ! R are called equiv-alent if there exists a substitution g 2 G such that f1 ı g D f2; the equivalence ofmappings is denoted by f1 � f2. Since G is a group, the group axioms induce thatthis is an equivalence relation.

Problem 4.5.9. Verify that the binary relation described in this definition is an equiv-alence relation.

Problem 4.5.10. 1) Given six different colors and assuming that not all of thenmust be used, in how many geometrically distinct ways is it possible to paintfaces of a cube?

Two colorings are called geometrically distinct if it is impossible to transfer onecoloring to another by rotating a cube.

2) In how many geometrically distinct ways is it possible to paint faces of a cubein two colors, blue and green geometrically identical.

Solution. 1) In general, there are P.6/ D 6Š D 720 colorings, but many of themshould be identified. We know (Problem 4.5.6) that there are 24 rotations of the cube,thus any equivalence class in this problem consists of 24 elements, and the number ofnonequivalent colorings is 720=24 D 30.

2) Now the equivalence classes have different cardinalities. First we solve the sec-ond part of the problem by a direct enumeration. Later on we solve it by making use ofTheorem 4.5.19. LetD be the set of faces, jDj D 6, andR D ¹blue; greenº; jRj D 2.When we paint a cube, we assign a color to each face of the cube, thus each coloringc can be viewed as a mapping c W D ! R, that is an element of the power set RD .By Theorem 1.1.35, jRDj D 26 D 64, but some of these functions are equivalent andmust be identified.

Namely, there is the unique coloring if all faces are blue. If there are five bluefaces and one green face, then any of six faces can be chosen as this unique greenface and all of these six colorings are equivalent. In the case of four blue faces andtwo green ones, these green faces can be opposite to one another, thus generatingthree different colorings, or these two green faces can be adjacent, that is, incident tothe same edge, so that giving 12 equivalent colorings. If there are three blue and threegreen faces, then there are eight equivalent colorings, when the three faces of the samecolor are incident to the same vertex. In addition, there are 12 more colorings, when


two faces of the same color are parallel (opposite to one another) and the third face ofthe same color is adjacent to both of them – indeed, there are three ways to pick anaxis perpendicular to two parallel faces, and after that there are four ways to place aconnecting face, thus, 3 � 4 D 12. Similarly, one can count colorings containing twoblue faces, one blue face, and no blue face. All in all, we have 1C 6C 3C 12C 8C12C12C3C6C1 D 64 colorings splitting into 10 different equivalence classes.

It is useful to endow the elements ofR with weights by considering one more setWand a mapping w W R ! W . We will have to multiply the weights, and for this pur-pose we assume that W is a commutative ring as discussed before Proposition 4.3.7.

Definition 4.5.12. The image w.r/ of an element r 2 R is called the weight of r andthe product

w.f / DYd2D

w.f .d//

is called the weight of a function f 2 RD .

Lemma 4.5.13. Equivalent functions have the same weight.

Proof. If f1 � f2, then there exists a substitution g 2 G such that f1 ı g D f2.Therefore,

w.f2/ DYd2D

w.f2.d// DYd2D

w.f1.g.d/// DYd 02D

w.f1.d0// D w.f1/

sinceQd2D D

Qd 0Dg.d/2D; this equation holds because the substitution g is bijec-

tive.

Whence, the following definition is well-posed.

Definition 4.5.14. The weight W.F / of an equivalence class F � RD is the weightw.f / of any mapping f in the equivalence class F .

In particular, if for any r 2 R its weight is w.r/ D 1, then also w.f / D 1 for anyfunction f 2 RD and so that W.F / D 1 for each equivalence class F .

Definition 4.5.15. The set R here is called a reserve. The sum of weights of allelements of the reserve R is called the inventory and is denoted by

Inv.R/ DXr2R

w.r/:


Depending on our choice of weights, the inventory gives more or less detailed de-scription of a reserve. For instance, if a student has three books and the weight of eachof them includes its title and the author’s name, the inventory is a formal sum, noth-ing but a list of three entries, giving a certain description of student’s books. If shehas two books in mathematics and one in physics and we assign weights M and P ,respectively, to these books, the inventory becomes 2M C P – from this inventorywe see only subjects but not the titles of the books. If these books have 342, 229, and400 pages respectively, totalling to 971 pages, and we use these numbers as weights,then the inventory becomes 971, giving us only the idea of the total thickness of thesebooks. Next, if we use the prices of books, say, $99.95, $125, and $129, then theinventory is $353.95, representing only the total price of these books and nothingelse.

Lemma 4.5.16. Let D and R be finite sets and D D D1 [ � � � [ Dk be a partitionof D. If a subset S � RD consists of all mappings that are constant on each setDj ; 1 � j � k, then

Inv.S/ DkY

jD1

²Xr2R

.w.r//jDj j³: (4.5.10)

Proof. Define a function W D ! ¹1; 2; : : : ; kº by the equation .d/ D j when-ever d 2 Dj � D, that is, .d/ is the index of the subset Dj in the partition, suchthat the element d 2 Dj ; obviously, d 2 D .d/. Any mapping f 2 S can be repre-sented as a composite function f D ' ı , where a function ' W ¹1; 2; : : : ; kº ! R.The mapping ' D 'f is uniquely defined by f because the latter is constant on allsubsets Dj ; 1 � j � k. Vice versa, given any function ' W ¹1; 2; : : : ; kº ! R,the superposition f D ' ı is a piecewise constant function, therefore, this functionf D f' 2 S . Hence, there exists a one-to-one correspondence between S and thepower set R¹1;2;:::;kº, which proves the equation jS j D jRjk .

Expand now the right-hand side of (4.5.10),²Xr2R

.w.r//jD1j³�

²Xr2R

.w.r//jD2j³� � �

²Xr2R

.w.r//jDk j³:

Multiplying these sums out, we derive a set of products each containing one addendfrom every sum. There are jRjk D jS j such products, a generic one being

.w.ri1//jD1j � .w.ri2//

jD2j � � � .w.rik //jDk j: (4.5.11)

Considering product (4.5.11), it is natural to introduce a function

e' W ¹1; 2; : : : ; kº ! R


as follows: e'.j / D rij ; 1 � j � k. This shows that product (4.5.11) is equal to theweight w.ef /, where the function ef De' ı . Indeed,

w.ef / D Yd2D

w.ef .d// D kYjD1

Yd2Dj

w.ef .d// D kYjD1

¹w.ef .d//ºjDj jbecause ef is constant on any Dj . It is easily seen that starting at different products(4.5.11), this construction leads to different functionse', and so that to different func-tions ef . Since there are as many products (4.5.11) as functions e', we conclude thatby multiplying out all terms in (4.5.10) we get the sum of the weights of all functionsin S . However, this sum is precisely the inventory Inv.S/, which completes the proofof Lemma 4.5.16.

Corollary 4.5.17. If a partition of D contains only one-element sets, then S is thepower set RD and .4:5:10/ simplifies to

Inv.RD/ D .Inv.S//jDj: (4.5.12)

Problem 4.5.11. In how many ways can three people distribute m tokens amongthemselves, so that the first and the second persons get an equal number of tokens?

Solution. We give two solutions of this problem, one straightforward and elementary,and another based on Lemma 4.5.16.

First solution of Problem 4:5:11. We have to find in how many ways it is possible tosplit the numberm into two whole addends, such that any addend or even two of themmay be 0, because one or two of these people may get nothing. If m is even, this canbe done in m

2C 1 ways, and if m is odd, then there are mC1

2ways to split the number.

Second solution of Problem 4:5:11. We use the problem to demonstrate the machineryof applying Lemma 4.5.16. In this particular problem the following solution is obvi-ously longer than the first one, however, we introduce in this solution an importanttechnique, which has much broader applications.

Consider two sets, D D ¹p1; p2; p3º and R D ¹0; 1; 2; : : : ; mº, and a partitionD D D1 [D2, where D1 D ¹p1; p2º and D2 D ¹p3º. Let S have the same senseas in Lemma 4.5.16, that is, a function f 2 S if and only if the images of p1 and p2are the same, f 2 S � RD” f .p1/ D f .p2/. To solve the problem, we have tofind the number of functions f 2 S satisfying an additional restriction

f .p1/C f .p2/C f .p3/ D m:

Assign the weights w.i/ D xi ; i D 0; 1; : : : ; m, where x is indeterminate, to theelements of R. If f 2 RD , then

w.f / DYd2D

w.f .d// D w.f .p1// �w.f .p2// �w.f .p3// D xf .p1/Cf .p2/Cf .p3/:


Therefore, the functions we sought after, have the weight xm, and the number of suchfunctions is the coefficient of xm in Inv.S/. It should be noted that in this case theinventory is the GF of the reserve. By (4.5.10) with k D 2,

Inv.S/ D .1C x2 C x4 C � � � C x2m/.1C x C x2 C � � � C xm/

D1 � x2mC2

1 � x2�1 � xmC1

1 � xD .1 � x2/�1.1 � x/�1 C g.x/

D1

4.1C x/�1 C

1

2.1 � x/�2 C

1

4.1 � x/�1 C g.x/:

Here we separated the function g, because in the problem we are interested in the coef-ficient of xm, and the function g has a power series, which begins with the term xmC1,therefore, g contributes nothing in the coefficient of xm. Using equation (4.3.7), westraightforwardly verify that the latter expression leads to the same answer as the firstsolution, namely, mC1

2C

1C.�1/m

4.

Remark 4.5.18. We have simultaneously found in this problem the number of com-positions consisting of two parts, such that one part is even.

The proof of Lemma 4.5.16 carries out without any change if jRj D 1, so longas all occurring series are convergent. In this case (4.5.10) and (4.5.12) read thatthe corresponding series are equal. This remark often allows to simplify computa-tions. For instance, in Problem 4.5.11 it is convenient to use R D ¹0; 1; 2; : : :º;then

Inv.S/ D .1C x2 C x4 C � � � /.1C x C x2 C � � � / D1

1 � x2�

1

1 � x;

and the latter expression has, certainly, the same coefficient of xm as the precedingone.

Now we state the main result of this section.

Theorem 4.5.19 (Redfield, Pólya, de Bruijn ). Let D and R be finite sets, jDj <1; jRj < 1. Let a group of permutations G act on D and PG.x1; x2; : : :/ be thecycle index of G. Then the inventory Inv.RD/, that is the complete list of the weightsof all equivalence classes generated in the power set RD by the group G, isX

F

W.F / D PG

�Xr2R

w.r/;Xr2R

.w.r//2;Xr2R

.w.r//3; : : :

�: (4.5.13)

Corollary 4.5.20. In particular, if w.r/ D 1; 8r 2 R, then as it was mentioned,W.F / D 1; 8F 2 RD , and the left-hand side of (4.5.13) is equal to the number ofthe equivalence classes, which is, therefore,

n D PG.jRj; jRj; : : :/: (4.5.14)


Before proving Theorem 4.5.19, we apply it to solve Problem 4.5.10.

Solution of Problem 4:5:10. In this problem D is the set of faces of a cube, jDj D 6,G is the group of rotations of a cube, whose cycle index was found in Problem 4.5.6,3), and R D ¹blue; greenº; jRj D 2. Inserting x1 D x2 D � � � D x6 D 2 into formula(4.5.8), we find anew

n D1

24.26 C 3 � 24 C 6 � 23 C 6 � 23 C 8 � 22/ D 10:

Proof of Theorem 4:5:19. Let ! be an arbitrary possible value of the weight and S!D¹f 2 RD W w.f / D !º. Lemma 4.5.13 implies that if an equivalence class intersectswith S! , then this entire class is a subset of S! . Thus, we can restrict the action ofthe group G on S! ; let n! denote the number of equivalence classes belonging to S! .By Lemma 4.5.8 being applied to S! , we conclude

n! D1

jGj

Xg2G

!.g/ (4.5.15)

where !.g/ is the number of functions f 2 RD such that f D f ıg andw.f / D !.The quantity

! � n! D ! C ! C � � � C !„ ƒ‚ …n! addends

is the inventory of all equivalence classes having the weight !. Therefore, if wemultiply (4.5.15) by ! and sum up these equations over all possible values of !, weget the inventory of all the equivalence classes,X

F

W.F / D1

jGj

X!

Xg2G

! !.g/

where the order of summation can be interchanged since all sums are finite. However,P! ! !.g/ D

Pf 2RD W fDf ıg w.f /, hence

XF

W.F / D1

jGj

Xg2G

� Xf 2RD W fDf ıg

w.f /

�:

Considering now the definition of the cycle index (4.5.2), we see that to prove thetheorem, it remains to prove the following claim.

Proposition 4.5.21. Given a permutation g with the cycle type .b1; b2; : : : ; bm/, theexpression X

f 2RD W fDf ıg

w.f /


comes out of the monomial

xb11 � x

b22 � � � x

bmm

after replacing the indeterminate x1 withPr2R w.r/, then replacing x2 withP

r2R.w.r//2, then x3 with

Pr2R.w.r//

3; : : : ; kk withPr2R.w.r//

k , and so forth.

Proof. We observe that when g acts on D, the latter breaks down into disjoint cyclesD1;D2; : : : ;Dk , and the condition f D f ı g implies

f .d/ D f .g.d// D f .g2.d// D � � �

therefore, f is constant on any cycle Dj ; 1 � j � k.Vice versa, if f is constant on every cycle contained in g, then f ı g D f ,

which means g.d/ always belongs to the same cycle as d . Thus, one can applyLemma 4.5.16, yielding

Xf 2RD W fDf ıg

w.f / D

kYjD1

Xr2R

.w.r//jDj j: (4.5.16)

Given a substitution g with the cycle type .b1; b2; : : : ; bm/, a 1 occurs b1 times amongthe cardinalities jD1j; jD2j; : : : ; jDkj, a 2 appears b2 times, etc. Hence (4.5.16) canbe written as

Xf 2RD W fDf ıg

w.f / D

²Xr2R

w.r/

³b1�

²Xr2R

.w.r//2³b2� � �

D xb11 � x

b22 � � � x

bmm

ˇxiD

Pr2R.w.r//

i ; 1�i�m

which proves Proposition 4.5.21 and consequently, Theorem 4.5.19.

In the rest of the section we consider various applications of the Pólya–Redfieldenumeration theory.

Problem 4.5.12. In how many geometrically distinct ways is it possible to paint thevertices of a square in blue and green colors? Two colorings are said to be geomet-rically identical if they can be made indistinguishable by rotating the square in theplane about its center.

Solution. Given a vertex, we choose a color for that vertex, therefore, a coloring isa mapping from the set D D ¹v1; v2; v3; v4º of the vertices of the square to the setR D ¹blue; greenº. The equivalence relation on the set of colorings is generated by


the group of rotations of a square acting on vertices D of the square. The cycle indexof this group is given by equation (4.5.4),

PG.x1; x2; x3; x4/ D1

4.x41 C x

22 C 2x4/:

Choosing all weights to be 1 and setting in (4.5.14) x1 D x2 D x4 D jRj D 2, weget n D PG.2; 2; 2; 2/ D 6.

Problem 4.5.13. In how many geometrically distinct ways can one paint the faces of aregular tetrahedron in two colors? Two colorings are said to be geometrically identical(equivalent) if they can be made indistinguishable by rotating the tetrahedron in spaceabout its center.

Solution. The cycle index of the group of rotations is given by equation (4.5.6). Com-bining it with (4.5.14) and substituting there xi D jRj D 2, we compute n D 5, thatis, there are five different ways to color the regular tetrahedron, which can be easilyverified by inspection.

Problem 4.5.14. For any natural n, 124.n8 C 17n4 C 6n2/ is an integer number.

Solution. Formulas (4.5.7) and (4.5.14) imply that this is the number of geometricallydifferent colorings of the vertices of a cube in n colors.

The next problem was solved by Pólya in his original article [39].

Problem 4.5.15. In how many geometrically distinct ways can we place three blueballs, two green balls, and a pink ball at the vertices of a regular octahedron?

Solution. The group of rotations of a regular octahedron acts on the set D of its ver-tices, jDj D 6. It is clear that the cycle index of this group is the same as that of thegroup of rotations of a cube acting on the set of the faces of the latter. The cycle indexof the latter group of substitutions is given by (4.5.8). Since we have three differentcolors, we select a reserve R D ¹blue; green; pinkº.

To distinguish different colors, we introduce weights on R as indeterminatesw.blue/ D b, w.green/ D g and w.pink/ D p, henceX

r2R

w.r/ D b C g C p

Xr2R

w2.r/ D b2 C g2 C p2

Xr2R

w3.r/ D b3 C g3 C p3


etc. Then by (4.5.13) and (4.5.8), we derive the complete list of all possible coloringsXF

W.F / D PG

�Xr2R

w.r/;Xr2R

.w.r//2;Xr2R

.w.r//3; : : :

�

D1

24¹.b C g C p/6 C 3.b C g C p/2.b2 C g2 C p2/2

C 6.b C g C p/2.b4 C g4 C p4/C 6.b2 C g2 C p2/3

C 8.b3 C g3 C p3/2º:

The equivalence classes we are looking for, have weight b3g2p. Multiplying out allfactors in the expression for PG , we see that this monomial appears in the sum threetimes, so that there are three geometrically different colorings in this problem.

Problem 4.5.16. How many different molecules are there (see Fig. 4.4), which con-tain a four-valent atom of the carbon C in the center with four endings X , where Xmay be either the hydrogen H , or the chlorine Cl , or the methyl group CH3, or theethyl group C2H5?

��

��

��

C XX

X

X

Figure 4.4. Molecules in Problem 4.5.16.

Solution. Put into a correspondence to each molecule a regular tetrahedron, whosevertices are labelled by the symbols X . We have to calculate the number of equiva-lence classes in the power set RD , where D is the set of vertices of the tetrahedron,jDj D 4, and R D ¹H;C l; CH3; C2H5º. The cycle index of this group of rotationsis given by (4.5.6). Thus, the number of molecules is equal to PG.4; 4; 4; 4/ D 36.

If we want to list these molecules with regard to the number of the hydrogen atomscontained, it is convenient to choose weights w.H/ D h and

w.C l/ D w.CH3/ D w.C2H5/ D 1:


Then the sum of the squares of the weights is h2 C 12 C 12 C 12 D h2 C 3, etc., andwe deduce from (4.5.13)

PG.hC 3; h2C 3; h3 C 3; h4 C 3/ D h4 C 3h3 C 6h2 C 11hC 15:

The coefficients in the latter tell us that there is the unique molecule containingfour atoms of hydrogen, there are three molecules with three hydrogen atoms, sixmolecules with two hydrogen atoms, 11 molecules with one hydrogen atom, and 15molecules do not contain hydrogen at all. We note that 15 D PG.3; 3; 3; 3/, since thelatter molecules can have only three possible endings.

In the following problems we apply the techniques of this section to compute onceagain the number of permutations and combinations.

Problem 4.5.17. Find the number of n-permutations with repetition from an m-ele-ment set A.

Solution. Since the permutations were defined as special mappings, we set D D¹1; 2; : : : ; nº and R D A. We do not have to identify any mappings, therefore, wecan use the trivial one-element group of substitutions G D ¹g0º, where g0 is theidentical substitution, and all weights w D 1. Combining (4.5.3) and (4.5.14), wearrive again at (1.3.1),

Arep.m; n/ D PG.jRj; jRj; : : :/ D mn:

Problem 4.5.18. Find the number of r-combinations with repetition from elementsof n types.

Solution. These combinations can be put in a one-to-one correspondence14 with map-pings f W ¹1; 2; : : : ; nº ! ¹1; 2; : : : ; rº, such that

f .1/C f .2/C � � � C f .n/ D r;

where f .i/; 1 � i � n, stands for the number of elements of the i th type in thiscombination. Introduce weights on R as in the second solution of Problem 4.5.11, byw.i/ D xi , and set D D ¹1; 2; : : : ; nº. The mappings we look after are listed by theterm xr in the inventory Inv.RD/, hence by means of (4.5.12) we get

Inv.RD/ D

²Xi2R

w.i/

³jDjD .1C x C x2 C � � � C xr/n

which yields again formula (1.4.6), Crep.n; r/ D.nCr�1/Š.n�1/ŠrŠ

.

14This means, in particular, that the combinations with repetition can be defined as such mappings.


Calculations are simpler if we take the infinite reserveR D ¹0; 1; 2; : : :º, leading to

Inv.RD/ D .1C x C x2 C � � � /n:

Similarly, by making use of (4.5.12) one can find the formula for the number of com-positions of integer numbers (cf. Section 4.4). The distinction between the latter andthe combinations with repetition is that a combination may omit elements of certaintypes, but a composition cannot contain zero elements. Thus, in the case of composi-tions we must use the reserve R D ¹1; 2; 3; : : :º.

The last problem in this section deals with colorings of binary trees.

Problem 4.5.19. Consider a binary tree with seven vertices (Fig. 4.5), which have tobe painted in two colors. Two colorings are called equivalent if one can be derivedfrom the other by rotating either the entire tree through 180ı about the vertical sym-metry axis or any of its subtrees through 180ı about the horizontal symmetry axis.For example, the coloring in Fig. 4.4(a) is equivalent to that in Fig. 4.4(b), but is notequivalent to the one in Fig. 4.4(c). How many are there nonequivalent colorings?

cc cs c

s scc cs s

c ssc cs s

c s

.a/ .b/ .c/

v2v1

v3 v2v1

v3 v2v1

v3

Figure 4.5. Three binary trees in Problem 4.5.19.

Solution. Let D be the set of the vertices of the tree, jDj D 7, and R the set ofavailable colors, jRj D 2. To each coloring there corresponds a mapping f 2 RD ,and this is easily seen to be a one-to-one correspondence. Let G D ¹g0; g1; : : : ; g7ºbe the group of substitutions acting on D, where g0 is the identical rotation, g1 is therotation about the vertex v1, g2 is the rotation about the vertex v2, g3 is the rotationabout the vertex v3, and g4 D g2 ıg3, g5 D g1 ıg2, g6 D g1 ıg3, g7 D g1 ıg2 ıg3.It is readily verified that the cycle indices of permutations g0 � g7 are, respectively

x71 ; x1x32 ; x

51x2; x

51x2; x

31x22 ; x1x2x4; x1x2x4; x1x

32


thus the cycle index of the group G is

PG.x1; x2; : : : ; x7/ D1

8.x71 C 2x1x

32 C 2x

51x2 C x

31x22 C 2x1x2x4/:

By (4.5.14), the number of colorings is PG.2; 2; 2; 2; 2; 2; 2/ D 42.


EP 4.5.1. Verify that the binary relation in Problem 4.5.1 satisfies the axioms of thebinary relation.

EP 4.5.2. 1) Extend Problem 4.5.1, computing now the number of nonequivalent6-digit and 7-digit numbers.

2) What is the sum of all these numbers?

3) How many of them are multiple of 4?

EP 4.5.3. Find the cycle type and cycle index of the substitution

g D

�x1 x2 x3 x4 x5 x6 x7 x8 x9 x10x2 x3 x1 x4 x6 x5 x7 x8 x10 x9

�:

EP 4.5.4. Find the coefficient of t5 in the expansion ofQ8kD1.t C ak/.

EP 4.5.5. A disk is divided in p equal sectors by p radii, where p is a prime number.In how many different ways is it possible to paint the disk in n colors, if two colorscannot be used on the same sector? Two colorings are to be identified if they can bemade indistinguishable by a rotation of the disk in the plane about its center.

EP 4.5.6. A family of substitutions F D ¹g0; g1; g2º, where

g0 D

�1 2 3 4

1 2 3 4

�g1 D

�1 2 3 4

2 1 4 3

�g2 D

�1 2 3 4

2 3 4 1

�acts on a set X D ¹1; 2; 3; 4º. Can we apply Lemma 4.5.8 to the family F ?

EP 4.5.7. Prove that the number of fixed elements of a substitution acting on an n-element set is nŠ

PnkD0.�1/

k=kŠ. Compare with Problem 4.1.9.

EP 4.5.8. Find the number of equivalence classes induced on a set X D ¹1; 2; 3; 4ºby the group of substitutions ¹g0; g1; g2; g3º, where

g0 D

�1 2 3 4

1 2 3 4

�g1 D

�1 2 3 4

2 1 4 3

�g3 D

�1 2 3 4

2 1 3 4

�g4 D

�1 2 3 4

1 2 4 3

�


EP 4.5.9. 1) A 2 � 2 square is divided in four 1 � 1 squares. Using 6 colors, inhow many geometrically distinct ways is it possible to paint the big square sothat neighboring (having a common side) small squares have different colors?

2) Solve the same problem for a 3 � 3 square split in nine 1 � 1 squares.

EP 4.5.10. Verify that rotations of a cube, acting on its vertices or on its sides, orfaces, form groups with the superposition of rotations as the group operation. Answerthe same question regarding the rotations of a square or a regular tetrahedron. Thecycle indices of these groups were found in Problems 4.5.3–4.5.6.

EP 4.5.11. Prove that the set Gd in the proof of Lemma 4.5.8 is a subgroup of order�.d/ of the original group of permutations G.

EP 4.5.12. Draw explicitly all six different colorings of the vertices of a square intwo colors – see Problem 4.5.12.

EP 4.5.13. In how many geometrically distinct ways can we paint the 2 � 2 checker-board in two colors? The same question for the 3 � 3 checker-board.

EP 4.5.14. 1) In how many geometrically distinct ways can we paint the faces ofa cube in three colors? In no more than six colors?

2) Solve the same problems for edge coloring. Answer the same question if thereare four or five colors available.

3) Solve the same problems for a right tetrahedron.

EP 4.5.15. Using the six digits 1; 2; : : : ; 6, the faces of a die can be marked in ge-ometrically different ways. For instance, 1 and 2 can be on two opposite or on twoadjoint faces. How many are there differently labelled dice?

EP 4.5.16. In how many geometrically distinct ways can we paint the edges or facesof a regular tetrahedron in two colors? The same question if there are three colorsavailable.

EP 4.5.17. In how many geometrically distinct ways can 12 friends, 6 girls and 6boys, ride a carousel with 12 seats, if all boys are considered to be indistinguishableand all girls are indistinguishable either?

EP 4.5.18. There are 999 students in the Small College and each of them has recentlypassed four tests with scores 7, 8, 9, or 10. What is the largest possible number ofstudents such that any two of them have different sets of scores and the sum of thefour grades is odd? Is an even number?


EP 4.5.19. Represent the substitutions defined in EPs 4.5.6 and 4.5.8 as products oftranspositions; which of them are odd and which of them are even?

EP 4.5.20. Compute the cycle index of the symmetric group Sym5 (see Example4.5.2) acting on 2-element subsets of the set ¹1; 2; 3; 4; 5º.

EP 4.5.21. Compare the parity (odd/even) of a substitution (permutation) with theparity of the number of inversions in the substitution (Definition 4.4.15 and before) –is it the same?

Chapter 5

Existence Theorems in Combinatorics

Three topics considered in this chapter have one essential feature in common –the existence of combinatorial configurations in question is not obvious at alland must be proved. In Section 5.1 we prove Ramsey’s theorem – a far-reachingextension of the Dirichlet or pigeonhole principle. Section 5.2 is devoted to thefamous Philip Hall’s marriage theorem. Its quantitative version on a lower esti-mate of the number of systems of distinct representatives is given as well as afew equivalent statements, in particular, Denés König’s and Dilworth’s theorems.Section 5.3 gives an introduction to the theory of combinatorial block designs.Finally in Section 5.4 we consider in more detail the systems of triples includingthe proof, due to Hilton [27], of the necessary and sufficient conditions of theexistence of the Steiner triple systems.

5.1 Ramsey’s Theorem

Ramsey life and work � Applications of Ramsey theory � Dirich-let life and work � Schur’s biography � Erdos’ biography � Whatis Erdos number? � George and Esther Szekeres summary � Erdos–Szekeres theorem

Problem 5.1.1. There are six pairs of socks of six different colors in a drawer. Whatis the smallest number of separate socks that must be drawn at random to ensure thatthe owner gets at least one complete pair?

Solution. Let us consider the worst-case scenario. This case occurs if one gets sixsocks of six different colors. After that, any seventh sock makes a complete pair ofsocks. It suffices, therefore, to draw seven socks, and moreover, the number sevencannot be reduced to six.

The problem can be stated in set-theory terms as follows.A set A is partitioned in six subsets, A D A1 [ � � � [ A6, with every jAi j � 2; 1 �

i � 6. What is the smallest cardinality of a subset B � A such that the intersectionof B with at least one of the subsets Ai contains two or more elements?

It is said that these two or more elements represent Ai in B .The existence of a solution to Problem 5.1.1 is clear. Ramsey’s theorem treats

significantly more general situations, when even the existence of a solution is farfrom obvious, whereas the cardinality of the solution in most cases is unknown yet.

Section 5.1 Ramsey’s Theorem 263

To state the theorem, we have to formalize a concept of a collection of elementscontaining several identical copies of the same element, for such a collection1 is nota set in the standard set-theory meaning. In particular, we must consider families ofsubsets containing several copies of the same subset.

Definition 5.1.1. For a set S , a mapping

U W ¹1; 2; : : : ; tº ! 2S

is called a family of subsets of S containing t terms, or just a t -family of subsets. Thefamily U is denoted by U D .S1; S2; : : : ; St /, where Si D U.i/; 1 � i � t . Themapping U does not have to be injective, thus some of the sets Si can coincide withone another.

We again use mappings to distinguish certain objects. Even if two terms, Si andSj , of a family are equal as sets, we consider them as different terms of the family,because they have different subscripts, that is, different preimages with respect to themapping U .

Example 5.1.2. Let t D 3, S D ¹1; 2; 3º, S1 D S2 D ¹1; 2º, and S3 D ¹2; 3º. ThenU D .S1; S2; S3/ is an example of a 3-family.

Definition 5.1.3. If S D S1[S2[� � �[St , where Si\Sj D ;; i ¤ j; 1 � i; j;� t ,then the t -family U D .S1; S2; : : : ; St / is called an (improper) ordered partition ofthe set S in t parts, or a t -partition of S . It is called improper because ; � S and sothat U can contain empty terms.

Example 5.1.4. Let again t D 3, S D ¹1; 2; 3º, but now S1 D ;, S2 D ¹1; 2º, andS3 D ¹3º. Then U D .S1; S2; S3/ is an example of an improper 3-family.

Theorem 5.1.5 (Ramsey ). Consider a set X , natural numbers p and t , and anyordered t -partition of the set 2Xp of all p-subsets of X ,

2Xp D A1 [ A2 [ � � � [ At (5.1.1)

where t subsets Ai are the parts of this partition. Then for arbitrary natural numbersp; q1; q2; : : : ; qt such that 1 � p � qi for all 1 � i � t , there exists the smallestnatural number R D Rp.q1; q2; : : : ; qt / with the following property:

If jX j � R, then there exists an index i; 1 � i � t , and a subset B � X such thatjBj D qi and 2Bp � Ai .

The numbers Rp.q1; q2; : : : ; qt / are called Ramsey numbers.

1“Sets” with repeating objects are sometimes called multisets.

264 Chapter 5 Existence Theorems in Combinatorics

It is useful to restate Theorem 5.1.5 in terms of colorings of graphs.

Theorem 5.1.6. For arbitrary natural numbers p; q1; q2; : : : ; qt such that 1 �p � qi for 1 � i � t , there exists the smallest natural numberRDRp.q1; q2; : : : ; qt /with the following property.

LetG be a graph of size q D qG � R, that is, with q edges. Consider all subgraphsofG of size p and color each of their edges in one of the given t colors. Then for somei; 1 � i � t , G has a monochromatic subgraph G0 of size qi , that is all subgraphs ofG0 of size p have the same color.

Ramsey’s theorem gives a precise meaning to the following intuitively clear state-ment:

For an arbitrary subdivision of a set in a prescribed number of parts, all these partscannot simultaneously be small if the set is sufficiently large.

First we consider the special case p D 1 of Theorem 5.1.5. In this case, the set2Xp D 2X1 of the one-element subsets of X can be identified with the set X itself and(5.1.1) can be thought of as a t -partition of X . Now the theorem claims that for anyt -partition X D A1 [ � � � [ At of the set X and for any integers q1 � 1; : : : ; qt � 1,there is an index i; 1 � i � t , and a subset B � X such that jBj D qi and B � Ai ,whenever jX j is large enough. Similarly to Problem 5.1.1, in the case p D 1 we have

R1.q1; q2; : : : ; qt / D .q1� 1/C � � �C .qt � 1/C 1 D q1C � � �C qt � t C 1: (5.1.2)

In Problem 5.1.1 t D 6; q1 D � � � D q6 D 2; p D 1 and R1.2; 2; 2; 2; 2; 2/ D 7.

If p D 1 and qi � 2; 1 � 2 � t , we derive the following Dirichlet orpigeonhole principle.

Proposition 5.1.7. If R objects (for example, pigeons) are placed in t boxes (cages)and R � t C 1, then at least one box (cage) must contain two or more of these objects(pigeons). Moreover, if R objects are placed in t boxes, then there is a box containingat least

�R�1t

�C 1 objects.

Before taking up the proof, we solve a few problems by making use of the Dirichletprinciple to show a variety of its applications.

Problem 5.1.2. How many numbers should be chosen from the set ¹1; 2; 3; 4º to en-sure that at least one pair of these numbers adds up to 5?

Solution. Among all 6 D C.4; 2/ 2-element subsets of the given set, only two pairs,¹1; 4º and ¹2; 3º satisfy the condition. Thus, if we select only two numbers, a and b, itmay happen that a 2 ¹1; 4º and b 2 ¹2; 3º, therefore, aC b ¤ 5. However, any thirdnumber chosen completes either the pair ¹1; 4º or the pair ¹2; 3º. Consequently, it isenough to choose three numbers. In a more formal language of Theorem 5.1.5, we sett D 2 (two pairs-cages), q1 D q2 D 2, and by (5.1.2) we have R1.2; 2/ D 3.


Problem 5.1.3. How many numbers are to be chosen from the set ¹1; 2; 3; 4; 5º toensure that at least one pair of these numbers adds up to 5?

Solution. We still have two favorable pairs ¹1; 4º and ¹2; 3º. However, if we selecttwo numbers, representing these two pairs, then the third number may happen to be5. Consequently, now we have to select at least 4 numbers. In the formal language,in this problem t D 3 (two pairs and a singleton 5), q1 D q2 D q3 D 2, so that by(5.1.2), R1.2; 2; 2/ D 4.

Problem 5.1.4. What is the smallest number of integers that one has to choose fromthe set T D ¹1; 2; : : : ; 30º in order to ascertain that there are three numbers amongthe chosen, whose sum is multiple of 3?

Solution. The sum of three integers is a multiple of 3 if and only if 3 divides the sumof the remainders after dividing these numbers by 3. Consider three subsets of theset T ,

T0 D ¹3; 6; 9; : : : ; 30º; T1 D ¹1; 4; 7; : : : ; 28º; T2 D ¹2; 5; 8; : : : ; 29º:

Obviously we can select three numbers, say one number in T1 and two in T2, whosesum is not divisible by 3. Moreover, we can choose two integers in T0 and two in T1(or in T2), and this quadruple also does not satisfy the problem. However, any othernumber from T being combined with these four numbers, solves the problem. Thus,we have to choose at least five numbers.

Remark 5.1.8. It is instructive to translate this solution to the language of the Dirich-let principle.

Problem 5.1.5. A box has the shape of a cube with the side of one meter. There are2 001 flies in the box. Prove that at least three of them are in a ball of radius 5

p3 cm.

Solution. Divide the box into 103 D 1000 small cubes of side 10 cm. A diagonalof each such cube is 10

p3 cm. Since 2 � 1000 < 2001, there is at least one small

cube C with three flies inside – in the problem we suppose that a fly is a mathemat-ical dimensionless point and exclude surfaces of the small cubes from consideration.The cube C together with its three flies lies completely in the ball of radius 5

p3 cm

circumscribed about C .

In the following lemma we consider another special case of Theorem 5.1.5 withp D t D 2.

Lemma 5.1.9. For any integer q � 2,

R2.q; 2/ D R2.2; q/ D q:


Proof. The equation R2.q; p/ D R2.p; q/ is obvious due to symmetry, therefore, itis enough to prove that R2.2; q/ D q. This equation says that for any set X withjX j � q and for any partition of its 2-element subsets into two groups, A1 and A2,one of which may be empty, either there exists a 2-element subset A � X; jAj D 2,such that A 2 A1, or there exists a q-element subset A � X; jAj D q, such that2A2 � A2. Moreover, q is the smallest cardinal number with this property. It isimportant to keep in mind that we consider ordered partitions of 2X2 .

Thus, to proceed with the proof, let X be an arbitrary q-element set and 2X2 DA1 [ A2 be any 2-partition of the set of the 2-element subsets of X . If A1 ¤ ;,which means that A1 contains at least one pair ¹a; bº, then we can set A D ¹a; bº –this 2-element set A � X solves the problem. Otherwise, that is, if A1 D ;, we setA D X . That establishes the inequality R2.2; q/ � q.

To prove that the strict inequality R2.2; q/ < q cannot hold, we consider a set Xwith jX j � q � 1. Then for the ordered 2-partition 2X2 D ; [A2 there is no subsetA � X with the properties we sought. Indeed, here A1 D ;, hence there is no 2-element subset of A belonging to A1, and also A cannot contain a q-element subsetsince jX j � q � 1. This yields the equation R2.2; q/ D q.

We shall prove Theorem 5.1.5 only in the case p D t D 2 following [14]. A proofof the general case can be found, for example, in [21, p. 73–74]. Restate the theoremin the case p D t D 2.

Theorem 5.1.10 (Erdos–Szekeres ). For arbitrary integer numbers k; l � 2 thereexists the smallest number R D R2.k; l/ D R2.l; k/ such that for any set X withjX j � R and for any ordered 2-partition 2X2 D A1 [ A2 of all 2-element subsetsof X either there exists a subset T � X , such that jT j D k and 2T2 � A1, or thereexists a subset U � X such that jU j D l and 2U2 � A2.

Proof. We will prove the theorem by mathematical induction on both k and l , usingLemma 5.1.9 as the basis of induction. Let k; l > 2. By the inductive assumption,there exist numbers R2.k � 1; l/ and R2.k; l � 1/ defined in the statement of The-orem 5.1.10. We will prove that there exists a number R2.k; l/ with the requiredproperty and this number satisfies the inequality

R2.k; l/ � R2.k � 1; l/CR2.k; l � 1/:

Denotep D R2.k � 1; l/CR2.k; l � 1/

and consider a set X with jX j � p and any partition 2X2 D A1 [ A2. Choose anarbitrary x 2 X and introduce two sets, A D ¹y 2 X j ¹x; yº 2 A1º and B D ¹y 2X j ¹x; yº 2 A2º.

Since jAj C jBj � p � 1, then either jAj � R2.k � 1; l/ or jBj � R2.k; l � 1/,otherwise we would had jAj C jBj � p � 2. These two cases are symmetric and it is


sufficient to consider only one of them; suppose that jAj � R2.k � 1; l/. Assumingthis inequality, we construct a 2-partition 2A2 D cA1[

cA2, where cAi D Ai \2A2 ; i D

1; 2. By the inductive assumption, either there exists bT 2 2Ak�1

such that 2bT2 � cA1,

or else there exists bU 2 2Al

such that 2U2 � cA2. However, A � X n ¹xº andcAi � Ai ; i D 1; 2. In the first case, since the set bT exists, we let T D bT [¹xº 2 2Xk

,

therefore 2T2 � cA1 [A1 D A1. In the second case, when a nonempty set bU exists,it is immediately clear that U 2 2X

land 2U2 � A2.

Corollary 5.1.11. The numbers R2.k; l/ are finite for all k; l � 2.

Corollary 5.1.12.

R2.k; l/ �

�k C l � 2

k � 1

�:

Consider two applications of Ramsey’s theorem.

Theorem 5.1.13. Let k; l be two arbitrary natural numbers and ¹x1; : : : ; xnº be anyset of distinct real numbers. There exists a number R D R.k; l/ such that if n � Rthen in the sequence ¹x1; : : : ; xnº there is either an increasing subsequence of lengthk, or a decreasing subsequence of length l .

Proof. Introduce the set of indices X D ¹1; 2; : : : ; nº and split all ordered pairs .i; j /of indices from X with i < j as 2X2 D Ainc [ Adec, where .i; j / 2 Ainc if xi <xj and .i; j / 2 Adec if xi > xj . It is clear now that the statement follows fromTheorem 5.1.5.

The next statement can be proved quite similarly.

Corollary 5.1.14 (Schur ). For every positive integer k there is a (large enough)natural number n D n.k/ such that for any partition of the set ¹1; 2; : : : ; nº in ksubsets at least one of these subsets contains three numbers x; y; z, such that z Dx C y.

Problem 5.1.6. Assuming that any two people either are familiar with one another orare not, prove that among any six people either there are three who are familiar withone another or there are three who do not know each other.

Proof. Let X be any 6-element set. Consider a 2-partition 2X2 D A1 [A2, assigningto A1 all pairs of people familiar with one another and to A2 all remaining pairs. If Tis a set of pairwise familiar people, then 2T2 � A1; on the other hand, if U is a set ofpairwise unfamiliar people, then 2U2 � A2. We have to prove that we can find eithera set T such that jT j D 3, or a set U such that jU j D 3.

Hence, to solve the problem it is sufficient to prove that R2.3; 3/ � 6. We actuallyprove a stronger statementR2.3; 3/ D 6, which in particular solves Problem 5.1.6.


Lemma 5.1.15. R2.3; 3/ D 6.

Proof. The following well-known proof uses the graph-theory language. Consider acomplete graph K6 modelling this party of six. If two people are familiar with oneanother, then the edge, connecting the vertices corresponding to these two people, ismarked by Y ; all other edges are marked by N – in the example in Fig. 5.1 someirrelevant for the proof labels Y and N are omitted.

t t

t t

t tbbbbbbbbb

TTTTTTTTTT

��

TTTTT

��

"""""""""bbbbbbbbb

TTTTT"""""""""

��

A

E

B

F C

D

N

Y

Y NY Y

Y

YN Y

N

Figure 5.1. Proof of Lemma 5.1.15.

Consider any vertex of the graph, say, the vertex F in Fig. 5.1. Among five edgesincident to F , at least three edges must have the same labels – in Fig. 5.1 three edgesare labelled by Y . The three end-vertices of these three edges, which are differentfrom the F , make a triangle – in Fig. 5.1 this is �ABC . If all the three sides of thistriangle bear the same label, then these vertices form a triple of elements we look for.If these sides have different labels, then there exists a side s of this triangle with thesame label as that marking three initial edges (incident to F ) – the side s and the twoedges connecting its end-vertices with the initial vertex form the triangle we seek. InFig. 5.1, the edge AC is labelled by Y , as well as AF and CF , hence, the triangle�ACF has the same marks on all the three of its sides.

We have proved that R2.3; 3/ � 6. The following example shows that the right-hand side 6 in this inequality cannot be decreased to 5, that is, the conclusion cannotbe claimed for all 5-element sets. Indeed, let X D ¹a; b; c; d; eº, thus,

2X2 D ¹¹a; bº; ¹a; cº; ¹a; dº; ¹a; eº; ¹b; cº; ¹b; dº; ¹b; eº; ¹c; dº; ¹c; eº; ¹d; eºº:

Consider two sets of pairs,

A1 D ¹¹a; cº; ¹a; eº; ¹b; dº; ¹b; eº; ¹c; dºº


and

A2 D ¹¹a; bº; ¹a; dº; ¹b; cº; ¹c; eº; ¹d; eºº:

It is obvious that there is no 3-element set Y � X such that either 2Y2 � A1 or2Y2 � A2.

Problem 5.1.7. Where in the proof was used the pigeonhole principle with t D 2?

Remark 5.1.16. Lemma 5.1.15 can be stated as follows: If edges of the completegraph K6 are colored in two colors, then the graph contains a triangle with edges ofthe same color. EP 5.1.19 can be restated similarly if one considers a three-coloringof the edges of K17.

Problem 5.1.8. Is it true or false that among any six natural numbers either there arethree pairwise mutually prime or there are three numbers whose common divisor isgreater than one?

Problem 5.1.9. (Cf. EP 1.1.28.) Consider a simple graph of order k � 2. Show thatthis graph has at least two vertices of the same degree.

Solution. If the graph has two isolated vertices, they have the same (zero) degree.Otherwise, let X be the set of non-isolated vertices and jX j D t C 1; t � 0. Simplegraphs have no parallel edges nor loops, thus, a vertex in X can have any degree from1 through t ; we denote by Ai the subset of vertices of degree i; 1 � i � t . SincejX j > t , by (5.1.2) with q1 D � � � D qt D 2we haveR1.2; : : : ; 2/ D tC1. Therefore,there exists an i such that 1 � i � t and jAi j � 2.

Problem 5.1.10. Consider the complete graph Kn and an arbitrary coloring of itsedges in two colors A and B . Show that for any natural numbers p and q there is anatural number n0.p; q/ such that for any n � n0.p; q/ the graph either contains anA-colored subgraph of order p or a B-colored subgraph of order q.

Solution. It suffices to apply Theorem 5.1.6 to the set of vertices of the graph, de-composing all pairs of vertices in two subsets depending upon the color of the edgeconnecting these two vertices.

The following, almost obvious, statement is equivalent to the Dirichlet principle.

Problem 5.1.11. Consider two finite sets X and Y and a function f W X ! Y . If forany y 2 Y its preimage f �1.¹yº/ contains at most k elements, then jX j � kjY j.



EP 5.1.1. A family has eight siblings. Prove that at least two of them were born thesame day of week.

EP 5.1.2. There are 12 red, 10 blue, 10 green, and 8 yellow pencils. What is thesmallest number of pencils that we must pick at random if we need

1) at least 6 pencils of the same color?

2) at least 6 green pencils?

3) at least 1 pencil of each color?

4) at least 4 pencils of the same color?

EP 5.1.3. 1) In a class of 37 students, are there 4 of them who celebrate theirbirthday the same month?

2) Answer the same question for a class with 36 students.

EP 5.1.4. A student claims that at least 4 people in her class were born in the samemonth. What is the smallest size of the class?

EP 5.1.5. Among given 11 lines in a plane no two are parallel. Prove that we can findtwo lines among them, such that the angle between them is less than 17ı.

EP 5.1.6. How many are there key chains with 7 identical apartment keys and 3 iden-tical lobby keys?

EP 5.1.7. Prove that among any 6 integers there are two numbers such that 5 dividestheir difference.

EP 5.1.8. What is the largest cardinality of a set of natural numbers not exceeding10, if no number among them is twice another number?

EP 5.1.9. There are three identical pairs of black socks and three pairs of white socksin a drawer. What is the smallest number of separate socks that must be drawn atrandom to ensure that one gets at least one complete pair of black socks?

EP 5.1.10. There are 70 balls of the same size but of different colors in a box, amongthem 20 red, 20 green and 20 yellow balls; the others are black and white balls. Whatis the smallest number of balls to be chosen at random from the box to insure that atleast 10 same-color balls are selected?

EP 5.1.11. Consider n-digit natural numbers with n � 3, whose decimal represen-tations contain only three digits 1, 2, 3. How many of such numbers contain each ofthese digits at least once?


EP 5.1.12. How many are there natural numbers less than 84 900 000 and mutuallyprime with that number?

EP 5.1.13. A high school rented 11 buses for the senior prom. The maximal load ofevery bus is 40 students. What are the smallest and the largest number of seniors inthe school this year, if at least three buses carry the same number of students?

EP 5.1.14. 1) Prove that any six-element sequence of natural numbers containseither three numbers going in increasing order or three numbers going in de-creasing order.

2) Is this conclusion true for five-element sequences?

3) Is this conclusion true for nine-element sequences and four-element sub-sequences?

EP 5.1.15. 1) Arrange the integers from 1 through 100 inclusive, so that this or-dering does not contain an increasing subsequence of length 11, nor a decreasingsubsequence of length 11.

2) Prove that no such arrangement is possible for the first 101 natural numbers, thatis, prove that any permutation of the integers from 1 through 101 inclusive eithercontains an increasing subsequence of length 11, or a decreasing subsequenceof length 11.

EP 5.1.16. A test consists of five problems. Five students took the test and each ofthem solved at least two problems. Prove that at least two students solved the samenumber of problems.

EP 5.1.17. The conclusion of Problem 5.1.8 is false. How to change its statement tomake it true?

EP 5.1.18. Prove that among any nine people there are either three pairwise familiarwith one another or four pairwise unfamiliar, that is, prove that R2.3; 4/ � 9. More-over, in a party of eight this property fails. In other words, prove that the Ramseynumber R2.3; 4/ D 9.

EP 5.1.19. Among 17 students there are people collecting stamps, postcards, andcoins. Each pair of students has one and only one common hobby. Prove that thereare at least three students with a mutual hobby. Is it always possible to find four peoplewith a mutual hobby among these 17 students? Rephrase this problem in terms of theRamsey numbers and in terms of colorings of the complete graph K17.


EP 5.1.20. In Theorem 5.1.10 we have proved that

R2.k; l/ � R2.k � 1; l/CR2.k; l � 1/:

Show that if both R2.k � 1; l/ and R2.k; l � 1/ are even, then this inequality is strict,that is, the equality case cannot occur here.

EP 5.1.21. Prove that R2.3; 5/ D 14 and R2.4; 4/ D R2.3; 6/ D 18.

EP 5.1.22. Prove that R2.k; l/ � C.k C l � 2; k � 1/; k; l � 2.

EP 5.1.23. Given 20 pairwise distinct natural numbers less than 65, prove that amongthe pairwise differences of these numbers there are at least four equal numbers.

EP 5.1.24. There are white, black, and brawn gloves in a drawer, at least two pairs ofeach color. What is the smallest number of gloves that one has to pick at random inorder to get two pairs (four gloves) of the same color?

EP 5.1.25. What is the smallest number of integers to be chosen from the set T D¹1; 2; : : : ; 15º so that the difference of two of the numbers chosen is 6?

EP 5.1.26. Prove that among any 101 integer numbers there are at least two numberssuch that 100 divides their difference.

EP 5.1.27. In a small town there are 10 000 cars, whose licence plates are numberedby 4-digit numbers. If a number has less than 4 digits, we append in front of it a fewzeros like 0012. More than half of the cars are registered in the central district of thetown. Prove that there is a car in the central district, whose number is the sum ofnumbers of two other cars from this district.

EP 5.1.28. Show that if 6 points are selected at random inside a square of side 1 cm.,then at least two of them are less than 0.5 cm. apart.

EP 5.1.29. All edges of a complete graph with 17 vertices are colored in three colors.Prove that there is a triangle in the graph whose edges have the same color.

EP 5.1.30. The 6-element set T D ¹1; 12; 23; 34; 45; 56º possesses the followingproperty: for any two numbers in T the last digits of their sum and of their differenceare not 0. Prove that 6 is the biggest number with this property, that is, prove that any7-element set of integers contains a pair of numbers such that 10 divides either theirsum or their difference.

EP 5.1.31. If six different numbers are chosen from the set T D ¹1; 2; : : : ; 10º, thenthere are at least two consecutive numbers among the six.


EP 5.1.32. Eight numbers are chosen from the set T D ¹1; 2; : : : ; 10º. Show thatthere are at least three pairs of these numbers with the sum 11. Is this conclusion trueif only seven numbers are chosen?

EP 5.1.33. 22 people gathered at the alumni reunion at a Small College, among themengineers, chemists, and business people. Show that at least one major was repre-sented by eight or more alumni.

EP 5.1.34. A set of integers contains at least two numbers congruent modulo 11 (Def-inition 1.1.49). What is the smallest cardinality of such a set of integers?

EP 5.1.35. Find a coloring of the edges of the complete graph K13 in two colors,blue and green, so that no subgraph of order 3 has only blue edges and no subgraphof order 5 has only green edges.

EP 5.1.36. Find a coloring of the edges of the complete graph K17 in two colors, sothat no subgraph of order 4 is monochromatic.

EP 5.1.37. Prove that for any two mutually prime integers m and n there is a naturalnumber k such that n divides mk � 1.

EP 5.1.38. The sum of all entries of a 10 � 10 zero-one matrix is 81. Prove that thematrix contains a row and a column such that the sum of elements in these two linesis at least 17.

EP 5.1.39. A township of 51 houses occupies a square of 1 mile side. Prove that atleast three houses are inside a circle of radius 1

7mi.

EP 5.1.40. Prove Corollaries 5.1.12, 5.1.14.

EP 5.1.41. Prove that for every integer l � 1 there is the smallest natural numbern D n.l/ such that for any partition of the set ¹1; 2; : : : ; nº in two subsets at leastone of these subsets contains l C 1 numbers x1; : : : ; xl ; xlC1 satisfying the equationx1 C � � � C xl D xlC1. Prove in particular that n.2/ D 5.

EP 5.1.42. Prove that a decimal expansion of any non-zero rational number is a pe-riodic (repeating) decimal; it can start with a finite pre-period. For example, 2=15 D0:133 : : : – here the period is 3 and the pre-period is 1.


5.2 Systems of Distinct Representatives

Several statements considered in this section are equivalent to each other as wellas to some other results such as Menger’s theorem on disjoint chains in graphs orthe maximal flow theorem – see, for instance, [17, p. 11 and p. 55]. They havenumerous applications. Hereafter we systematically use ordered families of setsU D .S1; : : : ; Sn/ in the sense of Definitions 5.1.1 and 5.1.3. In the end of thesection we consider matchings in bipartite graphs.

Halmos’ biography � I Want To Be A Mathematician � Karl Menger� Ph. Hall’s life and work � M. Hall’s life and work � Equivalenceof Seven Major Combinatorial Theorems � König’s biography � Dil-worth’s biography � Dilworth’s Theorem � More on the MarriageTheorem

Definition 5.2.1. Given a set S and a family U D .S1; : : : ; Sn/ of its subsets, a setD D ¹a1; : : : ; anº � S is called a system of distinct representatives (SDR) or atransversal of the family U , if there exists a permutation .j1; j2; : : : ; jn/ of indices1; 2; : : : ; n such that aji 2 Si ; 1 � i � n.

This definition becomes transparent from the following example.

Example 5.2.2. Let S D ¹1; 2º. The 3-family .¹1º; ¹2º; ¹1; 2º/ cannot have a SDR,since the two available elements, 1 and 2, cannot represent the three subsets ¹1º; ¹2º,and ¹1; 2º of the family. However, the 2-family .¹1º; ¹1; 2º/ has one SDRD D ¹1; 2º.

It is clear from this example, that in order to have a SDR, the family of subsetscannot contain more terms than the cardinality of the union of these subsets. It turnsout that this simple necessary condition is also sufficient for the existence of a SDR.

Definition 5.2.3. A family U D .S1; : : : ; Sn/ of subsets of a set S satisfies Hall’scondition (H) if the inequality

jSi1 [ Si2 [ � � � [ Sik j � k (5.2.1)

holds for each k; 1 � k � n, and for any set of indices .i1; i2; : : : ; ik/.

Theorem 5.2.4. An ordered family U D .S1; : : : ; Sn/ of subsets of a finite set S hasa SDR if and only if U satisfies the condition .H/.

Proof. The necessity of the condition is clear. Indeed, if the family U has a SDR¹a1; : : : ; anº, then for any set of indices .i1; i2; : : : ; ik/ the k-element set ¹ai1 ; : : : ;aikº is a SDR for a sub-family

�Si1 ; : : : ; Sik

�, thus the union Si1 [ � � � [ Sik contains

at least these k elements ai1 ; : : : ; aik , and the condition (H) is valid.

Section 5.2 Systems of Distinct Representatives 275

The sufficiency of the condition (H) follows from the next theorem of M. Hall ,which gives also a lower bound of the number of SDR. It is convenient to introducethe following notation.

For a family U D .S1; : : : ; Sn/ of subsets of a finite set S denote

� D �U D min1�i�n

jSi j:

Theorem 5.2.5. If the family U D .S1; : : : ; Sn/ of subsets of a finite set S satisfiesthe condition .H/ then U has at least �Š SDR if � � n and at least �Š

.��n/ŠSDR if

� � n.

Proof. We will prove the statement by mathematical induction on the number of termsn of the family. If n D 1, then the family U consists of one set S1, and the inequalityjS1j D � � n D 1 is true for any natural �. Since in this case any element of S1makes a SDR, there are exactly

�Š

.� � n/ŠD

�Š

.� � 1/ŠD �

SDR, and this establishes the basis of induction.To make an inductive step, we pick a natural n and assume that the conclusion of

the theorem is valid for all families with less than n terms, that is, under the condition(H) any family of less than n terms has at least the above-mentioned number of SDR.To prove the same conclusion for any n-family of subsets, we proceed in two steps.

First, introduce a strengthened condition . QH/. Say that a family U satisfies thecondition . QH/ if the inequality

jSi1 [ Si2 [ � � � [ Sik j � k C 1

holds for all k and for all sets of indices .i1; i2; : : : ; ik/ with 1 � k � n� 1, while fork D n we still assume (5.2.1) as in (H).

Consider a family U satisfying the strengthened condition . QH/. Choose an elementa 2 S1 and consider sets S 0i D Si n¹aº; 2 � i � n. A new family U 0 D .S 02; : : : ; S

0n/

satisfies the original condition (H), since it consists of less than n subsets, and by thestrengthened condition . QH/

jS 0i1 [ S0i2[ � � � [ S 0ik j � jSi1 [ Si2 [ � � � [ Sik j � 1 � k:

It is obvious that for any i , jS 0i j � �� 1. If � � n, then �� 1 � n� 1. However, thesize of the new family U 0 is n � 1, and by the inductive assumption, the new familyU 0 has at least .� � 1/Š of SDR. If � > n, then � � 1 > n � 1 and U 0 has at least.��1/Š.��n/Š

of SDR. Since the element a 2 S1 was excluded from all of the sets in thenew family U 0, this element, being appended to any SDR for U 0, makes up a SDRfor the original family U . Now, the element a can be chosen at least in � ways, thus,


multiplying the assumed number of SDR for the new family U 0 by �, we arrive at theconclusion of Theorem 5.2.5 under the strengthened condition . QH/.

Suppose now that the condition . QH/ fails but the original condition (H) holds true.Hence, for some k; 1 � k � n � 1, and for some set of indices .i1; i2; : : : ; ik/ theequality

jSi1 [ Si2 [ � � � [ Sik j D k

holds good. Without loss of generality we assume that ij D j , that is,

jS1 [ S2 [ � � � [ Skj D k:

We notice that now the parameter � satisfies

� D min jSi j � jS1j � jS1 [ S2 [ � � � [ Skj D k;

where k is the size of the family Uk D .S1; S2; : : : ; Sk/. It follows by the inductiveassumption that this family Uk has at least �Š of SDR; let D D ¹a1; : : : ; akº beany of them. To complete the proof, we now show that the shortened family U 00 D.S 00kC1

; : : : ; S 00n /, where S 00j D Sj nD, also satisfies the condition (H).Suppose on the contrary, that the familyU 00 does not satisfy the condition (H). Then

one can find a sub-family .S 00j1 ; : : : ; S00jl/; k C 1 � ji � n, such that

jS 00j1 [ � � � [ S00jlj D l 00 < l:

However, the families Uk and U 00 were constructed mutually disjoint, so that

jS1 [ S2 [ � � � [ Sk [ S00j1[ � � � [ S 00jl j

D jS1 [ S2 [ � � � [ Sk [ Sj1 [ � � � [ Sjl j D k C l00 < k C l:

Thus, the condition (5.2.1) fails also for the original family U , which contradictsthe premise. Hence, the family U 00 satisfies the condition (H) and by the inductiveassumption has at least one SDR. Combining this SDR with any of �Š SDR for thefamily Uk , as we did with the element a above, we derive �Š SDR for the originalfamily U . The proof of Theorem 5.2.5 is now complete. Simultaneously we provedTheorem 5.2.4

A constructive proof of the existence of SDR can be found, for instance, in [21,Section 5.1]. Theorem 5.2.4 is sometimes called the marriage theorem or the theoremon village weddings due to the following its reformulation (Halmos ).

Problem 5.2.1. Among young people attending a party, each boy is familiar withat least m of the attending girls, however, every girl knows no more than m boys.Demonstrate that every boy can marry a girl he has been familiar with.


Solution. Denote the number of boys by p and let Gi be the set of girls familiarwith the i th boy. We prove that the family

�G1; : : : ; Gp

�satisfies the condition (H).

Otherwise, there would existed a set of indices i1; i2; : : : ; ik; k � p, such thatˇGi1 [ � � � [Gik

ˇ� k � 1, which means that k boys, say, bi1 ; bi2 ; : : : bik , together

have at most k � 1 familiar girls; let us denote these girls by gj1 ; gj2 ; : : : gjl , wherel � k � 1.

Consider all pairs .b˛; gˇ /; ˛ 2 ¹i1; : : : ; ikº; ˇ 2 ¹j1; : : : ; jlº, such that the boyb˛ is familiar with the girl gˇ – obviously, this is a symmetric binary relation. Denoteby ‡ the total number of such pairs. By assumption, for a fixed ˇ there are no morethanm such pairs, and since l � k� 1, all in all we have ‡ � m.k� 1/. On the otherhand, for a fixed ˛ there are no less than m such pairs, thus, ‡ � mk implying thatmk � m.k � 1/. This contradiction proves that the family

�G1; : : : ; Gp

�satisfies the

Hall condition, consequently, it has a SDR, say, ¹gi1 ; gi2 ; : : : gipº. The latter exactlymeans that a boy bj is familiar with the girl gij ; 1 � j � p.

Analyzing this solution, we immediately derive the following sufficient conditionfor the existence of a SDR.

Problem 5.2.2. Let U D .S1; : : : ; Sn/ be a family of subsets of a finite set S . Provethat if all the subsets Si have the same cardinality k, jSi j D k; 1 � i � n, and eachelement of the set S belongs to exactly k of the subsets Si , then the family U has aSDR.

SDR have many applications. Let .A/ and .B/ denote two m-partitions, in thesense of Definition 1.1.14, of a finite set T ,

T D A1 [ A2 [ � � � [ Am D B1 [ B2 [ � � � [ Bm:

Consider anm-element subsetE � T; jEj D m, such thatAi\E ¤ ; andBi\E ¤; for each i; 1 � i � m. Then obviously

jAi \Ej D jBi \Ej D 1; 1 � i � m:

Definition 5.2.6. The set E is called a system of mutual representatives of the parti-tions .A/ and .B/.

Clearly, a system of mutual representatives exists if and only if one can renumbersets in the partitions so that jAi \ Bi j ¤ ;; 1 � i � m. We prove a criterion for theexistence of a system of mutual representatives analogous to the condition (H).

Theorem 5.2.7. Two m-partitions .A/ and .B/ have a system of mutual representa-tives if and only if for every k; 1 � k � m, and for any set of indices i1; i2; : : : ; ikthe union Ai1 [ � � � [ Aik contains no more than k of the sets B1; B2; : : : ; Bm.


Proof. The necessity is obvious, as in Theorem 5.2.4. For, if

A1 [ � � � [ Ak � B1 [ B2 [ � � � [ BkC1

then k elements a1; : : : ; ak , ai 2 Ai ; 1 � i � k, cannot represent k C 1 setsB1; B2; : : : ; BkC1.

To establish the sufficiency, we consider a set S D ¹A1; : : : ; Amº and introduce thefamily U D .S1; : : : ; Sm/, where Si is the totality of sets Aj such that Aj \ Bi ¤;; 1 � i � m. We prove that the family U satisfies the Hall condition (H). On thecontrary, if for some k the union S1 [ � � � [ SkC1 would contain at most k elements(sets) Ai1 ; : : : ; Aik , then it were

Ai1 [ � � � [ Aik � B1 [ B2 [ � � � [ BkC1

notwithstanding the assumption. Hence due to Theorem 5.2.4, the family U has aSDR. Now we can renumber the components of the partition .A/ so that this SDRbecomesD D ¹A1; : : : ; Amº and arrive at the conclusion by making use of the remarkbefore the theorem.

Theorem 5.2.7 can be reformulated as follows:

Theorem 5.2.8. Two m-partitions .A/ and .B/ have a system of mutual representa-tives if and only if for any k D 1; 2; : : : ; m no k of the sets Ai are contained in theunion of less than k of the sets Bj .

Problem 5.2.3. m �p couples attend a party. The gentlemen belong tom professions,p men in each trade. The attending ladies belong to m clubs, p women in each club.Show that it is possible to select m pairs for a dance representing all clubs and allprofessions.

Solution. Introduce a set C , whose elements are m � p couples at the party, and con-sider two of its partitions:

C D A1 [ A2 [ � � � [ Am D B1 [ B2 [ � � � [ Bm:

Here each Ai ; i D 1; 2; : : : ; m, stands for the set of all pairs where gentlemen havethe same profession, and every Bj stands for the set of all pairs where the ladiesbelong to the same club. Since

jA1j D � � � D jAmj D jB1j D � � � D jBmj D p

Theorem 5.2.8 certainly applies, and the partitions .A/ and .B/ have a system ofmutual representatives – they form a set of m pairs we need.


As another demonstration of the power of Hall’s theorem, we consider its appli-cation to counting bases in finite-dimensional vector spaces. Recall that a basis in avector space V is a linearly-independent set of vectors, which spans the whole space,meaning that any vector of the space can be expanded through the basis vectors.

Theorem 5.2.9. Any two bases of a finite-dimensional vector space consist of thesame number of vectors.

Proof. Let ¹x1; x2; : : : ; xnº and ¹y1; y2; : : : ; ymº be two bases of a vector space V .Expand the vectors xi against yj ,

xi D

miXkD1

fikyjk ; where mi � m and all fik ¤ 0

and consider the family U D .S1; : : : ; Sn/ of sets Si D ¹yj1 ; : : : ; yjmi º � V . Thus,for each i; 1 � i � n, the set Si consists of the basis vectors yjk spanning thevector xi .

We claim that this family satisfies the condition (H). Otherwise, we would be ableto express certain k basis vectors xi through less than k vectors yj , which meansthat these basis vectors xi are linearly dependent. This contradiction implies thatthe n-family U has a SDR. Since these n distinct representatives belong to the setS D ¹y1; y2; : : : ; ymº, there must be n � m. The reversed inequality follows thesame lines due to symmetry, so that m D n.

Problem 5.2.4. For the latter theorem to hold true, it is sufficient that the vectors¹yj º span the entire space and the vectors ¹xiº are linearly independent. Then theinvariance of the dimension of the vector space follows.

The next application of Hall’s theorem deals with extremal combinatorial problems.For detailed exposition of this topic see, for example, [20], here we consider onlythe assignment problem. Suppose that there are n jobs that must be assigned to nemployees on a one-to-one basis. The utility (usefulness or uselessness) of the i th

worker at the j th position is measured by the entry ti;j of the utility matrix T . Anyassignment is given by a permutation

… D

�1 2 : : : n

j1 j2 : : : jn

�where the first row lists the employees and ji D ….i/ denotes a job the i th worker isassigned to. To improve performance, we have to maximize the sum

nXiD1

ti;….i/


over all permutations …. A direct solution of the problem by the brute force enumer-ation of all nŠ permutations is unfeasible even for moderate values of n. However, theproblem has an effective algorithmic solution.

Theorem 5.2.10. Let T D .ti;j / be an n � n matrix with real entries. Then

max…

nXiD1

ti;….i/ D min

� nXiD1

wi C

nXjD1

vj

�(5.2.2)

where the maximum on the left is taken over all n-permutations of the set S D¹1; 2; : : : ; nº and the minimum on the right is taken over all numbers wi and vj ,such that

wi C vj � ti;j for all 1 � i; j � n:

This common extremal value of .5:2:2/ is attained for certain values ji D …�.i/

such thatwi C v…�.i/ D ti;…�.i/; i D 1; : : : ; n

and the permutation …� solves the assignment problem.

Proof. We prove the theorem only for the integer-valued utilities ti;j . The generalcase can be found, for example, in [21, Sect. 7.1].

For a given integer-valued matrix T we can always find the integer numberswi ; vj ,such that wi C vj � ti;j ; 1 � i; j � n. It suffices, for example, to set all vj D 0

and wi D max1�j�n ti;j . Then wi C v….i/ � ti;….i/ for any permutation …, andsumming up over i D 1; 2; : : : ; n, we deduce

nXiD1

wi C

nXjD1

vj �

nXiD1

ti;….i/:

This readily yields the inequality m � M , where m and M are, respectively, theminimum and the maximum appearing in (5.2.2). We use Hall’s theorem to prove thatactuallym DM . The subsequent proof is constructive, that is, it gives an algorithmicsolution of the assignment problem.

For the given entries ti;j , we have already shown the existence of integers wi andvj such that wi C vj � ti;j . Fix a subscript i and a number wi , and try, if it ispossible, to increase vj as long as the latter inequality holds true. This way, for eachi we find at least one j such that wi C vj D ti;j . Denote by Si ; 1 � i � n, the setof subscripts j such that wi C vj D ti;j , where wi and vj were chosen as describedabove. Introduce also the n-family U D .S1; : : : ; Sn/. If the family U has a SDR¹j1; : : : ; jnº, then wi C vji D ti;ji and the permutation …�, such that …�.i/ D ji ,solves the problem. Therefore, to complete the proof of Theorem 5.2.10, we have toconstruct a SDR for U .


Suppose that the family U does not have a SDR. Then by Theorem 5.2.4, thecondition (H) fails for U . In turn, this means that there are subscripts i1; : : : ; ik ,where 1 � k � n, such that the union Si1 [ � � � [ Sik contains at most k � 1 differentsubscripts j . Denote K D ¹i1; : : : ; ikº and SK D Si1 [ � � � [ Sik ; by the assumption,jSK j D l � k � 1. Denote also

bwi D ´ wi � 1 if i 2 K

wi if i … K

bvj D ´ vj C 1 if j 2 K

vj if j … K

(5.2.3)

Clearly, if i … K, then bwi Cbvj � ti;j . If i 2 K and j 2 SK , then bwi Cbvj D.wi � 1/ C .vj C 1/ D wi C vj � ti;j as well. Finally, if i 2 K and j … SK ,then due to the definition of Si , wi C vj ¤ ti;j , that is, wi C vj � ti;j C 1; hencebwiCbvj D wiCvj�1 � ti;j . Thus, we have proved thatbwiCbvj � ti;j ; 1 � i; j � n,in all possible cases. The following equation is also obvious,

nXiD1

bwi C nXjD1

bvj D nXiD1

wi C

nXjD1

vj � k C l

that is when we replace wi and vj with, respectively, bwi andbvj , this sum decreasesby an integer k � l > 0.

However, this sum is bounded from below by M . Therefore, after finitely manysuch steps the sum

PniD1wiC

PnjD1 vj cannot be decreased any more, which means

that after a finite number of steps the modified family U satisfies the condition (H),and so that it has a SDR. The corresponding permutation, as at the beginning of theproof, solves the assignment problem.

Problem 5.2.5. Solve the assignment problem for the 4 � 4 utility matrix

T D

0BB@1 2 4 5

3 3 2 6

4 1 2 3

2 1 3 4

1CCASolution. We illustrate here the algorithm of Theorem 5.2.10. Set initially v1 D v2 Dv3 D v4 D 0. Since we need

w1 C 0 D max¹t1;j j 1 � j � 4º

we choose w1 D 5. Similarly we find w2 D 6; w3 D 4; w4 D 4. Now we observethat these wi and the corresponding vj have the pairs of indices .1; 4/; .2; 4/; .3; 1/;


.4; 4/, leading to the first setK D ¹1; 2; 3; 4º and the corresponding set SK D ¹1; 4º –let us recall that SK comprises all the second indices from the index pairs above. Thefamily of three sets .¹4º; ¹4º; ¹1; 4º/ contains only two distinct elements and obviouslydoes not have a SDR, therefore, we have to change K and SK . For each index i 2 Kwe decrease wi by 1 and for each index j 2 SK we increase vj by 1. The new valuesare w1 D 4; w2 D 5; w3 D 3; w4 D 3 and v1 D v4 D 1; v2 D v3 D 0.

We are to recalculate K and SK with these new wi and vj , so that wi C vj D ti;j .Now we have pairs of indices .1; 3/; .1; 4/; .2; 4/; .3; 1/; .4; 3/; .4; 4/ and the corre-sponding family of four sets .¹3; 4º; ¹4º; ¹1º; ¹3; 4º/, comprising only three elements.This family also does not have a SDR and we have to repeat the basic step of thealgorithm. It is clear that i D 3 can only be represented by j D 1, thus, we definethe second set K D ¹1; 2; 4º and the corresponding SK D ¹3; 4º. These K and SKtell us to decrease w1; w2; w4 by 1 and to increase v3; v4 by 1, leading to new valuesw1 D 3; w2 D 4; w3 D 3; w4 D 2 and v1 D 1; v2 D 0; v3 D 1; v4 D 2. Thesevalues result in the same K D ¹1; 2; 4º and SK D ¹3; 4º, thus, we have to repeat thebasic step one more time. After this step, the family of sets (of subscripts) derived is.S1; S2; S3; S4/, where

S1 D ¹2; 3; 4º; S2 D ¹2; 4º; S3 D ¹1º; S4 D ¹1; 2; 3; 4º

which obviously has SDR, for example, .3; 2; 1; 4/. Thus, the permutation

…� D

�1 2 3 4

3 2 1 4

�solves the problem and gives the maximum value t1;3 C t2;2 C t3;1 C t4;4 D 15.To compute the extreme value of the utility, we also can, due to the duality relation(5.2.2), use the last values of wi and vj ; these values are (check that!) .2C 3C 3C1/C .1C 0C 2C 3/ D 15.

The next application of Hall’s theorem is concerned with zero-one matrices, thatis, the matrices consisting of 0s or 1s. In what follows a line means either a rowor a column of a matrix. Evidently, we can consider not only zero-one matrices butthose with elements of arbitrary nature and separate all the elements into two disjointclasses. Moreover, zeros and ones in this theory are symmetric as well as rows andcolumns.

Definition 5.2.11. A set of rows, containing all 1s in a zero-one matrix is called acovering of the matrix. A collection of 1s in a zero-one matrix, such that no two 1samong them belong to the same line, is called an independent set of 1s.

Theorem 5.2.12 (König ). The minimum number of lines in any covering of azero-one matrix is equal to the maximum number of independent 1s in the matrix.


Proof. Let m be the cardinality of a minimal covering and M be the maximum num-ber of independent 1s in a zero-one matrix A D .aij /. To cover all 1s, one should atthe very least cover theseM independent 1s. By virtue of the independence condition,no pair of 1s among these M 1s can be covered by a line, hence at least M lines areneeded and m �M .

We use Theorem 5.2.4 to prove the opposite inequality. Let the minimum coveringconsist of r rows and c columns, r C c D m. The numbers m and M certainly donot change when we rearrange rows or columns. Therefore, by changing the order ofrows and columns, we can assume that the r rows appearing in the covering, are ruppermost rows of the matrix, and similarly the c columns appearing in the coveringare c leftmost columns.

Introduce sets Si D ¹j jui;j D 1; j > sº; i D 1; : : : ; r , that is, Si containsthe numbers of columns to the right of the sth column, such that the element at theintersection of such a column and the i th row is a 1. We show that a family U D.S1; : : : ; Sr/ satisfies the condition (H). Otherwise, it would be possible to find amongthese Si sets k sets, such that their union contains at most k�1 elements. Consideringthe construction of these sets, this would mean that in the corresponding k rows, inthe columns to the right of the sth column there are altogether only k � 1 1s.

But the 1s, located at the intersection of these k rows with leftmost s columns arecovered by these columns – recall that the first s columns belong to the covering.Hence, if we remove these k rows from the covering, we can uncover at most k � 1of 1s in these rows to the right of the sth column. Now, to cover these 1s, we needat most k � 1 columns, and by adding such k � 1 columns to the covering insteadof the k rows removed, we derive a new covering consisting of at most m � 1 lines,which contradicts the minimality of m. By Theorem 5.2.4, the family U has a SDR –namely, r 1s in the upper r rows, such that no two of them are in the same row andall these 1s are in the columns with numbers greater than s.

In exactly the same way, we can choose s 1s in the leftmost s columns, such thatno two 1s among them are covered by one column and all are in the rows with indicesgreater than r . Obviously, no two 1s among the chosen r C s D m 1s are in the sameline, thus, m �M .

Many other applications of Hall’s theorem, such as for instance, the calculation ofpermanents or construction of the Latin squares, can be found in [21] or [43]. Weonly prove the following beautiful theorem of Frobenius dealing with determinants.The latter were introduced in Section 4.4, Definition 4.4.15. We recall here that thedeterminant of an n � n matrix is an alternating sum of nŠ products, called here theterms of the determinant.

Theorem 5.2.13. In order for each of the nŠ terms of the determinant of an n � nmatrix A to be equal to zero it is necessary and sufficient that there are k rows and


n � k C 1 columns in A, with 1 � k � n, not necessarily in succession, such that allentries ai;j at the intersection of these rows and columns are zeros.

Proof. Since a product (in our case a term of the determinant) vanishes if and onlyif it contains at least one vanishing factor, we can consider only zero-one matrices,replacing all non-zero elements of A with 1s. Let m and M have the same meaningas in the preceding theorem. If M � n, then A contains at least n independent (inthe sense of Definition 5.2.11) non-zero elements, that is, independent 1s. Thus theyform a non-zero term of the determinant det.A/, and if all terms of det.A/ vanish,then M < n. By Theorem 5.2.12, m D M < n, and if this m-covering of all 1s in Aconsists of r rows and s columns, then all other elements, which are 0s, are situated incomplementary n� r rows and n� s columns. The intersection of these lines is a zero.n� r/� .n� s/ matrix, and since s D m� r , we have n� r C n� s D 2n�m > n.

To complete the proof, it suffices now to notice that this reasoning is word-by-wordreversible.

The next statement, Dilworth’s theorem, is concerned with properties of partiallyordered sets (posets) – see Definition 1.1.24. Notice that if a poset is not a chain, thatis, not all of its elements are pairwise comparable, then it can be decomposed in theunion of disjoint chains, and this decomposition generally is not unique.

Example 5.2.14. Consider a poset X D ¹a; b; c; d; f º, where a binary relation ofpartial order is given by

% D ¹.a; c/; .a; d/; .b; d/; .b; f /º

that is a � c; a � d; b � d , and b � f . Then we can represent X as the union ofchains in several ways, for instance, as

X D ¹a; cº [ ¹b; dº [ ¹f º

or asX D ¹a; dº [ ¹b; f º [ ¹cº

or else asX D ¹a; cº [ ¹bº [ ¹dº [ ¹f º:

Theorem 5.2.15 (Dilworth ). LetX D ¹x1; x2; : : : ; xnº be a finite poset. The min-imum number of disjoint chains containing all elements ofX is equal to the maximumnumber of pairwise noncomparable elements of X .

Proof. Let m be the minimal number of chains covering the set X and M be themaximal number of pairwise noncomparable elements in X . Since a chain cannotcontain two noncomparable elements, it is obvious that m � M . We use König’s


Theorem 5.2.12 to prove the opposite inequality. Consider a matrix A D�ai;j

�,

where ai;j D 1 if and only if xi � xj and i ¤ j ; otherwise ai;j D 0. First we provetwo lemmas.

Lemma 5.2.16. For any independent set F of 1s in the matrix A there exists a parti-tion � of the n-element set X into disjoint chains such that

jF j C j�j D n:

Proof. LetF D ¹ai1;i2 ; ai3;i4 ; : : : ; ai2k�1;i2kº

which means that xi1 � xi2 ; : : : , xi2k�1 � xi2k . Thus, the elements xi1 ; : : : ; xi2kcan be grouped in chains containing two or more elements each. These chains aremutually disjoint due to the independence of F . If in addition to these chains, weconsider all other elements of X as one-element chains, we derive a partition of Xinto disjoint chains; call this partition �. Denote by lj the number of elements in thej th chain. Since these chains contain all elements of X and are disjoint, we have

n D

j�jXjD1

lj D

j�jXjD1

.lj � 1/C j�j D jF j C j�j

because the lj � 1 1s in F correspond to lj elements of X , which make the j th chainin �.

Definition 5.2.17. A covering of a matrix is called irreducible if it fails to be a cov-ering after removal of any line from it.

Lemma 5.2.18. Let a zero-one matrix A correspond to a poset X; jX j D n, and Tbe an irreducible covering of 1s in A. Then there exists a subset U � X such that

jU j C jT j D n

and U consists of pairwise incomparable elements.

Proof. Let the covering T consist of rows i1; : : : ; ik and columns j1; : : : ; jm. Firstwe prove that all these indices are different.

On the contrary, if i1 D j1, then due to the irreducibility of the covering T there isan element ar;j1 D 1 such that the r th row does not belong to T , and also there is anelement ai1;s D 1 such that the sth column does not belong to T . The transitivity ofa partial order and the equation i1 D j1 imply xr � xs . Suppose that r D s. Thenwe have xi1 � xs D xr � xj1 D xi1 and the antisymmetry of a partial order leads tothe equation xi1 D xs , meaning that the element ai1;s D 1 is located on the principal


diagonal, contrary to the definition ofA. Thus, r ¤ s and xr � xs , implying ar;s D 1.We have noticed, however, that no line of the covering T can cover this unity. Thiscontradiction shows that all indices in T are distinct.

We denote U D X n®xi1 ; : : : ; xik ; xj1 ; : : : ; xjm

¯. Since T is a covering, the ele-

ments of U are pairwise incomparable and n D jU j C jT j.

Completion of the proof of the Dilworth theorem. We have to establish the inequalitym � M . Let bF be a maximal set of independent 1s in A. By Lemma 5.2.16, thereexists the corresponding partition b�; clearly, m � jb�j. On the other hand, let bTbe a minimal covering of 1s in A. By Lemma 5.2.18, there is a subset bU � X

corresponding to bT ; clearly, jbU j � M . Theorem 5.2.12 implies the equation jbF j DjbT j. Thus jb�j D jbU j and finally jbU j �M � m � jb�j D jbU j, that is, m DM .

Now we establish the equivalence of the three major results of this section.

Theorem 5.2.19. The theorems of Hall, König, and Dilworth are equivalent.

Proof. We only have to deduce Hall’s theorem 5.2.4 from Dilworth’s theorem 5.2.15.Given a set S D ¹x1; : : : ; xmº, let us consider a family U D .S1; : : : ; Sn/ of itssubsets. Hall’s theorem asserts that (H) is a necessary and sufficient condition for theexistence of a SDR. Since the necessity is immediate – see the proof of Theorem 5.2.4,we assume that the family U satisfies the condition (H) and shall prove that U has aSDR.

Introduce a poset X D ¹R1; : : : ; Rm; C1; : : : ; Cnº, where the partial ordering isdefined by the following three conditions:

1) Ri � Cj if and only if xi 2 Sj

2) Cj � Cj ; 8j; 1 � j � n

3) Ri � Ri ; 8i; 1 � i � m .

If all elements of a subset ¹Ri1 ; : : : ; Ril ; Cj1 ; : : : ; Cjkº are pairwise incompara-ble, this signifies that no element among xi1 ; : : : ; xil belongs to any of the subsetsSj1 ; : : : ; Sjk . Hence, the condition (H) yields the inequality kC l � m. Therefore, nosubset in X , consisting of pairwise incomparable elements, can have the cardinalitygreater than m. This can be rephrased as follows: The maximum number of pairwiseincomparable elements in X does not exceed m. Moreover, there is a subset in X ,namely ¹R1; : : : ; Rmº, consisting of exactly m pairwise incomparable elements, thusthis maximal number is m. By Theorem 5.2.15, the set X can be decomposed in mdisjoint chains. There may be three kinds of chains: two-element chains ¹Rip ; Cjqº,one-element chains ¹Ripº, and one-element chains ¹Cjqº. After some renumbering,this decomposition can be written as

¹R1; C1º; : : : ; ¹Rt ; Ctº; ¹RtC1º; : : : ; ¹Rmº; ¹CtC1º; : : : ; ¹Cnº:


Since any chain contains at most one element Ri , every Ri must belong to a chain,and there are exactly as many chains as there are elements Ri ; 1 � i � m. Thus, noone-element chain ¹Cj º can exist and it must be t D n � m. Therefore, the abovechain decomposition actually is

¹R1; C1º; : : : ; ¹Rt ; Ctº; ¹RtC1º; : : : ; ¹Rmº:

This shows that x1 2 S1; : : : ; xn 2 Sn, and we constructed a SDR ¹x1; : : : ; xnº forthe family U .

In the end of this section we apply Hall’s theorem to study matchings in bipartitegraphs.

Definition 5.2.20. Consider a bipartite graph G D .V1[V2; E/. Any set of its edgesis called a matching from V1 to V2. A matching M in a bipartite graph G is calledcomplete if there is a one-to-one correspondence between V1 and a subset of V2 suchthat the corresponding vertices are connected by the M-edges. A matching M in abipartite graph G is called maximal if no other matching in G contains more edgesthan M.

A matching can be described as a (nonsymmetric) binary relation between the setsV1 and V2. If X � V is a set of vertices, we denote by �.X/ the set of all verticesadjacent with some vertex in X . We again immediately observe a necessary conditionof the existence of a complete matching in a bipartite graphG D .V1[V2; E/, namely,the inequality j�.X/j � jX j for any subset X � V1. Similarly to other results in thissection, the following theorem asserts that this natural necessary condition is alsosufficient.

Theorem 5.2.21. A complete matching in a bipartite graph G D .V1 [ V2; E/ existsif and only if j�.X/j � jX j for every subset X � V1.

Proof. We prove that the statement is equivalent to Hall’s Theorem 5.2.4. Let S bea finite set. To any family of sets U D .S1; : : : ; Sn/; Sj � S; 1 � j � n, therecorresponds its bipartite incidence graph G D .V1 [ V2; E/, where V1 and V2 arearbitrary sets with jV1j D n; jV2j D jS j, and a point in V1 is connected with a pointin V2 if and only if the corresponding subset in U contains the corresponding elementin S . Vice versa, to any bipartite graph we can quite similarly put in a correspondencea family of sets U . It is obvious that a complete matching in G exists if and only ifthe family U has a SDR.


EP 5.2.1. 1) A clothing store has suits of two designs and two colors. Is it possibleto choose two suits for display representing both designs and both colors?


2) If there are suits of three designs and three colors, can the display show all thethree designs and three colors using only two suits?

EP 5.2.2. Prove Theorem 5.2.10 for non-integer utilities ti;j .

EP 5.2.3. Revisit Problem 5.2.5 and find all other possible SDR and all other solu-tions of the assignment problem. Make sure that they return the same maximum valueas in the solution above.

EP 5.2.4. Solve the assignment problem with the utility matrix

T D

0BBBB@3 2 2 5 4

5 6 2 6 1

3 1 2 4 6

2 3 3 5 4

4 2 1 1 3

1CCCCA :

EP 5.2.5. In addition to the three chain decompositions presented in Example 5.2.14,find all other possible decompositions of the set X in the example.

EP 5.2.6. Consider all 3-element families of subsets of the set X D ¹a; b; c; dº with-out repeating subsets. Which of them have SDR? Find them.

EP 5.2.7. Prove that if a family U D .S1; : : : ; Sn/ of subsets of a finite set S D¹x1; : : : ; xmº satisfies Hall’s condition (H), then U has the unique SDR if and only ifjS1 [ � � � [ Snj D n.

EP 5.2.8. How many are there n�n zero-one matrices that have exactly one 1 in eachline (row or column)?

EP 5.2.9. How many are there m � n zero-one matrices such that the sum of all itselements is k?

EP 5.2.10. Prove that if a 2n� 2n zero-one matrix contains 3n 1s, then it is possibleto find n rows and n columns, which cover all 1s in the matrix. However, there existsa zero-one 2n � 2n matrix with 3nC 1 1s, such that no set of n rows and n columnscovers all of the 1s.

EP 5.2.11. An edge-cover in a graph is a set S of vertices such that every edge isincident to a vertex in S . Prove that König’s theorem can be stated as follows: Themaximum size of a matching in a bipartite graph is equal to the minimum size of anedge-cover in the graph.

Section 5.3 Block Designs 289

EP 5.2.12. The next problem represents another kind of problems on systems of (notnecessarily distinct) representatives.

In Really Fraternal College there are 2 006 fraternities and sororities each of whichcomprises more than half of all college students. Many of the students belong toseveral sororities or fraternities. Prove that it is possible to find at most 10 studentsat the college that represent every sorority and fraternity, that is, for each sorority andfraternity there is someone among these 10 students who belongs to this sorority orfraternity.

5.3 Block Designs

In this section we are concerned with methods of selecting subsets in a given setsubject to various restrictions on the elements of these subsets. Such methods areimportant in scheduling, planning of experiments, and many other problems.

Bruck–Ryser–Chowla Theorem � Chowla’s biography � Ryser’s bi-ography � Diophantus’ biography � Lagrange’s biography

Problem 5.3.1. Organizers of a football tournament invited nine teams and rentedthree stadiums. Each team must play any other exactly once. How should the orga-nizers schedule the games to finish the tournament as soon as possible?

Solution. All in all, C.9; 2/ D 36 games are necessary. Therefore, each stadiumshould host 36=3 D 12 games, because if one field hosts less than 12 games, thenanother field must do more than 12, which would make the tournament longer. Theorganizers can split all nine teams into groups, called hereafter blocks, of three, as-sign each block to a stadium and schedule mini-series within each group. Any suchmini-tournament consists of C.3; 2/ D 3 games. When the first series of the three si-multaneous mini-tournaments within blocks is over, the organizers reshuffle all teamsin new blocks of three, making sure that no pair of the teams meets again, and repeatthis procedure until each team plays all others. Since it is necessary to have 12 gamesat each field and any mini-series consists of three games, we expect at least 12=3 D 4shuffles of three blocks with each block consisting of three teams.

It is not at all clear that this procedure works, and to finish the solution, we haveto present all blocks explicitly. Denoting the participating teams by t1; : : : ; t9, wearrange them in blocks as follows. The first shuffle is

B1 D ¹t1; t2; t3º; B2 D ¹t4; t5; t6º; B3 D ¹t7; t8; t9º:

The second shuffle is



The third shuffle is


And the last, fourth shuffle is


We observe that for each pair ¹ti ; tj º; 1 � i; j � 9; i ¤ j , the teams ti and tj playeach other exactly once.

Solving this problem, we selected 12 subsets-blocks of the given set. In the exam-ple, these 12 blocks, B1; : : : ; B12, satisfy the following obvious conditions:

� Each block consists of three elements.

� Each element of the given set appears in exactly four blocks.

� Each pair of elements meets precisely in one block.

Such configurations are called (combinatorial) block designs. They are useful inmany problems, like scheduling, experiment planning and many others. Formalizingthe conditions above, we arrive at the following definition.

Definition 5.3.1. Let X D ¹x1; : : : ; xvº be a finite set, whose subsets are hereaftercalled blocks. A family (in the sense of Definition 5.1.1) of blocks B D .B1; : : : ; Bb/is called a balanced incomplete block design (BIBD) with parameters .v; b; k; r; �/,denoted by S.v; b; k; r; �/, if

� Each block contains k elements, jB1j D � � � D jBbj D k.

� Each element of the set X belongs to exactly r blocks.

� Each pair of elements of X appears in precisely � blocks.

These configurations are called balanced, because of the uniformity of the preced-ing conditions. They are called incomplete, because a block design does not nec-essarily contain all the 2v

kk-element subsets of X . It is worth reminding that by

Definition 5.1.1 of a family of subsets, some or even all blocks Bi can coincide. Thesolution of Problem 5.3.1 gives an example of the BIBD S.9; 12; 3; 4; 1/. Here aretwo more examples of BIBD.

BIBD S.7; 7; 3; 3; 1/:

B1 D ¹1; 3; 7º; B2 D ¹1; 2; 4º; B3 D ¹2; 3; 5º; B4 D ¹3; 4; 6º;

B5 D ¹4; 5; 7º; B6 D ¹1; 5; 6º; B7 D ¹2; 6; 7º


BIBD S.13; 26; 3; 6; 1/:

B1 D ¹1; 2; 5º; B2 D ¹2; 3; 6º; B3 D ¹3; 4; 7º; B4 D ¹4; 5; 8º;

B5 D ¹5; 6; 9º; B6 D ¹6; 7; 10º; B7 D ¹7; 8; 11º; B8 D ¹8; 9; 12º;

B9 D ¹9; 10; 13º; B10 D ¹1; 10; 11º; B11 D ¹2; 11; 12º; B12 D ¹3; 12; 13º;

B13 D ¹1; 4; 13º; B14 D ¹1; 3; 8º; B15 D ¹2; 4; 9º; B16 D ¹3; 5; 10º;

B17 D ¹4; 6; 11º; B18 D ¹5; 7; 12º; B19 D ¹6; 8; 13º; B20 D ¹1; 7; 9º;

B21 D ¹2; 8; 10º; B22 D ¹3; 9; 11º; B23 D ¹4; 10; 12º; B24 D ¹5; 11; 13º;

B25 D ¹1; 6; 12º; B26 D ¹2; 7; 13º

As we will see, the existence of a BIBD S.v; b; k; r; �/ imposes certain restrictionson the parameters v; b; k; r; � . To this end it is convenient to introduce the incidencematrix of a block design.

Definition 5.3.2. Consider a BIBD S.v; b; k; r; �/ built on a v-element set X andconsisting of b blocks. The incidence matrix of S.v; b; k; r; �/ is a zero-one b � vmatrix M D

�mi;j

�with the entries

mi;j D

´1 if xi 2 Bj0 if xi … Bj

: (5.3.1)

To find necessary conditions, which the parameters of a BIBD S.v; b; k; r; �/ mustsatisfy, we calculate the number of 1s in M in two different ways. On the one hand,each element belongs to r blocks, hence, each row in the matrix contains r 1s, thus,the matrix contains in total r � v 1s. On the other hand, each block consists of kelements, therefore, each one of b columns, representing b blocks, contains k 1s,totalling to b � k. Thus, we get a necessary condition for a BIBD S.v; b; k; r; �/ toexist,

b � k D r � v: (5.3.2)

To derive another necessary condition of the existence of a BIBD S.v; b; k; r; �/,we choose an element, say x1, and compute how many times all the ordered pairs.x1; xi /; 1 ¤ i , appear in all blocks. The element x1 appears in r blocks and ineach of the blocks it makes up pairs with k � 1 other elements, altogether generatingr.k � 1/ such pairs. On the other hand, since jX j D v, there are v � 1 different pairs.x1; xi / with i ¤ 1 and each of them appears � times, adding up to �.v � 1/. Thuswe get another necessary condition,

r.k � 1/ D �.v � 1/: (5.3.3)

Conditions (5.3.2) and (5.3.3) are necessary but as we will see, are not sufficientfor the existence of BIBD S.v; b; k; r; �/.


To study BIBD, we need a few simple properties of their incidence matrices. LetM be the incidence matrix of the BIBD S.v; b; k; r; �/. Introduce a v � v matrixN D M �MT , where MT is the transpose of M , that is, the matrix M flipped aboutits main diagonal. We compute matrix N in the next lemma. Hereafter, subscriptsindicate the dimensions of a matrix or a vector.

Lemma 5.3.3.

N D

0BBB@r � � � � �

� r � � � �::::::: : :

:::

� � � � � r

1CCCA D .r � �/Iv C �Jv (5.3.4)

Moreover, wvM D kwb , which is equivalent to

JM D kJ (5.3.5)

where I is the unit matrix, that is, all of its diagonal elements are 1s and all off-diagonal elements are 0s, J is a v � v matrix and w is a vector all of whose elementsare 1s.

Proof. An element ni;j of the matrix N is the dot product of the i th and j th rows ofthe incidence matrix M . Therefore, ni;i is equal to the number of 1s in the i th rowof M , which is r . If i ¤ j , then ni;j D mi;1mj;1 C � � � Cmi;bmj;b , and the addendmi;smj;s D 1 if and only ifmi;s D mj;s D 1, which means that ai 2 Bs and aj 2 Bs .However, each pair ai ; aj meets in � blocks, thus each pair contributes � unities toni;j . This proves (5.3.4).

To prove the second statement of the lemma, it suffices to notice that each columnof M contains exactly k 1s, which is expressed by (5.3.5).

The converse of Lemma 5.3.3 is also true – see EP 5.3.5.We will also use equation (5.3.4) rewritten in other terms. Introduce linear forms

Lj .x1; : : : ; xv/ D

vXiD1

mi;jxi ; 1 � j � b (5.3.6)

where mi;j are elements of the matrix M . Then (5.3.4) can be written as

L21 C � � � C L2b D .r � �/.x

21 C � � � C x

2v/C �.x1 C � � � C xv/

2: (5.3.7)

To compute the determinant of the matrix N DM �MT , we subtract its first columnfrom all the subsequent columns and then add the 2nd; 3rd; : : : ; vth rows to the firstone. The resulting matrix is triangular, hence its determinant is the product of thediagonal elements,

det.N / D .r � �/v�1.v� � �C r/: (5.3.8)


If r D �, then �.k � 1/ D �.v � 1/ by (5.3.3), thus, v D k and the block designis trivial in the sense that it consists of several identical repeating blocks – copiesof the basic set X . The strict inequality r < � is impossible, for it would implyr.k � 1/ < �.k � 1/, hence �.v � 1/ < �.k � 1/ by (5.3.3), therefore, v < k. Thelatter would mean that a block contains more elements than the entire basic set.

Thus, hereafter we assume that r > �. Hence

v� � �C r D v�C .r � �/ > 0

and (5.3.8) implies the inequality det.N / > 0. We have proved that N is a non-singular matrix and since N is a v � v matrix, its rank is v.

The rank of a product of matrices cannot exceed the rank of any factor in the prod-uct. In addition, the rank of M cannot exceed the number of its columns, which is b.Hence, we deduce the Fisher inequality

v � b (5.3.9)

valid for any BIBD S.v; b; k; r; �/. Moreover, (5.3.9) and (5.3.2) imply an inequalityk � r for any such BIBD.

Definition 5.3.4. A BIBD S.v; b; k; r; �/ is called symmetric if v D b; in this caseequation (5.3.2) implies also that for symmetric BIBD k D r . Therefore, the sym-metric block designs have only three independent parameters and will be denoted byS.v; k; �/.

For example, the BIBD S.7; 7; 3; 3; 1/ D S.7; 3; 1/ presented above, is symmetric.Symmetric block designs are dealt with in the following statement.

Theorem 5.3.5. The incidence matrix of a symmetric block design S.v; k; �/ satisfiesthe relations

MMTD .k � �/I C �J DMTM (5.3.10)

and

JM D kJ DMJ: (5.3.11)

Proof. Only the right equations require proofs, since the left ones are, respectively,(5.3.4) and (5.3.5) rewritten for the symmetric case v D b and k D r . The rightequation in (5.3.11) follows immediately from symmetry, since it tells that every rowin M contains k D r 1s, that is, each element belongs to r D k blocks. Thus, wehave to prove only the right equation in (5.3.10).


To prove it, we multiply (5.3.4) on the left by the inverse matrixM�1, which existsdue to the non-singularity of N DMMT , and get the equation

MTD .k � �/M�1 C �M�1J: (5.3.12)

Similarly, the equation kJ DMJ implies kM�1J D J , which together with (5.3.12)gives MT D .k � �/M�1 C .�=k/J . Multiplying the latter on the left by M andusing (5.3.11) we complete the proof.

Remark 5.3.6. This theorem is a particular case of a more general theorem by Ryser[21, p. 130].

Studying the combinatorial block designs we are concerned with two problems.

� Does there exist a design S.v; b; k; r; �/ with the given parameters v; b; k; r; �?

� If a design S.v; b; k; r; �/ does exist, how to construct it explicitly?

The following theorem gives necessary conditions for the existence of symmetricblock designs.

Theorem 5.3.7 (Bruck–Ryser–Chowla ). Let there exist a symmetric BIBDS.v; k; �/.

1) If the number of elements v is even, then the difference ˛ D k � � is a perfectsquare.

2) If v is odd, then the Diophantine equation

z2 D .k � �/x2 C .�1/v�12 �y2

has a non-trivial solution in integer numbers x; y; z. The triple x D y D z D

0 obviously satisfies this equation; non-trivial means that at least one of thenumbers x; y; z is non-zero, in other words x2 C y2 C z2 > 0.

Proof. 1) It follows from (5.3.3) that in the symmetric case k.k � 1/ D �.v � 1/,hence v� � �C k D k2. Therefore, we deduce from (5.3.8) the equation

.det.M//2 D det.N / D .k � �/v�1.v� � �C k/ D .k � �/v�1k2:

The latter implies that .k � �/v�1 must be a square, which is impossible for an oddnumber v � 1 unless the base k � � is a square.

2) To prove the theorem in the case of odd v, we need the following lemma. Weleave it to the reader to verify this claim by a direct calculation.


Lemma 5.3.8. If 8<:y1 D b1x1 � b2x2 � b3x3 � b4x4y2 D b2x1 C b1x2 � b4x3 C b3x4y3 D b3x1 C b4x2 C b1x3 � b2x4y4 D b4x1 � b3x2 C b2x3 C b1x4

(5.3.13)

theny21 C y

22 C y

23 C y

24 D ˛.x

21 C x

22 C x

23 C x

24/ (5.3.14)

where ˛ D b21 C b22 C b

23 C b

24 .

Let us notice that the determinant of system (5.3.13) is equal to ˛2. Thus, ifb1; b2; b3; b4 are integers, then the solutions x1 � x4 of (5.3.13) can be expressedas linear forms with rational coefficients through y1 � y4, and the common denomi-nator of all these coefficients is ˛2. Moreover, in the symmetric case equation (5.3.7)becomes

L21 C � � � C L2v D ˛.x

21 C � � � C x

2v/C �.x1 C � � � C xv/

2: (5.3.15)

We will use the following theorem (see, for example, [24, p. 302]).

Lagrange’s Theorem on Four Squares. If zero addends are allowed, then everynatural number can be written as a sum of four squares.

For example, 9 D 32C02C02C02, 10 D 32C12C02C02, 11 D 32C12C12C02,12 D 32 C 12 C 12 C 12.

Proof of Theorem 5:3:7 (continued) when v is odd. Let v � 1.mod 4/, that is, v � 1is a multiple of 4. Applying Lagrange’s theorem to number ˛ D k � �, we can write

˛ D b21 C b22 C b

23 C b

24 : (5.3.16)

Next, we split the variables x1; : : : ; xv�1 into quadruples

.x1; x2; x3; x4/; : : : ; .xv�4; xv�3; xv�2; xv�1/:

Considering (5.3.16), we apply formula (5.3.14) to each quadruple,

˛.x2i C x2iC1 C x

2iC2 C x

2iC3/ D y

2i C y

2iC1 C y

2iC2 C y

2iC3

thus (5.3.15) becomes

L21 C � � � C L2v D y

21 C � � � C y

2v�1 C ˛x

2v C �.x1 C � � � C xv/

2: (5.3.17)

The rest of the proof is based on the observation that (5.3.17) is an identity with ra-tional coefficients in indeterminates x1; : : : ; xv, or which is equivalent, in y1; : : : ; yv,and we use these indeterminates to derive the Diophantine equation we sought after.


First, we set yv D xv and eliminate all xi ; 1 � i � v, from (5.3.17) by consideringsystem (5.3.13) for each quadruple xi ; xiC1; xiC2; xiC3 with the same coefficientsb1; b2; b3; b4, and solving all these systems for xi ; 1 � i � v. After this elimination(5.3.17) becomes

L21 C � � � C L2v D y

21 C � � � C y

2v�1 C ˛y

2v C �w

2 (5.3.18)

whereL1; : : : ; Lv andw D x1C� � �Cxv are linear forms of indeterminates y1; : : : ; yvwith rational coefficients.

Let L1 D c1y1 C � � � C cvyv. If c1 ¤ 1, then the equation L1 D y1 allows us toexpress y1 through y2; : : : ; yv as a linear form with rational coefficients. Otherwise,that is, if c1 D 1, we consider the equation L1 D �y1. In either case L21 D y

21 , thus

(5.3.18) becomes

L22 C � � � C L2v D y

22 C � � � C y

2v�1 C ˛y

2v C �w

2:

The latter identity depends only on y2; : : : ; yv. Continuing in the same fashion andeliminating, one by one, the indeterminates y2; : : : ; yv�1, we end up with the equation

L2v D ˛y2v C �w

2

where Lv and w are rational multiples of the last remaining variable yv. Setting hereyv to be equal to an integer multiple x ¤ 0 of the denominators of Lv and w, wederive a relation between three integer numbers x ¤ 0; y; z:

z2 D ˛x2 C y2: (5.3.19)

Therefore, we have proved the theorem in the case v � 1.mod 4/. The case v �3.mod 4/ is treated similarly, but to apply Lagrange’s theorem in this case, we haveto introduce a new variable xvC1 and add the term ˛x2vC1 to both sides of (5.3.15).Similar calculations lead to equations ˛x2 D y2vC1 C �w

2 and

z2 D ˛x2 � �y2: (5.3.20)

The Diophantine equations (5.3.19) and (5.3.20) together complete the proof of The-orem 5.3.7.

Theorem 5.3.7 implies, in particular, that the necessary conditions (5.3.2)–(5.3.3),which in the symmetric case reduce to one equation

k.k � 1/ D �.v � 1/

are not sufficient. For example, it is readily verified that the values v D 43; k D

7; � D 1 satisfy the latter equation. However, Theorem 5.3.7 gives in this case theequation z2 D 6x2 � y2, which has no non-trivial solution.



EP 5.3.1. A teacher arranges her 4 first-graders in a 2�2 square. For how many dayscan she make these arrangements so that every child has a new neighbor in her row?

EP 5.3.2. Solve the previous problem if 40 kids must be arranged in a 10�4 rectangle.

EP 5.3.3. There are 20 students in a class. In the classroom, there are 10 desks withtwo seats each. On the first day of each week a teacher rearranges the students, so thatany two students seat at the same desk if and only if they have never seated togetherbefore. For how many weeks can the teacher do that?

EP 5.3.4. Arrange several pennies, nickels, dimes, quarters, and half-dollars in a 4�4square, so that each row, each column, and each of two diagonals consist of differentcoins and the total sum of all 16 coins is the largest.

EP 5.3.5. Prove the converse of Lemma 5.3.3, that is, prove that given a zero-onematrix M , whose elements satisfy (5.3.4)–(5.3.5), there exists a BIBD S.v; b; k; r; �/

with the incidence matrix M .

EP 5.3.6. Deduce equation (5.3.7) from (5.3.4).

EP 5.3.7. Restore details of the calculation of the determinant leading to equation(5.3.8).

EP 5.3.8. Compute the determinant of system (5.3.13).

EP 5.3.9. Prove Lemma 5.3.8.

EP 5.3.10. Derive in detail equation (5.3.20).

EP 5.3.11. Prove that the equation z2 D 6x2 � y2 has no non-trivial solution.

EP 5.3.12. Do there exist BIBD S.43; 43; 7; 7; 1/ and S.15; 21; 5; 7; 2/?


5.4 Systems of Triples

In the last section we study systems of triples, that is, the block designs with 3-element blocks. In particular, we find for what values of v the systems of triplesexist.

Kirkman’s schoolgirl problem � Kirkman’s biography � Steiner’s bi-ography � E. H. Moore � Many Fano Planes

If k D 3, equations (5.3.2) and (5.3.3) read

3b D rv; 2r D �.v � 1/;

leading to

r D1

2�.v � 1/; b D

1

6�v.v � 1/: (5.4.1)

Since r and b are integer numbers, equations (5.4.1) give the following necessaryconditions for a system of triples to exist.

Proposition 5.4.1. If a system of triples S.v; b; 3; r; �/ exists, then the product �.v �1/ is even and the product �v.v � 1/ is divisible by 6, that is,

�.v � 1/ � 0.mod 2/; �v.v � 1/ � 0.mod 6/: (5.4.2)

It turns out that these necessary conditions are also sufficient for the existence ofsystems of triples. Moreover, similar conditions, which follow from (5.3.2)–(5.3.3),are also necessary and sufficient for the existence of block designs with k D 4 andany �, but are not sufficient if k � 5 [21, Chap. 15]. It is, however, known [49] thatfor any given k and � there exists a number v0 such that for all v � v0 conditions(5.3.2)–(5.3.3) are not only necessary but also sufficient for the existence of a blockdesign S.v; b; k; r; �/.

We consider in more detail the systems of triples with � D 1. They are calledSteiner triple systems. If k D 3 and � D 1, then for a given v two other parameters, band r , are uniquely determined from (5.4.1), thus, we denote the Steiner triple systemsby S.v/ and call v the order of the system.

When � D 1, (5.4.2) implies v � 1 � 0.mod 2/ and v.v � 1/ � 0.mod 6/. There-fore, v is odd and v.v � 1/ D 6t , where t is an integer; thus, there are only threepossible cases,

v D 6t C 1; v D 6t C 3; v D 6t C 5:

However, if v D 6t C 5, then v.v � 1/ D .6t C 5/.6t C 4/ D 6.6t2 C 9t/ C

20, which is not divisible by 6, hence for v � 5.mod 6/ systems S.v/ do not exist,

Section 5.4 Systems of Triples 299

and we have only two possibilities left, v � 1.mod 6/ and v � 3.mod 6/. Suchvalues of v are called admissible. It turns out that for each admissible v Steiner triplesystems S.v/ do exist. The proof below follows Hilton [27] and is recursive. First weprove two theorems of Moore, which give algorithms for constructing a system S.v/

from given systems with smaller values of the parameter v, and then we prove thateach admissible value v D 6t C 1 or v D 6t C 3 can be expressed through smalleradmissible values of v, such that those algorithms can be applied.

Definition 5.4.2. Let two block designs, S 0 and S 00, be built on the sets X 0 and X 00,respectively, and B0; B00 stand for the families of their blocks. The designs S 0 andS 00 are called isomorphic if there exist two one-to-one correspondences,

' W X 0 ! X 00

and

W B0 ! B00

compatible with the incidence relations in these designs. The latter means that forany blocks B1 2 B0 and B2 2 B00, and for any elements x1 2 B1 and x2 2 B2, theequality x2 D '.x1/ holds if and only if B2 D .B1/. If S 0 D S 00 D S , then anisomorphism is called an automorphism of S .

Since superposition of mappings is associative, it is (almost) obvious that all theautomorphisms of a block design make a multiplicative group with respect to thesuperposition.

Problem 5.4.1. Prove this statement.

Theorem 5.4.3 (Moore ). If there are Steiner triple systems S.v1/ and S.v2/, thenthere exists also a Steiner triple system S.v/ with v D v1 � v2, containing a subsystemisomorphic to S.v1/ and a subsystem isomorphic to S.v2/.

Proof. Consider any two sets X D ¹x1; : : : ; xv1º and Y D ¹y1; : : : ; yv2º, such thatjX j D v1 and jY j D v2. To simplify notation, we consider, without loss of generality,the sets X D Nv1 D ¹1; : : : ; v1º and Y D Nv2 D ¹1; : : : ; v2º. By the assumption,there exist Steiner triple systems S.v1/ and S.v2/ on these sets, respectively. Weconstruct a system S.v1 � v2/ on the Cartesian product Z D X � Y; jZj D v1 � v2.The set Z consists of ordered pairs of natural numbers, and the following algorithmdetermines which triples of the elements of Z, that is, which triples of these orderedpairs make blocks in S.v/.

Let zi;j D .xi ; yj / 2 Z. A triple®zi;r ; zj;s; zk;t

¯is a block in S.v/ if and only if

one of the following three mutually-exclusive conditions holds true.


1) The triple ¹xi ; xj ; xkº is a block in S.v1/ and r D s D t .

2) The triple ¹yr ; ys; ytº is a block in S.v2/ and i D j D k.

3) The triple ¹xi ; xj ; xkº is a block in S.v1/ and the triple ¹yr ; ys; ytº is a blockin S.v2/.

We have to verify that this system of triples satisfies the definition of Steiner triplesystem S.v/ with v D v1 � v2. First we check that each pair of elements meetsin exactly one block. Let ¹zi;r ; zj;sº be an arbitrary pair in Z. If i D j , then thepair ¹yr ; ysº meets in the unique block ¹yr ; ys; yuº of S.v2/, since S.v2/ is a Steinertriple system. Thus, the pair ¹zi;r ; zj;sº uniquely determines the block ¹zi;r ; zi;s; zi;uº.Moreover, the pair ¹zi;r ; zj;sº does not appear in two other parts of the algorithm andcannot generate any more triples. The same argument works if r D s.

Next, if i ¤ j and r ¤ s, then the pair ¹xi ; xj º uniquely determines the block¹xi ; xj ; xkº in S.v1/, that is, we found the index k. Similarly, the pair ¹yr ; ysºuniquely determines the block ¹yr ; ys; yuº in S.v2/, thus, we found the index u.If a pair ¹zi;r ; zj;sº were to meet in two blocks ¹zi;r ; zj;s; zk;uº and ¹zi;r ; zj;s; zl;tº,this would mean that the system S.v1/ contained two different blocks ¹xi ; xj ; xpº and¹xi ; xj ; xqº with p ¤ q, contrary to the definition.

We still have to verify that any element zi;r of Z belongs to r D v�12

blocks. Theelement zi;r enters r2 D

v2�12

blocks together with xi , and it appears in r1 Dv1�12

blocks together with yr . We compute now how many blocks in S.v/ contain zi;r anddo not contain xi or yr . We have just shown that two elements zi;r and zj;s determinethe block uniquely, thus it suffices to calculate in how many ways it is possible to finda pair zj;s for a given element zi;r with j ¤ i and s ¤ r . To this end we compute thetotal number of elements v1 � v2 in Z less the number of elements containing xi oryr save the zi;r itself, and take a half of that amount, since the order of elements in apair does not matter. This calculation yields 1

2.v1v2 � v1 � v2C 1/. So that, the total

number of blocks containing zi;r is

r Dv1 � 1

2Cv2 � 1

2C1

2.v1v2 � v1 � v2 C 1/ D

1

2.v1v2 � 1/ D

1

2.v � 1/:

We have proved that the algorithm returns the Steiner triple system we sought. Thetriples with r D s D t D 1 form a subsystem isomorphic to S.v1/, and the tripleswith i D j D k D 1 form a subsystem isomorphic to S.v2/.

Theorem 5.4.4 (Moore). Given three natural numbers v1; v2; v3. If there exist sys-tems S.v1/ and S.v2/, and either v3 D 1 or there exists a system S.v3/ such that thesystem S.v2/ contains a subsystem isomorphic to S.v3/, then there exists a systemS.v/ of order v D v3Cv1.v2�v3/, containing a subsystem of order v1, a subsystemof order v3, and v1 subsystems of order v2.


Proof. If v2 D v3, then v D v2 and there is nothing to prove, for the given systemS.v2/ contains a subsystem isomorphic to S.v3/. Thus, we assume v2 � v3 � 1, sets D v2 � v3, and use the union of the following v1 C 1 sets,

X D ¹x1; x2; : : : ; xv3º

Y1 D ¹y1;1; y1;2; : : : ; y1;sº

:::

Yv1 D ¹yv1;1; yv1;2; : : : ; yv1;sº

as the set of elements of the system S.v/ under construction. We describe now analgorithm generating the system S.v/. A triple of elements from the union X [ Y1 [� � �[Yv1 is a block in S.v/ if and only if one of the following three mutually exclusiveconditions holds true.

1) A triple®xi ; xj ; xk

¯is a block in S.v/ if this triple is a block in a system S.v3/

derived from the base set X . If v3 D 1, then this case is vacuous.

2) For each i; 1 � i � v1, we construct a system S.v2/ from the elements ofthe set X [ Yi ; this system exists by the assumption and contains a systemS.v3/ built from X . The system S.v/ will include all blocks of S.v2/ exceptfor those that belong to S.v3/ and are listed in step 1) of the algorithm. Theseblocks contain no more than one element xj 2 X and are either ¹xj ; yi;k; yi;lºor ¹yi;k; yi;l ; yi;mº.

3) Finally, we construct a system S.v1/ on the set of numbers ¹1; 2; : : : ; v1º. If¹i; j; kº is a block of this system, we include in S.v/ all triples

®yi;x; yj;y ; yk;t

¯,

such that the second subscripts satisfy the congruence x C y C t � 0.mod s/.

To prove that this system is a Steiner triple system S.v/, the reader can carry overthe argument similar to the proof of Theorem 5.4.3.

Now we can prove a criterion of the existence of the Steiner triple systems.

Theorem 5.4.5. A Steiner triple system S.v/; v 2 N, exists if and only if v � 3 isadmissible, that is, v D 3, or v D 6t C 1, or v D 6t C 3 with any t 2 N.

Proof. The necessity of these conditions has been already proven. To prove their suf-ficiency, we follow the recurrent argument of A. Hilton [27] and start by constructingS.v/ with all admissible v � 36, that is, with v D 3, 7, 9, 13, 15, 19, 21, 25, 27,31, 33. Indeed, S.3/ is a trivial system with one block, the systems S.7/, S.9/, andS.13/ were presented in Section 5.3, the existence of systems S.15/, S.19/, S.21/,S.25/, S.27/, S.31/, and S.33/ follows from Theorems 5.4.3–5.4.4 by virtue of the


equations

15 D 1C 7.3 � 1/

19 D 1C 9.3 � 1/

21 D 7 � 3

25 D 1C 3.9 � 1/

27 D 3 � 9

31 D 1C 15.3 � 1/

33 D 3C 3.13 � 3/

if one takes into consideration the demonstrated existence of S.3/, S.7/, S.9/, S.13/.Next we present formulas, expressing all bigger admissible values of v, that is,

v D 6t C 1 and v D 6t C 3 with v > 36 through smaller admissible values of v,hence one can straightforwardly apply Theorem 5.4.4. If an admissible v ¤ 36tC13,then we have the following 11 cases,

v D 36t C 1 D 1C 3..12t C 1/ � 1/

v D 36t C 3 D 1C .18t C 1/.3 � 1/

v D 36t C 7 D 1C .6t C 1/.7 � 1/

v D 36t C 9 D 3C .6t C 1/.9 � 3/

v D 36t C 15 D 1C .18t C 7/.3 � 1/

v D 36t C 19 D 1C .6t C 3/.7 � 1/

v D 36t C 21 D 3C .6t C 3/.9 � 3/

v D 36t C 25 D 1C 3..12t C 9/ � 1/

v D 36t C 27 D 1C .18t C 13/.3 � 1/

v D 36t C 31 D 1C .18t C 15/.3 � 1/

v D 36t C 33 D 3C 3..12t C 13/ � 3/:

If v D 36t C 13, a tempting simple approach would be to write v D 7 C .6t C

1/.13 � 7/, however, this does not work since, by EP 5.4.18 the system S.13/ cannotcontain a subsystem of order 7, thus we cannot apply Theorem 5.4.4. Therefore, thecase v D 36t C 13 must be split into several subcases. If t is even, say t D 2k, then

v D 36t C 13 D 1C .6k C 1/.13 � 1/

and Theorem 5.4.4 is again applicable. Suppose now that t is odd and there existsr � 1 such that

t D 22r�2 C 22r�4 C � � � C 22 C 20:


If here r D 1, then v D 49 D 7 � 7 and Theorem 5.4.3 works. If r > 1 is odd, that isr D 2s C 3; s � 0, then

v D 36t C 13 D 9C .18.24s C � � � C 20/C 1/.49 � 9/

while if r is even, that is r D 2s C 2; s � 0, then

v D 36t C 13 D 3C .18.24s C � � � C 20/C 1/.13 � 3/:

Therefore, in all these cases v can be expressed through smaller admissible values ofv, so that Moore’s theorems can be applied.

Finally we have to consider the case of an odd t with a representation

t D 22rC˛s C � � � C 22rC˛0 C 22r�2 C 22r�4 C � � � C 22 C 1

where r � 1, 0 < ˛0 < ˛1 < � � � < ˛s; among the terms 22rC˛1 ; : : : ; 22rC˛s therealso may be powers of 4, but 22rC˛0 is not such a power. Thus,

v D 36t C 13 D 1C .3t C 1/ � 3 � 22 D 1C .3x C 1/ � 3 � 22rC2

where x D 2˛0 C � � � C 2˛s . Since ˛0 > 0, x must be an even number, hence theremainder after dividing x by 4 is either 0 or 2. Therefore, if t D 4nC 1, then

v D 36t C 13 D 1C .6nC 1/..3 � 22rC2 C 1/ � 1/;

and if t D 4nC 3, then

v D 36t C 13 D 1C .18nC 15/..22rC3 C 1/ � 1/:

To complete the proof, it only remains to notice that in the latter case

22rC3 C 1 D 6 � 4r C 2.3C 1/r C 1 D 6l C 3:

Thus, any admissible v can be expressed through smaller admissible values of v andTheorems 5.4.3–5.4.4 apply.

Problem 5.4.2. Construct a system S.7/ by making use of Moore’s theorems.

Solution. If we divide the order v D 7 by 36, the remainder is 7, so that by the generalalgorithm of Theorem 5.4.5 we can use the representation

v D 36t C 7 D 1C .6t C 1/.7 � 1/

with t D 0. However, it is more instructive here to write 7 D 1 C 3.3 � 1/ and usethe algorithm of Theorem 5.4.4 with v1 D v2 D 3 and v3 D 1.


As in the proof of this theorem, we introduce four sets

X D ¹xº

Y1 D ¹y1;1; y1;2º

Y2 D ¹y2;1; y2;2º

Y3 D ¹y3;1; y3;2º:

The set X itself does not make a three-element block, however the unions X [Y1; X [ Y2 and X [ Y3 generate three blocks ¹x; y1;1; y1;2º; ¹x; y2;1; y2;2º and¹x; y3;1; y3;2º. To work out the third part of the algorithm of Theorem 5.4.4, onemust consider the set ¹1; 2; : : : ; v1º D ¹1; 2; 3º. This set generates the only three-element set ¹1; 2; 3º. Thus, we must consider all possible triples ¹y1;k; y2;l ; y3;mºand solve the congruence k C l C m � 0.mod 2/. This is equivalent to solving aDiophantine equation k C l C m D 2s in integers k; l;m, where s is any integernumber and 1 � k; l;m � 2. Subject to the latter restrictions, the equation can bereadily solved, it has four solutions, .1; 1; 2/; .1; 2; 1/; .2; 1; 1/; .2; 2; 2/. These fourtriples generate four more blocks, in addition to the initial blocks, namely,

¹y1;1; y2;1; y3;2º; ¹y1;1; y2;2; y3;1º; ¹y1;2; y2;1; y3;1º; ¹y1;2; y2;2; y3;2º:

Setting here x D 1; y1;1 D 2; y1;2 D 4; y2;1 D 3; y2;2 D 7; y3;1 D 6, and y3;2 D 5,we arrive at the system S.7; 7; 3; 3; 1/ considered above. Other choices of the param-eters give other, but isomorphic block designs.

Problem 5.4.3. Construct Steiner system S.631/.

Solution. Since v D 631 D 6t C 1 with t D 105, the value v D 631 is admissibleand system S.631/ exists. First we find what smaller admissible values are requiredby the algorithm. The algorithm depends upon the divisibility by 36, therefore, wedivide v D 631 by 36, 631 D 17 � 36 C 19 with the remainder equal to 19, and usean appropriate representation from the chart, v D 36t C 19 D 1C .6t C 3/.7 � 1/,with t D 17; thus, v D 1 C 105 � .7 � 1/. We want to apply now Theorem 5.4.4with v D 631; v3 D 1; v2 D 7, and v1 D 105. The case v3 D 1 is trivial, thesystem S.7/ was presented above, thus, to apply Theorem 5.4.4, we need a systemS.105/. By the same token, 105 D 36 � 2C 33 with the remainder 33, that is we haveto use the representation 105 D 3 C 3.37 � 3/. Keeping in mind the existence ofthe trivial system S.3/, to apply Theorem 5.4.4 we need to construct a system S.37/.However, 37 D 36 � 1C 1 with the remainder 1, hence to get a system S.37/, we usethe representation 37 D 1C 3.13 � 1/, where the components S.3/ and S.13/ havebeen already proven to exist.

To construct S.631/, we now work backward. First we use the systems S.3/ andS.13/ to construct S.37/ by Theorem 5.4.4 – cf. the preceding problem. Using S.3/and S.37/, we construct S.105/, and finally from S.7/ and S.105/ we derive S.631/.



EP 5.4.1. 1) Write down all ordered triples from eight digits 1; 2; : : : ; 7; 8.

2) Find the number of triples consisting of three different digits (without repeti-tion).

3) Find the number of triples consisting of two different digits, like (1, 2, 1).

4) Find the number of triples consisting of the same digit, similar to (3, 3, 3).

5) What is the largest set of triples such that any pair of digits belongs to at mostone triple?

6) What is the smallest set of triples such that any pair of digits belongs to at leastone triple?

EP 5.4.2. Write down all triples of the elements of the set X D ¹1; 2; 3; 4; 5º

1) Such that each pair of elements of X belongs to at most one triple and thenumber of triples is the largest.

2) Such that each pair of elements of X belongs to at least one triple and the num-ber of triples is the smallest.

EP 5.4.3. Consider a system of triples

S D ¹¹1; 2; 3º; ¹1; 4; 5º; ¹1; 6; 7º; ¹2; 4; 6º; ¹2; 5; 7ºº:

What triples from the list

¹2; 3; 4º; ¹5; 6; 7º; ¹3; 4; 6º; ¹3; 5; 7º; ¹3; 4; 7º; ¹3; 6; 7º; ¹4; 5; 6º; ¹1; 2; 3º

should be added to S to make it a Steiner system S.7/?

EP 5.4.4. Complete the solution of Problem 5.4.3 by constructing explicitly all blocksof S.631/. How many blocks does this BIBD contain?

EP 5.4.5. The following figure called the Fano plane, has occurred in variousareas of mathematics. How does it represent the Steiner triple system S.7; 7; 1/?s

s s&%'$sss s

JJJJJJ

��

HHH

HHH

Figure 5.2. The Fano plane.


EP 5.4.6. Show that the Steiner triple systems can be generated by decompositionsof a complete graph Kn in triangles without common edges.

EP 5.4.7. There are 100 professors at a college. Every day three of them have a lunchtogether at the college cafeteria. Is it possible to schedule their visits during someperiod of time so that every two of them lunch together exactly once?

EP 5.4.8. Let S.w/ be a Steiner subsystem of the Steiner system of triples S.v/.Prove that w � 1

2.v � 1/.

EP 5.4.9. Prove that if two Steiner subsystems of a Steiner system of triples have anonempty intersection, then the intersection also is a Steiner system of triples.

EP 5.4.10. Find the necessary conditions, similar to (5.4.2), of the existence of sys-tems of quadruples S.v; b; 4; r; �/. Specialize these conditions when � D 1.

EP 5.4.11. Do BIBD S.7; 7; 4; 4; 1/ and S.13; 13; 4; 4; 1/ exist? If either of themexists, construct it.

EP 5.4.12. 9 professors must proctor 12 exams in 4 days, so that each test must beobserved by a committee of 3 professors. Compose a schedule of the exams such thatevery pair and every triple of the professors do not meet more than once during theexams.

EP 5.4.13. At the Test College students have four exams every day during seven daysin a row. Is it possible to arrange eight professors to proctor these exams in pairs sothat the same pair of professors does not proctor two exams?

EP 5.4.14. An ice hockey team has nine forwards. The team plays four games ina row. Prove that it is possible to set up the triples of field players, so that no twoforwards play twice in the same triple.

EP 5.4.15. Every year the Combi Club holds a meeting where each member of theClub must present his or her results for the past year to every other member of theclub. However, the Club has a very small classroom, where only 3 people can be ata time. If every such small meeting of 3 members lasts 30 minutes without a breakbetween 3-party meetings, and this year there are 15 club members, then how manythese meetings are necessary and for how long will the room be occupied?

EP 5.4.16. Solve the previous problem if the club has a) 14, b) 16 members.

Answers to Selected Problems

Section 1.1

1.1.1. f is neither, g injective, h surjective, k bijective.

1.1.9. Hint: 1k.kC1/

D1k�

1kC1

.

1.1.10. Hint: There are fewer digits than English characters.

1.1.11. N1 [N3 D N3, N1 \N3 D N1, N1 nN3 D ¿, N3 nN1 D ¹2; 3º.

1.1.13. Any set Z0 containing zero, satisfies Z0e[Zo[Z0 D Z, but only set Z00 D ¹0ºmakesZ0e [ Zo [ Z00 a partition of Z.

1.1.17. 8.

1.1.21. 1) 39 916 800, 2) 40 200.


1.1.1. 1) 25770

, 2) 21760

, 4) 1.

1.1.11. One zero; 12 zeros; 23 zeros.

1.1.27. N0.

1.1.30. 22n

.

1.1.31. Hint: Integrate by parts.

Section 1.2

1.2.1. 20.

Exercises and Problems 1.2.

1.2.2. 3 � t C 1 � 4 D 19.

1.2.3. .nŠ/2 – Assuming that circular shifts do not generate a new sitting.

1.2.6. 4 � .3ŠC 3C 3C 1/ D 52.

1.2.7. a) T0 [ T1 [ T2; b) N1 [ P [ P c.

1.2.11. The number of divisors is 4�5�6�7 D 840, their sum is .24�1/.35�1/.56�1/.77�1/.

1.2.15. [(9 999-1 000)/7]+1=1 286.

1.2.16. 2n.n�5/2C 2n.n�6/

2D 2.2n � 11/.


1.3.3. A.6; 4/ D 360.

1.3.5. 9Š.

1.3.6. 10Š � 9Š D 9 � 9Š.

308 Answers to Selected Problems

1.3.7. Hint: Such numbers must contain either one 4, or one 3 and one 1, or two 2, or one 2and two 1, or four 1, and a complementary number of zeros; the answer is 220.

1.3.8. Thus, the numbers are 24 permutations of these four digits, and each digit appears atevery position 6 times; the answer is 66660.

1.3.10. s parallel streets divide the town into sC1 infinite strips. The first slanted street addss C 1 blocks, the second one adds s C 2, etc. In general, we have .t C 1/.s C 1/C12t .t � 1/ blocks.

1.3.14. 2 � 8Š � 6Š.

1.3.15. 2 � 8Š � 6Š.

1.3.19. Hint: In how many ways can you place n different objects, without ordering, in either2 or three different boxes?

1.3.20. 2) After cancelling by P.k/, the equation becomes n � � � .kC1/ D 5, thus k D 4 andn D 5.


1.4.1. 1) A.12; 8/; 2) NA.12; 8/; 3) 1; 4) 1.

1.4.3. Hint: The ratio n.nC1/.nC2/��.nCk�1/kŠ

D C.nC k � 1; k/.

1.4.5. Hint: Set dk to be the largest integer such that C.dk ; k/ � n.

1.4.8. Hint: Find the largest integer i such that i Š � n.

1.4.10. C.6;4/

26D

1564

.

1.4.12. Let 2 005 have k1 preimages, 2 006 – k2, and 2 007 – k3, then k1Ck2Ck3 D 2 006,and the number 2 005k1C2 006k2C2 007k3 must be even. The latter implies that k1and k3 must have the same parity, while the parity of k2 is immaterial. Therefore, wehave to find how many ways there exist to partition the difference 2 006 � k2; k2 D0; 1; : : : ; 2 006 into two addends, which are, necessarily, of the same parity. Theanswer is 1 004 � 1 003. The number of such functions with an odd sum is 32006 �1 004 � 1 003.

1.4.14. C.n; 2/.

1.4.23. C.10; 3/.

1.4.28. P ŠC.P C 1; S/, thus P C 1 � S ; or .P � 1/ŠC.P; S/, thus P � S .

1.4.43. 1) C.12; 4/; 2) NC.12; 4/.

1.4.52. C.n; 2/.

1.4.74. 1230 � 27 D 405.

1.4.75. The equation 12n.n � 3/ D 35 gives n D 10.

1.4.78. 6.


1.5.1. 1) Place a ball in every urn – there is only one way to do that, and then put the tworemaining balls in any way, thus C.4:2/C C.4; 1/ D 10.

2) 5C.4; 2/ D 30.


3) 4C 5C.4; 2/C 4 � 10C 10 D 84.

4) 4.

1.5.2. 35Š=.5Š/7.

1.5.7. 1) 5Š=3Š; 2) 2 � 5Š=3Š.

1.5.8. 12

11Š

1Š.4Š/22Š.

1.5.11. 30Š.3Š/1010Š

I30Š

.10Š/33Š.

1.5.21. 8Š2Š�1Š�5Š

.

1.5.24. The last two digits must be 12, 24, 32, 44, 52, 64, 72, 84, 92; for each of these pairsthe first two digits can be chosen in P.3; 2/ D 6 ways. Thus, there are 9 � 6 D 54

numbers.

1.5.26. 1) kn; 2) C.nC k � 1; n; 3) C.k; n/ (if k � n).

Section 1.6

1.6.1. 2) S D ¹�$1.Loss/; 0; : : : ; $9 999º.

1.6.3. 1) These events are not disjoint.

2) The largest probability is 1 if P(Movie AND Restaurant) = 0.3, the smallest prob-ability is 0.7 if P(Movie AND Restaurant) = 0.6.

3) We must assign the probability P(Movie AND Restaurant).

Exercises and Problems 1.6.

1.6.4. 12�23D

13

.

1.6.6. 1262�104

.

1.6.7. P.9/C 8 � A.9; 8/ D 9 � 9Š.

1.6.30. 2 � 13�23C�13

�2D

59

.

1.6.42. To make the average 85, the fifth score must be 87. Thus, the first probability is 1/21,the second one is 9/21, and the third one is 0.


2.1.3. 1) Yes, it is possible to partition the graph in 10 disjoint subgraphs isomorphic toK4.

2) No, due to Lemma 2.1.12.

2.1.4. 1) Yes. 2) No, again by Lemma 2.1.12.

2.1.5. Yes.

2.1.10. Hint: In how many ways is it possible to split 2n vertices into n unordered pairs?

2.1.11. Hint: Use again Lemma 2.1.12.

2.1.12. d � v � 1 and d � v must be an even number.

2.1.16. If two Triplanians exchange a handshake, there must be a third Triplanian exchanginghandshakes with both of them, thus we have a triangle of vertices with the sum ofdegrees equal 6. Therefore, the handshaking lemma says that the sum of degrees ofall the vertices for any graph on Triplan is multiple of 6.


Section 2.2

2.2.3. 1)

A� D

0@ 0 01 02 0

1A2)

A � C D

�1 0 0

0 0 0

�A product C � A is undefined.


2.2.1. Hint: The five-element set of vertices of g has 25 � 1 D 31 non-empty subsets.

2.2.2. The first and the second diagrams are isomorphic.

2.2.3. G0 consists of two connected components, G1 and G2.

2.2.8. 1) C.n; 3/C C.n; 4/C � � � C C.n; n/ D 2n � 1 � n � n.n � 1/=22) C.5; 3/C C.5; 4/ D 15

2.2.12. 1) n � 2

2) 3 walks if the vertices are in the opposite components of the graph, 0 otherwise.

2.2.21. Hint: split the graph in two or more connected components.

Section 2.3

2.3.2. Hint: You can argue by contradiction and use Lemma 2.2.10.

2.3.4. Since the order of G0 is 5, its spanning tree must have 4 edges. We start with anyedge of weight 1 but cannot include all four such edges for three of them make acycle. Hence we select any three edges of weight 1 and append any of the two edgesof weight 2. There are .C.4; 3/�1/ �2 D 6minimum spanning tree, each of weight 5.


2.3.2. The trees.

2.3.4. Hint: Apply Theorem 2.3.2.

2.3.5. ¹e1; e4; e5º, ¹e1; e2; e5º, ¹e1; e3; e5º, ¹e2; e5; e4º, ¹e3; e5; e4º.2.3.8. 1.

2.3.12. p � 1.

2.3.16. By EP 2.3.15, 67 � 35 D 32 trees.

2.3.23. 3 000 � .300 � 1/ D 2 701.

Section 2.4

2.4.1. Such a graph does not exist by Corollary 2.1.13.

2.4.2. The graph on the right (“an open envelope”) is semi-Eulerian but not Eulerian, theleft graph is neither.



2.4.2. Hint: Count passes through a vertex.

2.4.7. If we number the vertices of K5 consecutively in either order by v1; v2; v3; v4; v5,then a possible Hamiltonian circuit goes consecutively through vertices v1; v3; v5; v2;v4; v1.


2.5.1. The answer to all three questions is positive.

2.5.3. Here p D 6, thus q D 12.5 � 3C 1/ D 8. By (2.5.1), f D 2C 8 � 6 D 4 including

the unbounded component.

2.5.4. No, the road map for this area must be isomorphic to the complete graph K5, whichis not planar.

2.5.10. Any connected component of the graph must have at least .p�1/=2C1 D .pC1/=2edges, thus if the graph has at least two components, then the complement cannothave more than p � .p C 1/=2 D .p � 1/=2 edges – contradiction.


3.1.3. Four clusters: DE, FL, LA, MD, MI, SC, AL, GA, KY, MO, NC, TN, VA, WV, TX.Five clusters: DE, FL, LA, AL, MD, MI, SC, GA, KY, MO, NC, TN, VA, WV, TX.

3.1.4. This dissimilarity value is 3.


3.2.6. The problem contains the ties starting from the dissimilarity of 2. Hence, the first-level clustering is unique: ¹¹x2; x5º; ¹x1º; ¹x3º; ¹x4º; ¹x6ºº, however the second-level clustering depends upon what edge with the dissimilarity of 2 is chosen first.This clustering may be ¹¹x2; x3; x5º; ¹x1º; ¹x4º; ¹x6ºº or ¹¹x2; x5º; ¹x4; x6º; ¹x1º;¹x3ºº. In turn, these clusterings lead to different clusterings of the next level. Afterthat, the coming conjoint clustering is, of course, unique.


4.1.3. 1) The smallest number is 83.

2) The longest such sequence of positive numbers consists of 107 numbers, and isinfinite if negative numbers are allowed.

4.1.4. By Theorem 4.1.1, the union contains

17C 23C 41C 45C 56 � 6 � C.5; 2/C 4 � .5; 3/ � 0 � C.5; 4/ D 162 elements.

4.1.24. 1) The power-set contains 43 D 64 maps, among them there are A.4; 3/ D 24

injective and (since 3 < 4) no surjective or bijective maps; thus 64 � 24 D 40 areneither injective nor surjective.



4.2.2. 4) Hint: Insert the latter formula in (4.2.5) and change the order of summation.

4.2.4. Hint: straightforward substitution.

4.2.5. Hint: Choose in EP 4.2.4 Qm.t/ D tm; m D 0; 1; : : : , compute the correspondingpolynomials Pm.t/, and use the Binomial Theorem.

Section 4.3

4.3.2.

p¹C.p C k � n � 2; p � 2/C C.p C k � n � 1; p � 2/C � � �

C C.p C 2n � k � 2; p � 2/º

4.3.8. Hint: Use the binomial convolution (Definition 4.3.3) and replace npjanj in (4.3.3)

by n

qjanjnŠ

.

4.3.9. a0 D a1 D 1, a2 D � � � D a6 D �2, a7 D � � � D a20 D �6, a21 D � � � D a44 D 6,a45 D � � � D 1

4.3.14. P0.I; t/ D P1.I; t/ D � � � D 1; Pk.S; t/ D 1CtCt2C� � �Ctk D .1�tkC1/=.t�1/,k D 0; 1; : : :

4.3.16. We have to compute the coefficient of t10 in the series

.t C t2 C � � � /5.1C t C t2 C � � � / D t5.1C t C t2 � � � /6

that is, the coefficient of t5 in .1C t C t2 � � � /6, which is C.10; 5/.


4.3.1. 1) 11�t

; 2) t3.1CtCt2/; 3) t3.1CtCt2/

1�t6; 4) 1CtCt

2

1�t6; 5) 1C5tC10t2C10t3C5t4Ct5;

6) 11Ct

; 7) 1�.cos˛/t1�2.cos˛/tCt2

; 8) .sin˛/t1�2.cos˛/tCt2

.

Section 4.4

4.4.2. .1 � t /�1.1 � t /�1 D 1C � � � C 3t78 C � � � .


4.4.2. The solution of equation (4.4.13) with the initial values f .0/ D f .1/ D 1 is f .2/ D1; f .3/ D 2; f .4/ D 3; f .5/ D 5; f .6/ D 8; : : : .

4.4.3. These equations follow either from equation (4.4.13) or from the preceding problem.For example, to derive the equation in 5), we can rewrite its left-hand side as f .n/C2f .n � 1/C f .n � 2/ � 1. Comparing the latter with the right-hand side, we get anequation

f .n/ � 1 D f .n � 1/C f .n � 2/ � 1 D f .0/C � � � C f .n � 2/;

which is the equation in 5) with n instead of nC 2.


4.4.5. By induction, g.n/ D ˛an; n D 0; 1; 2; : : : , thus g.t/ D ˛eat , and the conclusionfollows.

4.4.7. Hint: Expand the determinant over the first column.

4.4.9. f .n/.

4.4.10. Hint: Compose a recurrent equation for the number in question.

4.4.13. Hint: Estimate the numberp5�1

2p5

.

4.4.14. Ln D�1Cp5

2

�nC�1�p5

2

�n, L0 D 2; L1 D 1; L2 D 3; L3 D 4; L4 D 7,

L5 D 11.

4.4.15. Follows from EPs 4.4.2 and 4.4.14.

4.4.17. 4) x.n/ D aC b.�3/n C 532n.

4.4.25. Hint: Notice that the sequence satisfies equation (4.4.13).

4.4.41. 1) n2nC2 � .nC 1/2nC1 C 2.

Section 4.5

4.5.2. G2 if m D 2 and Ge .


4.5.2. .3; 2; 1; 0; 0; 0; 0; 0; 0; 0/, x31x22x3.

4.5.7. No, since F is not a group.

4.5.16. The ride is unique.

4.5.17. In both cases, 2 � 4 � 2 � 23 D 128.

4.5.21. g1 D .1; 2/.3; 4/, g2 D .1; 2/.1; 3/.1; 4/, g3 D .1; 2/.3; 3/.4; 4/,g4 D .1; 1/.2; 2/.3; 4/.

Section 5.1

5.1.7. Since degF D 5, at least three incident vertices must have the same label.

5.1.11. jX j DPy2Y jf

�1.¹yº/j � kjY j, since for different y the preimages f �1.¹yº/ aredisjoint.


5.1.1. A week consists of 7 days.

5.1.2. 1) 21; 2) 36; 3) 33; 4) 13.

5.1.3. 1) Yes, since 37 D 3 � 12C 1. 2) No.

5.1.5. Parallel translation does not change the angles between the lines, hence we can as-sume that these lines intersect at a point. Thus, 11 lines make 22 angles,and if eachangle is at leat 17ı, then the total angle would be at least 22 � 17ı D 374ı > 360ı.

5.1.11. 6 � 3n�3.

5.1.16. There are only four different outcomes of the test.


5.1.25. There are 14 differences, 1; 2; : : : ; 14, and the 9-element set 1, 2, 3, 4, 5, 6, 13, 14,15 generates all the differences but the 6. Thus we need to select at least 10 numbers.

5.1.34. There are only 11 different remainders after dividing any integer by 11.


5.2.1. 1) Yes, by Theorem 5.2.7 with m D 2.

5.2.8. nŠ.

5.2.9. C.m � n; k/ if m � n � k and 0 otherwise.

5.2.10. The simplest example is the case n D 1 and the 2�2matrix containing only 1s; thesefour 1s cannot be covered by one row and one column.


5.3.3. We must build the block design S.20; b; 2; 19; 1/, which clearly exists by (5.3.2) withb D 190 blocks.

5.3.5. Hint: Reconstruct a BIBD from M .

5.3.8. Hint: Multiply the i th row of the determinant by bi and add the rows.

5.3.9. Hint: Prove that y and z must have the same parity, that is, either both are odd orboth are even, and consider these cases separately.

5.3.10. S.43; 43; 7; 7; 1/ does not exist due to Theorem 5.3.7 and the previous problem; forS.15; 21; 5; 7; 2/, condition (5.3.2) fails.


5.4.3. Triples ¹3; 4; 7º and ¹3; 5; 6º.

5.4.4. The six segments and the circumference represent seven blocks of S.7/.

5.4.7. 100 is not a admissible value for S.v/.

5.4.10. ´�.v � 1/ � 0.mod 3/

�v.v � 1/ � 0.mod 12/

If � D 1, then v � 1.mod 12/ or v � 4.mod 12/.

5.4.11. S.13; 13; 4; 4; 1/ exists, however, S.7; 7; 4; 4; 1/ does not, since for these values ofparameters the necessary condition (5.3.3) fails.

5.4.12. Hint: Consider S.9; 12; 3; 4; 1/.

5.4.17. Hint: Fix a point outside S.v/ and count all the triples where this point meets ele-ments of S.v/ and also other elements.

Bibliography

[1] M. Aigner, Combinatorial Theory. Springer-Verlag, Berlin, New York, 1979.

[2] M. Aigner, Catalan and other numbers: a recurrent theme. In Algebraic Combinatoricsand Computer Science, 347–390. Springer Italia, Milan, 2001.

[3] G.E. Andrews, The Theory of Partitions. Cambridge Univ. Press, Cambridge, 1998.

[4] Applications of Discrete Mathematics, J.G. Michaels and K.H. Rosen, Editors. McGraw-Hill, Inc., New York, 1991.

[5] N.G. De Bruijn, Pólya’s theory of counting. In E. Beckenbach, Ed. Applied Combinato-rial Mathematics, 144–184. John Wiley and Sons, New York, 1964.

[6] C. Berge, The Theory of Graphs and its Applications. New York, Wiley, 1964.

[7] C. Berge, Principles of Combinatorics. Academic Press, New York, 1971.

[8] B. Bollobás, Modern Graph Theory. Springer Verlag, 1998.

[9] P.J. Cameron, Combinatorics: Topics, Techniques, Algorithms. Cambridge Univ. Press,Cambridge, 1994.

[10] C.C. Chen, K.M. Koh, Principles and Techniques in Combinatorics, World Sci. Publ.Co., Inc. Singapore, 1992.

[11] Combinatorial Analysis. Problems and Exercises. K.A. Rybnikov, Ed., Nauka, Moscow,1982 (Russian).

[12] C. Cox, P. Hansen, B. Julesh, Eds. Partitioning Data Sets. AMS, 1995.

[13] G.P. Egorychev, Integral Representation and the Computation of Combinatorial Sums.AMS, Providence, RI, 1984.

[14] P. Erdos, J. Spenser, Probabilistic Methods in Combinatorics. Acad. Press, New York,1974.

[15] B. Everitt, Cluster Analysis. Heinemann Educational Books, Portsmouth, NH, 1974.

[16] W. Feller, An Introduction to Probability Theory and its Applications, Vol. 1. John Wiley& Sons, Inc., New York, 1967; Vol. 2, 1971.

[17] L.R. Ford, Jr., D.R. Fulkerson, Flows in Networks. Princeton Univ. Press, Princeton, NJ,1962.

[18] A.D. Gordon, Classification. Chapman and Hall, 1981.

[19] I.P. Goulden, D.M. Jackson, Combinatorial Enumeration. John Wiley & Sons, NewYork, 1983.

[20] R.L. Graham, M. Grötschel, L. Lovász (eds), Handbook of Combinatorics, Vol. 1–2.Elsevier, Amsterdam, 1995.

316 Bibliography

[21] M. Hall Jr., Combinatorial Theory, 2nd Ed. John Wiley & Sons, Inc., New York, 1986.

[22] P.R. Halmos, Naive Set Theory. Van Nostrand, Princeton, N.J., 1960.

[23] F. Harary, E.M. Palmer, Graphical Enumeration. Academic Press, New York, 1973.

[24] G.H. Hardy, E.M. Wright, An Introduction to the Theory of Numbers, 4th Ed. OxfordClarendon Press, Oxford, 1960.

[25] J.A. Hartigan, Clustering Algorithms. John Wiley & Sons, Inc., New York, 1975.

[26] J. Herman, R. Kucera, J. Šimša, Problems in Combinatorics, Arithmetic, and Geometry.Springer-Verlag, New York, 2003.

[27] A.J.W. Hilton, A simplification of Moore’s proof of the existence of Steiner triple sys-tems. J. of Combinat. Theory (A), Vol. 13(1972), 422–425.

[28] L.J. Hubert, Some applications of graph theory to clustering. Psychometrika,Vol.39(1974), 283–309.

[29] A.K. Jain, R.C. Dubes, Algorithms for Clustering Data. Prentice Hall, 1988.

[30] L. Kaufman, P.J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Anal-ysis. John Wiley & Sons, Inc., New York, 1990.

[31] D.E. Knuth, The Art of Computer Programming, Vol. 1. Addison-Wesley Publ. Com-pany, Reading, MA, 1973.

[32] S. Lang, Undergraduate Algebra, 2nd Ed. Springer Verlag, 1990.

[33] L. Lovász, Combinatorial Problems and Exercises. North Holland, Amsterdam, 1993.

[34] G.E. Martin, Counting: The Art of Enumerative Combinatorics. Springer, New York,2001.

[35] D. Merlini, R. Sprugnoli, M.C. Verri, Lagrange inversion: when and how. Acta Appl.Math., Vol. 94(2006), 233–249.

[36] B. Mirkin, Mathematical Classification and Clustering. Kluwer, Dordrecht, 1996.

[37] W.K. Nicholson, Elementary Linear Algebra, 2nd ed.McGraw-Hill, Raerson, Toronto,2004.

[38] I. Niven, Mathematics of Choice. MAA, Washington, DC, 1965.

[39] G. Pólya, Kombinatorische anzahlbestimmungen für gruppen, graphen und chemischeverbindungen. Acta Math., Vol. 68(1937), 145–254.

[40] G. Pólya, G. Szegö, Problems and Theorems from Analysis, Vol. 1, 3rd Ed. Springer,Berlin, 1964.

[41] J. Riordan, An Introduction to Combinatorial Analysis. John Wiley & Sons, Inc., NewYork, 1958.

[42] G.-C. Rota, Finite Operator Calculus. Academic Press, New York, 1975.

[43] H.J. Ryser, Combinatorial Mathematics. John Wiley & Sons, Inc., New York, 1963.

[44] R.A. Silverman, Introductory Complex Analysis. Dover, NY, 1972.

Bibliography 317

[45] R.P. Stanley, Enumerative Combinatorics, Vol. 1–2, 2nd Ed. Cambridge Univ. Press,Cambridge, MA, 1997.

[46] S.K. Stein, A. Barcellos, Calculus and Analytic Geometry, 5th Ed. McGraw-Hill, NewYork, 1992.

[47] N.Ya. Vilenkin, Combinatorics. Acad. Press, New York-London, 1971.

[48] R.J. Wilson, Introduction to Graph Theory. Acad. Press, New York, 1972.

[49] R.M. Wilson, An existence theory for pairwise balanced designes, I, II, III. J. of Combi-nat. Theory (A), Vol. 13(1972), 220-273, Vol. 18(1975), 71–79.

Index

AAbel transformation, 26antisymmetric, see binary relationsarrangements

with repetition, 39without repetition, 39

assignment operator, 144assignment problem, 279

Bballs in urns model, 41, 58, 71, 73, 88,

180, 189, 210Bayes’s formula, 84Bell numbers, 14

difference equation, 239Bernoulli’s trials, 84bijective, see mappingsbinary relations, 16

antisymmetric, 16equivalence relations, 17

equivalence classes, 17factor-set, 17number of equivalence classes, 18

partial order, 17chain, 17

reflexive, 16symmetric, 16transitive, 16

binomial coefficients, 44, 58, 60, 62, 188,216, 234, 239

binomial convolution (Hurwitz composi-tion), 204

binomial distribution, 83binomial formula, 46birthday problem, 85bit string, 235block designs, 290

incidence matrix, 291incomplete, balanced (BIBD), 290isomorphism, 299

automorphism, 299

necessary conditions of existence, 291symmetric, 293

Bonferroni inequalities, 182Boolean, see setsBoolean functions, 68Bose–Einstein statistics, 74bracelet, 247Bruck–Ryser–Chowla theorem, 294Burnside (Cauchy–Frobenius) lemma, 245

Ccardinality of unions, see setsCartesian (direct) product, see setsCartesian product, cardinality, 19Catalan numbers, 58, 60, 69, 231, 239

recurrence relation, 236Cauchy rule, 201Cauchy–Hadamard criterion, 198Cayley

first formula, 115second formula, 122

chain (linearly or totally ordered set), seeposet

characteristic function, see setschromatic number, see graphscircuit, see graphsclustering, 129

(dis)similarity, 130algorithms

agglomerative, 132agglomerative single-link, 134, 145divisive, 132hierarchical, 132Hubert’s complete-link, 154, 155Hubert’s single-link, 147, 148single-link, 132

amalgamated, 132clumps, 130clusters, 130completely disjoint, 132dendrogram, 152

320 Index

dissimilarity matrix (table), 130link, 134threshold, 131threshold graph, 136

coloring problems, 181, 248, 254, 255, 258,260, 264, 269

combinationswith repetition, 48, 49, 257without repetition, 44

complement, see setscompositions, 221, 237, 252, 258

generating function, 222conditional probability, 81congruence modulo p, see natural num-

bersconnected graph, component, see graphscycle, see graphscyclic sequences, 192

Dde Morgan laws, 14dendrogram, see clusteringderangement, 183determinant, 228

continuant, 235expansion across a line, 228Jacobi determinant, 229minor, 229

difference, see setsdifference equations, 224

characteristic polynomial, 225generating function, 225superposition principle, 224

digraph, see graphsDilworth theorem, 284Diophantine equations, 52, 66, 215, 237,

294, 296Dirichlet principle, 264, 269dot product, 105Dyck path, see trajectory method

EEGF, see exponential generating functionequivalence relations, see binary relationsErdos–Szekeres theorem, 266Euler (totient) function, 188Euler’s theorem, 126

Eulerian circuit (trail), 123Eulerian graph, 123events, 75

complementary, 78disjoint (mutually exclusive), 78elementary, 75exhaustive, 78independent, 81, 82

expected value, see mathematical expecta-tion

exponential generating function (EGF), seemethod of generating functions

Ffactorial, 22

Stirling asymptotic formula, 23family of subsets, 263Fano plane, 305favorable outcome, 75Fermat little theorem, 194Fermi–Dirac statistics, 74Ferrers diagram, 220

normalized, 220Fibonacci numbers, 227, 228, 233

generating function, 235Fisher’s inequality, 293Fleury’s algorithm, 124floor function, see integer partforest, see graphsFrobenius theorem, 283function, see mappingsFundamental Theorem of Arithmetic, 27

GGauss formula, 194generating polynomials, see method of gen-

erating functionsgeometrically identical colorings, 248GF, see method of generating functionsgolden ratio, 234graphs, 92

acyclic, 110adjacency matrix, 106bipartite, 96bipartite graphs

matching, 287chromatic number, 185, 189

Index 321

coloring problems, 110, 181, 185, 216,264, 269, 273

complete graph Kp , 96connected, 103connected component, 103contour, 102cut-edge (bridge), 103degree sequence, 96diagram, 93directed, 93edge, 92

initial (end) vertex, 94loop, 93

edge-cover, 288embedding, 95

regular, 95embedding in Rn, 94enumeration problems, 98Eulerian characteristic, 126forest, 110incidence function, 92incidence of edges and vertices, 92isomorphism, 97labelled, 98, 216order, 92planar, 95plane, 125regular coloring, 185simple, 93size, 92spanning subgraph (factor), 101subgraph, 101Thomsen graph, 127tree, 110

rooted, 110spanning tree, 113

vertex, 92isolated, 92odd (even), 96pendant (leaf), 93

vertex degree, 96walks, trails, paths, circuits, cycles,

102weighted, 113

group, 241cycle index, 243order, 241

symmetric, 241

HHall’s condition, 274

strengthehed, 275Hall’s theorem, 274Hamiltonian graph, 125Hamiltonian path, 125handshaking lemma, 96harmonic numbers, 59hypothesis of equally likely probabilities,

77

Iimage, see mappingsInclusion-Exclusion Principle, 178injective, see mappingsinteger part, 54intersection, see setsinventory, 249inverse image, total preimage, see preim-

ageinversion formulas, 190, 195, 208, 215, 239

Möbius inversion, 190, 191inversions in permutations, 228

even (odd), 228

JJordan’s theorem, 126

KKönig’s theorem, 282Kaplansky lemma, 64Kronecker delta, 59Kruskal’s algorithm, 114, 145

LLagrange’s theorem on four squares, 295Lambert W function, 216loop, see graphsLucas numbers, 235

generating function, 235

MMöbius function, 190mappings

bijective, 5codomain, 5

322 Index

domain, 5equal, 5equivalent, 248image, 5injective, 5number of

arbitrary mappings, 25bijective mappings, 25injective mappings, 25

preimage, 5range, 5restriction, 20surjective

cardinality, 180surjective (onto), 5weight, 249

mathematical expectation, 83matrix, 105

product, 105symmetric, 105transpose, 105zero-one matrix, 282

covering, 282independent set of entries, 282irreducible covering, 285

Maxwell–Boltzmann statistics, 74method of generating functions, 197, 201

convolution of sequences, 201examples, 196exponential generating function, 203generating polynomials, 198, 199, 207,

209, 215, 223problems, 205, 208, 210–212, 234shifts of sequences, 204

Moore’s theorems, 299, 300multigraph, 93multinomial

coefficients, 70theorem, 73

multinomial coefficients, 223multiset, 263

Nnatural numbers N, 4

combinatorial representation, 59congruence, 25factorial representation, 60

natural segment Nn, 4number of equivalence classes, see binary

relations

Oordered pairs, 15

Ppartial order, see binary relationspartitions

of integers, 218, 237generating function, 218

of sets, 14number of, 181ordered, 263Schur’s lemma, 267

Pascal’s triangle, 46path, see graphspermanent, 239permutations, 39

with identified elements (with repeti-tion), 70

pigeonhole principle, see Dirichlet princi-ple

planar, plane graph, see graphsPólya–Redfield theorem, 252Pontryagin–Kuratowski theorem, 127poset, see partial orderpower set, see setsPrüfer code, 117preimage, see mappingsPrinciple of Mathematical Induction, 6probability axioms, 76probability distribution, 76probability experimental (frequency), 76product notation, 9Product Rule, 32progression

arithmetic, 26geometric, 27

pseudograph, 93

Qquantifier

existential, 4universal, 4

Index 323

RRamsey

numbers, 263theorem, 263

random experiment, 75random variables (functions), 80range, see mappingsrecurrence relations, see difference equa-

tionsreflexive, see binary relationsreserve, 249

Ssample space, 75semi-Eulerian graph, 123sets

Boolean, 20Cartesian (direct) product, 15characteristic function, 22complement, 13countable, 6difference, 13disjoint, 13empty set, 3finite, 6intersection, 13power set, 19subset, 4union, 13

cardinality, 18universal set, 12

Sieve Formula, see Inclusion-Exclusion Prin-ciple

sigma (summation) notation, 8spanning tree, see graphs

minimum, 113weighted, 113

Stirling numbers, 195of the first kind, 189of the second kind, 181, 188, 217

subfactorial, 183substitutions, 241

cycle (orbit), 242cycle index, 243cycle type, 242fixed elements, 245matrix representation, 242

Sum Rule, 31modified, 37

summing sequence (summator), 205surjective, see mappingssymmetric, see binary relationssystem of mutual representatives, 277systems of distinct representatives (SDR),

274systems of quadruples, 306systems of triples, 298

admissible values, 299Steiner systems, 298

Ttelescoping sums, 11totient function, see Euler (totient) func-

tiontrails, see graphstrajectory method, 57transitive, see binary relationstransversal, see systems of distinct repre-

sentativestree, see graphstree of alternatives, 34triangular numbers, 63tuple (vector), 16

Uunion, see sets

VVandermonde’s identity, 59village weddings (marriage) theorem, see

Hall’s theorem

Wwalks, see graphsweight

of a function, 249of an element, 249of an equivalence class, 249

whole numbers W , 4

Date post:	07-Feb-2016
Category:	Documents
Upload:	silviu-boga
View:	343 times
Download:	19 times

A Primer in Combinatorics - Alexander Kheyfits.pdf

Documents