Download - Ch 6 Buckets & Balls

76

Chapter 6 Buckets & Balls

Suppose you walk into a room with people. How many

people do there need to be in the room before you bet that

there are two people with the same birthday? Of course if there

were a thousand people, you would bet. It is a sure thing. But

probably nobody would bet against you. How about if there

are only 100 people? 50? Before we find the surprising answer,

let's look at a simple principle from probability. Since the size

of a set plus the size of its complement is always the size of the

universe (first counting principle), or equivalently, in symbols,

AA U

U U U,

we have

the probability for a given event to occur equals 1 probability that the event will not occur.

Example 1. Let A be the set of people in the room. We want to compute the probability

that at least two of them have the same birthday. What is the experiment? We go around

the room asking for people's birthdays. If A m , then there are 365m (we have m

decisions or stages and 365 choices for each one of them) outcomes to our experiment. It

seems tricky to compute the size of the outcomes that give at least two with the same

birthday, but the complement of this set is that no two of them have the same birthday.

How many ways can that occur? For the first person we ask we have 365 choices, but for

the second one we have only 364, and for the third one 363, etc. and this kind of

reasoning should sound familiar.

So we can build a table of probabilities. Let's say pn denotes the probability of no two

persons having the same birthday when there are n people in the room. We can write

down an answer by simple counting, and we obtain

365!

365 )!365( nnn

p .

Unfortunately, this may not be easy to compute since 365! has more than 500 digits. A

much better way to do the computation is recursively. Namely, at stage 1 we have 365

365, for

the next one we have our previous result times 364

365, and for the next, previous result

times 363

365, etc. In short,

77

1

365

365n n

np p .

The table below gives the values recursively computed.

Thus, amazingly, you should be ready to bet when there

are only 23 people in the room!

In a typical classroom of 35 students, the odds that at least

two have the same birthday are better than 4 to 1; and at

40 people they are better than 8 to 1—thus, we should

have not been surprised when two Presidents had the

same birthday.

With 50 people the odds are better than 32 to 1, and if

there are as many as 100 people in the room your odds are

astronomical, better than 3,000,000 to 1. In our own

Mathematics Department of 40 faculty members, the odds

were close to 10-to-1 than 2 of us would have the same

birthday. And that is the case. What was not likely is that

we would be next to each other alphabetically, Mena and

Merryfield. At one time, there were 3 of us with the same

birthday! And that was not very likely either.

A more sophisticated, but not any more elucidating, way

to view the previous problem is to realize that all we are

asking for is the probability of the birthday function from

the set A of people to the set of dates of the year to be

one-to-one. This way of viewing leads to many other

interesting counting problems some of which we will

tackle in the next examples.

Example 2. Suppose you are to distribute

some money among your favorite relatives:

Alphonse, Bertrand and Constance. In the

first situation we are going to look at, you

have 5 bills: $1, $5, $10, $20 and $50. In how many ways

can you distribute the money? This is very easy: we are

looking at functions from the set 1,5,10,20,50 into the

set , ,a b c . The answer to the question is then 243 35 .

Suppose we vary the question a little bit, and we ask how

many ways can we distribute the money so that everybody

gets something. Equivalently, we could have started by

asking what is the probability that everybody will receive

# of people

All NOT All

1 1.00000 0.00000

2 0.99726 0.00274

3 0.99180 0.00820

4 0.98364 0.01636

5 0.97286 0.02714

6 0.95954 0.04046

7 0.94376 0.05624

8 0.92566 0.07434

9 0.90538 0.09462

10 0.88305 0.11695

11 0.85886 0.14114

12 0.83298 0.16702

13 0.80559 0.19441

14 0.77690 0.22310

15 0.74710 0.25290

16 0.71640 0.28360

17 0.68499 0.31501

18 0.65309 0.34691

19 0.62088 0.37912

20 0.58856 0.41144

21 0.55631 0.44369

22 0.52430 0.47570

23 0.49270 0.50730

24 0.46166 0.53834

25 0.43130 0.56870

26 0.40176 0.59824

27 0.37314 0.62686

28 0.34554 0.65446

29 0.31903 0.68097

30 0.29368 0.70632

31 0.26955 0.73045

32 0.24665 0.75335

33 0.22503 0.77497

34 0.20468 0.79532

35 0.18562 0.81438

40 0.10877 0.89123

50 0.02963 0.97037

60 0.00588 0.99412

70 0.00084 0.99916

80 0.00009 0.99991

90 0.00001 0.99999

78

something given that the money will be distributed at random. What we are counting now

is the number of onto functions from {1,5,10,20,50} to the set { , , }a b c . There are several

ways to accomplish this counting.

One of them is to divide the bills into three piles and then assign the piles to the three

people. To count the last stage is easy, since we have three piles of money and we are to

distribute it to three people, there are 3 6! ways to do that part. But how can we divide

the bills into three nonempty piles? Doing it by brute force, we see from the table below

that there are 25 ways to break 5 things into 3 nonempty piles.

Since each set of three piles gives rise to 6 onto functions (as we remarked in the

79

previous paragraph), we have a total of 150 onto functions from a 5-set to a 3-set. So the

probability of everybody getting something is 150

243=62% (whether Alphonse should be

paranoid if he receives nothing is a question for moralists to decide.)

Observe also that we have grouped these 25 objects in the table in special ways, first

having grouped 7 of them, and then the remaining 18 were separated into 6 groups of 3.

First we listed the 7 when was by itself. Where is the number 7 coming from? It

is the number of ways of dividing 4 objects into two piles, and we will call this number

(4,2)S . But because every time we break up 4 objects into two piles we can just break up

5 into 3 piles by making the new, fifth object into a pile of its own.

What about the remaining 18 ways? By necessity, was not by itself. So what we

did is take any of the (4,3) 6S ways of dividing 4 objects into 3 piles, and then added

to any of the 3 piles, giving us a total of 3 6 18 (see the vertical groupings in

the list above). In other words, we have derived the fact that

(5,3) (4,2) 3 (4,3)S S S .

Suppose you had now ten different bills and six relatives. As usual we could get a

machine to do the work, but we can perhaps see a recursive relation coming on. It is clear

that if we count the number of ways of breaking up the ten bills into six (undifferentiated)

piles, then every set of piles gives rise to 6!=720 onto functions, so the total number of

onto functions will be 720 times the number of ways of breaking up 10 things into 6

undifferentiated piles. As above, let’s call this number (10,6)S .

What about (10,6)S ? Take the bill. It is either by itself in a pile or not (we are

partitioning the collection of piles into two pieces.) How many sets of piles are there

where it is by itself? A little reflection shows there are (9,5)S . If is not by itself,

what must we have done? We took any of the (9,6)S sets of piles, and added to

one of the six piles (and there are 6 ways to do this) so we have 6 (9,6)S sets of piles.

In short, we have

(10,6) (9,5) 6 (9,6)S S S .

Note that this recursion is somewhat similar to Pascal's recursion, and the time has come

to get general.

The number of ways of dividing m different objects into exactly n (nonempty)

undifferentiated piles is called a Stirling number of the second kind, and is denoted by

( , )S m n .

We will have no use for Stirling numbers of the first kind (they are similar), and these

numbers are named after the British mathematician of the 18th

century (a student and

colleague of Newton's). If we just abstract the argument for 10 and 6 above for any m and

n, we get

80

Theorem. Stirling Recursion. Let m n 2 . Then

( , ) ( 1, 1) ( 1, )S m n S m n nS m n .

We should remark that clearly ( , ) 0S m n if m n , and also in order to start building our

table, we need ( ,1) 1S m and ( , ) 1S m m (are these clear?).

In the table we compute the first twelve rows of the Stirling numbers of the second kind

using the recursion:

Hence (10,6) 22827S . Multiplying by 720, we get that there are 16,435,440 onto

functions from a 10-set to a 6-set, and since there are 60,466,176 functions, the

probability of a function being onto is roughly 27% (Alphonse shouldn't get paranoid

under these circumstances.)

Corollary. Onto Functions. Let A have m elements and let B have n

elements. Then the number of onto functions from A to B is ! ( , )n S m n .

But we can count the number of onto functions in another way. Let's go back to our three

favorite relatives and our 5 different bills. Let's consider our universe of 243 functions,

and let A be the set of them in which Alphonse receives nothing, B the ones in which

Bertrand receives nothing and the same with C and Constance. Then what we are trying

to do is find A B C and we can use the inclusion-exclusion principle. We need to

count some things. First A 32 , since it is the number of functions from a 5-set to a 2-

set. Similarly, B C 32 . Trivially, 1A B (if neither Alphonse receives money,

nor Bertrand..., or equivalently, how many functions from a 5-set to a 1-set),

1A C B C also. Finally, 0A B C , so

32 32 32 1 1 1 93A B C .

m/n 1 2 3 4 5 6 7 8 9 10 11 12

1 1

2 1 1

3 1 3 1

4 1 7 6 1

5 1 15 25 10 1

6 1 31 90 65 15 1

7 1 63 301 350 140 21 1

8 1 127 966 1701 1050 266 28 1

9 1 255 3025 7770 6951 2646 462 36 1

10 1 511 9330 34105 42525 22827 5880 750 45 1

11 1 1023 28501 145750 246730 179487 63987 11880 1155 55 1

12 1 2047 86526 611501 1379400 1323652 627396 159027 22275 1705 66 1

Stirling Numbers of the Second Kind

81

Hence, A B C 243 93 150 . Just as before. This type of reasoning can obviously

be extended, and we will ask you to do some of that in the homework. Let's vary

Example 2 a little bit.

Example 3. Onto Functions Again. Suppose we wanted to count the number of onto

functions from a 8 set to a 5 set. The answer is of course 5! (8,5)S

120 1050 126,000 . But attacking the problem from the inclusion-exclusion point of

view, we let the codomain be {1,2,3,4,5} , for any subset S of this set, we let SA be the

set of functions whose range contains no element of S . Our universe consists of all

functions from an 8 set to a 5 set so it has 85 390,625 elements. Clearly, the size of

SA is 8

5 S . We are interested in counting 1 2 3 4 5A A A A A . By inclusion-

exclusion: 8 8 8

1 2 3 4 5 5(4 ) 10(3 ) 10(2 ) 5 264,625A A A A A

so our answer is the complement so it is 390625 264625 126000 as promised.

Example 4. Suppose now we are still going to distribute some money to Alphonse,

Bertrand and Constance. But now what we have is 5 crisp, new $20 bills. How many

ways do we have of doing this? As usual, first brute force. Here what matters is how

many bills each Al, Bert and Connie receive. We are going to let ( , , )x y z stand for a

distribution where x is the number of bills Al received, y the number Bert got and z how

many Connie received. So x, y and z are nonnegative whole numbers that add up to 5.

The possibilities are then

(5,0,0) (4,1,0) (4,0,1) (3,2,0) (3,0,2) (3,1,1) (2,2,1)

(0,5,0) (0,4,1) (1,4,0) (0,3,2) (2,3,0) (1,3,1) (1,2,2)

(0,0,5) (1,0,4) (0,1,4) (2,0,3) (0,2,3) (1,1,3) (2,1,2)

So there are 21 ways. If we just try to build a tree we will soon notice that the number of

branches at a stage does depend on previous choices.

However, when viewed the right way, this problem is not very hard. Think of leaving

instructions for somebody to deliver your presents. One easy way is having two pieces of

white paper and putting them among your bills. Then you simply leave the instructions:

Al gets all the bills until the first piece of paper, Bert gets the next group and finally

Connie gets the remaining bills. Think of the pieces of white paper as separators, so in

total we have seven pieces of paper, and of the seven positions where one can put the

separators, one has to choose two for them, so the answer is 7

221 . Which is correct!

So simple it is deceptive.

Let's look at a few examples. Of the numbers {1,2,3,4,5,6,7} we are going to choose two

to correspond to the positions for the separators (think of the bills and separators as

occupying those 7 positions.) Thus, {1,2} corresponds to the first two positions, hence Al

gets 0, Bert gets 0 and Connie gets 5. What about the subset {3,6}? Here the first two

82

positions in your stack are occupied by bills, so Al is going to get 2, the next stack also

contains 2 bills and that is Bert's share and finally Connie gets 1. Going in reverse,

suppose you wanted Al to get 1, Bert to get 1 and Connie to get 3. Where are the

separators? {2,4}. And so we have a correspondence between the places for the

separators and the distributions.

Suppose that it is required that everybody gets at least one bill. How many ways can we

do it now? This is easy. Give everybody a bill. We have two left. Use the same scheme

with the separators (we still have two), so now we have a total of four positions (2 for

bills & 2 for separators), and we are to choose 2 for the separators, so we can do it

4

26 ways: (3,1,1), (1,3,1), (1,1,3), (2,2,1), (2,1,2) and (1,2,2).

What the previous examples all have in common is the idea of distributing balls in buckets. In some situations, the balls all have different colors, so they are differentiated (Examples 1 and 2); in others they are all the same color (Example 3).

The same applies for the buckets where, for example, for the Stirling numbers they were

not differentiated. Finally, sometimes we wanted at most one ball in any bucket

(Example 1) and sometimes we wanted at least one ball in every bucket (Example

2). This leads to a potential 16 different situations, not all of which are interesting. Below

we will build a table highlighting the most important ones.

We will be using the following notation: d or D for differentiated, u or U for

undifferentiated. We will be using a capital letter for the balls when we want at most one

ball in any bucket, and a capital letter for the buckets means we want at least one ball in

any bucket. But before we go to our table, and before we look at the (u,u) situation, we

should encourage the reader to realize that this consideration of distributing balls into

buckets is more than just a gimmick, it is an actual physical model of the situations

we will encounter, and that two essential ingredients in the physical model should be kept

in mind at all times: first, that a ball cannot go into two buckets (but a bucket can

receive two balls), and second, that one must distribute all the balls (but that a bucket

may remain empty).

Example 5. Suppose we have 5 black balls, and we want to distribute them in 3 gray

buckets. Our acronym hence would be ( , )u u where m 5 and n 3 . How many ways

can this be done? Only 5 ways: 5 in one bucket; 4 in one, 1 in another; 3 and 2; 3 and 1

and 1; 2 and 2 and 1. It is clear that what we have here are the partitions of 5 in at most 3

pieces. Suppose we had been after ( , )u U , so that every bucket had to have at least one

ball? Then the answer would have been 2: 3+1+1 and 2+2+1. The answer to ( , )U u is 0

since there is no way to distribute 5 balls into 3 buckets with at most one ball per bucket.

We will return to partitions in a latter section, but in a way, they are behind a lot of the

counting, and one can always use them. However, they often lead to intractable ways of counting due to their large number. Hence we suggest you don't get too fascinated

by this way of counting, but sometimes it is unavoidable. Let's revisit the examples of

this section by viewing them in terms of partitions.

83

In Example 5, we had undifferentiated balls and undifferentiated buckets. Each partition

just gave us one arrangement. Acronym ( , )u u :

In Example 3, we had undifferentiated balls, but differentiated buckets. Hence a partition

of the balls, depending on its nature, will give a different number of arrangements. For

example, the partition 2+2+1 gives 3 arrangements since there are 3 ways of choosing the

bucket that gets only one ball. Acronym ( , )u d :

In the latter part of Example 2, the balls were differentiated but the

buckets weren't. In addition we required that every bucket have at least one ball. Thus

only the partitions into exactly three parts are relevant: 3+1+1 and 2+2+1. Each partition,

again, by its nature, gives us a different number of arrangements. For example, 2+2+1

gives 15 different arrangements: 5 3 where 5 is the number of ways of choosing the ball

that is going to be alone, and 3 is the number of ways of dividing the remaining 4 balls

into two groups of 2. Acronym (d,U):

Finally in the beginning of Example 2 we were interested in a situation where the balls

and the buckets were differentiated (this is the only case when you are dealing with

functions). And again every partition gives us a different number. For example, 3+1+1

gives 10*3*2=60 arrangements where 10 is the number of ways of choosing the 3 balls

that will be together, 3 is the number of ways of choosing the bucket they will go into,

and 2 is the number of ways of distributing the other 2 balls. Acronym (d,d):

5 = 1

4+1 = 1

3+2 = 1

3+1+1 = 1

2+2+1 = 1

5

5 = 3

4+1 = 6

3+2 = 6

3+1+1 = 3

2+2+1 = 3

21

3+1+1 = 10

2+2+1 = 15

25

5 = 3

4+1 = 30

3+2 = 60

3+1+1 = 60

2+2+1 = 90

243

84

Summary Table

We will assume we have m balls to be distributed among n buckets.

Acronym

Objects that we are counting Count

(d,d) Functions from the set of balls to the set of buckets

nm

(D,d) One-to-one functions from the balls to the buckets

!

)!(

n

n m

(d,U) Ways of dividing m balls into n nonempty undifferentiated piles

( , )S m n

(d,D) Onto functions ! ( , )n S m n

(u,d) Sometimes called selections: ways of distributing m identical objects into (at most) n differentiated pieces

1 1

1

m n m n

n m

The only one we should clarify further is the last one. It is Example 3. We have identical

bills to be distributed among relatives. In the general situation we have m bills and n

relatives. Using separators, the only question is how many of them do we need? A little

thought will convince that we need one less separator than relatives, therefore n 1. So

in total we have m n 1 positions for all the pieces of paper, and we have to choose

n 1 of them for the separators. Equivalently we could have chosen the m positions for

the bills.

This table should not be memorized. What is important is the reasoning behind it, and

what is sometimes hard is to decide which are the balls and which are the buckets, and

whether they are differentiated, or something else. We will fill some of the holes in the

table in the exercises.

In a previous chapter we studied the binomial theorem.

The Binomial Theorem. 0

( )n

n n i i

i

n

ix y x y

Let us revisit it. The following special case of the theorem deserves recognition:

Corollary. 0

(1 )n

n i

i

n

ix x .

This equation is, again, an algebraic identity so it lends itself to all kinds of manipulations

and substitutions. For example, we can differentiate (with respect to x) to obtain a new

identity:

1 1

0

(1 )n

n i

i

n

in x i x ,

thus if we let n 6 , we have

85

2 3 4 56 6 6 6 6 6

1 2 3 4 5 62 3 4 5 6x x x x x

6 30 60 60 30 6 6 12 3 4 5 5x x x x x x( ) .

We will be seeing some more of these substitutions and manipulations in the exercises.

Suppose we now we want to compute 6( )x y z instead. So we would be looking at

( )( )( )( )( )( )x y z x y z x y z x y z x y z x y z .

Here the expansion would be real tedious. But the reasoning we did in the binomial

theorem is still valid. Namely as we expand this product, in order to build a term we get

one summand out of each of the factors, thus our terms are of the form x y zi j k where i ,

j and k are nonnegative integers and 6i j k . For example, x y z2 3 and x z4 2 are

both terms. How many terms will we have? 6 stages, 3 options for each decision give us a

total of 3 7296 terms!

On the other hand, there are very few types of terms: x 6 , y 6 , z6 , x y5 , x z5 , …. How

many types are there? Let's count them smartly: how many ways can we partition 6 into

at most 3 pieces: 6+0+0, 5+1+0, 4+2+0, 4+1+1, 3+3+0, 3+2+1, and 2+2+2. The partition

6+0+0 gives rise to 3 types x 6 , y 6 , and z6 . In contrast, 5+1+0 gives rise to 6 types: x y5 ,

x z5 , xy5 , xz5 , y z5 and yz5 . The difference stems from the fact that for the first partition

all we had to do is assign a variable to the 6 and then the other two variables would get a

0 (1 decision, 3 options for that decision). But for 5+1+0, we had two decisions: which

variable gets the 5 and which variable gets the 1: 3 options for the first one, 2 for the

second one give us 6 types. Since this line of reasoning has become routine, we just

proceed by listing the number of types each of the partitions will give us: 4+2+0, 6 types;

4+1+1, 3 types; 3+3+0, 3 types; 3+2+1, 6 types; 2+2+2, 1 type. For a total of 28 types,

3+6+6+3+3+6+1.

All we need now is the coefficient for each of the types. Take for example, x y z3 2 . How

can such a term arise? It must have been that three of the factors contributed x’s, two

contributed y’s and the remaining factor contributed a z. We have three decisions: first to

choose the three factors that contribute x’s—we have 6

3=20 ways of doing that stage.

Then we have to pick the y’s: there are three factors remaining: 3

2=3 and finally the last

stage is for the z, but there is only one option left. So we have 20 3 1 60 , so that is

the coefficient of x y z3 2 . It should be clear that x yz2 3 has also 60 for its coefficient, or

any of the types stemming from the partition 3+2+1.

Let’s reason it in general: take x y zi j k where i j k 6 . Then i factors have to

contribute x's and we have 6

i ways of doing that choosing. Then from the remaining

86

factors we have to decide for the y's: 6 i

j ways, and finally, for the z's: 6 i j

k1

since i j k 6 . Hence the coefficient of x y zi j k is

6 6 6 6! 6 )! 6 )!

!(6 )! !(6 )! !0!

( (i i j i i j

i j k i i j i j k

6!

! ! !i j k

by canceling. This is a very satisfying expression:

These coefficients are called multinomial coefficients.

And we immediately test whether our ideas generalize to any expansion. Suppose we are

doing 11( )x y z w . What is the coefficient of x y z wi j k m ? If the exponents do not add

up to 11, the coefficient is 0. If i j k 11 , then we have 11

i ways of choosing the

factors that will contribute the x's, then 11 i

j ways for the y's, and 11 i j

k for the z's,

and the remaining factors all have to contribute w's. So in total we have,

11 11 11 11 11! (11 )! (11 )! (11 )!

!(11 )! !(11 )! !(11 )! !

i i j i j k i i j i j k

i j k m i i j i j k i j k m

which reduces nicely to

by cancellation and the fact that i j k m 11.

As an illustration we finish computing 6( )x y z : 6( )x y z x y z6 6 6 6 6 6 6 6 65 5 5 5 5 5x y x z xy xz y z yz

15 15 15 15 15 154 2 4 2 2 4 2 4 4 2 2 4x y x z x y x z y z y z

4 4 430 30 30x yz xy z xyz 20 20 203 3 3 3 3 3x y x z y z

60 60 60 60 60 603 2 3 2 2 3 2 3 3 2 2 3x y z x yz x y z x yz xy z xy z 90 2 2 2x y z .

With exactly 3+36+90+90+60+360+90=729 terms as expected.

In general, if m m m nt1 2 is any partition of n, then we can define the

multinomial coefficient of that partition by

n

m m m

n

m m mt t1 2 1 2, , ,

!

! ! ! .

These are natural generalizations of the binomial coefficients, and indeed if we let t 2 ,

so m m n1 2 , then

11!

! ! ! !i j k m

87

n

m m

n

m m

n

m

n

m1 2 1 2 1 2,

!

! !.

And combinatorially, they are also very close. We developed the binomial coefficients by

choosing a given number of friends for either of two different activities: coming to the

party or not coming to the party. What we are doing in the multinomial coefficients is choosing specific numbers of friends for each of several, different activities.

We have motivated (although not formally proved—too boring)

Theorem. Multinomial Theorem.

1 2

1 21 2

1 2 1 2, , ,

t

tt

n mm m

t t

m m m n

n

m m mx x x x x x .

As before, the notation can be intimidating, but we can always just write one term at a

time, as we did above.

Example 6. An anagram of a word is a rearrangement of its letters. Thus, my name

MENA has 24 anagrams since there are 4! arrangements of 4 objects. Not all are

meaningful words in English, but that is irrelevant. BOB on the other hand does not have

3!=6 anagrams, only 3: OBB, BOB and BBO. The reason being that when we compute

the 6 ways we are counting the B's as different but they are not. Similarly, my first name

ROBERT does not have 6! anagrams because of double counting: there are two R's that

are being counted as different in the 6!, but do not give different words. Let's do it

another way.

Think of the six blanks that we are going to fill in with the letters:{R, O, B, E, R, T}:_ _ _

_ _ _ _. Of those 6 blanks, 2 have to go for R's. Choose those 2, for which we have 6

2

15 ways of doing. Then choose the spot for the O's : any of 4 ways. Then we have 3

spots left to put the B, 2 for the E and 1 for the T. Thus the number of anagrams is:

15 4 3 2 1 360 . You can think of obtaining the result in terms of activities: you

have six blanks: 2 of them are going for R, 1 for O, 1 for B, 1 for E and 1 for T. By the

same reckoning, MISSISSIPPI has 11

4 4 2 1, , ,34 650, anagrams.

Of course, one can have variations on the questions. How many anagrams of ROBERT

are there where the vowels are together? Put the vowels together: there are 2 ways of

doing that, and then think of them as one letter: V and we have anagrams of RBRTV, of

which there are 60, so our answer is 120.

Another variation: how many anagrams are there of ROBERT where the vowels come in

alphabetical order (not necessarily together). There are 360 anagrams, and they come in

pair where the vowels occur in the same locations, only one of those two has the vowels

88

in order, so the answer is 360

2180 .

There is yet another important way to think of the multinomial coefficients. We have

been discussing the number of ways of distributing balls in buckets under different

circumstances depending on whether the balls (or the buckets) were all different colors or

all the same. What we have now is the possibility of repeated colors. Namely, for

anagrams, think of the letters as the balls, the alphabet being the colors, and the blanks

are differentiated buckets. So we are distributing balls (some the same, some not) into

buckets of different colors, with at most one ball per bucket. In this situation, the

multinomial coefficients are relevant.

Example 7. In how many ways can we break a group of 10 people into two groups of 2

and 2 groups of 3. We can use the anagram model to start this problem. We have two A’s,

2 E’s, 3 B’s and 3 C’s. So we could impulsively answer 10 10!

2,2,3,3 2!2!3!3!. But this is

not quite correct because the groups of two are not distinguishable from each other, and

neither are the groups of three, so we have to divide our answer by 2 2 4 , so the

eventual answer is 6,300.

To divide 12 people into four groups of 3, we have 12!

15,4003!3!3!3!4!

.