Chapter 3 Discrete Random Variable and Probability...

Post on 21-Mar-2020

27 views 1 download

transcript

Chapter 3 Discrete Random Variable and Probability

Distributions

Seungchul Baek

STAT 355 Introduction to Probability and Statistics for Scientists andEngineers

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 1

Random Variable

Definition: A random variable is a function from a sample space S intothe real numbers. We usually denote random variables with uppercaseletters, e.g., X, Y,. . .

X : S æ R.

Examples:

Experiment: toss a coin 2 times, Random variable: X = number ofheads in 2 tosses

Experiment: toss 10 dice, Random variable: X = sum of the numbers

Experiment: apply di�erent amounts of fertilizer to corn plants,Random variable: X = yield/acre

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 2

D= O

E.

X can be 0,1 , 2.

= X = 10, - - -

,60

- untie

Sample Space of Random Variable

In defining a random variable, we have also defined a new samplespace. For example, in the experiment of tossing a coin 2 times, theoriginal sample space is

S = {HH, HT , TH, TT}

We define a random variable X = number of heads in 2 tosses. Thenew sample space is

X = {0, 1, 2}

The new sample space X is called the range of the random variable X .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 3

⇒ee

Type of Random Variables

A discrete random variable can take one of a countable list of distinctvalues. Its sample space has finite or countable outcomes.

A continuous random variable can take any value in an interval of the realnumber line. Its sample space has uncountable outcomes.

Examples:

Time until a projectile returns to earth.The number of times a transistor in computer memory changes state inone operation.The volume of gasoline that is lost to evaporation during the filling ofa gas tank.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 4

⇒ ⇐

Example

Consider toss a fair coin 10 times. The sample space S contains total210 = 1024 elements, which is of the form

S = {TTTTTTTTTT , . . . , HHHHHHHHHH}

Define the random variable Y as the number of tails out of 10 trials.Remember that a random variable is a map from sample space to realnumber. For instance,

Y (TTTTTTTTTT ) = 10.

The range (all possible values) of Y is

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 5

-0-

{o, I , 2 , - - r

,I 0 }

,

ex ,I

Y = I H-j.twT .

Example: Mechanical Components

An assembly consists of three mechanical components. Suppose that theprobabilities that the first, second, and third components meetspecifications are 0.90, 0.95 and 0.99, respectively. Assume the componentsare independent.

We define event Si = the ith component is within specification and

Si = the ith component is not within specification, where i = 1, 2, 3,

One can calculate

P(S1S2S3) =P(S1S2S3) =. . .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 6

-

siPCs , nsznsz ) = pcsnpcsz> pls ,) -- 0.9×0.95

PCs , Easy , +0.99=0.84645Go-9×(1-0.95) all -0.99 )I 0,00045

Example: Mechanical Components

Possible outcomes for one assembly is

Let Y = Number of components within specification in a randomly chosenassembly.

Remark: We usually denote the realized value of the random variable bycorresponding lowercase letters.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 7

soB

-

Probability Mass Functions of Discrete Variables

Definition: Let X be a discrete random variable defined on some samplespace S. The probability mass function (pmf) associated with X isdefined to be

pX (x) = P(X = x).

A pmf p(x) for a discrete random variable X satisfies the following:

0 Æ p(x) Æ 1, for all possible values of x .The sum of the probabilities, taken over all possible values of X , mustequal 1; i.e., ÿ

all xpX (x) = 1.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 8

e.* .

O-

q⇒@ '¥÷¥④ V I. Px"' =L

"

Px " "e. a ¥7.

.

! "

Cumulative Distribution Function

Definition: Let Y be a random variable, the cumulative distribution

function (cdf) of Y is defined as

FY (y) = P(Y Æ y).

FY (y) = P(Y Æ y) is read, ‘the probability that the random variableY is less than or equal to the value y .’

Properties of cumulative distribution function

limyæ≠Œ FY (y) = 0, limyæŒ FY (y) = 1Right continuous, i.e., limyæa+ FY (y) = FY (a).Non-decreasing, i.e., y1 Æ y2 =∆ FY (y1) Æ FY (y2)

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 9

i:x:*=-0

#q ,ex " ' -- PH -- D

-

e

-

× Idfx. discrete II÷o¥

,

Example: Mechanical Components Revisited

In the Mechanical Components example, Y = number of components withinspecification in a randomly chosen assembly. We have the following pmf:

And the cdf is?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 10

rnc-

*.tt¥÷t¥"""

c df 0.o

Example: pmf plot of Mechanical Components

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.0

0.2

0.4

0.6

0.8

y

pmf

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 11

" "ot. . .

, go -006g. .-

o.#" "" & Y

teQ) If we roll

a die

0.0005"

"¥¥¥, Px "" -- to '" "

'''" '

'b

*

Example: cdf plot of Mechanical Components

−1 0 1 2 3 4

0.0

0.2

0.4

0.6

0.8

1.0

y

cdf

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 12

o-14 -§-÷4t

o.o ..*q"¥7I#oT' "" "on:pHEO) mom#

Example 3.9

Consider whether the next person buying a computer at a certain electronicsstore buys a laptop or a desktop model. We denote X = 1 if the customerpurchases a desktop and X = 0 if the customer purchases a laptop.Suppose that 20% of all purchases during that week select a desktop.

Specify the pmf and cdf of X .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 13

Expected Value of a Random Variable

Definition: Let Y be a discrete random variable with pmf pY (y), theexpected value or expectation of Y is defined as

µ = E (Y ) =ÿ

all yypY (y).

The expected value for a discrete random variable Y is simply a weightedaverage of the possible values of Y . Each value y is weighted by itsprobability pY (y).

In statistical applications, µ = E (Y ) is commonly called thepopulation mean.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 14

,

a"constant "

00=1are:*

comes.

o :O

Expectation and Mean

Expectation

µ = E (Y ) =ÿ

all yypY (y)

Arithmetic mean when there are n observations, x1, . . . , xn

X̄ = 1n

nÿ

i=1xi

Why do we say ‘mean’ for expectation of a random variable?

The expected value can be interpreted as the long-run average.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 15

¥

Expected Value: Mechanical Components

The pmf of Y in the mechanical components example is

The expected value of Y is

µ = E (Y ) =ÿ

all yypY (y)

=

Interpretation: On average, we would expect 2.84 components withinspecification in a randomly chosen assembly.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 16

)

Example3.9-pcx.EE/oI-#EK' Ii::I= 0

.2

±#¥÷¥hH¥④ is:

.iii.

Expected Value of Functions of Y

Theorem: Let Y be a discrete random variable with pmf pY (y). Let g bea real-valued function defined on the range of Y . The expected value orexpectation of g(Y ) is defined as

E{g(Y )} =ÿ

all yg(y)pY (y).

Interpretation: The expectation E{g(Y )} could be viewed as theweighted average of the function g(y) when evaluated at the randomvariable Y .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 17

-43×7=+2;¥¥¥, = . ., sis ang' th""

Eff) = -241712 )-

EH3= ×y/

pCo -f 10,2 EH) = oxo-Stl Xo . 2= 0

. 2

E- H2) = 0.2 = 07¥.A +12×0,2

Properties of Expectation

Let Y be a discrete random variable with pmf pY (y). Suppose thatg1, g2, . . . , gk are real-valued functions, and let c be a real constant. Theexpectation of Y satisfies the following properties:

1 E (c) = c .2 E (cY ) = cE (Y ).3 Linearity: E

Óqkj=1 gj(Y )

Ô=

qkj=1 E{gj(Y )}.

Remark: The 2nd and 3rd rules together imply that

E

Y]

[

kÿ

j=1cjgj(Y )

Z^

\ =kÿ

j=1cjE{gj(Y )},

for constant cj .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 18

✓ summation Zz

→ Ek) =a.Eyck 's ) = ca,FgPe" = c

.

Efc 's ) = Pt's) = a E3Pzy} = cEg'

e- Hi "'I ? ''g.c.as

?!E{ sit'),

linear combination of Gi , - - - , Gq-

Example: Mechanical Components

In the Mechanical Components example, Y = number of componentswithin specification in a randomly chosen assembly. We foundE (Y ) = 2.84. Suppose that the cost (in dollars) to repair the assembly isgiven by the cost function g(Y ) = (3 ≠ Y )2. What is the expected cost torepair an assembly?

Interpretation:

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 19

Example: Project Management

A project manager for an engineering firm submitted bids on three projects.The following table summarizes the firm’s chances of having the threeprojects accepted.

Project A B CProb. of Acceptance 0.3 0.8 0.1

Assuming the projects are independent of one another, what is theprobability that the firm will have all three projects accepted?

What is the probability of having at least one project accepted?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 20

-

P(AABhc ) = PCA) PCB) Pcc) =0.3×0.8×0.1=0,0241- p ( all rejected )

-

= I - PCA' n Bend )A A'

→iI¥aa

Example: Project Management

Let Y = number of projects accepted. Fill in the following table andcalculate the expected number of projects accepted.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 21

Cumulative Binomial probabilities

°#"""

-O

""⇒""" f"¥¥£÷:::::x:÷:0,27 f

→ ( t- o - 3) Xlt- o . f) Xo - I

-

0.572

Variance of a Random Variable

Definition: Let Y be a discrete random variable with pmf pY (y) andexpected value E (Y ) = µ. The variance of Y is defined as

‡2 © var(Y ) © E

Ë(Y ≠ µ)2

È

=ÿ

all y(y ≠ µ)2

pY (y).

Warning: Variance is always non-negative!

Definition: The standard deviation of Y is the positive square root ofthe variance:

‡ =Ô

‡2 =Ò

var(Y ).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 22

*I constant

-

O

÷÷÷÷...④

O-

.

Remarks on Variance (‡2)

Suppose Y is a random variable with mean µ and variance ‡2.

The variance is the average squared distance from the mean µ.

The variance of a random variable Y is a measure of dispersion orscatter in the possible values for Y .

The larger (smaller) ‡2 is, the more (less) spread in the possible valuesof Y about the population mean E (Y ).

Computing Formula:

var(Y ) = E [(Y ≠ µ)2] = E [Y 2] ≠ [E (Y )]2.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 23

ENT -Ml)Etf) f-f- to}2Eft

¥ioak 932=*

= E(yZ¥7I fir -

Properties on Variance (‡2)

Suppose X and Y are random variable with finite variance. Let c be aconstant.

var(c) = 0.

var(cY ) = c2var(Y ).

Suppose X and Y are independent, then

var(X + Y ) = var(X ) + var(Y ).

Question: var(X ≠ Y ) = var(X ) ≠ var(Y ). True or false?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 24

ECHL -- CEH)

E ( c) = c

→ van c) = E (C - E (4)2=0variety Efc 's - Effy.gs/=EfH--tHD=iEffEtd5

X. y auindef. Fey

ex , ""×,"

-

-var +EttaKY)

Example 3.25

The pmf of the number X of DVD’s checked out was given p(X = 1) = 0.3,p(X = 2) = 0.25, p(X = 3) = 0.15, p(X = 4) = 0.05, p(X = 5) = 0.1, andp(X = 6) = 0.15.

Compute E (X ) and var(X ).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 25

E- (x) = 1×0,3+2×0,25 t - - - + 6×0. 15 = 2,152va E = 12×0.3+22×0.25

+ . - - t 62×0 $5-

var(x) = EXT - {ECHR=

Bernoulli Trial

Definition: Many experiments can be envisioned as consisting of asequence of “trials,” a trial is called Bernoulli trial if

1 a trial results in a “success” or “failure”2 trials are independent3 the probability of a “success” in each trial, denoted as p, 0 < p < 1, is

the same on every trial, i.e., identical.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 26

" "

= - E E

-=i. i.d

Example

When circuit boards used in the manufacture of Blue Ray players are tested,the long-run percentage of defective boards is 5 percent.

trial =

success =

p =

Ninety-eight percent of all air tra�c radar signals are correctly interpretedthe first time they are transmitted.

trial =

success =

p =

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 27

-

To test a circuit board

board is defective

P o.o,f boaideiseteanePC 7=0.95--

To interpret a radar signal.

correctly interpreted / not

°-Gtf 00¥

Or 02

Expectation and Variance of Bernoulli Trial

Usually, a success is recorded as “1” and a failure as “0”.

The probability of success is equal to the probability to get “1”, i.e.,success.

Let p = P(X = 1) and q = 1 ≠ p = P(X = 0).

We haveE (X ) = p

andvar(X ) = p(1 ≠ p) = pq

This random variable X is called a Bernoulli random variable.

Notation: X ≥ Bernoulli(p).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 28

X o n ta ft o

O O

① p=Ph"

¥n¥ELM -01477Ti

Binomial Distribution

Suppose that n Bernoulli trials are performed. Define

Y = the number of successes (out of n trials performed).

The random variable Y has a binomial distribution with number of trialsn and success probability p. It is written as Y ≥ B(n, p).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 29

yparameters

X ;¥ ' Bcn, p )

d' '

is distributed as' '

e-

O--

= =

Suppose Xi , n -- i

, X nrandom variables

.

n.at iidfr X; n Bernoulli IP)B In:*

=.

x : v

Example: Water Filters

A manufacturer of water filters for refrigerators monitors the process fordefective filters. Historically, this process averages 5% defective filters. Fivefilters are randomly selected.

Find the probability that no filters are defective.

Find the probability that exactly 1 filter is defective.

Find the probability that exactly 2 filter is defective.

Can you see the pattern?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 30

D= f defective )

N = { Notdefective }

.

-

(E)← NNNNN 955

(F) e NANNIEN#HAHA

f-Nap .IET t s-

0.052×0.953 5×0.05/0.9542

Eat =I = '¥=④

The pmf of Binomial Random Variable

Suppose Y ≥ B(n, p). The pmf of Y is given by

p(y) =I!n

y"p

y (1 ≠ p)n≠y , for y = 0, 1, 2, . . . , n,

0, otherwise.

Recall that!n

r"

is the number of ways to choose r distinct unordered

objects from n distinct objects:A

n

r

B

= n!r !(n ≠ r)! .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 31

yY successes

.¢" - " """ " s ) ""

O Othe :÷;÷÷

±÷÷÷÷÷÷÷÷÷:÷÷

Expectation and Variance of Binomial Random Variable

If Y ≥ B(n, p), thenE (Y ) = np

var(Y ) = np(1 ≠ p).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 32

I .

EHl=EyYPyl 's )= g. (f) P' a-pi'

:

O :

=@ =np=

It.

X;- Bernoulli ( p)

Xi,- - - - ,XnEH I:L

. - +

= EH, ) + END t - - - tellin )=p t p t . - - t p =np

Example 3.31 (Using Binomial Tables)

Suppose that 20% of all copies of a particular textbook fail a certainbinding strength test. Let X denote the number among 15 randomlyselected copies that fail the test.

The probability that at most 8 fail the test is

The probability that exactly 8 fail is

The probability that between 4 and 7, inclusive, fail is

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 33

X n B (n = I5, p = o . 2)

O=

÷:÷÷?:*..1Fx (d)

Binomial R function

We can use R to calculate the pmf and cdf of binomial distribution directlyby using

pY (y) = P(Y = y) = dbinom(y, n, p)

andFY (y) = P(Y Æ y) = pbinom(y, n, p).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 34

=-

==

Example: Radon Levels

Historically, 10% of homes in Florida have radon levels higher than thatrecommended by the EPA. Assume that radon level for each home isindependent of one another. 20 homes are randomly selected.

Find the probability that exactly 3 have radon levels higher than theEPA recommendation.

What is the probability that no more than 5 homes out of the samplehaving higher radon level than recommended?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 35

Example: Radon Levels

What is the probability that at least 5 homes out of the sample havinghigher radon level than recommended?

What is the probability that 2 to 8 homes out of the sample havinghigher random level than recommended?

What is the probability that less than 2 or greater than 8 homes out ofthe sample having higher random level than recommended?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 36

0.0€o

0.392J

Geometric Distribution

The geometric distribution also arises in experiments involving Bernoullitrial:

each trial results in a “success” or “failure”the trials are independentthe probability of a “success” in each trial, denoted as p, 0 < p < 1, isthe same on every trial.

Definition: Suppose that Bernoulli trials are continually observed. Define

Y = the number of trials to observe the first success.

We say that Y has a geometric distribution with success probability p.Notation: Y ≥ geom(p).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 37

s

O00

The pmf of Geometric Distribution

If Y ≥ geom(p), then the probability mass function (pmf) of Y is given by

pY (y) =I

(1 ≠ p)y≠1p, y = 1, 2, 3, . . .

0, otherwise.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 38

T # of trialsFFF . . . - p

fixed

p' up) mythsYEN#

O E Ctp >" "

P E l

Tyga- PM "p=py¥Ctp=P 'i¥=D

Mean and Variance of Geometric Distribution

Suppose Y ≥ geom(p), the expected value of Y is

E (Y ) = 1p

.

The variance of Y isvar(Y ) = 1 ≠ p

p2

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 39

8

Example: Wafer

The probability that a wafer contains a large particle of contamination is0.01. It is assumed that the wafers are independent.

What is the probability that exactly 125 wafers need to be analyzeduntil a large particle is detected?

How many wafers need to be analyzed until a large particle is detectedon average?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 40

X ~ Geom ( 0.01 )F

,

o-

-

- PIX-1257=0.01×11-8*122125- I

E- (X)=y4. oozg

= =④,

Example: Fruit Fly

Biology students are checking the eye color of fruit flies. For each fly, theprobability of observing white eyes is p = 0.25. We interpret

trial: to identify the eye color of a fruit fly

success: The fly has white eyes

p = P(white eyes) = 0.25

If the Bernoulli trial assumptions hold (independent and identical), then

Y = the number of flies needed to find the first white-eyed≥ geom(p = 0.25)

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 41

=

Example: Fruit Fly

What is the probability that the first white-eyed fly is observed on thefifth trial?

What is the probability that the first white-eyed fly is observed beforethe fourth fly is examined?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 42

P (Y -- 5) = 0,25×(1-0,25) 510.079.

P (Yc 4) = PCT E3 )=p IT=D

+ Pc-f-2)tPH=3)

Geometric Distribution in a Di�erent View

We consider a random variable X as the number of failures until the firstsuccess. Then, the pmf of X is

pX (x) =I

p(1 ≠ p)x , x = 0, 1, 2, . . .

0, otherwise.

Mean and Variance

E (X ) = 1 ≠ p

p.

var(X ) = 1 ≠ p

p2

The relationship between X and Y ?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 43

y,

#°t""T

÷.÷÷s⇒i*

⇒ FIFI ¥a* " ¥ YCvar (Tt c) Hark)

Geometric Distribution pmf & cdf R Code

We can use R to calculate the pmf and cdf of geometric distribution directlyby using

pY (y) = P(Y = y) = dgeom(y-1, p)

andFY (y) = P(Y Æ y) = pgeom(y-1, p)

i.e., the first argument is “the number of failures.”

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 44

Negative Binomial Distribution

The negative binomial distribution also arises in experiments involvingBernoulli trial:

each trial results in a “success” or “failure”;the trials are independent;the probability of a “success” in each trial, denoted as p, 0 < p < 1, isthe same on every trial.

Definition: Suppose that Bernoulli trials are continually observed. Define

Y = the number of trials to observe the r -th success.

We say that Y has a negative binomial distribution with waitingparameter r and success probability p. Notation: Y ≥ NB(r , p).

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 45

- -

-

=

①7¥,y n Geom CP )

.

Negative Binomial Distribution

The negative binomial distribution is a generalization of the geometricdistribution. If r = 1, then NB(1, p) = geom(p)

If Y ≥ NB(r , p), then the probability mass function of Y is given by

pY (y) =I!y≠1

r≠1"p

r (1 ≠ p)y≠r , y = r , r + 1, r + 2, . . .

0, otherwise.

Mean and Variance:E (Y ) = r

p

var(Y ) = r(1 ≠ p)p2

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 46

T : ftoftrials

*s÷iM-strwww.pr-aem" III. TO

yOgi:

(FI) pro- pi '"

* na =nc ITH -

-¥I

Example Revisited: Fruit Fly

Biology students are checking the eye color of fruit flies. For each fly, theprobability of observing white eyes is p = 0.25. What is the probability thethird white-eyed fly is observed on the tenth fly checked?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 47

-

- -

-f : # of trials.

'f n N B f 3 , p-

- o- 25J

PIT = lo)=

= ( Y ;'

) 0.253C to.= 0

. 0175.

Negative Binomial Distribution (Textbook Version)

The alternative formulation of the NB distribution is to count the number

of failures until the r -th success.

X = the number of failures to observe the r -th success.

If X ≥ nib(r , p), then the probability mass function of X is given by

pX (x) =I!x+r≠1

r≠1"p

r (1 ≠ p)x , x = 0, 1, 2, . . .

0, otherwise.

Mean and Variance:E (X ) = r

1 ≠ p

p

var(X ) = r(1 ≠ p)

p2

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 48

T : # of trials

et

✓→

'

stutterers

EHkEH-tIEHI@rIfiIIe7.esvaria, -_ vary

Negative Binomial Distribution (Textbook Version)

Is there any relationship between X and Y ?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 49

' f -- Xin-

Negative Binomial Distribution pmf & cdf R Code

We can use R to calculate the pmf and cdf of negative binomial distributiondirectly by using

pY (y) = P(Y = y) = dnbinom(y-r, r, p)

andFY (y) = P(Y Æ y) = pnbinom(y-r, r, p)

i.e. the first argument is “the number of failures.”

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 50

Example Re-Re-visited: Fruit Fly

Biology students are checking the eye color of fruit flies. For each fly, theprobability of observing white eyes is p = 0.25. What is the probability thethird white-eyed fly is observed on the tenth fly checked? We denote X tobe the number of failures to observe third success in 10 trilas.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 51

① Y : # of trials

t =3-

P (Y -- lo )

⑦ X : # of failures to get

Z successes

Play) = ( n ; 1) orators"

=0

Hypergeometric Distribution

Suppose we have a bowl containing N balls, r of which are red and N ≠ r ofwhich are black. We draw n(< N) balls out (without replacement). Ourr.v. of interest Y is the number of sampled balls that are red. Then Y hasa hypergeometric distribution. Define

Y = the number of success (out of the n selected).

We say that Y has a hypergeometric distribution and writeY ≥ hyper(N, n, r). The pmf of Y is given by

p(y) =

Y_]

_[

(ry)(N≠r

n≠y)(N

n), y Æ r and n ≠ y Æ N ≠ r

0, otherwise.,

where y = 0, 1, 2, . . . , n.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 52

reams

" ""

\isamfulgorplacement)# of redball samonth

"h"b

stun ±PH -- H --→Y9yY

Mean and Variance of Hypergeometric Distribution

If Y ≥ hyper(N, n, r), then

E (Y ) = n! r

N"

var(Y ) = n! r

N" 1

N≠rN

2 1N≠nN≠1

2.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 53

X

Example: Parts

A batch of parts contains 100 parts from a local supplier of tubing and 200parts from a supplier of tubing in the next state. If four parts are selectedrandomly and without replacement, what is the probability they are all fromthe local supplier?

What is the probability that two or more parts in the sample are from thelocal supplier?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 54

""

D

ply,@ =P " '

II,

'

= I - PNE ' )=tPH=o)-pT

Example: Generator

If a shipment of 100 generators contains 5 faulty generators, what is theprobability that we select 10 generators from the shipment and get a faultyone?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 55

Example 3.35

Five individuals from an animal population thought to be near extinction ina certain region have been caught, tagged, and released to mix into thepopulation. After they have had an opportunity to mix, a random sample of10 of these animals is selected. Let X be the number of tagged animals inthe second sample. Suppose there are actually 25 animals of this type in theregion.

What is the probability that exactly two of the animals in the secondare tagged?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 56

i 9¥59 so.

Poisson Distribution

The Poisson distribution is commonly used to model counts, such as

the number of customers entering a post o�ce in a given hour

the number of –-particles discharged from a radioactive substance inone second

the number of machine breakdowns per month

the number of insurance claims received per day

the number of defects on a piece of raw material.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 57

the

area

①!

→an:÷:

Poisson Process

Let the number of occurrences in a given continuous interval of time orspace be counted. A Poisson process enjoys the following properties:

The number of occurrences in non-overlapping intervals areindependent random variables.

The probability of an occurrence in a su�ciently short interval isproportional to the length of the interval.

The probability of two or more occurrences in a su�ciently shortinterval is zero.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 58

Suppose 2 hareOn average ,

two calls=

in a hour.⇒ call in

3°m÷,4"T:↳

Poisson Distribution

Poisson distribution can be used to model the number of events occurringin a continuous time or space.

Let ⁄ be the average number of occurrences per a unit interval. Let Y =the number of “occurrences” over in a unit interval of time (or space).Suppose Poisson distribution is adequate to describe Y . Then, the pmf ofY is given by

pY (y) =I

⁄y e≠⁄

y ! , y = 0, 1, 2, . . .

0, otherwise.

The shorthand notation is Y ≥ Poisson(⁄).

Mean and Variance:E (Y ) = ⁄

var(Y ) = ⁄

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 59

EH) -- X CII EITI II.org

("m"

=÷÷÷÷÷O-

-

O E. .

"s = ,

e-xz÷i = '-

a exet

⇒ .i÷÷÷¥g

t -- Y-I= Et de"

Example: Insulated Wire

“Historically the process has averaged 2.6 breaks in the insulation per 1000meters of wire” implies that the average number of occurrences ⁄ = 2.6,and the base units is 1000 meters. Let X= number of breaks in 1000meters of wire. So X ≥ Poisson(2.6). We want to find the probability that1000 meters of wire will have 1 or fewer breaks in insulation.

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 60

xiie

pen = of.

←=

PIX El) = PIX-- o ) t PH -- i )

we-2.to?6o+2.e-Zbz.61I !

÷.

o. 267

Example: Insulated Wire

If we were inspecting 2000 meters of wire, what distribution does Y

follow?

If we were inspecting 500 meters of wire, what distribution does Y

follow?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 61

X - Poisson 14=2 . 6)4 100 meters

of wine .

Y n Poisson (4--2.6×2)1--5. U

-

T- Poisson (4--2.6×1=1.3 )#

Example

Suppose we average 5 radioactive particles passing a counter in 1millisecond. What is the probability that exactly 3 particles will pass inone millisecond?

Suppose we average 5 radioactive particles passing a counter in 1millisecond. What is the probability that exactly 10 particles will passin three milliseconds?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 62

' f - Poisson (4=5)

-

-

Pl -1=3 ) = e- 5×53

-

-31 0.140

-

-

Xn Poisson ( X -- 5×3=15)P( A- to) = e-

' '5×15"

to= 0.0486

Example 3.38

Let X denote the number of traps in a particular type of metal oxidesemiconductor transistor, and suppose it has a Poisson distribution with⁄ = 2.

What is the probability that there are exactly three traps?

What is the probability that there are at most three traps?

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 63

X n Poisson( X -- 2)

PIX =3 ) = 52233!= o . If O

P ( X E3 ) = PIX -- o )c- PIX =L )+ PIX-- 27c-PIX =3)÷ 0.857

.

Relationship Between Binomial and Poisson

For some fixed ⁄, we have p = ⁄/n, then for large n, b(n, p) isapproximately Poisson(⁄ = np).

We can show that

limnæŒ

An

y

B

py (1 ≠ p)n≠y = e

≠⁄⁄y

y ! .

Seungchul Baek STAT 355 Introduction to Probability and Statistics for Scientists and Engineers 64

-

O" ⇒

tofXrBm.