Bayes Theorem, Independence, and
Discrete Random Variables
Bruce A Craig
Department of StatisticsPurdue University
STAT 511 Feb 15 1
Outline
Monty Hall Problem
Independence
Random Variables
Probability Distributions
STAT 511 Feb 15 2
Monty Hall Problem
Suppose you’re on “Let’s Make a Deal” and you’re giventhe choice of three curtains: behind one is a new car andbehind each of the other two is a goat. You pick curtain#3 and then the host (who knows where the car is)reveals the goat behind curtain #1. He then asks if you’dlike to switch to curtain #2 or stay with #3. Whatshould you do?
Question made famous after posting in the “Ask Marilyn”column in Parade magazine (1990). She correctlyanswered the question but many PhD mathematicianswrote in telling her she was wrong.
Correct answer: Switch to #2
STAT 511 Feb 15 3
Solution Approach #1
Interested in comparing the probabilities that #3 iscorrect (A) versus curtain #2 is correct (A′) givencurtain #1 is shown (B)Prior to event B , we know P(A) = P(A′) = 1/3
P(A|B) = P(A ∩ B)/P(B)
=P(A)P(B|A)
P(A)P(B|A) + P(A′)P(B|A′)
=1/3(1/2)
1/3(1/2) + 1/3(1)
=1
3↓
P(A′|B) =2
3
Because Monty only shows curtain with a goat, theconditional probabilities above diff
STAT 511 Feb 15 4
Solution Approach #2
Consider the following table of possible scenarios:
Initial choice Action ResultCurtain with a goat Stay Lose
Switch WinCurtain with a goat Stay Lose
Switch WinCurtain with the car Stay Win
Switch Lose
Win 2/3 of time with switch, 1/3 of time without
STAT 511 Feb 15 5
Independence
If A and B independent, then P(A|B) = P(A)
This says that the occurrence of one event does notaffect probability of another event occuring
If A and B independent then
P(A ∩ B) = P(A)P(B) (multiplication rule and definition)
A and B ′ are independentA′ and B are independentA′ and B ′ are independent
Independence is a common assumption in statistical testsand probability models
Should not be blindly assumed though
STAT 511 Feb 15 6
Example: Determining independence
Consider a gas station with six pumps numbered 1, 2, · · · , 6,and let Ei denote the event that pump i is in use at arandomly chosen time. Suppose that
P(E1) = P(E6) = .10,P(E2) = P(E5) = .15,P(E3) = P(E4) = .25
Define events A, B , C by
A = {E2,E4,E6}, B = {E1,E2,E3}, C = {E2,E3,E4,E5}
We then have P(A) = .50, P(A|B) = .30, and P(A|C ) = .50.That is, events A and B are dependent, whereas events A andC are independent.
STAT 511 Feb 15 7
Example: Using multiplication rule
Suppose I roll seven six-sided dice. What is the probabilitythat I get at least one 6?
For a single die, P(rolling a six) = 1/6
For a single die, P(not rolling a six) = 5/6
Complement of “at least one” is “none”
P(no sixes) = 5/6× 5/6× 5/6× 5/6× 5/6× 5/6× 5/6
= (5/6)7
= 0.297
P(at least one six) = 1− P(no sixes)
= 1− 0.297
= .703
STAT 511 Feb 15 8
Random Variables (RVs)
Definition : For a given sample space S, a randomvariable (RV) is any rule that associates a number witheach outcome in S.
In mathematical language, a random variable is a functionwhose domain is the sample space and whose range is theset of real numbers.
Conventions:
Upper case letters: X , Y , Z for RVsLower case letters: x , y and z , specific value or an RVX (E ) = x means that the outcome E is associated withthe value x by the RV X .
STAT 511 Feb 15 9
Example I
When a student calls a university help desk for technicalsupport, he/she will either immediately (S) be able tospeak to someone or (F) be placed on hold.
With S = {S , F}, define RV X by
X (S) = 1, X (F ) = 0
The RV X indicates whether or not the student canimmediately speak to someone.
NOTE: Any random variable whose only possible values are 0and 1 is called a Bernoulli random variable.
STAT 511 Feb 15 10
Example II
Let’s expand on the earlier gas pump example and considerthat there are two six-pump gas stations. Define the followingRVs:
X = the total number of pumps in use at the two stationsY = the difference between the number of pumps in use at
station 1 and the number in use at station 2U = the maximum of the numbers of pumps in use at the
two stationsWhat are the corresponding values when E = (3, 2)?
X = 3 + 2 = 5,Y = 3− 2 = 1, and U = max(3, 2) = 3
STAT 511 Feb 15 11
Types of Random Variables
Definition
A discrete random variable is an rv whose possible valueseither constitute a finite set or else can be listed in an infinitesequence in which there is a first element, a second element,and so on (“countably” infinite).
Definition
A random variable is continuous if both of the followingapply:
1 Its set of possible values consists either of all numbers ina single interval or all numbers in a disjoint union of suchintervals (e.g., [0, 10] ∪ [20, 30]).
2 No possible value of the variable has positive probability,that is, P(X = c) = 0 for any possible value c .
STAT 511 Feb 15 12
Probability Distribution
Describes a discrete random variableAssociates each numeric value with a probabilityDescribes how the total probability of 1 is distributedamong the possible numeric values
P(X = x) = P(all s ∈ S,X (s) = x)
Example: Roll a fair dieX = numeric value on face of diex = 1, 2, 3, 4, 5, 6The probability distribution is P(X = x) = 1/6
Example: Business has just purchased four laser printers,and let X be the number among these that requireservice during the warranty period.
x = 0, 1, 2, 3, 4
Distribution could be P(X = x) =1/4 x = 0, 1, 21/8 x = 3, 4
STAT 511 Feb 15 13
Probability Distribution
Definition
The probability distribution or probability mass function (pmf)of a discrete rv is defined for every number x byp(x) = P(X = x) = P(all s ∈ S : X (s) = x).
STAT 511 Feb 15 14
Probability Distribution
Definition
The probability distribution or probability mass function (pmf)of a discrete rv is defined for every number x byp(x) = P(X = x) = P(all s ∈ S : X (s) = x).
For every possible value x of the random variable, thepmf specifies the probability of observing that value whenthe experiment is performed.
The conditions p(x) ≥ 0 and∑
all possible x p(x) = 1 arerequired of any pmf.
STAT 511 Feb 15 14
Graphical Representation
Will often display distribution graphically
(left) Roll of a fair die (right) printers needing repair
STAT 511 Feb 15 15
Example: Probability Distribution
The Cal Poly Department of Statistics has a lab with sixcomputers reserved for statistics majors. Let X denote thenumber of these computers that are in use at a particular timeof day.Suppose that the probability distribution of X is as given inthe following table; the first row of the table lists the possibleX values and the second row gives the probability of each suchvalue.
x 0 1 2 3 4 5 6p(x) .05 .10 .15 .25 .20 .15 .10
STAT 511 Feb 15 16
Example: Probability Distribution
What is the probability that at most 2 computers are inuse?
What is the probability that at least 3 computers are inuse?
What is the probability that between 2 and 5 computersare in use?
STAT 511 Feb 15 17
Example: Probability Distribution
What is the probability that at most 2 computers are inuse?
p(0) + p(1) + p(2) = 0.30
What is the probability that at least 3 computers are inuse?
1− 0.3 = 0.7 or p(3) + p(4) + p(5) + p(6) = 0.70
What is the probability that between 2 and 5 computersare in use?
p(2) + p(3) + p(4) + p(5) = 0.75
STAT 511 Feb 15 18
Another PMF Example
Y = # of rolls until a 6 is obtainedThe outcomes are y = 1, 2, 3, 4, ......To determine distribution, consider sequence of rolls anduse product rule/independence
P(Y = 1) = P(roll a six)
= 1/6
P(Y = 2) = P(don’t roll a six)P(roll a six)
= 5/6× 1/6
P(Y = 3) = P(don’t roll a six)P(don’t roll a six)P(roll a six)
= 5/6× 5/6× 1/6...
P(Y = y) =
(
5
6
)y−1
×1
6
STAT 511 Feb 15 19
PMF Parameter
In previous example, assumed the die is fair so p(6)=1/6
Suppose instead that p(6) = α where 0 < α < 1
The pmf (family form) of this RV is then
P(Y = y) = α(1− α)(y−1)for y ≥ 1
A different α results in diff probability distribution
The α is called a parameter
This distribution family is called a geometric distribution
STAT 511 Feb 15 20
Bernoulli Distribution
Suppose that 20% of the customers coming to yourcomputer store buy a desktop computer. Let X indicatewhether a customer buys a desktop computer.
The pmf of this Bernoulli rv X is p(0) = .8 and p(1) = .2
At another store, it may be the case that p(0) = .9 andp(1) = .1.
More generally, the pmf of any Bernoulli rv can beexpressed in the form p(1) = α and p(0) = 1− α, where0 < α < 1. Because the pmf depends on the particularvalue of α we often write p(x ;α) rather than just p(x):
p(x ;α) =
1− α if x = 0α if x = 10 otherwise
STAT 511 Feb 15 21
The Cumulative Distribution Function
For some fixed value x , we often wish to compute theprobability that the observed value of X will be at most x .
p(x) =
0.500 x = 00.167 x = 10.333 x = 20 otherwise
The probability that X is at most 1 is then
P(X ≤ 1) = p(0) + p(1) = .500 + .167 = .667
What about P(X ≤ 1.5), P(X ≤ 0), P(X ≤ 2),P(X ≤ 3.7), and P(X ≤ 20.5)?
Note that P(X < x) ≤ P(X ≤ x).
STAT 511 Feb 15 22
The Cumulative Distribution Function
Definition
The cumulative distribution function (cdf) F (x) of adiscrete rv variable X with pmf p(x) is defined for everynumber x by
F (x) = P(X ≤ x) =∑
y :y≤x
p(y)
For any number x , F (x) is the probability that the observedvalue of X will be at most x .
STAT 511 Feb 15 23
Example
A store carries flash drives with either 1 GB, 2 GB, 4 GB, 8GB, or 16 GB of memory. The accompanying table gives thedistribution of Y = the amount of memory in a purchaseddrive:
y 1 2 4 8 16p(y) .05 .10 .35 .40 .10
Calculate F (4), F (8) and F (16). What about F (2.7),F (7.999).
STAT 511 Feb 15 24
Example
For any number y , F (y) will equal the value of F at theclosest possible value of Y to the left of y . The cumulativedistribution function in this example is
F (y) =
0 y < 1.05 1 ≤ y < 2.15 2 ≤ y < 4.50 4 ≤ y < 8.90 8 ≤ y < 161 16 ≤ y
STAT 511 Feb 15 25
Graphical Representation
STAT 511 Feb 15 26