7/30/2019 Stat 491 Chapter 3--Probability
1/34
Probability and InferenceDefinitions and Properties
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Stat 491: Biostatistics
Chapter 3: Probability
Solomon W. Harrar
The University of Montana
Fall 2012
Chapter 3: Probability Stat 491: Biostatistics
7/30/2019 Stat 491 Chapter 3--Probability
2/34
Probability and InferenceDefinitions and Properties
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Example: Verifying a Claim
A manufacturer of pregnancy test kit claims that the accuracy(true positive rate) of their kit is over 75%.
We conducted a clinical trial and out of 100 pregnant women,77 tested positive.
Can we have faith on the sample evidence?
There is enough evidence to back your claim if
Probability(Evidence given that the Claim is FALSE) = Small,
say less than 0.05.
This is one instance where probability and probability modelscome in handy.
Chapter 3: Probability Stat 491: Biostatistics
7/30/2019 Stat 491 Chapter 3--Probability
3/34
Probability and InferenceDefinitions and Properties
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Definitions
Sample Space: the set of all possible outcomes of a trial.
Event (A,B,C,. . .): any set of outcomes of interest.
The probability of an event is the relative frequency ofoccurrence of this set of outcomes over an indefinitely large(or infinite) number of trials.
That is,
P(E) =nEn
where nE the number of outcomes in favor of E in n (large)repetitions of the trial.
Chapter 3: Probability Stat 491: Biostatistics
7/30/2019 Stat 491 Chapter 3--Probability
4/34
Probability and InferenceDefinitions and Properties
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Interpretation of Probability
Suppose it is known that
The probability of developing a breast cancer over
30 years in 40-year-old women who have never hadcancer is 1/11.
This probability means
over a large sample of 40-year-old women who have
never had breast cancer, approximately 1 in 11 willdevelop the disease by age 70.
This proportion becomes increasingly close to 1 in 11 as thenumber of women sampled increases.
Chapter 3: Probability Stat 491: Biostatistics
P b bili d I f
7/30/2019 Stat 491 Chapter 3--Probability
5/34
Probability and InferenceDefinitions and Properties
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Properties of Probability
Prop. 1 For any event A
0 P(A) 1.
Prop. 2 If A and B are two events that can not happen at the sametime then
P(A or B occurs) = P(A) + P(B).
Definition Two events A and B are said to be mutually exclusive if theycan not happen at the same time.
Chapter 3: Probability Stat 491: Biostatistics
P b bilit d I f
7/30/2019 Stat 491 Chapter 3--Probability
6/34
Probability and InferenceDefinitions and Properties
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Example: Properties of Probability
Let X be diastolic blood pressure of a person. LetA = {X < 90} and B = {90 X < 95}. SupposeP(A) = 0.7 and P(B) = 0.1. Let C = X < 95. Then,
P(C) = P(A or B) = P(A) + P(B) = 0.7 + 0.1 = 0.8.
Chapter 3: Probability Stat 491: Biostatistics
Probability and Inference
7/30/2019 Stat 491 Chapter 3--Probability
7/34
Probability and InferenceDefinitions and Properties
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
The Complement of an EventThe complement of an event A written as A is defined as
A = the event that A does not occur
= the collection of all outcomes in the sample spacethat are not in A
P(A) = 1 P(A).
Using a venn diagram.
Example: For the diastolic blood pressure example whereA = {X < 90},
P(A) = 1 P(A) = 1 0.7 = 0.3.
Chapter 3: Probability Stat 491: Biostatistics
Probability and Inference
7/30/2019 Stat 491 Chapter 3--Probability
8/34
Probability and InferenceDefinitions and Properties
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
The Union of Events
The union of events A and B written as A B is defined as
A B = the event that either A or B or both occur
= the collection of all outcomes in the sample spacethat are either in A or B or both
Using a venn diagram
Example: For the hypertension example, let A = {X 90}and B = {75 X 100}. Then
A B = {X 75}.
Chapter 3: Probability Stat 491: Biostatistics
Probability and Inference
7/30/2019 Stat 491 Chapter 3--Probability
9/34
Probability and InferenceDefinitions and Properties
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
The Intersection of Events
The intersection of events A and B written as A B isdefined as
A B = the event that both A and B occur
= the collection of all outcomes in the sample space
that are both in A and B
Using a venn diagram
For the hypertension example,let A = {X 90} andB = {75 X 100} then
P(A B) = {90 X 100}
If A and B are mutually exclusive, then A B = andP(A B) = 0.
Chapter 3: Probability Stat 491: Biostatistics
Probability and Inference
7/30/2019 Stat 491 Chapter 3--Probability
10/34
Probability and InferenceDefinitions and Properties
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
The Intersection of Events Contd ...
Two events A and B are said to be independent if
P(A B) = P(A) P(B).
Otherwise, A and B are said to be dependent.Example: Let A = {Wifes DBP > 95},B = {Husbands DBP > 95} andC = {first-born childs DBP > 80} where DBP stands fordiastolic blood pressure. Assume that P(A) = 0.1 and
P(B) = 0.2. What can we say about P(A B)?If we are willing to assume that the wifes hypertensive statusdoes not depend on the husbands hypertensive status then
P(A B) = 0.1 0.2 = 0.02.
If P(C) = .2 and P(B C) = .05, are B and C independent?Is the result unex ected? Ex lain.Chapter 3: Probability Stat 491: Biostatistics
Probability and Inference
7/30/2019 Stat 491 Chapter 3--Probability
11/34
yDefinitions and Properties
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
The Intersection of Events: Example
Suppose two doctors, A and B, test all patients coming into aclinic for syphilis.
A+ = {Dr. A makes a positive diagnosis}
B+ = {Dr. B makes a positive diagnosis}
Suppose P(A+) = 0.1 , P(B+) = 0.17 andP(A+ B+) = 0.08. Are the events A+ and B+ independent?
Now
P(A+ B+) = 0.08 > P(A+) P(B+) = 0.017.
Thus the two events are dependent.
This result is NOT unexpected. Why?
Chapter 3: Probability Stat 491: Biostatistics
Probability and Inference
7/30/2019 Stat 491 Chapter 3--Probability
12/34
yDefinitions and Properties
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Multiplication and Addition Laws
Multiplication Law: For mutually independent eventsA1,A2, . . . ,Am,
P(A1 A2 Am) = P(A1) P(A2) P(Am).
Addition Law: For any two events A and B,
P(A B) = P(A) + P(B) P(A B).
Use venn diagram.For the STD example, suppose a patient will be referred forfurther lab test if at least one of the doctors makes a positivediagnosis. Then
P(A+B+) = P(A+)+P(B+)P(A+B+) = 0.1+0.170.08 = .19
Thus 19% of all patients will be referred for further lab tests.Chapter 3: Probability Stat 491: Biostatistics
Probability and Inference
7/30/2019 Stat 491 Chapter 3--Probability
13/34
Definitions and PropertiesEvent Relations
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Addition Law for Independent Events
If A and B are independent events then
P(A B) = P(A) + P(B)[1 P(A)].
Example: Let A = {Wifes DBP > 95} andB = {Husbands DBP > 95} where DBP stands for diastolicblood pressure. Assume that P(A) = 0.1 and P(B) = 0.2.Then the probability of a hypertensive household is
P(AB) = P(A)+P(B)[1P(A)] = 0.1+0.2[10.1] = 0.28.
Therefore, 28% of all households will be hypertensive.
Chapter 3: Probability Stat 491: Biostatistics
Probability and Inferencefi
7/30/2019 Stat 491 Chapter 3--Probability
14/34
Definitions and PropertiesEvent Relations
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Addition Law for More Than Two Events
For any events A, B and C,
P(A B C) = P(A) + P(B) + P(C)
P(A B) P(A C) P(B C)
+ P(A B C)
The addition law also generalizes to an arbitrarily number ofevents, although that is beyond the scope of this course.
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceD fi i i d P i
7/30/2019 Stat 491 Chapter 3--Probability
15/34
Definitions and PropertiesEvent Relations
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Conditional Probability
Let A and B be two events with P(B) > 0. Then theconditional probability of event A given event B, written asP(A|B), is defined as
P(A|B) =P(A B)
P(B)
and the conditional probability of event B given event A isdefined in similar way assuming that P(A) > 0.
Using a venn diagram.If A and B are independent then
P(B|A) = P(B) = P(B|A).
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceD fi iti d P ti
7/30/2019 Stat 491 Chapter 3--Probability
16/34
Definitions and PropertiesEvent Relations
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Relative Risk
The relative risk (RR) of B given A is
RR = P(B|A)P(B|A)
.
Obviously,0 RR
7/30/2019 Stat 491 Chapter 3--Probability
17/34
Definitions and PropertiesEvent Relations
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Relative Risk: Example
Suppose that 1500 smokers in 10,000 develop lung cancer in20 years and 50 non-smokers in 5000 developed lung cancer in
20 years. What is the relative risk of lung cancer in 20 yearsgiven a person smokes?
Let A = {Smoker} and B = {Develop lung cancer}. Then
RR =P(B|A)
P(B|A)=
0.15
0.01= 15.
Therefore, smokers are 15 times more likely to develop lungcancer in 20 years than nonsmokers.
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
7/30/2019 Stat 491 Chapter 3--Probability
18/34
Definitions and PropertiesEvent Relations
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Total Probability Rule
Let A and B be any two events.
Clearly,P(B) = P(B A) + P(B A).
The above relation implies
P(B) = P(B|A) P(A) + P(B|A) P(A)
which is known as the Total Probability Rule.
Generalization of the total probability rule: Let A1,A
2, ,A
mbe a set of mutually exclusive and exhaustive events. Then,
P(B) =m
i=1
P(B|Ai) P(Ai).
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
7/30/2019 Stat 491 Chapter 3--Probability
19/34
Definitions and PropertiesEvent Relations
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Total Probability Rule: Example
Suppose the rate of type II diabetes mellitus (DM) in 40- to
59-year old is 7% among Caucasians, 10% among AfricanAmericans, 12% among Hispanics and 5% among AsianAmericans. Suppose the ethnic distribution in Houston, TX is30% Caucasian, 25% African American, 40% Hispanics and5% Asian American. What is the overall probability of type IIDM among 40 to 59 year-old in Houston?
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
7/30/2019 Stat 491 Chapter 3--Probability
20/34
Definitions and PropertiesEvent Relations
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Total Probability Rule: Example contd...
Let A1 = {Caucasian}, A2 = {African American},A3 = {Hispanics}, A4 = {Asian American} and
B = {Having Type II DM}. Then,
P(B) = P(B|A1) P(A1) + P(B|A2) P(A2)
+ P(B|A3) P(A3) + P(B|A4) P(A4)
= (.07)(.3) + (.1)(.25) + (.12)(.4) + (.05)(.05) = 0.0965.
Therefore, the overall probability of type II DM among 40 to59 year-old in Houston is 0.0965.
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
7/30/2019 Stat 491 Chapter 3--Probability
21/34
pEvent Relations
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Generalized Multiplication Rule
Let A1,A2, . . . ,Am be arbitrary set of events, not necessarilymutually independent.
P(A1 A2 Am) = P(A1) P(A2|A1)
P(A3|A2 A1) P(Am|A1 A2 Am1)
This is a direct consequence of definition of conditional
probability.
The rule reduces to the multiplication law discussed before ifthe events A1,A2, . . . ,Am are mutually independent.
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
7/30/2019 Stat 491 Chapter 3--Probability
22/34
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Predictive Value, Sensitivity and Specificity
Predictive Value Positive of a screening test
PV+ = P(disease|test+)
Predictive Value Negative of a screening test
PV = P(no disease|test)
Sensitivity of a screening test
Sensitivity = P(test+|disease)
Specificity of a screening test
Specificity = P(test|no disease)
Question: Which of the above information do you think can bedirectly measured by clinicians?
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
7/30/2019 Stat 491 Chapter 3--Probability
23/34
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Symptoms as a Screening Test
A symptom or set of symptoms can be used as a screeningtest.
Ideally, we would like to find a set of symptoms such thatboth PV+ and PV are 1.
For a symptom to be effective in predicting disease, it isimportant that both sensitivity and specificity to be high.
Please know, though, that high sensitivity and specificity arenot the whole story.
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
E R l i
7/30/2019 Stat 491 Chapter 3--Probability
24/34
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
False Positive and False Negative
When do you think results from a screening test are classifiedFalse Positive and when are they classified False Negative.
The probability of False positive
P(False Positive) = 1 Specificity
The probability of False Negative
P(False Negative) = 1 Sensitivity
Example: Suppose 5% of women with breast cancer have afamily history of breast cancer but only 2% women withoutbreast cancer have such a history. Then what is the rate ofFalse Positive and False Negative of family history as ascreening test?
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
E t R l ti
7/30/2019 Stat 491 Chapter 3--Probability
25/34
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Example: Association Between PSA and Prostate Cancer
The PSA+ and PSA status of each participant in a studywas evaluated and the following data was obtained.
PSA Test Result Prostate Cancer Frequency+ + 92+ - 27- + 46- - 72
Calculate the PV+, PV
, Sensitivity and Specificity.Why is this type of sample is, in general, unrealistic?
Typically, case-control studies which allow only to estimatesensitivity and specificity but not PVs are used.
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
Event Relations
7/30/2019 Stat 491 Chapter 3--Probability
26/34
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Example: Association Between PSA and Prostate Cancer
Contd...
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
Event Relations
7/30/2019 Stat 491 Chapter 3--Probability
27/34
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Bayes Rule
Let A = Symptom and B = disease.
P(B) = is the probability of the disease in the referencepopulation.
P(B) is also known as the prevalence of the disease in thereference population.
The prevalence information can be combined with thespecificity and sensitivity information to get PV+ and PV.
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
Event Relations
7/30/2019 Stat 491 Chapter 3--Probability
28/34
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Bayes Rule Contd...
Bayes Rule says that the Predictive Value Positive is given by
PV+ =P(A|B) P(B)
P(A|B) P(B) + P(A|B) P(B)
=Sensitivity Prevalence
Sensitivity Prevalence + (1 Specificity) (1 Prevalence)
Similarly, the Predictive Value Negative is given by
PV = P(A|
B) P(
B)
P(A|B) P(B) + P(A|B) P(B)
=Specificity (1 Prevalence)
Specificity (1 Prevalence) + (1 Sensitivity) Prevalence
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
Event Relations
7/30/2019 Stat 491 Chapter 3--Probability
29/34
Event RelationsLaws of Probability
Conditional ProbabilityBayes Rule and Screening Tests
Bayes Rule: Example
It is known that Enzyme-Linked Immunosorbent Assay(ELISA) test for HIV infection has 99.7% sensitivity and
98.5% specificity.What is the likelihood that a person has been infected withHIV if he or she had a positive test readout given that
1 the prevalence of HIV infection in the region where the patientis from is 0.2%?
2 the patients decision to come for a test resulted from his or herconcern of high-risk behavior and the prevalence of HIVinfection among high-risk groups in the persons region is 20%.
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
Event Relations
7/30/2019 Stat 491 Chapter 3--Probability
30/34
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Bayes Rule: Example Contd...
Why would the probability be so low in (1) while the tests arevery accurate?
This is easy to explain. The test has high false positive rate
compared to the prevalence HIV.Of 1000 people in the region, roughly 2 people[.002 .997 1000 = 1.994] would have HIV and the testwould give positive result.
Approximately, the test gives positive results for roughly17[(.002 .997 + .015 .998) 1000 = 16.964] people.
The test gives positive results for 15 more than actually are.
Hence, the probability that the person have HIV given apositive test result is 1.994/16.949 = 11.7543%.
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
Event Relations
7/30/2019 Stat 491 Chapter 3--Probability
31/34
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Bayes Rule: Example contd...
The two lessons to be learnt from this example are1 The patient must be told his likelihood in addition to the
specificity and sensitivity of the test, NOT just the readout.2 The prevalence information is very important in determining
the persons likelihood.
The best course of action may be to do a more accurate(Western Blot Procedure) test if a positive readout is met.The combination of these two procedures is highly accurate.
In the previous calculation, we updated the persons priorprobability of infection in light of the test result to get thepersons posterior probability of infection.
This is a special cases of the general class of inference knownus Bayesian Inference.
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
Event Relations
7/30/2019 Stat 491 Chapter 3--Probability
32/34
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Generalization of Bayes Rule
Let B1,B2, . . . ,Bm be a set of mutually exclusive andexhaustive disease states. Let A be the presence of asymptom or set of symptoms. Then
P(Bi|A) =P(A|Bi) P(Bi)
m
j=1
P(A|Bj) P(Bj).
Read Example 3.27 in the textbook.
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
Event RelationsL f P b bili
7/30/2019 Stat 491 Chapter 3--Probability
33/34
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Summary
Defined Probability
Addition and multiplication laws
Independent and dependent eventsConditional Probability and Relative Risks to quantify thedependence between events.
Accuracy of screening tests defined as an application ofconditional probabilities.
Application of Bayes rule for computing PV+ and PVwhen only sensitivity, specificity and prevalence are known.
This is a special case of Bayesian Inference.
Chapter 3: Probability Stat 491: Biostatistics
Probability and InferenceDefinitions and Properties
Event RelationsL f P b bilit
7/30/2019 Stat 491 Chapter 3--Probability
34/34
Laws of ProbabilityConditional Probability
Bayes Rule and Screening Tests
Homework
Problems 3.49, 3.50, 3.75, 3.83, 3.84
Chapter 3: Probability Stat 491: Biostatistics