Date post: | 14-Jul-2015 |
Category: |
Education |
Upload: | eugene-yan |
View: | 1,179 times |
Download: | 3 times |
Statistical InferenceWeeks 1 & 2: Probability and Distribution
Types of Variables
All Variables
Categorical May be represented by
numbers, but does not make sense to add, subtract, average, etc
Numerical Makes sense to add,
subtract, average, etc(i.e., perform math operations)
Discrete Are counted and can
only take on non-negative whole numbers
Continuous Are measured and
can take on any real number (i.e., have decimal places)
Categorical Have no inherent
ordering (e.g., single, married, divorced)
Ordinal Have ordered levels
(e.g., primary, secondary, JC, university, etc)
Probability
P(A) = Probability of event A happening0 β€ P(A) β€ 1
Disjoint (mutually exclusive) events Cannot happen at the same time
β A card drawn from a deck cannot be both spades and hearts
β P(Spade & Heart) = 0
Non-disjoint events Can happen at the same time
β A card drawn from a deck can be both a spade and an ace
β P(Spade & Ace) = 1/52
Spade SpadeHeart Ace
Disjoint and non-disjoint events
Union of disjoint eventsβProbability of drawing a
Spade or a Heart from a deck of cards
P(Spade or Heart)
= P(Spade) + P(Heart)
= 13/52 + 13/52
= 26/52
Union of non-disjoint eventsβProbability of drawing a
Spade or an Ace from a deck of cards
P(Spade or Ace)
= P(Spade) + P(Ace) β P(Spade and Ace)
= 13/52 + 4/52 β 1/52
= 16/52
General Additional Rule = P(A or B) = P(A) + P(B) β P(A and B)
Marginal, Joint, and Conditional Probability
Marginal probabilityβ Probability based on a single variable
P(Student = uses)
= 219/445
Joint Probabilityβ Probability based on two or more
variables
P(Student = uses and Parent = uses)
= 125/445 = 0.28
Conditional Probabilityβ Probability of one event conditional
upon another event
P(Student = use | parents = used)
= 125/210 = 0.60
Parents
Used Did not use
Total
Student
Uses 125 94 219
Does not Use
85 141 226
Total 210 235 445
Bayesβ Theorem
Bayesβ theoremβ π· π¨ π©) =
π·(π¨ πππ π©)
π· (π©)
Probability that the Children use given that the Parents also usedπ πβππππππ = π’π π ππππππ‘π = π’π ππ)
= π(πβππππππ=π’π π πππ ππππππ‘π =π’π ππ)
π(ππππππ‘π =π’π ππ)
= 125/445
210/445
= 0.60
Parents
Used Did not use
Total
Children
Uses 125 94 219
Does not Use
85 141 226
Total 210 235 445
General Product Rule = P(A and B) = P(A|B) x P(B)
Bayesβ Theorem expanded Probability of women with
breast cancer in general populationβ P(breast cancer) = 0.017
Probability of true positive from mammogramβ P(positive | breast cancer) = 0.78
β I.e., sensitivity
Probability of false positive from mammogramβ P(positive | no breast cancer) =
0.10
β i.e., 1 - specificity
What is the probability that the patient has breast cancer given a positive mammogram? π(ππππππ | πππ ππ‘ππ£π)
= π πππ ππ‘ππ£π ππππππ) π(ππππππ)
π πππ ππ‘ππ£π ππππππ) π ππππππ +π πππ ππ‘ππ£π ππ ππππππ) π(ππ ππππππ)
= 0.78 β 0.017
0.78 β0.017+0.10 β0.983
= 0.119
Bayesβ theorem
π· π¨ π©) =π·(π¨ πππ π©)
π· (π©)
= π· π© π¨) π·(π¨)
π· (π©)
= π· π© π¨) π·(π¨)
π· π© π¨) π· π¨ +π· π© π¨π)π·(π¨π)
Probability Tree
Cancer
No Cancer
P(cancer)0.017
P(no cancer)0.983
What is the probability that the patient has breast cancer given a positive mammogram?
Positive
Positive
Negative
Negative
P(positive | cancer)
0.78
P(negative | cancer)
0.22
P(positive | no cancer)
0.10
P(negative | no cancer)
0.90
P(cancer and positive)
0.017 x 0.78 = 0.01326
P(no cancer and positive)0.983 x 0.10
= 0.0983
π(ππππππ | πππ ππ‘ππ£π)
= π(ππππππ πππ πππ ππ‘ππ£π )
π(πππ ππ‘ππ£π)
= 0.01326
0.01326+0.0983
= 0.119
Expected Mean
Expected MeanπΈ π
= E[π Γ π π₯ ] # sum of all values of x multiplied by its probability
What is the expected value of a dice roll?πΈ π
= 1 Γ1
6+ 2 Γ
1
6+ 3 Γ
1
6+ 4 Γ
1
6+ 5 Γ
1
6+ 6 Γ
1
6
= 3.5
Notation: π₯ : sample meanπ : population mean
Mean
Meanππππ
= π₯1+ π₯2+ π₯3+ β¦+ π₯π
π
What is the mean number of dots on each die face?ππππ
= 1+2+3+4+5+6
6
= 3.5
Notation: π₯ : sample meanπ : population mean
Expected Variance
Expected Varianceπππ π
=E[(π β π)2] # sum square of difference between each value and mean
=E π2 β πΈ[π]2
What is the variance of a dice roll?
From previous slide, mean πΈ π = 3.5
πΈ π2 = 12 Γ1
6+ 22 Γ
1
6+ 32 Γ
1
6+ 42 Γ
1
6+ 52 Γ
1
6+ 62 Γ
1
6= 15.17
Var(X) = πΈ π2 β πΈ π 2 = 15.17 β 3.52 β 2.9
Notation:π 2: sample varianceπ2 : population variance
π : sample standard deviationπ : population standard deviation
Population Variance
Population Varianceπ2
= 1
πΞ£[(π₯π β π)2]
What is the variance of dots on die faces?
Given π₯ = 3.5
π2 = 1
6[ 1 β 3.5 2 + 2 β 3.5 2 + β¦+ 6 β 3.5 2]
β 2.9
Notation:π 2: sample varianceπ2 : population variance
π : sample standard deviationπ : population standard deviation
Sample Variance
Sample Varianceπ 2
= 1
πβ1Ξ£[(π₯π β π₯)2]
Why n β 1?βA sample will always have smaller variance than the population. Thus, we
perform an βadjustmentβ to get a bigger variance that more closer approximates the population variance
β i.e., think of it as a βcorrectionβ used on samples
Notation:π 2: sample varianceπ2 : population variance
π : sample standard deviationπ : population standard deviation
Bernoulli Distribution
Where an individual trial only has two possible outcomes
Assuming a fair coin, what is the probability of it landing on heads (i.e., success)?π π π’ππππ π = π βππππ 1π(π‘ππππ )0 = 0.5
Assuming an unfair coin (i.e., π βππππ = 0.25), what is the probability of it landing on tails (i.e., failure)? π πππππ’ππ = π βππππ 0π(π‘ππππ )1 = 0.75
Binomial Distribution
Probability of k successes in n trialsπ π π π’ππππ π ππ ππ π π‘πππππ = (π
π) ππ(1 β π)(πβπ)
where (ππ) =
π!
π! πβπ !
Given 7 trials, how many scenarios can have 2 successes?
(27) =
7!
2!(5!)
= 7 Γ6 Γ5!
2 Γ1Γ5!
= 21
If you toss the unfair coin 7 times, whatβs the probability of 2 heads (i.e., successes)?
Given π βππππ = 0.25π π = 2 = (2
7) Γ 0.252 Γ 0.755
= 7 Γ6 Γ5!
2 Γ1Γ5!Γ 0.252 Γ 0.755
= 0.311
Normal Distribution
Unimodal (only one peak) and symmetric
68-95-99.7% ruleβ 68% of values within 1sd from mean
β 95% of values within 2sd from mean
β 99.7% of values within 3sd from mean
Represented as π(π, π)
Xiao MingMuthu
Normal Distribution
You want to compare between two cousins and determine who fared better. Xiao Ming scored 1800 on his SAT and Muthuscored 24 on his ACTβwho did better?β ππ΄π π πππππ ~ π ππππ = 1500, ππ· = 300
βπ΄πΆπ π πππππ ~ π(ππππ = 21, ππ· = 6)
Xiao Ming: 1800 β1500
300= 1sd
Muthu: 24 β21
6= 0.5sd
Normal Distribution (Z scores)
Standardization with Z scores (normalization)
π =πππ πππ£ππ‘πππ β π
ππ·
Standardized (Z) score of a value is the number of standard deviations it falls above or below the mean
Z score of mean = 0
Normal Distribution
Suppose that your company ad campaign receives daily ad clicks that are (approximately) normally distributed with mean = 1,020 and standard deviation = 50. Whatβs the probability of getting more than 1,160 clicks a day?
π =πππ πππ£ππ‘πππ β π
ππ·
=1,160 β 1,020
50= 2.8
π π > 2.8 = 1 β 0.9974= 0.0026
Normal Distribution
Your friend boast that his ad is in the top 25% of the companyβs ad campaign. What is the lowest number of ad clicks his ad received? βπ΄π ππππππ ~ π(1020, 50)
π = 0.67 =π₯ β 1,020
50π₯ = 0.67 Γ 50 + 1020= 1053.5
Poisson Distribution
Poisson Distribution
π π =πβπππ₯
π₯!β π = πππ π ππ πππ‘π’πππ πππ, 2.71828β¦
β π = ππππ ππ’ππππ ππ π π’ππππ π ππ ππ π πππ£ππ π‘πππ πππ‘πππ£ππ
2.5 people show up at a bus stop every hour. What is the probability that 3 or fewer people show up after 4 hours?
π π β€ 3 =πβ10100
0!+πβ10101
1!+πβ10102
2!+πβ10103
3!= 0.10336
Thank you for your attention!Eugene Yan