+ All Categories
Home > Documents > Lecture 17: One Way ANOVA API-201ZAnnouncements I Midterm #2 one week from Thursday I Review session...

Lecture 17: One Way ANOVA API-201ZAnnouncements I Midterm #2 one week from Thursday I Review session...

Date post: 20-Oct-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
284
Lecture 17: One Way ANOVA API-201Z Maya Sen Harvard Kennedy School http://scholar.harvard.edu/msen
Transcript
  • Lecture 17:One Way ANOVA

    API-201Z

    Maya Sen

    Harvard Kennedy Schoolhttp://scholar.harvard.edu/msen

    http://scholar.harvard.edu/msen

  • Announcements

    I Midterm #2 one week from Thursday

    I Review session next Tuesday afternoon will be taped

    I Because of Veteran’s Day, shifting my OH to Tuesday noon to2pm (Taubman 356)

    I Have posted readings for Thursday – Oregon health care casestudy

  • Announcements

    I Midterm #2 one week from Thursday

    I Review session next Tuesday afternoon will be taped

    I Because of Veteran’s Day, shifting my OH to Tuesday noon to2pm (Taubman 356)

    I Have posted readings for Thursday – Oregon health care casestudy

  • Announcements

    I Midterm #2 one week from Thursday

    I Review session next Tuesday afternoon will be taped

    I Because of Veteran’s Day, shifting my OH to Tuesday noon to2pm (Taubman 356)

    I Have posted readings for Thursday – Oregon health care casestudy

  • Announcements

    I Midterm #2 one week from Thursday

    I Review session next Tuesday afternoon will be taped

    I Because of Veteran’s Day, shifting my OH to Tuesday noon to2pm (Taubman 356)

    I Have posted readings for Thursday – Oregon health care casestudy

  • Announcements

    I Midterm #2 one week from Thursday

    I Review session next Tuesday afternoon will be taped

    I Because of Veteran’s Day, shifting my OH to Tuesday noon to2pm (Taubman 356)

    I Have posted readings for Thursday – Oregon health care casestudy

  • Roadmap

    I Finish up paired tests

    I One-Way Analysis of Variance (ANOVA)

    I Multiple comparisons and Bonferroni corrections

    I Multiple comparisons corrections will be last topic covered onMidterm #2

    I Leaves one common type of test (Chi Square tests) for final,along with regression

  • Roadmap

    I Finish up paired tests

    I One-Way Analysis of Variance (ANOVA)

    I Multiple comparisons and Bonferroni corrections

    I Multiple comparisons corrections will be last topic covered onMidterm #2

    I Leaves one common type of test (Chi Square tests) for final,along with regression

  • Roadmap

    I Finish up paired tests

    I One-Way Analysis of Variance (ANOVA)

    I Multiple comparisons and Bonferroni corrections

    I Multiple comparisons corrections will be last topic covered onMidterm #2

    I Leaves one common type of test (Chi Square tests) for final,along with regression

  • Roadmap

    I Finish up paired tests

    I One-Way Analysis of Variance (ANOVA)

    I Multiple comparisons and Bonferroni corrections

    I Multiple comparisons corrections will be last topic covered onMidterm #2

    I Leaves one common type of test (Chi Square tests) for final,along with regression

  • Roadmap

    I Finish up paired tests

    I One-Way Analysis of Variance (ANOVA)

    I Multiple comparisons and Bonferroni corrections

    I Multiple comparisons corrections will be last topic covered onMidterm #2

    I Leaves one common type of test (Chi Square tests) for final,along with regression

  • Roadmap

    I Finish up paired tests

    I One-Way Analysis of Variance (ANOVA)

    I Multiple comparisons and Bonferroni corrections

    I Multiple comparisons corrections will be last topic covered onMidterm #2

    I Leaves one common type of test (Chi Square tests) for final,along with regression

  • Paired Tests for Proportions

    I For paired data, we have to take into account fact that wehave dependence between groups

    I For sample means, straightforward → take difference betweengroups as new quantity, use that to re-calculate standarddeviation, conduct hypothesis test

    I However, for proportions we sometimes don’t have entire tableof individual observations

    I Usually only have a contingency table

    I So we estimate the standard error slightly differently

  • Paired Tests for Proportions

    I For paired data, we have to take into account fact that wehave dependence between groups

    I For sample means, straightforward → take difference betweengroups as new quantity, use that to re-calculate standarddeviation, conduct hypothesis test

    I However, for proportions we sometimes don’t have entire tableof individual observations

    I Usually only have a contingency table

    I So we estimate the standard error slightly differently

  • Paired Tests for Proportions

    I For paired data, we have to take into account fact that wehave dependence between groups

    I For sample means, straightforward → take difference betweengroups as new quantity, use that to re-calculate standarddeviation, conduct hypothesis test

    I However, for proportions we sometimes don’t have entire tableof individual observations

    I Usually only have a contingency table

    I So we estimate the standard error slightly differently

  • Paired Tests for Proportions

    I For paired data, we have to take into account fact that wehave dependence between groups

    I For sample means, straightforward → take difference betweengroups as new quantity, use that to re-calculate standarddeviation, conduct hypothesis test

    I However, for proportions we sometimes don’t have entire tableof individual observations

    I Usually only have a contingency table

    I So we estimate the standard error slightly differently

  • Paired Tests for Proportions

    I For paired data, we have to take into account fact that wehave dependence between groups

    I For sample means, straightforward → take difference betweengroups as new quantity, use that to re-calculate standarddeviation, conduct hypothesis test

    I However, for proportions we sometimes don’t have entire tableof individual observations

    I Usually only have a contingency table

    I So we estimate the standard error slightly differently

  • Paired Tests for Proportions

    I For paired data, we have to take into account fact that wehave dependence between groups

    I For sample means, straightforward → take difference betweengroups as new quantity, use that to re-calculate standarddeviation, conduct hypothesis test

    I However, for proportions we sometimes don’t have entire tableof individual observations

    I Usually only have a contingency table

    I So we estimate the standard error slightly differently

  • Proportion example from public opinion

    I Ex) Public opinion example (difference in proportion) fromGSS data on government oversight given suspected terroristactivity

    I Each person asked 2 questions: (1) ok for gov’t to tap phone,or (2) ok for gov’t to conduct random stops

    I Results as follows:

    Q2: Random Stop on StYes No

    Q1: Tap Phone Yes 494 335No 126 537

    I Question: Does the true proportion answering yes to the firstquestion differ significantly from the second question?

  • Proportion example from public opinion

    I Ex) Public opinion example (difference in proportion) fromGSS data on government oversight given suspected terroristactivity

    I Each person asked 2 questions: (1) ok for gov’t to tap phone,or (2) ok for gov’t to conduct random stops

    I Results as follows:

    Q2: Random Stop on StYes No

    Q1: Tap Phone Yes 494 335No 126 537

    I Question: Does the true proportion answering yes to the firstquestion differ significantly from the second question?

  • Proportion example from public opinion

    I Ex) Public opinion example (difference in proportion) fromGSS data on government oversight given suspected terroristactivity

    I Each person asked 2 questions: (1) ok for gov’t to tap phone,or (2) ok for gov’t to conduct random stops

    I Results as follows:

    Q2: Random Stop on StYes No

    Q1: Tap Phone Yes 494 335No 126 537

    I Question: Does the true proportion answering yes to the firstquestion differ significantly from the second question?

  • Proportion example from public opinion

    I Ex) Public opinion example (difference in proportion) fromGSS data on government oversight given suspected terroristactivity

    I Each person asked 2 questions: (1) ok for gov’t to tap phone,or (2) ok for gov’t to conduct random stops

    I Results as follows:

    Q2: Random Stop on StYes No

    Q1: Tap Phone Yes 494 335No 126 537

    I Question: Does the true proportion answering yes to the firstquestion differ significantly from the second question?

  • Proportion example from public opinion

    I Ex) Public opinion example (difference in proportion) fromGSS data on government oversight given suspected terroristactivity

    I Each person asked 2 questions: (1) ok for gov’t to tap phone,or (2) ok for gov’t to conduct random stops

    I Results as follows:

    Q2: Random Stop on StYes No

    Q1: Tap Phone Yes 494 335No 126 537

    I Question: Does the true proportion answering yes to the firstquestion differ significantly from the second question?

  • Proportion example from public opinion

    I Ex) Public opinion example (difference in proportion) fromGSS data on government oversight given suspected terroristactivity

    I Each person asked 2 questions: (1) ok for gov’t to tap phone,or (2) ok for gov’t to conduct random stops

    I Results as follows:

    Q2: Random Stop on StYes No

    Q1: Tap Phone Yes 494 335No 126 537

    I Question: Does the true proportion answering yes to the firstquestion differ significantly from the second question?

  • Proportion example from public opinion

    I Calculate some basic point estimates of “yes” answers:

    I Question #1: Proportion of people in sample who believeauthorities should be able to tap phones:

    π̂1 =494 + 335

    1492= 0.556

    I Question #2: Proportion of people in sample who believeauthorities should be able to randomly stop and search peopleon street:

    π̂2 =494 + 126

    1492= 0.416

  • Proportion example from public opinion

    I Calculate some basic point estimates of “yes” answers:

    I Question #1: Proportion of people in sample who believeauthorities should be able to tap phones:

    π̂1 =494 + 335

    1492= 0.556

    I Question #2: Proportion of people in sample who believeauthorities should be able to randomly stop and search peopleon street:

    π̂2 =494 + 126

    1492= 0.416

  • Proportion example from public opinion

    I Calculate some basic point estimates of “yes” answers:

    I Question #1: Proportion of people in sample who believeauthorities should be able to tap phones:

    π̂1 =494 + 335

    1492= 0.556

    I Question #2: Proportion of people in sample who believeauthorities should be able to randomly stop and search peopleon street:

    π̂2 =494 + 126

    1492= 0.416

  • Proportion example from public opinion

    I Calculate some basic point estimates of “yes” answers:

    I Question #1: Proportion of people in sample who believeauthorities should be able to tap phones:

    π̂1 =494 + 335

    1492= 0.556

    I Question #2: Proportion of people in sample who believeauthorities should be able to randomly stop and search peopleon street:

    π̂2 =494 + 126

    1492= 0.416

  • Proportion example from public opinion

    I Calculate some basic point estimates of “yes” answers:

    I Question #1: Proportion of people in sample who believeauthorities should be able to tap phones:

    π̂1 =494 + 335

    1492= 0.556

    I Question #2: Proportion of people in sample who believeauthorities should be able to randomly stop and search peopleon street:

    π̂2 =494 + 126

    1492= 0.416

  • Proportion example from public opinion

    I Calculate some basic point estimates of “yes” answers:

    I Question #1: Proportion of people in sample who believeauthorities should be able to tap phones:

    π̂1 =494 + 335

    1492= 0.556

    I Question #2: Proportion of people in sample who believeauthorities should be able to randomly stop and search peopleon street:

    π̂2 =494 + 126

    1492= 0.416

  • Proportion example from public opinion

    I Steps 1 & 2 of hypothesis test the same as non-pairedproportions test

    I Step 3: This is where paired data differ from independent data

    I Independent data: Assume no covariance between groups

    I Paired (non-independent) data: Must adjust standard error toaccommodate covariance

    I In means case → Just looked at difference (X̄d)I In proportions case → Oftentimes don’t have that data

  • Proportion example from public opinion

    I Steps 1 & 2 of hypothesis test the same as non-pairedproportions test

    I Step 3: This is where paired data differ from independent data

    I Independent data: Assume no covariance between groups

    I Paired (non-independent) data: Must adjust standard error toaccommodate covariance

    I In means case → Just looked at difference (X̄d)I In proportions case → Oftentimes don’t have that data

  • Proportion example from public opinion

    I Steps 1 & 2 of hypothesis test the same as non-pairedproportions test

    I Step 3: This is where paired data differ from independent data

    I Independent data: Assume no covariance between groups

    I Paired (non-independent) data: Must adjust standard error toaccommodate covariance

    I In means case → Just looked at difference (X̄d)I In proportions case → Oftentimes don’t have that data

  • Proportion example from public opinion

    I Steps 1 & 2 of hypothesis test the same as non-pairedproportions test

    I Step 3: This is where paired data differ from independent data

    I Independent data: Assume no covariance between groups

    I Paired (non-independent) data: Must adjust standard error toaccommodate covariance

    I In means case → Just looked at difference (X̄d)I In proportions case → Oftentimes don’t have that data

  • Proportion example from public opinion

    I Steps 1 & 2 of hypothesis test the same as non-pairedproportions test

    I Step 3: This is where paired data differ from independent data

    I Independent data: Assume no covariance between groups

    I Paired (non-independent) data: Must adjust standard error toaccommodate covariance

    I In means case → Just looked at difference (X̄d)I In proportions case → Oftentimes don’t have that data

  • Proportion example from public opinion

    I Steps 1 & 2 of hypothesis test the same as non-pairedproportions test

    I Step 3: This is where paired data differ from independent data

    I Independent data: Assume no covariance between groups

    I Paired (non-independent) data: Must adjust standard error toaccommodate covariance

    I In means case → Just looked at difference (X̄d)

    I In proportions case → Oftentimes don’t have that data

  • Proportion example from public opinion

    I Steps 1 & 2 of hypothesis test the same as non-pairedproportions test

    I Step 3: This is where paired data differ from independent data

    I Independent data: Assume no covariance between groups

    I Paired (non-independent) data: Must adjust standard error toaccommodate covariance

    I In means case → Just looked at difference (X̄d)I In proportions case → Oftentimes don’t have that data

  • Proportion example from public opinion

    I Interested in distribution of π̂1 − π̂2I By CLT, this should be normally distributed

    I From earlier lecture, if independent, then π̂1 − π̂2

    ∼ N(π1 − π2,π1(1 − π1)

    n1+π2(1 − π2)

    n2)

    I However, here dependent, so π̂1 − π̂2

    ∼ N(π1 − π2,π12 + π21 − (π12 − π21)

    2

    n)

    I where these refer to cell proportions (not conditionalproportions)

    I (Proof in appendix)

  • Proportion example from public opinion

    I Interested in distribution of π̂1 − π̂2

    I By CLT, this should be normally distributed

    I From earlier lecture, if independent, then π̂1 − π̂2

    ∼ N(π1 − π2,π1(1 − π1)

    n1+π2(1 − π2)

    n2)

    I However, here dependent, so π̂1 − π̂2

    ∼ N(π1 − π2,π12 + π21 − (π12 − π21)

    2

    n)

    I where these refer to cell proportions (not conditionalproportions)

    I (Proof in appendix)

  • Proportion example from public opinion

    I Interested in distribution of π̂1 − π̂2I By CLT, this should be normally distributed

    I From earlier lecture, if independent, then π̂1 − π̂2

    ∼ N(π1 − π2,π1(1 − π1)

    n1+π2(1 − π2)

    n2)

    I However, here dependent, so π̂1 − π̂2

    ∼ N(π1 − π2,π12 + π21 − (π12 − π21)

    2

    n)

    I where these refer to cell proportions (not conditionalproportions)

    I (Proof in appendix)

  • Proportion example from public opinion

    I Interested in distribution of π̂1 − π̂2I By CLT, this should be normally distributed

    I From earlier lecture, if independent, then π̂1 − π̂2

    ∼ N(π1 − π2,π1(1 − π1)

    n1+π2(1 − π2)

    n2)

    I However, here dependent, so π̂1 − π̂2

    ∼ N(π1 − π2,π12 + π21 − (π12 − π21)

    2

    n)

    I where these refer to cell proportions (not conditionalproportions)

    I (Proof in appendix)

  • Proportion example from public opinion

    I Interested in distribution of π̂1 − π̂2I By CLT, this should be normally distributed

    I From earlier lecture, if independent, then π̂1 − π̂2

    ∼ N(π1 − π2,π1(1 − π1)

    n1+π2(1 − π2)

    n2)

    I However, here dependent, so π̂1 − π̂2

    ∼ N(π1 − π2,π12 + π21 − (π12 − π21)

    2

    n)

    I where these refer to cell proportions (not conditionalproportions)

    I (Proof in appendix)

  • Proportion example from public opinion

    I Interested in distribution of π̂1 − π̂2I By CLT, this should be normally distributed

    I From earlier lecture, if independent, then π̂1 − π̂2

    ∼ N(π1 − π2,π1(1 − π1)

    n1+π2(1 − π2)

    n2)

    I However, here dependent, so π̂1 − π̂2

    ∼ N(π1 − π2,π12 + π21 − (π12 − π21)

    2

    n)

    I where these refer to cell proportions (not conditionalproportions)

    I (Proof in appendix)

  • Proportion example from public opinion

    I Interested in distribution of π̂1 − π̂2I By CLT, this should be normally distributed

    I From earlier lecture, if independent, then π̂1 − π̂2

    ∼ N(π1 − π2,π1(1 − π1)

    n1+π2(1 − π2)

    n2)

    I However, here dependent, so π̂1 − π̂2

    ∼ N(π1 − π2,π12 + π21 − (π12 − π21)

    2

    n)

    I where these refer to cell proportions (not conditionalproportions)

    I (Proof in appendix)

  • Proportion example from public opinion

    I Interested in distribution of π̂1 − π̂2I By CLT, this should be normally distributed

    I From earlier lecture, if independent, then π̂1 − π̂2

    ∼ N(π1 − π2,π1(1 − π1)

    n1+π2(1 − π2)

    n2)

    I However, here dependent, so π̂1 − π̂2

    ∼ N(π1 − π2,π12 + π21 − (π12 − π21)

    2

    n)

    I where these refer to cell proportions (not conditionalproportions)

    I (Proof in appendix)

  • Proportion example from public opinion

    I Interested in distribution of π̂1 − π̂2I By CLT, this should be normally distributed

    I From earlier lecture, if independent, then π̂1 − π̂2

    ∼ N(π1 − π2,π1(1 − π1)

    n1+π2(1 − π2)

    n2)

    I However, here dependent, so π̂1 − π̂2

    ∼ N(π1 − π2,π12 + π21 − (π12 − π21)

    2

    n)

    I where these refer to cell proportions (not conditionalproportions)

    I (Proof in appendix)

  • Proportion example from public opinion

    I This gives us a test statistic:

    z =π̂1 − π̂2 − (π1 − π2)√π̂12+π̂21−(π̂12−π̂21)2

    n

    I When the null is true, π1 − π2 = 0, we have some options

    I 1) Use this expression for the standard error

    I 2) Simplify test using McNemar’s Test for comparingdependent proportions (medicine/public health)

    z =n12 − n21√n12 + n21

    I where this approximately comes from standard Normal

    I Intuition borrows from Binomial distribution – explanation ofMcNemar’s test statistic in Appendix

  • Proportion example from public opinion

    I This gives us a test statistic:

    z =π̂1 − π̂2 − (π1 − π2)√π̂12+π̂21−(π̂12−π̂21)2

    n

    I When the null is true, π1 − π2 = 0, we have some options

    I 1) Use this expression for the standard error

    I 2) Simplify test using McNemar’s Test for comparingdependent proportions (medicine/public health)

    z =n12 − n21√n12 + n21

    I where this approximately comes from standard Normal

    I Intuition borrows from Binomial distribution – explanation ofMcNemar’s test statistic in Appendix

  • Proportion example from public opinion

    I This gives us a test statistic:

    z =π̂1 − π̂2 − (π1 − π2)√π̂12+π̂21−(π̂12−π̂21)2

    n

    I When the null is true, π1 − π2 = 0, we have some options

    I 1) Use this expression for the standard error

    I 2) Simplify test using McNemar’s Test for comparingdependent proportions (medicine/public health)

    z =n12 − n21√n12 + n21

    I where this approximately comes from standard Normal

    I Intuition borrows from Binomial distribution – explanation ofMcNemar’s test statistic in Appendix

  • Proportion example from public opinion

    I This gives us a test statistic:

    z =π̂1 − π̂2 − (π1 − π2)√π̂12+π̂21−(π̂12−π̂21)2

    n

    I When the null is true, π1 − π2 = 0, we have some options

    I 1) Use this expression for the standard error

    I 2) Simplify test using McNemar’s Test for comparingdependent proportions (medicine/public health)

    z =n12 − n21√n12 + n21

    I where this approximately comes from standard Normal

    I Intuition borrows from Binomial distribution – explanation ofMcNemar’s test statistic in Appendix

  • Proportion example from public opinion

    I This gives us a test statistic:

    z =π̂1 − π̂2 − (π1 − π2)√π̂12+π̂21−(π̂12−π̂21)2

    n

    I When the null is true, π1 − π2 = 0, we have some options

    I 1) Use this expression for the standard error

    I 2) Simplify test using McNemar’s Test for comparingdependent proportions (medicine/public health)

    z =n12 − n21√n12 + n21

    I where this approximately comes from standard Normal

    I Intuition borrows from Binomial distribution – explanation ofMcNemar’s test statistic in Appendix

  • Proportion example from public opinion

    I This gives us a test statistic:

    z =π̂1 − π̂2 − (π1 − π2)√π̂12+π̂21−(π̂12−π̂21)2

    n

    I When the null is true, π1 − π2 = 0, we have some options

    I 1) Use this expression for the standard error

    I 2) Simplify test using McNemar’s Test for comparingdependent proportions (medicine/public health)

    z =n12 − n21√n12 + n21

    I where this approximately comes from standard Normal

    I Intuition borrows from Binomial distribution – explanation ofMcNemar’s test statistic in Appendix

  • Proportion example from public opinion

    I This gives us a test statistic:

    z =π̂1 − π̂2 − (π1 − π2)√π̂12+π̂21−(π̂12−π̂21)2

    n

    I When the null is true, π1 − π2 = 0, we have some options

    I 1) Use this expression for the standard error

    I 2) Simplify test using McNemar’s Test for comparingdependent proportions (medicine/public health)

    z =n12 − n21√n12 + n21

    I where this approximately comes from standard Normal

    I Intuition borrows from Binomial distribution – explanation ofMcNemar’s test statistic in Appendix

  • Proportion example from public opinion

    I This gives us a test statistic:

    z =π̂1 − π̂2 − (π1 − π2)√π̂12+π̂21−(π̂12−π̂21)2

    n

    I When the null is true, π1 − π2 = 0, we have some options

    I 1) Use this expression for the standard error

    I 2) Simplify test using McNemar’s Test for comparingdependent proportions (medicine/public health)

    z =n12 − n21√n12 + n21

    I where this approximately comes from standard Normal

    I Intuition borrows from Binomial distribution – explanation ofMcNemar’s test statistic in Appendix

  • Proportion example from public opinion

    I This gives us a test statistic:

    z =π̂1 − π̂2 − (π1 − π2)√π̂12+π̂21−(π̂12−π̂21)2

    n

    I When the null is true, π1 − π2 = 0, we have some options

    I 1) Use this expression for the standard error

    I 2) Simplify test using McNemar’s Test for comparingdependent proportions (medicine/public health)

    z =n12 − n21√n12 + n21

    I where this approximately comes from standard Normal

    I Intuition borrows from Binomial distribution – explanation ofMcNemar’s test statistic in Appendix

  • Proportion example from public opinion

    I Calculating McNemar’s test statistic:

    z =n12 − n21√n12 + n21

    =335 − 126√335 + 126

    = 9.7341

    I where this approx comes from a standard normal distribution

  • Proportion example from public opinion

    I Calculating McNemar’s test statistic:

    z =n12 − n21√n12 + n21

    =335 − 126√335 + 126

    = 9.7341

    I where this approx comes from a standard normal distribution

  • Proportion example from public opinion

    I Calculating McNemar’s test statistic:

    z =n12 − n21√n12 + n21

    =335 − 126√335 + 126

    = 9.7341

    I where this approx comes from a standard normal distribution

  • Proportion example from public opinion

    I Calculating McNemar’s test statistic:

    z =n12 − n21√n12 + n21

    =335 − 126√335 + 126

    = 9.7341

    I where this approx comes from a standard normal distribution

  • Proportion example from public opinion

    I Step 4: Calculate p-value (using two-tailed test)

    I p-value = 2× P(Z 6 −9.7341)I p-value < 0.01

    I Step 5: Decide whether or not to reject the null hypothesisand interpret results

    I Question: What is your conclusion?

    I (People seem to have different tolerance for gov’t tappingphones vs random stops on street)

  • Proportion example from public opinion

    I Step 4: Calculate p-value (using two-tailed test)

    I p-value = 2× P(Z 6 −9.7341)I p-value < 0.01

    I Step 5: Decide whether or not to reject the null hypothesisand interpret results

    I Question: What is your conclusion?

    I (People seem to have different tolerance for gov’t tappingphones vs random stops on street)

  • Proportion example from public opinion

    I Step 4: Calculate p-value (using two-tailed test)

    I p-value = 2× P(Z 6 −9.7341)

    I p-value < 0.01

    I Step 5: Decide whether or not to reject the null hypothesisand interpret results

    I Question: What is your conclusion?

    I (People seem to have different tolerance for gov’t tappingphones vs random stops on street)

  • Proportion example from public opinion

    I Step 4: Calculate p-value (using two-tailed test)

    I p-value = 2× P(Z 6 −9.7341)I p-value < 0.01

    I Step 5: Decide whether or not to reject the null hypothesisand interpret results

    I Question: What is your conclusion?

    I (People seem to have different tolerance for gov’t tappingphones vs random stops on street)

  • Proportion example from public opinion

    I Step 4: Calculate p-value (using two-tailed test)

    I p-value = 2× P(Z 6 −9.7341)I p-value < 0.01

    I Step 5: Decide whether or not to reject the null hypothesisand interpret results

    I Question: What is your conclusion?

    I (People seem to have different tolerance for gov’t tappingphones vs random stops on street)

  • Proportion example from public opinion

    I Step 4: Calculate p-value (using two-tailed test)

    I p-value = 2× P(Z 6 −9.7341)I p-value < 0.01

    I Step 5: Decide whether or not to reject the null hypothesisand interpret results

    I Question: What is your conclusion?

    I (People seem to have different tolerance for gov’t tappingphones vs random stops on street)

  • Proportion example from public opinion

    I Step 4: Calculate p-value (using two-tailed test)

    I p-value = 2× P(Z 6 −9.7341)I p-value < 0.01

    I Step 5: Decide whether or not to reject the null hypothesisand interpret results

    I Question: What is your conclusion?

    I (People seem to have different tolerance for gov’t tappingphones vs random stops on street)

  • Proportion example from public opinion

    I To calculate confidence interval, follow same formula

    π̂1 − π̂2 ± zα/2SE [π̂1 − π̂2]

    I Using full form of standard error

    π̂1 − π̂2 ± zα/2

    √π̂12 − π̂21 − (π̂12 − π̂21)2

    n

  • Proportion example from public opinion

    I To calculate confidence interval, follow same formula

    π̂1 − π̂2 ± zα/2SE [π̂1 − π̂2]

    I Using full form of standard error

    π̂1 − π̂2 ± zα/2

    √π̂12 − π̂21 − (π̂12 − π̂21)2

    n

  • Proportion example from public opinion

    I To calculate confidence interval, follow same formula

    π̂1 − π̂2 ± zα/2SE [π̂1 − π̂2]

    I Using full form of standard error

    π̂1 − π̂2 ± zα/2

    √π̂12 − π̂21 − (π̂12 − π̂21)2

    n

  • Proportion example from public opinion

    I To calculate confidence interval, follow same formula

    π̂1 − π̂2 ± zα/2SE [π̂1 − π̂2]

    I Using full form of standard error

    π̂1 − π̂2 ± zα/2

    √π̂12 − π̂21 − (π̂12 − π̂21)2

    n

  • Proportion example from public opinion

    I To calculate confidence interval, follow same formula

    π̂1 − π̂2 ± zα/2SE [π̂1 − π̂2]

    I Using full form of standard error

    π̂1 − π̂2 ± zα/2

    √π̂12 − π̂21 − (π̂12 − π̂21)2

    n

  • Proportion example from public opinion

    I Using our example

    I For 95% CI:

    0.556 − 0.416± 1.96

    √3351492 −

    1261492 − (

    3351492 −

    1261492)

    2

    1492

    0.14± 0.018

    I So 95% is (0.122, 0.158)

    I Does it include 0?

    I What does this mean substantively?

  • Proportion example from public opinion

    I Using our example

    I For 95% CI:

    0.556 − 0.416± 1.96

    √3351492 −

    1261492 − (

    3351492 −

    1261492)

    2

    1492

    0.14± 0.018

    I So 95% is (0.122, 0.158)

    I Does it include 0?

    I What does this mean substantively?

  • Proportion example from public opinion

    I Using our example

    I For 95% CI:

    0.556 − 0.416± 1.96

    √3351492 −

    1261492 − (

    3351492 −

    1261492)

    2

    1492

    0.14± 0.018

    I So 95% is (0.122, 0.158)

    I Does it include 0?

    I What does this mean substantively?

  • Proportion example from public opinion

    I Using our example

    I For 95% CI:

    0.556 − 0.416± 1.96

    √3351492 −

    1261492 − (

    3351492 −

    1261492)

    2

    1492

    0.14± 0.018

    I So 95% is (0.122, 0.158)

    I Does it include 0?

    I What does this mean substantively?

  • Proportion example from public opinion

    I Using our example

    I For 95% CI:

    0.556 − 0.416± 1.96

    √3351492 −

    1261492 − (

    3351492 −

    1261492)

    2

    1492

    0.14± 0.018

    I So 95% is (0.122, 0.158)

    I Does it include 0?

    I What does this mean substantively?

  • Proportion example from public opinion

    I Using our example

    I For 95% CI:

    0.556 − 0.416± 1.96

    √3351492 −

    1261492 − (

    3351492 −

    1261492)

    2

    1492

    0.14± 0.018

    I So 95% is (0.122, 0.158)

    I Does it include 0?

    I What does this mean substantively?

  • Proportion example from public opinion

    I Using our example

    I For 95% CI:

    0.556 − 0.416± 1.96

    √3351492 −

    1261492 − (

    3351492 −

    1261492)

    2

    1492

    0.14± 0.018

    I So 95% is (0.122, 0.158)

    I Does it include 0?

    I What does this mean substantively?

  • Switching to multiple comparisons

    I We have spent the last few classes looking at tests for:I one and two means (independent, paired, pooled)I one and two proportions (independent or paired)

    I What happens if we want to compare observations from3 or more independent populations?

  • Switching to multiple comparisons

    I We have spent the last few classes looking at tests for:

    I one and two means (independent, paired, pooled)I one and two proportions (independent or paired)

    I What happens if we want to compare observations from3 or more independent populations?

  • Switching to multiple comparisons

    I We have spent the last few classes looking at tests for:I one and two means (independent, paired, pooled)

    I one and two proportions (independent or paired)

    I What happens if we want to compare observations from3 or more independent populations?

  • Switching to multiple comparisons

    I We have spent the last few classes looking at tests for:I one and two means (independent, paired, pooled)I one and two proportions (independent or paired)

    I What happens if we want to compare observations from3 or more independent populations?

  • Switching to multiple comparisons

    I We have spent the last few classes looking at tests for:I one and two means (independent, paired, pooled)I one and two proportions (independent or paired)

    I What happens if we want to compare observations from3 or more independent populations?

  • Some examples

    I Economics: Does mean consumer debt differ meaningfullybetween five different countries?

    I Medicine: Does a medical treatment help blacks, Latino,Asian Americans differently?

    I Education: Is there is a difference in the average SAT scoresacross 4 high schools in Boston?

    I Health: Does mean weight loss differ over 6 months betweensubjects following 5 different diets?

    In all of these → want to compare population means acrossmore than two groups

  • Some examples

    I Economics: Does mean consumer debt differ meaningfullybetween five different countries?

    I Medicine: Does a medical treatment help blacks, Latino,Asian Americans differently?

    I Education: Is there is a difference in the average SAT scoresacross 4 high schools in Boston?

    I Health: Does mean weight loss differ over 6 months betweensubjects following 5 different diets?

    In all of these → want to compare population means acrossmore than two groups

  • Some examples

    I Economics: Does mean consumer debt differ meaningfullybetween five different countries?

    I Medicine: Does a medical treatment help blacks, Latino,Asian Americans differently?

    I Education: Is there is a difference in the average SAT scoresacross 4 high schools in Boston?

    I Health: Does mean weight loss differ over 6 months betweensubjects following 5 different diets?

    In all of these → want to compare population means acrossmore than two groups

  • Some examples

    I Economics: Does mean consumer debt differ meaningfullybetween five different countries?

    I Medicine: Does a medical treatment help blacks, Latino,Asian Americans differently?

    I Education: Is there is a difference in the average SAT scoresacross 4 high schools in Boston?

    I Health: Does mean weight loss differ over 6 months betweensubjects following 5 different diets?

    In all of these → want to compare population means acrossmore than two groups

  • Some examples

    I Economics: Does mean consumer debt differ meaningfullybetween five different countries?

    I Medicine: Does a medical treatment help blacks, Latino,Asian Americans differently?

    I Education: Is there is a difference in the average SAT scoresacross 4 high schools in Boston?

    I Health: Does mean weight loss differ over 6 months betweensubjects following 5 different diets?

    In all of these → want to compare population means acrossmore than two groups

  • Some examples

    I Economics: Does mean consumer debt differ meaningfullybetween five different countries?

    I Medicine: Does a medical treatment help blacks, Latino,Asian Americans differently?

    I Education: Is there is a difference in the average SAT scoresacross 4 high schools in Boston?

    I Health: Does mean weight loss differ over 6 months betweensubjects following 5 different diets?

    In all of these → want to compare population means acrossmore than two groups

  • Life Expectancy Example

    I Ex) Life expectancies from 193 countries around the world

    I Data based on World Bank data for 6 different continents

    I We can assume different continents (groups) independent (nocountry in more than one continent)

  • Life Expectancy Example

    I Ex) Life expectancies from 193 countries around the world

    I Data based on World Bank data for 6 different continents

    I We can assume different continents (groups) independent (nocountry in more than one continent)

  • Life Expectancy Example

    I Ex) Life expectancies from 193 countries around the world

    I Data based on World Bank data for 6 different continents

    I We can assume different continents (groups) independent (nocountry in more than one continent)

  • Life Expectancy Example

    I Ex) Life expectancies from 193 countries around the world

    I Data based on World Bank data for 6 different continents

    I We can assume different continents (groups) independent (nocountry in more than one continent)

  • Life Expectancy Example

    Country Average Life Expectancy Continent

    1 73.1 Africa2 48.1 Asia3 81.8 Oceania4 77 Europe5 75.1 North America6 73.1 Africa7 74.3 South America... ... ...

  • Life Expectancy Example

    Country Average Life Expectancy Continent

    1 73.1 Africa2 48.1 Asia3 81.8 Oceania4 77 Europe5 75.1 North America6 73.1 Africa7 74.3 South America... ... ...

  • Life Expectancy Example

    Africa Asia Europe Oceania North Am South Am

    X̄i 57.54 72.02 78.11 72.69 74.93 73.81si 7.97 6.33 3.93 5.37 4.13 3.26ni 52 50 42 13 25 11

  • Life Expectancy Example

    Africa Asia Europe Oceania North Am South Am

    X̄i 57.54 72.02 78.11 72.69 74.93 73.81si 7.97 6.33 3.93 5.37 4.13 3.26ni 52 50 42 13 25 11

  • Life Expectancy Example

  • Life Expectancy Example

  • Life Expectancy Example

    I We could compare each possible pair usingdifference-in-means t-test at α = 0.05 level

    I African to Oceania, Africa to Europe, Oceania to Europe, etc.

    I Problem → Each hypothesis test has P(Type I Error) of 0.05I What is the probability of a Type I error if we test all pairwise

    combinations of means?I 15 possible combinations of tests →

    I Pr none of them having a Type 1 error = 0.9515

    I So Pr at least one has a Type 1 error, 1 − 0.9515, or around54%

    I As number of groups compared increases → P(at least oneType I error) also increases

  • Life Expectancy Example

    I We could compare each possible pair usingdifference-in-means t-test at α = 0.05 level

    I African to Oceania, Africa to Europe, Oceania to Europe, etc.

    I Problem → Each hypothesis test has P(Type I Error) of 0.05I What is the probability of a Type I error if we test all pairwise

    combinations of means?I 15 possible combinations of tests →

    I Pr none of them having a Type 1 error = 0.9515

    I So Pr at least one has a Type 1 error, 1 − 0.9515, or around54%

    I As number of groups compared increases → P(at least oneType I error) also increases

  • Life Expectancy Example

    I We could compare each possible pair usingdifference-in-means t-test at α = 0.05 level

    I African to Oceania, Africa to Europe, Oceania to Europe, etc.

    I Problem → Each hypothesis test has P(Type I Error) of 0.05I What is the probability of a Type I error if we test all pairwise

    combinations of means?I 15 possible combinations of tests →

    I Pr none of them having a Type 1 error = 0.9515

    I So Pr at least one has a Type 1 error, 1 − 0.9515, or around54%

    I As number of groups compared increases → P(at least oneType I error) also increases

  • Life Expectancy Example

    I We could compare each possible pair usingdifference-in-means t-test at α = 0.05 level

    I African to Oceania, Africa to Europe, Oceania to Europe, etc.

    I Problem → Each hypothesis test has P(Type I Error) of 0.05

    I What is the probability of a Type I error if we test all pairwisecombinations of means?

    I 15 possible combinations of tests →I Pr none of them having a Type 1 error = 0.9515

    I So Pr at least one has a Type 1 error, 1 − 0.9515, or around54%

    I As number of groups compared increases → P(at least oneType I error) also increases

  • Life Expectancy Example

    I We could compare each possible pair usingdifference-in-means t-test at α = 0.05 level

    I African to Oceania, Africa to Europe, Oceania to Europe, etc.

    I Problem → Each hypothesis test has P(Type I Error) of 0.05I What is the probability of a Type I error if we test all pairwise

    combinations of means?

    I 15 possible combinations of tests →I Pr none of them having a Type 1 error = 0.9515

    I So Pr at least one has a Type 1 error, 1 − 0.9515, or around54%

    I As number of groups compared increases → P(at least oneType I error) also increases

  • Life Expectancy Example

    I We could compare each possible pair usingdifference-in-means t-test at α = 0.05 level

    I African to Oceania, Africa to Europe, Oceania to Europe, etc.

    I Problem → Each hypothesis test has P(Type I Error) of 0.05I What is the probability of a Type I error if we test all pairwise

    combinations of means?I 15 possible combinations of tests →

    I Pr none of them having a Type 1 error = 0.9515

    I So Pr at least one has a Type 1 error, 1 − 0.9515, or around54%

    I As number of groups compared increases → P(at least oneType I error) also increases

  • Life Expectancy Example

    I We could compare each possible pair usingdifference-in-means t-test at α = 0.05 level

    I African to Oceania, Africa to Europe, Oceania to Europe, etc.

    I Problem → Each hypothesis test has P(Type I Error) of 0.05I What is the probability of a Type I error if we test all pairwise

    combinations of means?I 15 possible combinations of tests →

    I Pr none of them having a Type 1 error = 0.9515

    I So Pr at least one has a Type 1 error, 1 − 0.9515, or around54%

    I As number of groups compared increases → P(at least oneType I error) also increases

  • Life Expectancy Example

    I We could compare each possible pair usingdifference-in-means t-test at α = 0.05 level

    I African to Oceania, Africa to Europe, Oceania to Europe, etc.

    I Problem → Each hypothesis test has P(Type I Error) of 0.05I What is the probability of a Type I error if we test all pairwise

    combinations of means?I 15 possible combinations of tests →

    I Pr none of them having a Type 1 error = 0.9515

    I So Pr at least one has a Type 1 error, 1 − 0.9515, or around54%

    I As number of groups compared increases → P(at least oneType I error) also increases

  • Life Expectancy Example

    I We could compare each possible pair usingdifference-in-means t-test at α = 0.05 level

    I African to Oceania, Africa to Europe, Oceania to Europe, etc.

    I Problem → Each hypothesis test has P(Type I Error) of 0.05I What is the probability of a Type I error if we test all pairwise

    combinations of means?I 15 possible combinations of tests →

    I Pr none of them having a Type 1 error = 0.9515

    I So Pr at least one has a Type 1 error, 1 − 0.9515, or around54%

    I As number of groups compared increases → P(at least oneType I error) also increases

  • ANOVA

    I Instead use one-way ANOVA (Analysis of Variance)I Type of test frequently used in psychology, epidemiology,

    other fields that rely on experimentsI “one-way” → Exploring one characteristic (life expectancy)I Could explore two characteristics (life expectancy, weight) w/

    “two-way ANOVA” (more complicated)

    I Here, use one-way ANOVA as a global test, which tests nullthat population means are all equal

    I Null hypothesis for this ANOVA test:I µ1 = µ2 = µ3 = ...µk

    I Alternative hypothesis:I At last two of the population means are unequalI Note: Can be all population means, some population means,

    or just two that differ

    I → Null hypothesis generally pretty strong for global tests

  • ANOVA

    I Instead use one-way ANOVA (Analysis of Variance)

    I Type of test frequently used in psychology, epidemiology,other fields that rely on experiments

    I “one-way” → Exploring one characteristic (life expectancy)I Could explore two characteristics (life expectancy, weight) w/

    “two-way ANOVA” (more complicated)

    I Here, use one-way ANOVA as a global test, which tests nullthat population means are all equal

    I Null hypothesis for this ANOVA test:I µ1 = µ2 = µ3 = ...µk

    I Alternative hypothesis:I At last two of the population means are unequalI Note: Can be all population means, some population means,

    or just two that differ

    I → Null hypothesis generally pretty strong for global tests

  • ANOVA

    I Instead use one-way ANOVA (Analysis of Variance)I Type of test frequently used in psychology, epidemiology,

    other fields that rely on experiments

    I “one-way” → Exploring one characteristic (life expectancy)I Could explore two characteristics (life expectancy, weight) w/

    “two-way ANOVA” (more complicated)

    I Here, use one-way ANOVA as a global test, which tests nullthat population means are all equal

    I Null hypothesis for this ANOVA test:I µ1 = µ2 = µ3 = ...µk

    I Alternative hypothesis:I At last two of the population means are unequalI Note: Can be all population means, some population means,

    or just two that differ

    I → Null hypothesis generally pretty strong for global tests

  • ANOVA

    I Instead use one-way ANOVA (Analysis of Variance)I Type of test frequently used in psychology, epidemiology,

    other fields that rely on experimentsI “one-way” → Exploring one characteristic (life expectancy)

    I Could explore two characteristics (life expectancy, weight) w/“two-way ANOVA” (more complicated)

    I Here, use one-way ANOVA as a global test, which tests nullthat population means are all equal

    I Null hypothesis for this ANOVA test:I µ1 = µ2 = µ3 = ...µk

    I Alternative hypothesis:I At last two of the population means are unequalI Note: Can be all population means, some population means,

    or just two that differ

    I → Null hypothesis generally pretty strong for global tests

  • ANOVA

    I Instead use one-way ANOVA (Analysis of Variance)I Type of test frequently used in psychology, epidemiology,

    other fields that rely on experimentsI “one-way” → Exploring one characteristic (life expectancy)I Could explore two characteristics (life expectancy, weight) w/

    “two-way ANOVA” (more complicated)

    I Here, use one-way ANOVA as a global test, which tests nullthat population means are all equal

    I Null hypothesis for this ANOVA test:I µ1 = µ2 = µ3 = ...µk

    I Alternative hypothesis:I At last two of the population means are unequalI Note: Can be all population means, some population means,

    or just two that differ

    I → Null hypothesis generally pretty strong for global tests

  • ANOVA

    I Instead use one-way ANOVA (Analysis of Variance)I Type of test frequently used in psychology, epidemiology,

    other fields that rely on experimentsI “one-way” → Exploring one characteristic (life expectancy)I Could explore two characteristics (life expectancy, weight) w/

    “two-way ANOVA” (more complicated)

    I Here, use one-way ANOVA as a global test, which tests nullthat population means are all equal

    I Null hypothesis for this ANOVA test:I µ1 = µ2 = µ3 = ...µk

    I Alternative hypothesis:I At last two of the population means are unequalI Note: Can be all population means, some population means,

    or just two that differ

    I → Null hypothesis generally pretty strong for global tests

  • ANOVA

    I Instead use one-way ANOVA (Analysis of Variance)I Type of test frequently used in psychology, epidemiology,

    other fields that rely on experimentsI “one-way” → Exploring one characteristic (life expectancy)I Could explore two characteristics (life expectancy, weight) w/

    “two-way ANOVA” (more complicated)

    I Here, use one-way ANOVA as a global test, which tests nullthat population means are all equal

    I Null hypothesis for this ANOVA test:

    I µ1 = µ2 = µ3 = ...µkI Alternative hypothesis:

    I At last two of the population means are unequalI Note: Can be all population means, some population means,

    or just two that differ

    I → Null hypothesis generally pretty strong for global tests

  • ANOVA

    I Instead use one-way ANOVA (Analysis of Variance)I Type of test frequently used in psychology, epidemiology,

    other fields that rely on experimentsI “one-way” → Exploring one characteristic (life expectancy)I Could explore two characteristics (life expectancy, weight) w/

    “two-way ANOVA” (more complicated)

    I Here, use one-way ANOVA as a global test, which tests nullthat population means are all equal

    I Null hypothesis for this ANOVA test:I µ1 = µ2 = µ3 = ...µk

    I Alternative hypothesis:I At last two of the population means are unequalI Note: Can be all population means, some population means,

    or just two that differ

    I → Null hypothesis generally pretty strong for global tests

  • ANOVA

    I Instead use one-way ANOVA (Analysis of Variance)I Type of test frequently used in psychology, epidemiology,

    other fields that rely on experimentsI “one-way” → Exploring one characteristic (life expectancy)I Could explore two characteristics (life expectancy, weight) w/

    “two-way ANOVA” (more complicated)

    I Here, use one-way ANOVA as a global test, which tests nullthat population means are all equal

    I Null hypothesis for this ANOVA test:I µ1 = µ2 = µ3 = ...µk

    I Alternative hypothesis:

    I At last two of the population means are unequalI Note: Can be all population means, some population means,

    or just two that differ

    I → Null hypothesis generally pretty strong for global tests

  • ANOVA

    I Instead use one-way ANOVA (Analysis of Variance)I Type of test frequently used in psychology, epidemiology,

    other fields that rely on experimentsI “one-way” → Exploring one characteristic (life expectancy)I Could explore two characteristics (life expectancy, weight) w/

    “two-way ANOVA” (more complicated)

    I Here, use one-way ANOVA as a global test, which tests nullthat population means are all equal

    I Null hypothesis for this ANOVA test:I µ1 = µ2 = µ3 = ...µk

    I Alternative hypothesis:I At last two of the population means are unequal

    I Note: Can be all population means, some population means,or just two that differ

    I → Null hypothesis generally pretty strong for global tests

  • ANOVA

    I Instead use one-way ANOVA (Analysis of Variance)I Type of test frequently used in psychology, epidemiology,

    other fields that rely on experimentsI “one-way” → Exploring one characteristic (life expectancy)I Could explore two characteristics (life expectancy, weight) w/

    “two-way ANOVA” (more complicated)

    I Here, use one-way ANOVA as a global test, which tests nullthat population means are all equal

    I Null hypothesis for this ANOVA test:I µ1 = µ2 = µ3 = ...µk

    I Alternative hypothesis:I At last two of the population means are unequalI Note: Can be all population means, some population means,

    or just two that differ

    I → Null hypothesis generally pretty strong for global tests

  • ANOVA

    I Instead use one-way ANOVA (Analysis of Variance)I Type of test frequently used in psychology, epidemiology,

    other fields that rely on experimentsI “one-way” → Exploring one characteristic (life expectancy)I Could explore two characteristics (life expectancy, weight) w/

    “two-way ANOVA” (more complicated)

    I Here, use one-way ANOVA as a global test, which tests nullthat population means are all equal

    I Null hypothesis for this ANOVA test:I µ1 = µ2 = µ3 = ...µk

    I Alternative hypothesis:I At last two of the population means are unequalI Note: Can be all population means, some population means,

    or just two that differ

    I → Null hypothesis generally pretty strong for global tests

  • ANOVA

    I Different from hypothesis tests (which rely on CLT, comparingtest statistics to a standard normal)

    I ANOVA: Compare between-group and within-group variationI ANOVA tests rely on the fact that total variability composed

    of1. Variability between groups

    I Ex) Compare mean life expectancy for each continent toglobal mean life expectancy

    2. Variability within groupsI Ex) Compare life expectancy of individual countries to their

    continent’s mean life expectancy

    I However: Both ANOVA and hypothesis tests rely oncalculating test statistic, using that to reject or not reject nullhypothesis

  • ANOVA

    I Different from hypothesis tests (which rely on CLT, comparingtest statistics to a standard normal)

    I ANOVA: Compare between-group and within-group variationI ANOVA tests rely on the fact that total variability composed

    of1. Variability between groups

    I Ex) Compare mean life expectancy for each continent toglobal mean life expectancy

    2. Variability within groupsI Ex) Compare life expectancy of individual countries to their

    continent’s mean life expectancy

    I However: Both ANOVA and hypothesis tests rely oncalculating test statistic, using that to reject or not reject nullhypothesis

  • ANOVA

    I Different from hypothesis tests (which rely on CLT, comparingtest statistics to a standard normal)

    I ANOVA: Compare between-group and within-group variation

    I ANOVA tests rely on the fact that total variability composedof

    1. Variability between groupsI Ex) Compare mean life expectancy for each continent to

    global mean life expectancy

    2. Variability within groupsI Ex) Compare life expectancy of individual countries to their

    continent’s mean life expectancy

    I However: Both ANOVA and hypothesis tests rely oncalculating test statistic, using that to reject or not reject nullhypothesis

  • ANOVA

    I Different from hypothesis tests (which rely on CLT, comparingtest statistics to a standard normal)

    I ANOVA: Compare between-group and within-group variationI ANOVA tests rely on the fact that total variability composed

    of

    1. Variability between groupsI Ex) Compare mean life expectancy for each continent to

    global mean life expectancy

    2. Variability within groupsI Ex) Compare life expectancy of individual countries to their

    continent’s mean life expectancy

    I However: Both ANOVA and hypothesis tests rely oncalculating test statistic, using that to reject or not reject nullhypothesis

  • ANOVA

    I Different from hypothesis tests (which rely on CLT, comparingtest statistics to a standard normal)

    I ANOVA: Compare between-group and within-group variationI ANOVA tests rely on the fact that total variability composed

    of1. Variability between groups

    I Ex) Compare mean life expectancy for each continent toglobal mean life expectancy

    2. Variability within groupsI Ex) Compare life expectancy of individual countries to their

    continent’s mean life expectancy

    I However: Both ANOVA and hypothesis tests rely oncalculating test statistic, using that to reject or not reject nullhypothesis

  • ANOVA

    I Different from hypothesis tests (which rely on CLT, comparingtest statistics to a standard normal)

    I ANOVA: Compare between-group and within-group variationI ANOVA tests rely on the fact that total variability composed

    of1. Variability between groups

    I Ex) Compare mean life expectancy for each continent toglobal mean life expectancy

    2. Variability within groupsI Ex) Compare life expectancy of individual countries to their

    continent’s mean life expectancy

    I However: Both ANOVA and hypothesis tests rely oncalculating test statistic, using that to reject or not reject nullhypothesis

  • ANOVA

    I Different from hypothesis tests (which rely on CLT, comparingtest statistics to a standard normal)

    I ANOVA: Compare between-group and within-group variationI ANOVA tests rely on the fact that total variability composed

    of1. Variability between groups

    I Ex) Compare mean life expectancy for each continent toglobal mean life expectancy

    2. Variability within groups

    I Ex) Compare life expectancy of individual countries to theircontinent’s mean life expectancy

    I However: Both ANOVA and hypothesis tests rely oncalculating test statistic, using that to reject or not reject nullhypothesis

  • ANOVA

    I Different from hypothesis tests (which rely on CLT, comparingtest statistics to a standard normal)

    I ANOVA: Compare between-group and within-group variationI ANOVA tests rely on the fact that total variability composed

    of1. Variability between groups

    I Ex) Compare mean life expectancy for each continent toglobal mean life expectancy

    2. Variability within groupsI Ex) Compare life expectancy of individual countries to their

    continent’s mean life expectancy

    I However: Both ANOVA and hypothesis tests rely oncalculating test statistic, using that to reject or not reject nullhypothesis

  • ANOVA

    I Different from hypothesis tests (which rely on CLT, comparingtest statistics to a standard normal)

    I ANOVA: Compare between-group and within-group variationI ANOVA tests rely on the fact that total variability composed

    of1. Variability between groups

    I Ex) Compare mean life expectancy for each continent toglobal mean life expectancy

    2. Variability within groupsI Ex) Compare life expectancy of individual countries to their

    continent’s mean life expectancy

    I However: Both ANOVA and hypothesis tests rely oncalculating test statistic, using that to reject or not reject nullhypothesis

  • Variability between groups

    I Variability between each continent and global mean isbetween-group sum of squares

    I Adds squared differences of (a) each group mean from (b)global (“grand”) mean

    I The between-group sum of squares for k groups is:

    k∑i=1

    ni (X̄i − X̄ )2

    I Where:I i is an index representing group (here, six continents)I k = # of groups (six)I X̄i is the mean of group iI X̄ is global (“grand”) mean

    I Intuition: If group means are close to each other (andtherefore to grand mean) this will be small

  • Variability between groups

    I Variability between each continent and global mean isbetween-group sum of squares

    I Adds squared differences of (a) each group mean from (b)global (“grand”) mean

    I The between-group sum of squares for k groups is:

    k∑i=1

    ni (X̄i − X̄ )2

    I Where:I i is an index representing group (here, six continents)I k = # of groups (six)I X̄i is the mean of group iI X̄ is global (“grand”) mean

    I Intuition: If group means are close to each other (andtherefore to grand mean) this will be small

  • Variability between groups

    I Variability between each continent and global mean isbetween-group sum of squares

    I Adds squared differences of (a) each group mean from (b)global (“grand”) mean

    I The between-group sum of squares for k groups is:

    k∑i=1

    ni (X̄i − X̄ )2

    I Where:I i is an index representing group (here, six continents)I k = # of groups (six)I X̄i is the mean of group iI X̄ is global (“grand”) mean

    I Intuition: If group means are close to each other (andtherefore to grand mean) this will be small

  • Variability between groups

    I Variability between each continent and global mean isbetween-group sum of squares

    I Adds squared differences of (a) each group mean from (b)global (“grand”) mean

    I The between-group sum of squares for k groups is:

    k∑i=1

    ni (X̄i − X̄ )2

    I Where:I i is an index representing group (here, six continents)I k = # of groups (six)I X̄i is the mean of group iI X̄ is global (“grand”) mean

    I Intuition: If group means are close to each other (andtherefore to grand mean) this will be small

  • Variability between groups

    I Variability between each continent and global mean isbetween-group sum of squares

    I Adds squared differences of (a) each group mean from (b)global (“grand”) mean

    I The between-group sum of squares for k groups is:

    k∑i=1

    ni (X̄i − X̄ )2

    I Where:I i is an index representing group (here, six continents)I k = # of groups (six)I X̄i is the mean of group iI X̄ is global (“grand”) mean

    I Intuition: If group means are close to each other (andtherefore to grand mean) this will be small

  • Variability between groups

    I Variability between each continent and global mean isbetween-group sum of squares

    I Adds squared differences of (a) each group mean from (b)global (“grand”) mean

    I The between-group sum of squares for k groups is:

    k∑i=1

    ni (X̄i − X̄ )2

    I Where:

    I i is an index representing group (here, six continents)I k = # of groups (six)I X̄i is the mean of group iI X̄ is global (“grand”) mean

    I Intuition: If group means are close to each other (andtherefore to grand mean) this will be small

  • Variability between groups

    I Variability between each continent and global mean isbetween-group sum of squares

    I Adds squared differences of (a) each group mean from (b)global (“grand”) mean

    I The between-group sum of squares for k groups is:

    k∑i=1

    ni (X̄i − X̄ )2

    I Where:I i is an index representing group (here, six continents)

    I k = # of groups (six)I X̄i is the mean of group iI X̄ is global (“grand”) mean

    I Intuition: If group means are close to each other (andtherefore to grand mean) this will be small

  • Variability between groups

    I Variability between each continent and global mean isbetween-group sum of squares

    I Adds squared differences of (a) each group mean from (b)global (“grand”) mean

    I The between-group sum of squares for k groups is:

    k∑i=1

    ni (X̄i − X̄ )2

    I Where:I i is an index representing group (here, six continents)I k = # of groups (six)

    I X̄i is the mean of group iI X̄ is global (“grand”) mean

    I Intuition: If group means are close to each other (andtherefore to grand mean) this will be small

  • Variability between groups

    I Variability between each continent and global mean isbetween-group sum of squares

    I Adds squared differences of (a) each group mean from (b)global (“grand”) mean

    I The between-group sum of squares for k groups is:

    k∑i=1

    ni (X̄i − X̄ )2

    I Where:I i is an index representing group (here, six continents)I k = # of groups (six)I X̄i is the mean of group i

    I X̄ is global (“grand”) mean

    I Intuition: If group means are close to each other (andtherefore to grand mean) this will be small

  • Variability between groups

    I Variability between each continent and global mean isbetween-group sum of squares

    I Adds squared differences of (a) each group mean from (b)global (“grand”) mean

    I The between-group sum of squares for k groups is:

    k∑i=1

    ni (X̄i − X̄ )2

    I Where:I i is an index representing group (here, six continents)I k = # of groups (six)I X̄i is the mean of group iI X̄ is global (“grand”) mean

    I Intuition: If group means are close to each other (andtherefore to grand mean) this will be small

  • Variability between groups

    I Variability between each continent and global mean isbetween-group sum of squares

    I Adds squared differences of (a) each group mean from (b)global (“grand”) mean

    I The between-group sum of squares for k groups is:

    k∑i=1

    ni (X̄i − X̄ )2

    I Where:I i is an index representing group (here, six continents)I k = # of groups (six)I X̄i is the mean of group iI X̄ is global (“grand”) mean

    I Intuition: If group means are close to each other (andtherefore to grand mean) this will be small

  • Variability within groups

    I The variability between individual countries within a continentis within-group sum of squares

    I Adds squared differences of (a) each observation from (b)their group’s mean

    k∑i=1

    ni∑j=1

    (Xij − X̄i )2

    I WhereI Xij is an individual observation j in group iI ni observations in group iI k = # of continents

    I Also referred to as Mean Squared Error (MSE)

  • Variability within groups

    I The variability between individual countries within a continentis within-group sum of squares

    I Adds squared differences of (a) each observation from (b)their group’s mean

    k∑i=1

    ni∑j=1

    (Xij − X̄i )2

    I WhereI Xij is an individual observation j in group iI ni observations in group iI k = # of continents

    I Also referred to as Mean Squared Error (MSE)

  • Variability within groups

    I The variability between individual countries within a continentis within-group sum of squares

    I Adds squared differences of (a) each observation from (b)their group’s mean

    k∑i=1

    ni∑j=1

    (Xij − X̄i )2

    I WhereI Xij is an individual observation j in group iI ni observations in group iI k = # of continents

    I Also referred to as Mean Squared Error (MSE)

  • Variability within groups

    I The variability between individual countries within a continentis within-group sum of squares

    I Adds squared differences of (a) each observation from (b)their group’s mean

    k∑i=1

    ni∑j=1

    (Xij − X̄i )2

    I WhereI Xij is an individual observation j in group iI ni observations in group iI k = # of continents

    I Also referred to as Mean Squared Error (MSE)

  • Variability within groups

    I The variability between individual countries within a continentis within-group sum of squares

    I Adds squared differences of (a) each observation from (b)their group’s mean

    k∑i=1

    ni∑j=1

    (Xij − X̄i )2

    I Where

    I Xij is an individual observation j in group iI ni observations in group iI k = # of continents

    I Also referred to as Mean Squared Error (MSE)

  • Variability within groups

    I The variability between individual countries within a continentis within-group sum of squares

    I Adds squared differences of (a) each observation from (b)their group’s mean

    k∑i=1

    ni∑j=1

    (Xij − X̄i )2

    I WhereI Xij is an individual observation j in group i

    I ni observations in group iI k = # of continents

    I Also referred to as Mean Squared Error (MSE)

  • Variability within groups

    I The variability between individual countries within a continentis within-group sum of squares

    I Adds squared differences of (a) each observation from (b)their group’s mean

    k∑i=1

    ni∑j=1

    (Xij − X̄i )2

    I WhereI Xij is an individual observation j in group iI ni observations in group i

    I k = # of continents

    I Also referred to as Mean Squared Error (MSE)

  • Variability within groups

    I The variability between individual countries within a continentis within-group sum of squares

    I Adds squared differences of (a) each observation from (b)their group’s mean

    k∑i=1

    ni∑j=1

    (Xij − X̄i )2

    I WhereI Xij is an individual observation j in group iI ni observations in group iI k = # of continents

    I Also referred to as Mean Squared Error (MSE)

  • Variability within groups

    I The variability between individual countries within a continentis within-group sum of squares

    I Adds squared differences of (a) each observation from (b)their group’s mean

    k∑i=1

    ni∑j=1

    (Xij − X̄i )2

    I WhereI Xij is an individual observation j in group iI ni observations in group iI k = # of continents

    I Also referred to as Mean Squared Error (MSE)

  • Overall variability

    I A measure of overall variability in the dataset is called totalSum of Squares (SS)

    I Adds squared differences of all individual observations acrossall groups from global (“grand”) mean

    k∑i=1

    ni∑j=1

    (Xij − X̄ )2

  • Overall variability

    I A measure of overall variability in the dataset is called totalSum of Squares (SS)

    I Adds squared differences of all individual observations acrossall groups from global (“grand”) mean

    k∑i=1

    ni∑j=1

    (Xij − X̄ )2

  • Overall variability

    I A measure of overall variability in the dataset is called totalSum of Squares (SS)

    I Adds squared differences of all individual observations acrossall groups from global (“grand”) mean

    k∑i=1

    ni∑j=1

    (Xij − X̄ )2

  • Overall variability

    I A measure of overall variability in the dataset is called totalSum of Squares (SS)

    I Adds squared differences of all individual observations acrossall groups from global (“grand”) mean

    k∑i=1

    ni∑j=1

    (Xij − X̄ )2

  • ANOVA Table

    Common to organize this info in an ANOVA table, which includes:

    I Source of variation: (1) Between Group, (2) Within Group, or(3) Total

    I Sum of Squares value

    I Degrees of Freedom

    I Mean Sum of Squares, which equals for each row

    Sum of Squares

    Degrees of Freedom

    I ANOVA F-statistic (will explain)

    I A p-value (will explain)

  • ANOVA Table

    Common to organize this info in an ANOVA table, which includes:

    I Source of variation: (1) Between Group, (2) Within Group, or(3) Total

    I Sum of Squares value

    I Degrees of Freedom

    I Mean Sum of Squares, which equals for each row

    Sum of Squares

    Degrees of Freedom

    I ANOVA F-statistic (will explain)

    I A p-value (will explain)

  • ANOVA Table

    Common to organize this info in an ANOVA table, which includes:

    I Source of variation: (1) Between Group, (2) Within Group, or(3) Total

    I Sum of Squares value

    I Degrees of Freedom

    I Mean Sum of Squares, which equals for each row

    Sum of Squares

    Degrees of Freedom

    I ANOVA F-statistic (will explain)

    I A p-value (will explain)

  • ANOVA Table

    Common to organize this info in an ANOVA table, which includes:

    I Source of variation: (1) Between Group, (2) Within Group, or(3) Total

    I Sum of Squares value

    I Degrees of Freedom

    I Mean Sum of Squares, which equals for each row

    Sum of Squares

    Degrees of Freedom

    I ANOVA F-statistic (will explain)

    I A p-value (will explain)

  • ANOVA Table

    Common to organize this info in an ANOVA table, which includes:

    I Source of variation: (1) Between Group, (2) Within Group, or(3) Total

    I Sum of Squares value

    I Degrees of Freedom

    I Mean Sum of Squares, which equals for each row

    Sum of Squares

    Degrees of Freedom

    I ANOVA F-statistic (will explain)

    I A p-value (will explain)

  • ANOVA Table

    Common to organize this info in an ANOVA table, which includes:

    I Source of variation: (1) Between Group, (2) Within Group, or(3) Total

    I Sum of Squares value

    I Degrees of Freedom

    I Mean Sum of Squares, which equals for each row

    Sum of Squares

    Degrees of Freedom

    I ANOVA F-statistic (will explain)

    I A p-value (will explain)

  • ANOVA Table

    Common to organize this info in an ANOVA table, which includes:

    I Source of variation: (1) Between Group, (2) Within Group, or(3) Total

    I Sum of Squares value

    I Degrees of Freedom

    I Mean Sum of Squares, which equals for each row

    Sum of Squares

    Degrees of Freedom

    I ANOVA F-statistic (will explain)

    I A p-value (will explain)

  • ANOVA Table

    Common to organize this info in an ANOVA table, which includes:

    I Source of variation: (1) Between Group, (2) Within Group, or(3) Total

    I Sum of Squares value

    I Degrees of Freedom

    I Mean Sum of Squares, which equals for each row

    Sum of Squares

    Degrees of Freedom

    I ANOVA F-statistic (will explain)

    I A p-value (will explain)

  • ANOVA Table

    Common to organize this info in an ANOVA table, which includes:

    I Source of variation: (1) Between Group, (2) Within Group, or(3) Total

    I Sum of Squares value

    I Degrees of Freedom

    I Mean Sum of Squares, which equals for each row

    Sum of Squares

    Degrees of Freedom

    I ANOVA F-statistic (will explain)

    I A p-value (will explain)

  • ANOVA Table

    Source Sum of Squares df Mean SS F -stat p-value

    BetweenWithinTotal

  • ANOVA Table

    Source Sum of Squares df Mean SS F -stat p-value

    BetweenWithinTotal

  • ANOVA Table

    Source Sum of Squares df Mean SS F -stat p-value

    Between∑

    ni (X̄i − X̄ )2

    Within∑∑

    (Xij − X̄i )2

    Total∑∑

    (Xij − X̄ )2

  • ANOVA Table

    Source Sum of Squares df Mean SS F -stat p-value

    Between∑

    ni (X̄i − X̄ )2 k − 1

    Within∑∑

    (Xij − X̄i )2 n − k

    Total∑∑

    (Xij − X̄ )2 n − 1

  • ANOVA Table

    Source Sum of Squares df Mean SS F -stat p-value

    Between 11891.39 k − 1Within 6701.02 n − kTotal 18592.41 n − 1

  • ANOVA Table

    Source Sum of Squares df Mean SS F -stat p-value

    Between 11891.39 5Within 6701.02 187Total 18592.41 192

  • ANOVA Table

    Source Sum of Squares df Mean SS F -stat p-value

    Between 11891.39 5 2378.28Within 6701.02 187 35.83Total 18592.41 192

  • To Conduct ANOVA Test

    I Remember null hypothesis: µ1 = µ2 = ... = µkI And let’s further assume groups have same population

    standard deviation, σ

    I If null is true → every group’s X s come from samedistribution:

    Xij ∼ (µ,σ2)

    I But if null is not true → each group’s X s come from differentdistributions:

    Xij ∼ (µi ,σ2)

  • To Conduct ANOVA Test

    I Remember null hypothesis: µ1 = µ2 = ... = µk

    I And let’s further assume groups have same populationstandard deviation, σ

    I If null is true → every group’s X s come from samedistribution:

    Xij ∼ (µ,σ2)

    I But if null is not true → each group’s X s come from differentdistributions:

    Xij ∼ (µi ,σ2)

  • To Conduct ANOVA Test

    I Remember null hypothesis: µ1 = µ2 = ... = µkI And let’s further assume groups have same population

    standard deviation, σ

    I If null is true → every group’s X s come from samedistribution:

    Xij ∼ (µ,σ2)

    I But if null is not true → each group’s X s come from differentdistributions:

    Xij ∼ (µi ,σ2)

  • To Conduct ANOVA Test

    I Remember null hypothesis: µ1 = µ2 = ... = µkI And let’s further assume groups have same population

    standard deviation, σ

    I If null is true → every group’s X s come from samedistribution:

    Xij ∼ (µ,σ2)

    I But if null is not true → each group’s X s come from differentdistributions:

    Xij ∼ (µi ,σ2)

  • To Conduct ANOVA Test

    I Remember null hypothesis: µ1 = µ2 = ... = µkI And let’s further assume groups have same population

    standard deviation, σ

    I If null is true → every group’s X s come from samedistribution:

    Xij ∼ (µ,σ2)

    I But if null is not true → each group’s X s come from differentdistributions:

    Xij ∼ (µi ,σ2)

  • To Conduct ANOVA Test

    I Remember null hypothesis: µ1 = µ2 = ... = µkI And let’s further assume groups have same population

    standard deviation, σ

    I If null is true → every group’s X s come from samedistribution:

    Xij ∼ (µ,σ2)

    I But if null is not true → each group’s X s come from differentdistributions:

    Xij ∼ (µi ,σ2)

  • To Conduct ANOVA Test

    I Remember null hypothesis: µ1 = µ2 = ... = µkI And let’s further assume groups have same population

    standard deviation, σ

    I If null is true → every group’s X s come from samedistribution:

    Xij ∼ (µ,σ2)

    I But if null is not true → each group’s X s come from differentdistributions:

    Xij ∼ (µi ,σ2)

  • To Conduct ANOVA Test

    Here is the key intuition:I (1) Between group variability:

    I If null is true → only source of variance is populationvariability (so σ2)

    I If null is false → X̄i from different groups, you have variationcome from differences in means (b/c µi ’s vary) as well aspopulation variability (σ2)

    I If null false → between group error larger than if null trueI (2) Within group variability:

    I Unaffected by null being true or falseI Should be around σ2 (if same across groups)

    I → If null true, within group error and between group errorshould be close together

    I → If null false, between group error > within group error,reflecting the fact that µi varies

  • To Conduct ANOVA Test

    Here is the key intuition:

    I (1) Between group variability:I If null is true → only source of variance is population

    variability (so σ2)I If null is false → X̄i from different groups, you have variation

    come from differences in means (b/c µi ’s vary) as well aspopulation variability (σ2)

    I If null false → between group error larger than if null trueI (2) Within group variability:

    I Unaffected by null being true or falseI Should be around σ2 (if same across groups)

    I → If null true, within group error and between group errorshould be close together

    I → If null false, between group error > within group error,reflecting the fact that µi varies

  • To Conduct ANOVA Test

    Here is the key intuition:I (1) Between group variability:

    I If null is true → only source of variance is populationvariability (so σ2)

    I If null is false → X̄i from different groups, you have variationcome from differences in means (b/c µi ’s vary) as well aspopulation variability (σ2)

    I If null false → between group error larger than if null trueI (2) Within group variability:

    I Unaffected by null being true or falseI Should be around σ2 (if same across groups)

    I → If null true, within group error and between group errorshould be close together

    I → If null false, between group error > within group error,reflecting the fact that µi varies

  • To Conduct ANOVA Test

    Here is the key intuition:I (1) Between group variability:

    I If null is true → only source of variance is populationvariability (so σ2)

    I If null is false → X̄i from different groups, you have variationcome from differences in means (b/c µi ’s vary) as well aspopulation variability (σ2)

    I If null false → between group error larger than if null trueI (2) Within group variability:

    I Unaffected by null being true or falseI Should be around σ2 (if same across groups)

    I → If null true, within group error and between group errorshould be close together

    I → If null false, between group error > within group error,reflecting the fact that µi varies

  • To Conduct ANOVA Test

    Here is the key intuition:I (1) Between group variability:

    I If null is true → only source of variance is populationvariability (so σ2)

    I If null is false → X̄i from different groups, you have variationcome from differences in means (b/c µi ’s vary) as well aspopulation variability (σ2)

    I If null false → between group error larger than if null trueI (2) Within group variability:

    I Unaffected by null being true or falseI Should be around σ2 (if same across groups)

    I → If null true, within group error and between group errorshould be close together

    I → If null false, between group error > within group error,reflecting the fact that µi varies

  • To Conduct ANOVA Test

    Here is the key intuition:I (1) Between group variability:

    I If null is true → only source of variance is populationvariability (so σ2)

    I If null is false → X̄i from different groups, you have variationcome from differences in means (b/c µi ’s vary) as well aspopulation variability (σ2)

    I If null false → between group error larger than if null true

    I (2) Within group variability:I Unaffected by null being true or falseI Should be around σ2 (if same across groups)

    I → If null true, within group error and between group errorshould be close together

    I → If null false, between group error > within group error,reflecting the fact that µi varies

  • To Conduct ANOVA Test

    Here is the key intuition:I (1) Between group variability:

    I If null is true → only source of variance is populationvariability (so σ2)

    I If null is false → X̄i from different groups, you have variationcome from differences in means (b/c µi ’s vary) as well aspopulation variability (σ2)

    I If null false → between group error larger than if null trueI (2) Within group variability:

    I Unaffected by null being true or falseI Should be around σ2 (if same across groups)

    I → If null true, within group error and between group errorshould be close together

    I → If null false, between group error > within group error,reflecting the fact that µi varies

  • To Conduct ANOVA Test

    Here is the key intuition:I (1) Between group variability:

    I If null is true → only source of variance is populationvariability (so σ2)

    I If null is false → X̄i from different groups, you have variationcome from differences in means (b/c µi ’s vary) as well aspopulation variability (σ2)

    I If null false → between group error larger than if null trueI (2) Within group variability:

    I Unaffected by null being true or false

    I Should be around σ2 (if same across groups)

    I → If null true, within group error and between group errorshould be close together

    I → If null false, between group error > within group error,reflecting the fact that µi varies

  • To Conduct ANOVA Test

    Here is the key intuition:I (1) Between group variability:

    I If null is true → only source of variance is populationvariability (so σ2)

    I If null is false → X̄i from different groups, you have variationcome from differences in means (b/c µi ’s vary) as well aspopulation variability (σ2)

    I If null false → between group error larger than if null trueI (2) Within group variability:

    I Unaffected by null being true or falseI Should be around σ2 (if same across groups)

    I → If null true, within group error and between group errorshould be close together

    I → If null false, between group error > within group error,reflecting the fact that µi varies

  • To Conduct ANOVA Test

    Here is the key intuition:I (1) Between group variability:

    I If null is true → only source of variance is populationvariability (so σ2)

    I If null is false → X̄i from different groups, you have variationcome from differences in means (b/c µi ’s vary) as well aspopulation variability (σ2)

    I If null false → between group error larger than if null trueI (2) Within group variability:

    I Unaffected by null being true or falseI Should be around σ2 (if same across groups)

    I → If null true, within group error and between group errorshould be close together

    I → If null false, between group error > within group error,reflecting the fact that µi varies

  • To Conduct ANOVA Test

    Here is the key intuition:I (1) Between group variability:

    I If null is true → only source of variance is populationvariability (so σ2)

    I If null is false → X̄i from different groups, you have variationcome from differences in means (b/c µi ’s vary) as well aspopulation variability (σ2)

    I If null false → between group error larger than if null trueI (2) Within group variability:

    I Unaffected by null being true or falseI Should be around σ2 (if same across groups)

    I → If null true, within group error and between group errorshould be close together

    I → If null false, between group error > within group error,reflecting the fact that µi varies

  • To Conduct ANOVA Test

    I This intuition gives us the ANOVA test (sometimes calledANOVA F


Recommended