Date post: | 14-Jan-2016 |
Category: |
Documents |
Upload: | gertrude-snow |
View: | 214 times |
Download: | 0 times |
Statistics for Socialand Behavioral Sciences
Part IV: CausalityComparison of two groups
Chapter 7Prof. Amine Ouazad
Statistics Course Outline
PART I. INTRODUCTION AND RESEARCH DESIGN
PART II. DESCRIBING DATA
PART III. DRAWING CONCLUSIONS FROM DATA: INFERENTIAL
STATISTICS
PART IV. : CORRELATION AND CAUSATION: TWO GROUPS,
REGRESSION ANALYSIS
Week 1
Weeks 2-4
Weeks 5-9
Weeks 10-14
This is where we talk about Zmapp and Ebola!
Estimating a parameter using sample statistics. Confidence Interval at 90%, 95%, 99% Testing a hypothesis using the CI method and the t method.
Sample statistics: Mean, Median, SD, Variance, Percentiles, IQR, Empirical RuleBivariate sample statistics: Correlation, Slope
Four Steps of “Thinking Like a Statistician”Study Design: Simple Random Sampling, Cluster Sampling, Stratified Sampling
Biases: Nonresponse bias, Response bias, Sampling bias
Coming up
• “Comparison of Two Groups”This Session.
• “Univariate Regression Analysis”Next Session Saturday.
• “Association and Causality”Tuesday, Thursday and Extra Session.
• “Randomized Experiments (Cted), ANOVA”.Last Tuesday and Extra Session.
• “Robustness Checks and Wrap Up”.Last Thursday.
Outline
1. Randomized controlled trials
2. t test for equality of means
Next time: Inference in Univariate Regressions
Do U.S. Employers Discriminate?
• Employers post job ads.• Sometimes mentioning they are an “Equal Opportunity
Employer.”• Some Employers are federal contractors.• Lots of anecdotal evidence…
– “In hiring, racial bias is still a problem.” Forbes.– “Protesters allege hiring discrimination by Ferrara Candy”, Chicago
Tribune, October 28, 2014.
• But we can’t trust stories…• Very very tough question. Should be extra careful.• What about causal evidence from statistical data?
Outline
1. Randomized controlled trials
2. t test for equality of means
Next time: Inference in Univariate Regressions
Difference of means• Two groups: White and African American.– m1: sample mean in first group.
X1i: observation of individual i in group 1.
– m2: sample mean in the second group.X2
i: observation of individual i in group 2.
• The expected value of the difference m1-m2?
• Sampling distribution of the difference m1-m2?
• Standard error of m1-m2:Standard deviation of the sampling distribution of m1-m2.
• Very very similar to the one group.
• t is also chosen from Table 5.1.• Degrees of freedom df given either by the Welch approximation or
the Satterthwaite approximation (see end of handout). In general:
• use of t distribution makes normality assumption on X.• Robustness to violations of the normality assumptions, esp. for
proportions.
Confidence Interval for the Difference of Means
t statistic for the differenceof means
• Built similarly as in the one group case.
• Can also subtract numerator by v when testing for the equality of the mean to a number v.
Methods for hypothesis testing
H0 : m1 = m2 Ha : m1 different from m2.
Reject the null hypothesis if either:The confidence interval does not include 0
The t statistic is above the t score in absolute value
Application
• Compute the t statistics of the difference for each city.
• Can you recover the p values using Stata?• Can you reject the null hypothesis that the call
back rates for White and African American names are equal?
t statistic Reject?
Can you recover the p values?• Using display 2*ttail in Stata.• And df approximately n1+n2 – 2.
Fill in the t statistics here. Can we reject the null hypothesis?
Resume Quality And Callback Rates
A refinement for sample proportions• When X1
i and X2i are variables that take only
two values 0 or 1.• m1 and m2 are sample proportions p1 and p2.
• Group 1 size : n1. Group 2 size: n2.
• H0: “p1 = p2”• Under the null, the standard deviations of the
two samples are equal (p), and thus:
• df = n1 + n2 – 2 and
Back to Café Firenze?
• What we did before: – Confidence interval around the mean m1 of Café
Firenze and the mean m2 of Lebanese Express.– We showed that:
• Exercise at home: – Can we reject the null hypothesis that the true
mean rating m1 of Café Firenze is different from the true mean rating m2 of Lebanese Express?
• “In many studies, one group of volunteers will be given an experimental or "test" drug or treatment, while the control group is given either a standard treatment for the illness or an inactive pill, liquid, or powder that has no treatment value (placebo). This control group provides a basis for comparison for assessing effects of the test treatment. In some studies, the control group will receive a placebo instead of an active drug or treatment. In other cases, it is considered unethical to use placebos, particularly if an effective treatment is available. Withholding treatment (even for a short time) would subject research participants to unreasonable risks.”
Another application of t tests for the equality of means
Coming up:
• Reading : Chapter on “Comparing Two Groups”.• Next chapter 9 with t tests for slope coefficients.• Online quiz this weekend on this material.• Session on Saturday at 12.45 in the same room -> catch up for National Day.• Make sure you come to sessions and recitations.
For help:
• Amine OuazadOffice 1135, Social Science [email protected] hour: Tuesday from 5 to 6.30pm.
• GAF: Irene [email protected] recitations. At the Academic Resource Center, Monday from 2 to 4pm.
Read only if interested:Degrees of freedom for two groups x and y
• Satterthwaite’s approximate formula:
• Welch’s approximate formula: