Statistics for Social and Behavioral Sciences Part IV: Causality Comparison of two groups Chapter 7...

Statistics for Socialand Behavioral Sciences

Part IV: CausalityComparison of two groups

Chapter 7Prof. Amine Ouazad

Statistics Course Outline

PART I. INTRODUCTION AND RESEARCH DESIGN

PART II. DESCRIBING DATA

PART III. DRAWING CONCLUSIONS FROM DATA: INFERENTIAL

STATISTICS

PART IV. : CORRELATION AND CAUSATION: TWO GROUPS,

REGRESSION ANALYSIS

Week 1

Weeks 2-4

Weeks 5-9

Weeks 10-14

This is where we talk about Zmapp and Ebola!

Estimating a parameter using sample statistics. Confidence Interval at 90%, 95%, 99% Testing a hypothesis using the CI method and the t method.

Sample statistics: Mean, Median, SD, Variance, Percentiles, IQR, Empirical RuleBivariate sample statistics: Correlation, Slope

Four Steps of “Thinking Like a Statistician”Study Design: Simple Random Sampling, Cluster Sampling, Stratified Sampling

Biases: Nonresponse bias, Response bias, Sampling bias

Coming up

• “Comparison of Two Groups”This Session.

• “Univariate Regression Analysis”Next Session Saturday.

• “Association and Causality”Tuesday, Thursday and Extra Session.

• “Randomized Experiments (Cted), ANOVA”.Last Tuesday and Extra Session.

• “Robustness Checks and Wrap Up”.Last Thursday.

Outline

1. Randomized controlled trials

2. t test for equality of means

Next time: Inference in Univariate Regressions

Do U.S. Employers Discriminate?

• Employers post job ads.• Sometimes mentioning they are an “Equal Opportunity

Employer.”• Some Employers are federal contractors.• Lots of anecdotal evidence…

– “In hiring, racial bias is still a problem.” Forbes.– “Protesters allege hiring discrimination by Ferrara Candy”, Chicago

Tribune, October 28, 2014.

• But we can’t trust stories…• Very very tough question. Should be extra careful.• What about causal evidence from statistical data?

Outline

1. Randomized controlled trials

2. t test for equality of means

Next time: Inference in Univariate Regressions

Difference of means• Two groups: White and African American.– m1: sample mean in first group.

X1i: observation of individual i in group 1.

– m2: sample mean in the second group.X2

i: observation of individual i in group 2.

• The expected value of the difference m1-m2?

• Sampling distribution of the difference m1-m2?

• Standard error of m1-m2:Standard deviation of the sampling distribution of m1-m2.

• Very very similar to the one group.

• t is also chosen from Table 5.1.• Degrees of freedom df given either by the Welch approximation or

the Satterthwaite approximation (see end of handout). In general:

• use of t distribution makes normality assumption on X.• Robustness to violations of the normality assumptions, esp. for

proportions.

Confidence Interval for the Difference of Means

t statistic for the differenceof means

• Built similarly as in the one group case.

• Can also subtract numerator by v when testing for the equality of the mean to a number v.

Methods for hypothesis testing

H0 : m1 = m2 Ha : m1 different from m2.

Reject the null hypothesis if either:The confidence interval does not include 0

The t statistic is above the t score in absolute value

Application

• Compute the t statistics of the difference for each city.

• Can you recover the p values using Stata?• Can you reject the null hypothesis that the call

back rates for White and African American names are equal?

t statistic Reject?

Can you recover the p values?• Using display 2*ttail in Stata.• And df approximately n1+n2 – 2.

Fill in the t statistics here. Can we reject the null hypothesis?

Resume Quality And Callback Rates

A refinement for sample proportions• When X1

i and X2i are variables that take only

two values 0 or 1.• m1 and m2 are sample proportions p1 and p2.

• Group 1 size : n1. Group 2 size: n2.

• H0: “p1 = p2”• Under the null, the standard deviations of the

two samples are equal (p), and thus:

• df = n1 + n2 – 2 and

Back to Café Firenze?

• What we did before: – Confidence interval around the mean m1 of Café

Firenze and the mean m2 of Lebanese Express.– We showed that:

• Exercise at home: – Can we reject the null hypothesis that the true

mean rating m1 of Café Firenze is different from the true mean rating m2 of Lebanese Express?

• “In many studies, one group of volunteers will be given an experimental or "test" drug or treatment, while the control group is given either a standard treatment for the illness or an inactive pill, liquid, or powder that has no treatment value (placebo). This control group provides a basis for comparison for assessing effects of the test treatment. In some studies, the control group will receive a placebo instead of an active drug or treatment. In other cases, it is considered unethical to use placebos, particularly if an effective treatment is available. Withholding treatment (even for a short time) would subject research participants to unreasonable risks.”

Another application of t tests for the equality of means

Coming up:

• Reading : Chapter on “Comparing Two Groups”.• Next chapter 9 with t tests for slope coefficients.• Online quiz this weekend on this material.• Session on Saturday at 12.45 in the same room -> catch up for National Day.• Make sure you come to sessions and recitations.

For help:

• Amine OuazadOffice 1135, Social Science [email protected] hour: Tuesday from 5 to 6.30pm.

• GAF: Irene [email protected] recitations. At the Academic Resource Center, Monday from 2 to 4pm.

mailto:[email protected]



Read only if interested:Degrees of freedom for two groups x and y

• Satterthwaite’s approximate formula:

• Welch’s approximate formula:

Date post:	14-Jan-2016
Category:	Documents
Upload:	gertrude-snow
View:	214 times
Download:	0 times

Statistics for Social and Behavioral Sciences Part IV: Causality Comparison of two groups Chapter 7...

Documents