Simple Group Comparisons Comparison of a single sample (group) to the population

Simple Group Comparisons

Comparison of a single sample (group) to the population

Ho: Sample Statistic - Population Parameter = 0

Comparison of two independent samples from the same population

Ho: Sample 1 Statistic - Sample 2 Statistic = 0

Comparison of two related samples from the same population

Ho: Difference between related scores = 0

In each case, the statistical test compares an observed difference (based on the data collected)

to an expected difference (expected when the Ho is true)

Research question

Did students admitted in 1982 score better than the national norm on the GRE tests?

GRE Verbal

GRE Quantitative

Interval or Ratio Data – consider these data first

Sample compared to Population (like previous example of class to standardized test)

Ho : Sample Mean - Population Mean = 0

Is the observed variability (of sample mean from population mean) more than would be expected by chance in this population

Is the variability observed (potential systematic variability)‘greater than’ the unsystematic variability? (the expected variability of means across equivalent samples)

How likely is the observed Mean in the assumed Population?

limited to cases where a single difference is being evaluated

and the Population Mean is ‘known’ (or assumed)

One sample t-test

is the mean for the sample ‘typical’ of sample means drawn from the population-is GRE for MA students different from “Mean”??

sample mean is compared to population mean to see if there is a systematic difference

(more difference than would be expected if Ho true and only error variance is present)

Assumptions

interval/ratio data (at least being treated as interval/ratio data)

sample was based on independent observations from the population

normal distribution – data were drawn from one, not that data will appear to be normalif sample 50+ normality less importantonly serious variations from normal need be concerns

standard deviation of sample is equivalent to population value(usually cannot ‘know’, since t used when pop SD unknown)

t-test for sample to population comparison is equivalent to a z-test, except you know the population mean but not the population standard deviation, so it must be estimated using the sample information.

M – observed deviation = systematic + error)

SEmean (typical deviation when Ho true = error only) estimated from the sample SD and Sample Size

What is the difference between z and t?

So, what value for t is likely to lead you to reject the Ho? Assuming typical p = .05, two-tailed test

What is the likely ‘critical t value’ on the Ho distribution?

Implications of sample size (degrees of freedom) – critical t refresh notion of degrees of freedom

t =

Research question 1

Was the Class of 1982, first Psychology Masters students, above average on the GRE test scores?

GRE – Verbal

GRE - Quantitative

Go back to the Steps…….

Characteristics of the DataMeasurement type – nominal, ordinal, interval/ratioIndependent or related observations

Steps in Statistical Evaluation and Decision-making

Pre Data Collection What are the Research Questions and Specific Hypotheses?

Descriptive research – variability or structure of a variableCovariation – simple or complexGroup Differences – simple or complex

Select appropriate statistical test - t-test

Establish criteria for decision to be made based on testone vs. two tailed testtype one error level to be used (.05)experiment-wise (family-wise alpha) (could use .025/test)

since 2 means will be comparedPower Analysis

to determine sample size goalsto determine power for sample available (n = 23) and

for Type 1 error selected(power = .63 for moderate effect size, d = .5) what if wanted to detect large/small effects?

Steps in Statistical Evaluation and Decision-making

Post Data Collection

Check Data to See if Assumptions were MetExploratory Data AnalysisData Clean-up – Transformations, if neededAlter Choice of Statistical test if needed

Run Statistical Test and Compute Appropriate Descriptive Statistics

Interpret Results of Statistical Test Can the Null Hypothesis be Rejected? If Reject – consider type I error level, confidence intervals, effect size, power If Fail to Reject – consider power (type 2 error), significance,

confidence intervals Always must interpret by reference to the descriptive statistics

Interpret Results Give meaning to the results in the context of the design and statistical tests used

Consider limits of conclusions supported by the statistical testsConsider Design strengths and Weaknesses

Generalize, as appropriate

Go to GPower to estimate power

Go to the handout packExplore the data to check data and assumptions

any missing valuesare scores reasonableany outlierswill Mean capture ‘typical’could data be from a normal distribution?

t-test is robust, so potential deviations must be clear

Calculate the observed t value

Decide

Interpret

Research question 2

Do students in the two programs differ on GPA’s as undergraduates?

GPA Total

GPA JrSr Years

Situation 2

Two Independent Means – are they drawn from the same population?

Like the Comparison of Chanters to Non-Chanters

Ho: Sample 1 Mean - Sample 2 Mean = 0

C/C students Mean – I/O students Mean = 0

t-test for independent samples

What makes them ‘independent’?

Situation 2

Two Independent Means – are they drawn from the same population?

Are C/C students and I/O students drawn from the same population?

Ho: Sample 1 Mean - Sample 2 Mean = 0

Assumptionsinterval/ratio data (or at least assumed)independent observations from the population

homogeneity of variances (Tabachnick & Fidell rule of thumb)- if sample sizes same, or close, (4:1 ratio or less)- and ratio of variances 10:1 or less- probably OK

if not in these limits, then use separate variances t tests and adjusted dfnormal distribution – drawn from, not is one,

robust unless dramatic departure

Most likely nonparametric alternative – Mann-Whitney U

M1 – M2 Observed deviation (error + systematic) SE m-m Expected deviation when Ho true (error only)

the major issue now is how to estimate the SEm-m (standard error of the sampling distribution based on the differences between two means from the same population)

use group variances to estimate variability in distribution of differences between pairs of means drawn from the same population

you have two samples, not just one, so how should they be combined

if the samples have similar variances, can pool the two estimates into one(SPSS uses a weighted average, weighted by df in group)

implications of pooling 2 dissimilar variances?

if dissimilar, may want to keep them separate, especially if sample sizes differ dramatically

t =

Go to the handout pack – Explore the data to check data and assumptions

any missing valuesare scores reasonableany outlierswill Mean capture ‘typical’could data be from a normal distribution?

t-test is robust, so potential deviations must be clear

Calculate the observed t value

Decide

Interpret

How would results be changed if the 2 “outliers” were dropped?

GPA = 2.75

GPA = 2.98

Both are in the C/C program

Group Statistics

9 3.6544 .31193 .10398

12 3.6708 .16914 .04883

programC/C

I/O

Undergrad GPAJr Sr Years

N Mean Std. DeviationStd. Error

Mean

Independent Samples Test

8.157 .010 -.155 19 .878 -.01639 .10577 -.23776 .20499

-.143 11.510 .889 -.01639 .11487 -.26786 .23508

Equal variancesassumed

Equal variancesnot assumed

Undergrad GPAJr Sr Years

F Sig.

Levene's Test forEquality of Variances

t df Sig. (2-tailed)Mean

DifferenceStd. ErrorDifference Lower Upper

95% ConfidenceInterval of the

Difference

t-test for Equality of Means

SDs are now more similar, as are the group Means, but you have lost df’s

Research Situation 3

What if the two means are from the same samples, or non-independent samples

repeated measures approach (within subjects)matching in the design as a control strategy

unit of analysis becomes the difference between paired scores (a transformation to restore

independence)

so each pair of scores contributes only a single observation to the calculation

two related scores yield a single measure, without relatedness

t test for paired samples

S# X X-M1 (X-M1)2 S# X X-M2 (X-M2) 2 D D-MD (D-MD)2

1 7 -3 9 11 6 -1 1 1 -2 4 2 8 -2 4 12 5 -2 4 3 0 0 3 9 -1 1 13 6 -1 1 3 0 0 4 9 -1 1 14 6 -1 1 3 0 0 5 10 0 0 15 7 0 0 3 0 0 6 10 0 0 16 7 0 0 3 0 0 7 11 +1 1 17 8 +1 1 3 0 0 8 11 +1 1 18 8 +1 1 3 0 0 9 12 +2 4 19 9 +2 4 3 0 010 13 +3 9 20 8 +1 1 5 +2 4_______________________ ________________________ ___________ 100 0 30 70 0 14 30 0 8

Low Noise (Group 1) High Noise (Group 2) Sample of Difference Scores

Example with “Matched” Groups

The independent variable was the level of background noise present while participants studied the list of 15 verbs for two minutes. Participants were matched on GPA.

The dependent variable was the number of verbs correctly recalled after a two minute delay.

Research question

Is there a difference between performance on the GREV and the GREQ for the 1982 class

Ho = GREV – GREQ = 0

t test for paired samples

Assumptions

interval/ratio data (or treated as such)

independent observations

normal distribution (drawn from)

(note no need for homogeneity of variances)

Most likely nonparametric alternatives Wilcoxon or Sign test

Mean Difference between paired scores

SE diff - how variable are difference scores (less variable than raw scores usually)

The t-test for paired samples is similar to the t-test comparing a sample mean to the population mean

When the Ho is true, assume that the population Mean Difference = 0

Ho: Mean difference between related scores = 0

t =

Research question

Is there a difference between performance on the GREV and the GREQ for the 1982 class

Convert each student’s two scores into a single difference score (GREV – GREQ = difference)

Go to the handouts …

N = 23Mean Difference = 37.39 (GRE V = 554.35; GRE Q = 516.96)SD of difference scores = 81.42

SEGREV-GREQ = 81.42/ 23 = 16.9773

Simple Group Differences – categorical data

When data are nominal or ordinal in categories – questions deal with frequencies in categories

So comparison involves the distribution of observations across categories

• Sample to Population

Ho: Differences between sample frequencies in categories and population frequencies in categories will = 0

• Two Independent Samples

Ho: Differences between the two independent sample’s frequencies in categories will = 0

• Two Related Samples

Ho: Differences between paired categorizations will = 0

Simple Group Differences – categorical data

Chi-square Test

Assumptions

1. Nominal data, or Ordinal in categories (frequencies)

2. Mutually exclusive categories

3. Independent Observations

4. No more than 20% of categories have expected frequencies < 5

5. No categories have expected frequencies < 1

To assess fit of Sample to population, must “know” population frequencies (expected frequencies)

A Known Population – gender distribution in adults is 50-50

Have a sample that includes 30 females, and 20 males

An Assumed Population – when Ho true

Have a sample (n = 90) of cola preferences (blind taste test)

Pepsi Coke RC Cola

Sample 45 30 15 Assumed Pop 30 30 30 if no real differences in

preferences in the population

To assess where differences occur, examine ‘residuals’

differences between observed and expected frequencies

Single Sample - test of proportions or fit

• Sample Frequencies compared to Population Expected Frequencies • - have a sample set of category frequencies, want to know• if different from frequencies expected in Population

• = (Obs freq –Exp freq)2

• Exp freq

• still like a ratio of Systematic + Error • Error

• where ‘error’ or unsystematic would be the expected frequencies when chance (Ho) operates

Pepsi Coke RC Cola

Sample (obs) 45 30 15

Pop (expected) 30 30 30

(45-30)2/30 + (30-30)2/30 + (15-30)2/30 = 15

(df, N=Sample size ) = 15 with what df?

Is there evidence of ‘systematic’ variability between sample and pop?

Conclusion?

Look at example in CoursePack – is gender representation in MA program representative of the population

Situation 2

Comparing two independent samples

TEST OF INDEPENDENCE Chi Square

Like t test, can compare two samples assumed to be from the same populationto see if they are different from one another when the data are categorical

assume IV / DV situation, want to know if levels of IV are associated

with different relative frequencies

Two Independent Samples

Ho: Differences between the two sample’s frequencies in categories will = 0

Same formula as used for Sample to Population

have a sample set of category frequencies, want to knowif different from frequencies expected in Population (when Ho true)

(Obs freq –Exp freq)2

Exp freq

again like a ratio of Error + Systematic Error

where ‘error’ is expected frequencies when chance operates

In this case, do not ‘know’ the population expected frequencies

must estimate them from data (like finding SE of differences in t test)

what would be the expected frequencies in the categories if two samples were from same population

(no systematic differences)?

help No help

Rude 10 30 40

Polite 20 20 40

30 50 80

Assume you are interested in the effects of request style

on willingness to help

Response

Request Style

Numbers represent observed frequencies of responses

help No help

Rude 15 25 40

Polite 25 40

30 50 80

Numbers represent expected frequencies of responses if Ho true

Request Style

Response

15

Contingency Table

Go to the calculations…… (Obs freq –Exp freq)2

Exp freq

(10-15)2/15 + (30-25)2/25 + (20-15)2/15 + (20-25)/2/25 = 5.33

(df, N=80) = 5.33 with what df?

Go to the example in the handouts…….

Is gender distribution same for both programs (C/C & I/O)

Power and effect size with Chi-square

‘w’ as measure of differences in proportions

phi2 and Cramer’s V2 as effect size measures

in % of variance

w = phi or Cramer’s V

Situation 3

Two Related Samples

Ho: Differences between paired categorizations will = 0 McNemar Test – variation of chi-square

-like the Paired Samples t-test Must, as before, create independent observation

look at categories that indicate paired responses

Research question:

Will people be more likely to help a child in need or an adult in need?

Help No help

child 18 12 30

adult 14 16 30

32 28 60

Assume you are interested in the status of person in need on the willingness to help – observe 30 individuals responses to

two possible situations

Response

Status

Numbers are not independent – categories are not mutually exclusive

So, need to ‘transform’ the data to create independent frequencies

60 observed behaviors become

30 individuals’ paired behaviors

Help a Child * Help an Adult Crosstabulation

Count

6 6 12

10 8 18

16 14 30

No Help

Help

Help aChild

Total

No Help Help

Help an Adult

Total

Chi-Square Tests

.454a .227a .122a

30

McNemar Test

N of Valid Cases

ValueExact Sig.(2-sided)

Exact Sig.(1-sided)

PointProbability

Binomial distribution used.a.


6 6 12

6.4 5.6 12.0

10 8 18

9.6 8.4 18.0

16 14 30

16.0 14.0 30.0

Count

Expected Count

Count

Expected Count

Count

Expected Count

No Help

Help

Help aChild

Total

No Help Help

Help an Adult

Total

Chi-Square Tests

.089b 1 .765 1.000 .529

.000 1 1.000

.089 1 .765 1.000 .529

1.000 .529

.086c

1 .769 1.000 .529 .278

.454d .227d .122d

30

Pearson Chi-Square

Continuity Correctiona

Likelihood Ratio

Fisher's Exact Test

Linear-by-LinearAssociation

McNemar Test

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)Exact Sig.(2-sided)

Exact Sig.(1-sided)

PointProbability

Computed only for a 2x2 tablea.

0 cells (.0%) have expected count less than 5. The minimum expected count is 5.60.b.

The standardized statistic is -.294.c.

Binomial distribution used.d.

Where is the evidence for or against ‘independence’?

What cell frequencies?


2 12 14

7.5 6.5 14.0

30 16 46

24.5 21.5 46.0

32 28 60

32.0 28.0 60.0

Count

Expected Count

Count

Expected Count

Count

Expected Count

No Help

Help

Help aChild

Total

No Help Help

Help an Adult

Total

Chi-Square Tests

11.187b 1 .001 .002 .001

9.234 1 .002

11.987 1 .001 .002 .001

.002 .001

11.000c

1 .001 .002 .001 .001

.008d .004d .003d

60

Pearson Chi-Square

Continuity Correctiona

Likelihood Ratio

Fisher's Exact Test

Linear-by-LinearAssociation

McNemar Test

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)Exact Sig.(2-sided)

Exact Sig.(1-sided)

PointProbability

Computed only for a 2x2 tablea.

0 cells (.0%) have expected count less than 5. The minimum expected count is 6.53.b.

The standardized statistic is -3.317.c.

Binomial distribution used.d.

Where is the evidence for or against ‘independence’?

What cell frequencies?

Same Question – with larger sample and more extreme differences

• Asymptotic Significance: The significance level based on the asymptotic distribution of a test statistic. Typically, a value of less than 0.05 is considered significant. The asymptotic significance is based on the assumption that the data set is large. If the data set is small or poorly distributed, this may not be a good indication of significance.

Exact Tests option not provided with SPSS 19

– will get with 2x2 Tables

Date post:	21-Jan-2016
Category:	Documents
Upload:	nam
View:	39 times
Download:	0 times

Simple Group Comparisons Comparison of a single sample (group) to the population

Documents