Chi square goodness of fit test The goodness of fit of a supposed freaquencies to sample data. 1©...

Post on 13-Dec-2015

222 views 1 download

Tags:

transcript

Chi square goodness of fit test

• The goodness of fit of a supposed freaquencies to sample data.

1© V.Čekanavičius, G.Murauskas

goodness of fit

• Data. One categorical (nominal) sample.

• All data is divided into k categories.

• At least 5 respondents in each category.

• We make a conjecture about ratios between categories.

© V. Čekanavičius, G. Murauskas

goodness of fit

• Statistical hypothesis

H0 : Conjecture is correct.

H1 : Conjecture is incorrect.

© V. Čekanavičius, G. Murauskas

H0 is rejected (data contradicts conjecture), if

H0 is accepted (data does not contradict conjecture), if

Here is the level of significance.

Conclusion based on p - value

05.0 p

05.0 p

05.0

© V. Čekanavičius, G. Murauskas

SPSS goodness of fit test Is ratio between national majority and

national minority 7:2 ?

5© V.Čekanavičius, G.Murauskas

SPSS

data

6© V.Čekanavičius, G.Murauskas

SPSS

Here

7© V.Čekanavičius, G.Murauskas

SPSS

Supposed ratio

variable

8

SPSS

Supposed ratio

9© V.Čekanavičius, G.Murauskas

SPSS

Frequencies

0 No 276 282,3 -6,3

1 Yes 87 80,7 6,3

363

1

2

Total

Category Observed N Expected N Residual

minority Minority Classification

ObservedExpected

difference

10© V.Čekanavičius, G.Murauskas

SPSS

Test Statistics

,639

1

,424

Chi-Squarea

df

Asymp. Sig.

minority Minority

Classification

0 cells (,0%) have expected frequencies less than5. The minimum expected cell frequency is 80,7.

a.

test statistic

p-value

Data does not contradict the ratio 7:2.

11© V.Čekanavičius, G.Murauskas

ConcIusion

• Application of the goodness of fit test showed that there is no statistically significant difference between the supposed ratio of national majority/minority and sample data.

12© V.Čekanavičius, G.Murauskas

SPSS Special caseA marketing analyst claims that 25% of the

customers will by certain type of sweets packed in large boxes, 25% in medium boxes, 30% in small boxes and 20% in very small boxes.

Data: 50 bought large boxes, 40 medium, 72 small and 19 very small.

Does data contradict analyst‘s claim statistically

significantly?

13© V.Čekanavičius, G.Murauskas

SPSS

datais numeric

14© V.Čekanavičius, G.Murauskas

SPSS

Here

15© V.Čekanavičius, G.Murauskas

SPSS

Weight by

16© V.Čekanavičius, G.Murauskas

SPSS

Supposedratio

Weight isleft alone

17© V.Čekanavičius, G.Murauskas

SPSS

RUSIS

50 45.3 4.8

40 45.3 -5.3

72 54.3 17.7

19 36.2 -17.2

181

1.00

2.00

3.00

4.00

Total

Observed N Expected N Residual

18© V.Čekanavičius, G.Murauskas

SPSS

Test Statistics

15.050

3

.002

Chi-Squarea

df

Asymp. Sig.

RUSIS

0 cells (.0%) have expected frequencies less than5. The minimum expected cell frequency is 36.2.

a.

Data statistically significantly

contradicts the supposed ratio.19© V.Čekanavičius, G.Murauskas

CHI SQUARE TEST FORINDEPENDENCE

Test of association for categorical data

test

• Two categorical (nominal) variables.

• We test if those categorical variables are dependent.

21© V.Čekanavičius, G.Murauskas

Examples

• Does smoking depend on respondents religion;

• Do men and women vote similarly;• Is percent of male students the same in

all courses.

2

© V. Čekanavičius, G. Murauskas

Data

All data is organized in cells according to two categorical variables.

23© V.Čekanavičius, G.Murauskas

Statistical hypothesis

H0 : variables are independent.

H1 : variables are dependent.

24© V.Čekanavičius, G.Murauskas

H0 is rejected (variables are dependent), if

H0 is accepted (variables are independent), if

Conclusion based on p-value

,050 p

0,05 p

25© V.Čekanavičius, G.Murauskas

Example

• Is percent of female employees the same for clerks and managers?

26© V.Čekanavičius, G.Murauskas

SPSS

data

Numeric orstring

27© V.Čekanavičius, G.Murauskas

SPSS

Here!

28© V.Čekanavičius, G.Murauskas

SPSS

rowNext, here

column

29© V.Čekanavičius, G.Murauskas

SPSS

check

30© V.Čekanavičius, G.Murauskas

SPSS

Then go

31© V.Čekanavičius, G.Murauskas

SPSS

check

check32© V.Čekanavičius, G.Murauskas

SPSS

JOBCAT Employment Category * GENDER Gender Crosstabulation

206 157 363

56.7% 43.3% 100.0%

95.4% 68.0% 81.2%

10 74 84

11.9% 88.1% 100.0%

4.6% 32.0% 18.8%

216 231 447

48.3% 51.7% 100.0%

100.0% 100.0% 100.0%

Count

% within JOBCAT Employment Category

% within GENDER Gender

Count

% within JOBCAT Employment Category

% within GENDER Gender

Count

% within JOBCAT Employment Category

% within GENDER Gender

1 Clerical

3 Manager

JOBCAT EmploymentCategory

Total

f Female m Male

GENDER Gender

Total

33© V.Čekanavičius, G.Murauskas

SPSS

Chi-Square Tests

54.935b 1 .000

53.154 1 .000

61.256 1 .000

.000 .000

447

Pearson Chi-Square

Continuity Correctiona

Likelihood Ratio

Fisher's Exact Test

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)Exact Sig.(2-sided)

Exact Sig.(1-sided)

Computed only for a 2x2 tablea.

0 cells (.0%) have expected count less than 5. The minimum expected count is 40.59.b.

p-value

p < 0.05, therefore, corresponding percents differ statistically significantly.

34© V.Čekanavičius, G.Murauskas

Conclusion

• Applying chi-square test we got that among clerks there is statistically significantly greater percent of women (56,7%), than among managers (11,9 %), p<0,01.

35© V.Čekanavičius, G.Murauskas

SPSS Special case• One hundred children watched violence-prone

shows and 100 watched nonviolent programs. After two weeks of observation each child was classified as either agressive or nonagressive. 63 watched violent shows and were agressive, 37 watched violent shows and were nonagressive, 30 nonviolent and agressive and 70 nonviolent and nonagressive.

• Are TV and behavior related?

36

SPSS

Numeric orstring

37© V.Čekanavičius, G.Murauskas

SPSS

Weight by‘kiek’

38© V.Čekanavičius, G.Murauskas

SPSS

Leave alone!Po to čia!Next, here!

Statistics and Cells are delt in the same way as before 39

SPSS

ELGESYS * TV Crosstabulation

30 63 93

32.3% 67.7% 100.0%

30.0% 63.0% 46.5%

70 37 107

65.4% 34.6% 100.0%

70.0% 37.0% 53.5%

100 100 200

50.0% 50.0% 100.0%

100.0% 100.0% 100.0%

Count

% within ELGESYS

% within TV

Count

% within ELGESYS

% within TV

Count

% within ELGESYS

% within TV

agres

neagr

ELGESYS

Total

nesmurt smurt

TV

Total

violent TV watchers are more agressive

40© V.Čekanavičius, G.Murauskas

SPSS

Chi-Square Tests

21.887b 1 .000

20.581 1 .000

22.314 1 .000

.000 .000

200

Pearson Chi-Square

Continuity Correctiona

Likelihood Ratio

Fisher's Exact Test

N of Valid Cases

Value dfAsymp. Sig.

(2-sided)Exact Sig.(2-sided)

Exact Sig.(1-sided)

Computed only for a 2x2 tablea.

0 cells (.0%) have expected count less than 5. The minimum expected count is 46.50.b.

stat. significantly

41© V.Čekanavičius, G.Murauskas

Conclusion

• Applying chi-square test we established that among watchers of violent TV are greater percent of agressive children (63%) then among non-watchers (30 %), p<0,01.

42© V.Čekanavičius, G.Murauskas

43

Mc Nemar test

© V.Čekanavičius, G.Murauskas

44

Mc Nemar test

Most freaquently is applied when we have dichotomuous data for the same respondents.

• Buyers and non-buyers before and after advertisment.

• Voters and non-voters before and after TV debates.

© V.Čekanavičius, G.Murauskas

45

Data One two-valued categorical variable

observed in two related populations Or in one population twice.

© V.Čekanavičius, G.Murauskas

46

Duomenys

dc

ba

After

Before

© V.Čekanavičius, G.Murauskas

47

Statistical hypothesis

H0 : no impact of advertisment

H1 : significant impact

© V.Čekanavičius, G.Murauskas

48

H0 is rejectest (impact stat. significant), if

H0 is accepted (impact not significant), if

Here 0.05 is the level of significancy.

Conclusion based on p - value

05.0 p

05.0 p

© V.Čekanavičius, G.Murauskas

49

SPSS

• Voters were twice asked about their support for candidate before and after TV debates:

• Before and after debates vote support candidate 200

• Before debates support, after - not 30• Before do not support, after support 60• Before and after do not support100• Does dabates influenced voters preferences?

© V.Čekanavičius, G.Murauskas

50

SPSS

© V.Čekanavičius, G.Murauskas

51

SPSS

Weight by

© V.Čekanavičius, G.Murauskas

52

SPSS

Here!

© V.Čekanavičius, G.Murauskas

53

SPSS

variables

here

© V.Čekanavičius, G.Murauskas

54

SPSS

check

© V.Čekanavičius, G.Murauskas

55

SPSS

pries * po Crosstabulation

Count

po

TotalNe užpries

Ne 100 60 160už 30 200 230

Total 130 260 390

© V.Čekanavičius, G.Murauskas

SPSS

Number of supporters increased

statistically significantly.

Chi-Square Tests

Value Exact Sig. (2-sided)

McNemar Test .002a

N of Valid Cases 390

a. Binomial distribution used.

p-value

56© V.Čekanavičius, G.Murauskas

Nonparametric tests

• Are also called rank tests• Normality of variables is not required;• Fits small samples;• More difficult to interpret; test is nonparametric test but not

a rank test

Typical hypothesis

• H0 : distributions of X and Y are equal• H1 : distributions of X and Y differ

© V. Čekanavičius, G. Murauskas 60

Mann - Whitney test

© V. Čekanavičius, G. Murauskas 61

Mann-Whitney test

1. Analogue of Student‘s test for independent samples;

2. Means are not compared;

3. Compares distributions;

4. The lager mean rank shows which variable is stochastically larger.

© V. Čekanavičius, G. Murauskas 62

Data1. Two independent interval or rank

samples.

2. Sample sizes can be dfferent.

3. Rank variable has at least 5 different outcomes.

© V. Čekanavičius, G. Murauskas 63

Statistical hypothesis

H0 : distributions are equal,

H1 : distributions differ.

© V. Čekanavičius, G. Murauskas 64

H0 is rejected (distributions differ) if

H0 is accepted (distributions do not differ) if

Here is the level of significance

Conclusion based on p - value

α p

α p

α

© V. Čekanavičius, G. Murauskas 65

Example

• We investigate respondents, who are older than 40 years.

• Do classical music is equally appreciated by men and women?

• Values: 1-like it very much, 2-like it,….,5- hate it.

SPSS

• After suitable select cases (age >40)

• Analyze -> Nonparametric Tests -> Legacy Dialogs ->2 independent samples

© V. Čekanavičius, G. Murauskas 66

SPSS

© V. Čekanavičius, G. Murauskas 67

© V. Čekanavičius, G. Murauskas 68

SPSS

• Males chose greater marks -> they like classical music less.

Ranks sex Respondent's

Sex N Mean Rank

Sum of Ranks

classicl Classical Music

1 Male 321 412,07 132273,50

2 Female 462 378,06 174662,50

Total 783

© V. Čekanavičius, G. Murauskas 69

SPSS

• Statistically significantly, p =0,033<0,05Test Statisticsa

classicl Classical Music

Mann-Whitney U 67709,500

Wilcoxon W 174662,500

Z -2,134

Asymp. Sig. (2-tailed)

,033

a. Grouping Variable: sex Respondent's Sex

© V. Čekanavičius, G. Murauskas 70

Wilcoxon test

© V. Čekanavičius, G. Murauskas 71

Wilcoxon test

1. Analogue of Students paired samples test;

2. Means are not compared;

3. Compares distributions;

4. The lager mean difference rank shows which variable is stochastically larger.

© V. Čekanavičius, G. Murauskas 72

Data1. Two dependent (paired) interval or

rank samples.

2. Rank variable has at least 5 different outcomes.

3. Usually the same respondent measured twice.

© V. Čekanavičius, G. Murauskas 73

Statistical hypothesis

H0 : distributions are equal,

H1 : distributions differ.

© V. Čekanavičius, G. Murauskas 74

H0 is rejected (distributions differ) if

H0 is accepted (distributions do not differ) if

Here is the level of significance

Conclusion based on p - value

α p

α p

α

© V. Čekanavičius, G. Murauskas 75

Example

• If respondents, older than 50 years, like classical music more than jazz?

• Each respondent rated both music styles by using the following scale: 1- like it very much,......7 – hate it very much.

SPSS

• After suitable select cases (age >50)

• Analyze -> Nonparametric Tests -> Legacy Dialogs ->2 related samples

© V. Čekanavičius, G. Murauskas 76

SPSS

© V. Čekanavičius, G. Murauskas 77

© V. Čekanavičius, G. Murauskas 78

SPSS

Ranks

138a 157.43 21725.00

198b 176.22 34891.00

161c

497

Negative Ranks

Positive Ranks

Ties

Total

JAZZ - CLASSICN Mean Rank Sum of Ranks

JAZZ Jazz Music < CLASSICL Classical Musica.

JAZZ Jazz Music > CLASSICL Classical Musicb.

CLASSICL Classical Music = JAZZ Jazz Musicc.

Ranks for differences

© V. Čekanavičius, G. Murauskas 79

SPSS

Test Statisticsb

-3.782a

.000

Z

Asymp. Sig. (2-tailed)

JAZZ JazzMusic -

CLASSICL Classical

Music

Based on negative ranks.a.

Wilcoxon Signed Ranks Testb.

p-reikšmė

Difference is statistically significant.

© V. Čekanavičius, G. Murauskas 80

Spearman correlation

© V. Čekanavičius, G. Murauskas 81

Spearman correlation test

1. Analogue of Pearson’s correlation.

2. Has the same interpretation.

3. Calculates Pearson’s correlation between ranks;

4. Can be used for already ranked data.

© V. Čekanavičius, G. Murauskas 82

Data1. Two dependent interval or ranked

variables.

2. Rank variable has at least 5 different outcomes.

3. In a special case data can be ranked.

© V. Čekanavičius, G. Murauskas 83

Statistical hypothesis

H0 : variables do not correlate.

H1 : variables correlate.

© V. Čekanavičius, G. Murauskas 84

H0 is rejected (variables correlate statistically significantly) if

H0 is accepted (variables do not correlate) if

Here is the level of significance

Conclusion based on p - value

α p

α p

α

© V. Čekanavičius, G. Murauskas 85

Example• Respondents older than 50years.• Do the data support a statement that

the more respondent likes musicals, the more he/she likes classical music.

Analyze -> Correlate -> Bivariate

© V. Čekanavičius, G. Murauskas 86

© V. Čekanavičius, G. Murauskas 87

Un-check

Check

© V. Čekanavičius, G. Murauskas 88

SPSS

Variables correlate statistically significantly.

Correlation is positive, but weak.

Correlations

classicl jazz

Spearman's rho classicl Correlation Coefficient

1,000 ,205**

Sig. (2-tailed) . ,000

N 504 497

jazz Correlation Coefficient

,205** 1,000

Sig. (2-tailed) ,000 .

N 497 514

**. Correlation is significant at the 0.01 level (2-tailed).

© V. Čekanavičius, G. Murauskas 89

Spearman correlation test for ranked data

1. Two teachers ranked their students:

2. First teacher: A, B, C, D, E, F, G, H, I,J, K, L.

3. Second teacher: B, C, A, D, H,E, F, G, K, I,J, L.

4. Do their rankings correlate?

© V. Čekanavičius, G. Murauskas 90

Statistical hypothesis

H0 : variables do not correlate.

H1 : variables correlate.

© V. Čekanavičius, G. Murauskas 91

SPSS

•First: A,B,C,D,E,F, G,H,I,J,K,L

•Second: B, C, A,D, H,E, F,G,K,I,J,L.

This variable is auxiliary

© V. Čekanavičius, G. Murauskas 92

SPSS

Correlations

1.000 .916**

. .000

12 12

.916** 1.000

.000 .

12 12

Correlation Coefficient

Sig. (2-tailed)

N

Correlation Coefficient

Sig. (2-tailed)

N

MOKYT1

MOKYT2

Spearman's rhoMOKYT1 MOKYT2

Correlation is significant at the .01 level (2-tailed).**.

Correlation is very strong, significant and positive

© V. Čekanavičius, G. Murauskas 93

Kruskal - Wallis test

© V. Čekanavičius, G. Murauskas 94

Kruskal-Wallis test

1. Mann-Whitney test extended to more than 2 samples.

2. Interpretation is the same as fo M-W test.

3. The larger mean rank corresponds to larger scores.

4. Gives no information on which variables differ.

5. Is also called ANOVA for rank data.

© V. Čekanavičius, G. Murauskas 95

Data1. Two or more independent interval or

rank samples.

2. Each rank variable has at least 5 different outcomes.

© V. Čekanavičius, G. Murauskas 96

Statistical hypothesis

H0 : all distributions are the same

H1 : some distributions differ.

© V. Čekanavičius, G. Murauskas 97

H0 is rejected (some distributions differ st. significantly), if

H0 is accepted (all distributions are equal), if

Here is the level of significance.

Conclusion with p - value

α p

α p

α

© V. Čekanavičius, G. Murauskas 98

Example• We investigate respondents with at

leasy 13years of formal education.• Do all races equally like rap music?• Rank variable rap: 1-like it very much,

….,5-hate it.

SPSS

• After: select cases ->if -> educ >13

• Analyze -> Nonparametric Tests -> Legacy Dialogs -> K independent Samples

© V. Čekanavičius, G. Murauskas 99

SPSS

© V. Čekanavičius, G. Murauskas 100

Here

SPSS

© V. Čekanavičius, G. Murauskas 101

© V. Čekanavičius, G. Murauskas 102

SPSS

Ranks

617 372.20

65 254.05

34 309.59

716

RACE Racewof Respondent1 white

2 black

3 other

Total

RAP Rap MusicN Mean Rank

Blacks like best (coding).

© V. Čekanavičius, G. Murauskas 103

SPSS

Test Statisticsa,b

23.311

2

.000

Chi-Square

df

Asymp. Sig.

RAP RapMusic

Kruskal Wallis Testa.

Grouping Variable: RACE Racew of Respondentb.

p-reikšmė

The scores statistically significantly

depend on the respondents race.

© V. Čekanavičius, G. Murauskas 104

Friedman test

© V. Čekanavičius, G. Murauskas 105

Friedman test1. Generalization of Wilcoxon‘s test for

more samples than 2.

2. For 2 samples, Wilcoxon‘s test is more powerful.

3. Easy to interpret.

© V. Čekanavičius, G. Murauskas 106

Interpretation of ranks

1. Lat us assume that respondent evaluated performances of three actors (larger score – better perfomance): 10 for actor A , 6 for actor B, 8 for actor C.

2. Scores are ranked. Ranks: 3 for A, 1 for B, 2 for C.

© V. Čekanavičius, G. Murauskas 107

Data1. Two or more dependent interval or

rank samples.

2. Each rank variable has at least 5 different outcomes.

© V. Čekanavičius, G. Murauskas 108

Statistical hypothesis

H0 : all distributions are the same

H1 : some distributions differ.

© V. Čekanavičius, G. Murauskas 109

H0 is rejected (some distributions differ st. significantly), if

H0 is accepted (all distributions are equal), if

Here is the level of significance.

Conclusion with p - value

α p

α p

α

© V. Čekanavičius, G. Murauskas 110

Example• We investigate respondents with formal

education longer than 15years.• Do musicals, classical music and rap

music are equally popular?• Rank variable rap: 1-like it very much,

….,5-hate it.

SPSS

• After: select cases ->if -> educ >15

• Analyze -> Nonparametric Tests -> Legacy Dialogs -> K related Samples

© V. Čekanavičius, G. Murauskas 111

SPSS

© V. Čekanavičius, G. Murauskas 112

© V. Čekanavičius, G. Murauskas 113

SPSS

Ranks

1.87

2.05

2.08

CLASSICL Classical Music

MUSICALS BroadwayMusicals

BIGBAND Bigband Music

Mean Rank

Classical music got lowest scores

© V. Čekanavičius, G. Murauskas 114

SPSS

Test Statisticsa

343

14.286

2

.001

N

Chi-Square

df

Asymp. Sig.

Friedman Testa.

p-reikšmė

Not all styles are equally popular.

© V. Čekanavičius, G. Murauskas 115

Friedman‘s test special case

• Five experts ranked three sorts of bear: A,B and C.

• First: B, C, A (i.e. the best is B bear)• Second: B, C, A • Third: A or C, B• Fourth: A, B,C• Fifth: B, A,C• Do all sorts are equally popular?

© V. Čekanavičius, G. Murauskas 116

SPSS

ranks!

sorts

© V. Čekanavičius, G. Murauskas 117

SPSS

Ranks

2.10

1.60

2.30

A

B

C

Mean Rank

Most popular is sort B

© V. Čekanavičius, G. Murauskas 118

SPSS

Test Statisticsa

5

1.368

2

.504

N

Chi-Square

df

Asymp. Sig.

Friedman Testa.

Differences are st. Insignificant.