psyc3010 lecture 13 - School of Psychology Homeuqwloui1/stats/3010 for post/3010l13... · psyc3010...

1

11

psyc3010 lecture 13psyc3010 lecture 13This week:This week:

(1)(1)PSYC3010 PSYC3010 -- overviewoverview(2)(2)SECATsSECATs(3)(3)Discussion of exam and distribution of practice examDiscussion of exam and distribution of practice exam(4)(4)A bit on logistic regression A bit on logistic regression (5)(5) Interconnections between ANOVA and regressionInterconnections between ANOVA and regressionHowell Howell chch 16 p. 60416 p. 604--617617

last week: mixed anova

Exam consultExam consultPlease post and answer ?s in the Please post and answer ?s in the discussion forum (which I will also monitor discussion forum (which I will also monitor periodically)periodically)Consult times for me for the exam will be:Consult times for me for the exam will be:Monday 20 June 4Monday 20 June 4--5pm 5pm Friday 17 June 8Friday 17 June 8--10am10amMonday 13 June 1Monday 13 June 1--3pm3pmMonday 6 June 3Monday 6 June 3--4pm4pmOr by appointmentOr by appointment

22

2

33

(1) what have we “added” (1) what have we “added” in 3010 from 2010?in 3010 from 2010?

44

course overviewcourse overview

PSYC2010PSYC2010designs involving one factor or one designs involving one factor or one predictorpredictor

PSYC3010PSYC3010designs involving designs involving multiplemultiple factors, factors, predictors, or categorical variablespredictors, or categorical variables

3

55

PSYC2010PSYC2010oneone--way betweenway between--subjects ANOVAsubjects ANOVAoneone--way withinway within--subjects ANOVAsubjects ANOVA

PSYC3010PSYC3010factorial ANOVAfactorial ANOVAbetweenbetween--subjectssubjects–– 22--wayway–– 33--wayway

withinwithin--subjectssubjectsmixedmixedblocking (and ANCOVA)blocking (and ANCOVA)

66

PSYC2010PSYC2010bivariatebivariate correlation and regressioncorrelation and regression

PSYC3010PSYC3010multiple regressionmultiple regression

standardstandardhierarchicalhierarchical–– as control techniqueas control technique–– assessing mediationassessing mediation–– assessing moderationassessing moderation

4

77

PSYC3010 learning objectives1. Generate research designs for questions involving multiple IVs /

predictors, based on methodological and practical considerations.2. Identify the statistical analyses that are appropriate for research

designs involving multiple IVs / predictors.3. Identify the key terms and conceptual principles relevant to

statistical techniques involving multiple IVs / predictors.

4. Plan and execute (omnibus and follow-up) tests in statistical analyses involving multiple IVs / predictors.

5. Interpret results from these statistical analyses, identifying the implications of the results for hypotheses and research questions.

6. Report and discuss the results of these analyses, following standard conventions in Psychology.

7. Use your statistics knowledge to develop and enrich your work as a psychologist.

88

The purpose of statisticsThe purpose of statisticsTo understand the shape of the dataTo understand the shape of the dataTo To understand understand meaningfulmeaningful questionsquestionsand and assess assess meaninglessmeaningless rantingranting–– “Women mature faster than men” “Men are “Women mature faster than men” “Men are

stronger”stronger”•• What’s the standard deviation? Is the difference What’s the standard deviation? Is the difference

reliable reliable -- Is it even going to be significant in the Is it even going to be significant in the population?population?

•• What’s the effect size? What portion of the What’s the effect size? What portion of the variance in the data does gender account for?variance in the data does gender account for?

•• What are other factors associated with gender to What are other factors associated with gender to control for (e.g., via ANCOVA)? [ANOVA is not control for (e.g., via ANCOVA)? [ANOVA is not causation!]causation!]

•• What other factors might moderate this effect? What other factors might moderate this effect? (interactions!)(interactions!)

5

99

The purpose of statisticsThe purpose of statisticsMeaningful ?s and meaningless rantingMeaningful ?s and meaningless ranting–– The wealthier you are, the happier you are!The wealthier you are, the happier you are!

•• Is that relationship reliable, is it significant in the Is that relationship reliable, is it significant in the population?population?

•• What is the effect size?What is the effect size?•• What other factors might need to be controlled for? What other factors might need to be controlled for?

[Correlation is not causation!][Correlation is not causation!]•• What other factors might moderate this effect? What other factors might moderate this effect?

(interactions!)(interactions!)•• Is this really a linear effect ?Is this really a linear effect ?

To read psych articles, need to know statistics To read psych articles, need to know statistics ––now you can read most & understand themnow you can read most & understand themmore broadly, it’s difficult to understand human more broadly, it’s difficult to understand human variability meaningfully without understanding variability meaningfully without understanding what variability and differences are and are not.what variability and differences are and are not.

1010

when do you use which when do you use which analysis?analysis?

Need to consider the Need to consider the typetype of variables: of variables: independent (predictor) and dependent (criterion).independent (predictor) and dependent (criterion).

Predictors

Categorical

Categorical & Continuous

Continuous

Continuous

Categorical

Criterion

Continuous

Continuous

Continuous

Categorical

Categorical

Method

ANOVA; MR

MR

MR

Logistic Regression

Log-linear Analysis

6

1111

The multivariate universe:The multivariate universe:Before 3010:Before 3010:–– Single explanationsSingle explanations–– Barely grasp difference Barely grasp difference

between correlations and between correlations and group differencesgroup differences

–– Tendency to rely too much Tendency to rely too much on pon p--valuesvalues

After 3010:After 3010:–– Multiple explanationsMultiple explanations–– Explanations that interact, Explanations that interact,

or are interor are inter--relatedrelated–– Variables considered jointly Variables considered jointly

so you can see interactions so you can see interactions and interand inter--relationships relationships explain more than explain more than considering each aloneconsidering each alone

–– Strong understanding of Strong understanding of correlations and group correlations and group differencesdifferences

–– Understanding key idea of Understanding key idea of effect sizeseffect sizes

1212

(2) SETCs(2) SETCs

Knowing artists, you think you know all about Prima Donnas:boy!, just wait till you hear scientists get up and sing.

-- W. H. Auden

7

1313

(3) exam review:content, structure, & study tips

1414

content of exam primarily assesses primarily assesses conceptualconceptual material from material from lectureslecturesmoving between research questions, design, moving between research questions, design, hypotheses, and analytic choiceshypotheses, and analytic choicespartitioning variance (systematic and error)partitioning variance (systematic and error)structuralstructural modelsmodels ((XXijij = )= )steps in analysis (omnibus and followsteps in analysis (omnibus and follow--up tests)up tests)key terms and principles in analysis key terms and principles in analysis interpretation of statisticsinterpretation of statistics–– description of results (e.g., description of results (e.g., FF and and pp values provided)values provided)–– no SPSS outputno SPSS output

calculating degrees of freedomcalculating degrees of freedom

8

1515

structure of exam 50 multiple-choice questions– 1 mark each– content spread across Lectures 1 12

10 mins perusal + 2 hours working time

formula sheet included formula sheet included –– does not include DF calculationsdoes not include DF calculations–– Posted on Blackboard nowPosted on Blackboard now

1616

study resources on Blackboardrecordings from all lecturesslides from all lecturesreview notes for all lectures– key concepts and principles that you should know

from each lecture

practice exam questions

tips for answering multiple-choice questions on a closed-book test

“Practice Materials”

“Lecture Materials”

9

1717

how to study for the exam I revise lecture content (strategically )

go over lecture notes and listen to lecture recordingsuse the Review Notes to work out which principles and concepts you must understand and memorise– dot-points in Review Notes are listed in the same

order as the concepts and principles in the lecture

tutorial notes / textbook readings may clarify things but if you understand everything in the lecture

slides, don’t worry about the tutorial / textbook content

1818

how to study for the exam II be prepared for the exam questionsbe prepared for the exam questions

the exam questions will ask you to apply your knowledge from the lectures

it is very important to complete the practice questions– PDFs for ANOVA and MR and ANCOVA– PDF for practice exam– Practice quizzes reopened online – why not keep going until

you get them all right?

it may also be useful to look at the tips for multiple-choice questions on a closed-book exam

10

1919

important logistical details what you are allowed to bring to the exam:what you are allowed to bring to the exam:–– nonnon--programmable calculator (be aware of need to programmable calculator (be aware of need to

have approved model / sticker)have approved model / sticker)–– Unmarked nonUnmarked non--electronic dictionary (you know, a book)electronic dictionary (you know, a book)

check with UQ Central Examinations for check with UQ Central Examinations for list of things you are list of things you are notnot allowed to bring inallowed to bring in

doubledouble--check the exam date / time / venuecheck the exam date / time / venue

arrive at least 15 minutes before the examarrive at least 15 minutes before the exam

be sure to have your ID card at the exambe sure to have your ID card at the exam

2020

Practice examPractice exam

More practice MC questionsMore practice MC questionsAnswers may be discussed in the Answers may be discussed in the PSYC3010 forumPSYC3010 forumYou are also welcome to attend You are also welcome to attend Winnifred’sWinnifred’s consult (times listed last slide)consult (times listed last slide)

11

2121

(4) A bit on logistic regression(4) A bit on logistic regression

2222

0123456789

10

0 1 2 3 4 5 6 7 8 9 10

Number of Social Events Attended

Life

Sat

isfa

ctio

n

Multiple regression = continuous IVs and DVs, Multiple regression = continuous IVs and DVs, each normally distributed. We fit the data with a each normally distributed. We fit the data with a

linear model linear model –– the straight line minimising the the straight line minimising the discrepancy between Y and Y hatdiscrepancy between Y and Y hat

12

2323

Logistic regression = continuous IVs and categorical (0, 1) Logistic regression = continuous IVs and categorical (0, 1) DV. Obviously (a) Y is not normally distributed and (b) a DV. Obviously (a) Y is not normally distributed and (b) a

straight line fits this data poorly.straight line fits this data poorly.

00.10.20.30.40.50.60.70.80.9

1

0 1 2 3 4 5 6 7 8 9 10


Mor

talit

y w

ithin

5 y

ears

(1

=dea

d)

2424

Accordingly we fit the data with a logistic model Accordingly we fit the data with a logistic model –– the Sthe S--shape curve (a.k.a., sigmoidal curve) that best predicts shape curve (a.k.a., sigmoidal curve) that best predicts whether an observation will be in one group (0) versus whether an observation will be in one group (0) versus another (1). another (1).

00.10.20.30.40.50.60.70.80.9

1

0 1 2 3 4 5 6 7 8 9 10


Mor

talit

y w

ithin

5 y

ears

(1=d

ead)

13

2525

Conceptual similarities:Conceptual similarities:Interpreting logistic R2 and R2 changeInterpreting logistic R2 and R2 change

In SPSS for logistic regression, you get R2 estimates In SPSS for logistic regression, you get R2 estimates labelled Cox & Snell R2 and Nagelkerke R2 labelled Cox & Snell R2 and Nagelkerke R2 –– These are two ways of understanding the “variance” in These are two ways of understanding the “variance” in

dichotomous (0, 1) DVsdichotomous (0, 1) DVs–– No convention exists regarding which to report No convention exists regarding which to report -- C&S is the more C&S is the more

conservative one, and Nagelkerke is more liberal conservative one, and Nagelkerke is more liberal –– at the at the moment Nagelkerke R2 is more common.moment Nagelkerke R2 is more common.

Hierarchical logistic regression can be performedHierarchical logistic regression can be performed–– SPSS will output C&S and N R2 for each model but you need to SPSS will output C&S and N R2 for each model but you need to

subtract the later R2 from earlier to get R2 change / blocksubtract the later R2 from earlier to get R2 change / block

R2 and R2 change are tested with chiR2 and R2 change are tested with chi--square ( square ( χχ22 ) ) tests, tests, not Fnot F--tests, but structure of writetests, but structure of write--up = identicalup = identicalBoth X2 for model and for block are reported, Both X2 for model and for block are reported, R2 change must be calculated by hand from the output.R2 change must be calculated by hand from the output.

2626

E.g. output and E.g. output and writewrite--upup

Logistic Regression Block 0: Beginning Block

Variables in the Equation

-.235 .143 2.673 1 .102 .791ConstantStep 0B S.E. Wald df Sig. Exp(B)

Block 1: Method = Enter

Omnibus Tests of Model Coefficients

.856 2 .652

.856 2 .652

.856 2 .652

StepBlockModel

Step 1Chi-square df Sig.

Model Summary

269.553a .004 .006Step1

-2 Loglikelihood

Cox & SnellR Square

NagelkerkeR Square

Estimation terminated at iteration number 3 becauseparameter estimates changed by less than .001.

a.


-.026 .034 .613 1 .434 .974-.171 .313 .298 1 .585 .843-.118 .260 .204 1 .651 .889

c_ageec_women(1)Constant

Step1

a

B S.E. Wald df Sig. Exp(B)

Variable(s) entered on step 1: c_age, ec_women.a.

“A hierarchical logistic regression was conducted “A hierarchical logistic regression was conducted predicting whether or not participants took predicting whether or not participants took political action from demographic factors political action from demographic factors (Block 1) and attitude strength (Block 2). (Block 1) and attitude strength (Block 2). Table 1 describes the means, standard Table 1 describes the means, standard deviations, and intercorrelations. The entry of deviations, and intercorrelations. The entry of the demographics did not increase the the demographics did not increase the variance accounted for, Nagelkerke variance accounted for, Nagelkerke RR22 = .01, = .01, XX22 (2) = 0.86, (2) = 0.86, pp = .652 [snip]= .652 [snip]

14

2727



14.475 1 .00014.475 1 .00015.331 3 .002

StepBlockModel


Model Summary

255.078a .075 .100Step1

-2 Loglikelihood

Cox & SnellR Square

NagelkerkeR Square


a.


-.045 .035 1.622 1 .203 .956-.054 .327 .028 1 .868 .947.404 .110 13.439 1 .000 1.498

-1.073 .379 8.015 1 .005 .342

c_ageec_women(1)atstr_scConstant

Step1

a


Variable(s) entered on step 1: atstr_sc.a.

“However, the entry of attitude strength in “However, the entry of attitude strength in Block 2, significantly increased the variance Block 2, significantly increased the variance accounted for, Nagelkerke accounted for, Nagelkerke RR22 change = .09, change = .09, XX2 2 (1) = 14.48, (1) = 14.48, pp < .001. [snip] The final < .001. [snip] The final model accounted for only 10% of the model accounted for only 10% of the variance in action however, Xvariance in action however, X22 (3) = 15.33, (3) = 15.33, pp = .002.= .002.

E.g. output E.g. output and writeand write--upup

Note: the the difference between 2LL in this model (255.078) and the first model (269.553) equals the chi-square value (14.475). Some reviewers prefer reporting 2LL over R2

2828

15

2929

Return to the data Return to the data

00.10.20.30.40.50.60.70.80.9

1

0 1 2 3 4 5 6 7 8 9 10


Mor

talit

y w

ithin

5 y

ears

(1=d

ead)

3030

Interpreting logistic coefficientsInterpreting logistic coefficientsError = still deviations from the (sError = still deviations from the (s--shaped) line but now shaped) line but now involve misclassification (e.g., predicted dead when is in involve misclassification (e.g., predicted dead when is in fact alive) fact alive) –– instead of being normally distributed, errors instead of being normally distributed, errors also trend towards 0,1 distributionalso trend towards 0,1 distributionInstead of describing and reporting unstandardised Instead of describing and reporting unstandardised coefficients, report Exp(B). This coefficient is tested with a coefficients, report Exp(B). This coefficient is tested with a Wald test not a tWald test not a t--test, but structure of writetest, but structure of write--up is same.up is same.Exp(B) coefficients don’t describe the 1 unit change in DV Exp(B) coefficients don’t describe the 1 unit change in DV given 1 unit change in IV given 1 unit change in IV –– they describe change in odds of they describe change in odds of being (1) compared to (0) for every unit increase in IVbeing (1) compared to (0) for every unit increase in IV–– Exp(B) = 1.00 Exp(B) = 1.00 –– no change in likelihood of dead within 5 years for no change in likelihood of dead within 5 years for

every 1 more social eventsevery 1 more social events–– Exp(B) = 2.50 Exp(B) = 2.50 –– likelihood of being dead within 5 years increases by likelihood of being dead within 5 years increases by

2.5 times (or increases by 250%) for every 1 more social events 2.5 times (or increases by 250%) for every 1 more social events attendedattended

–– Exp(B) = .80 Exp(B) = .80 –– likelihood of death within 5 years increases by .8 likelihood of death within 5 years increases by .8 times (but much more useful to say decreases by 20% [1times (but much more useful to say decreases by 20% [1--.8 = .2]) .8 = .2]) for every 1 more social events attendedfor every 1 more social events attended

16

3131

E.g. output and E.g. output and writewrite--upup

Logistic Regression Block 0: Beginning Block


-.235 .143 2.673 1 .102 .791ConstantStep 0B S.E. Wald df Sig. Exp(B)



.856 2 .652

.856 2 .652

.856 2 .652

StepBlockModel


Model Summary

269.553a .004 .006Step1

-2 Loglikelihood

Cox & SnellR Square

NagelkerkeR Square


a.


-.026 .034 .613 1 .434 .974-.171 .313 .298 1 .585 .843-.118 .260 .204 1 .651 .889

c_ageec_women(1)Constant

Step1

a


Variable(s) entered on step 1: c_age, ec_women.a.

““A hierarchical logistic regression was conducted A hierarchical logistic regression was conducted predicting whether or not participants took predicting whether or not participants took political action from demographic factors political action from demographic factors (Block 1) and attitude strength (Block 2). Table (Block 1) and attitude strength (Block 2). Table 1 describes the means, standard deviations, 1 describes the means, standard deviations, and intercorrelations. The entry of the and intercorrelations. The entry of the demographics did not increase the variance demographics did not increase the variance accounted for, Nagelkerke Raccounted for, Nagelkerke R22 = .01, X= .01, X22 (2) = (2) = 0.86, p = .652, 0.86, p = .652, andand inspection of the inspection of the coefficients revealed that neither age nor coefficients revealed that neither age nor gender was significantly linked to action, gender was significantly linked to action, Wald tests < .30, Wald tests < .30, pps> .584.s> .584.

3232



14.475 1 .00014.475 1 .00015.331 3 .002

StepBlockModel


Model Summary

255.078a .075 .100Step1

-2 Loglikelihood

Cox & SnellR Square

NagelkerkeR Square


a.


-.045 .035 1.622 1 .203 .956-.054 .327 .028 1 .868 .947.404 .110 13.439 1 .000 1.498

-1.073 .379 8.015 1 .005 .342

c_ageec_women(1)atstr_scConstant

Step1

a


Variable(s) entered on step 1: atstr_sc.a.

“However, the entry of attitude strength in “However, the entry of attitude strength in Block 2, significantly increased the variance Block 2, significantly increased the variance accounted for, Nagelkerke Raccounted for, Nagelkerke R22 change = .09, change = .09, XX22(1) = 14.48, p < .001. (1) = 14.48, p < .001. Specifically, on a Specifically, on a scale from 0 to 5, every additional unit of scale from 0 to 5, every additional unit of attitude strength increased the attitude strength increased the likelihood of political action by 150%, likelihood of political action by 150%, Exp(B) = 1.50, Wald = 13.44, Exp(B) = 1.50, Wald = 13.44, pp < .001. < .001. The final model accounted for only 10% of The final model accounted for only 10% of the variance in action however, Xthe variance in action however, X22 (3) = (3) = 15.33, 15.33, pp = .002.”= .002.”

E.g. output and writeE.g. output and write--upup

17

3333

Logistic regression is seen quite often, e.g.:Logistic regression is seen quite often, e.g.:–– clinical psychology (what factors predict becoming clinical psychology (what factors predict becoming

schizophrenic, recurrence of depression?) schizophrenic, recurrence of depression?) –– social (predict attending rally, getting divorced?)social (predict attending rally, getting divorced?)–– org psych (predict quitting the firm / being promoted?)org psych (predict quitting the firm / being promoted?)

Occasionally other statistics are reported but the above Occasionally other statistics are reported but the above would serve in a journal article at the moment.would serve in a journal article at the moment.Also can have multiple categories on DV Also can have multiple categories on DV –– Use multinomial logistic regressionUse multinomial logistic regression

So worth knowingSo worth knowingField spells it all out rather nicely and goes thru SPSSField spells it all out rather nicely and goes thru SPSSCovered in Howell section 15.14 (5Covered in Howell section 15.14 (5thth & 6& 6thth eded))ButBut not assessed on exam!not assessed on exam!Also note: Log linear analysis is in Howell Also note: Log linear analysis is in Howell chptchpt 17 but we 17 but we won’t get around to covering this (as psychs you will won’t get around to covering this (as psychs you will come across logistic regression far more frequently)come across logistic regression far more frequently)

3434

(5) Interconnections between (5) Interconnections between ANOVA and regressionANOVA and regression

18

3535

ANOVA& t-tests

between/within

Bivariate (simple)

correlation

Factorial ANOVA

between/within& mixed

Multiple Regression

…multivariate methods……multivariate methods…

3636

experimental vs. correlational experimental vs. correlational researchresearch

this is what many will tell you about the differences this is what many will tell you about the differences between anova vs correlational designs:between anova vs correlational designs:

Anova designsAnova designs–– the only research strategy in which causation can be inferred the only research strategy in which causation can be inferred --

the factor can be said to “cause” changes in DVthe factor can be said to “cause” changes in DV–– this is because the IV is manipulatedthis is because the IV is manipulated

correlational researchcorrelational research–– can not be used to infer causalitycan not be used to infer causality–– this is because variables are not manipulated this is because variables are not manipulated ---- just measuredjust measured

19

3737

experimental vs. correlational experimental vs. correlational researchresearch

this is this is misleadingmisleading because:because:it is confuses it is confuses researchresearch methodology (PSYC3042) with methodology (PSYC3042) with

statisticalstatistical methodology (PSYC3010), and it assumes methodology (PSYC3010), and it assumes that the benefits of experimental research transfer that the benefits of experimental research transfer automatically to anovaautomatically to anova

-- the differences between experimental and correlational the differences between experimental and correlational research involve random assignment to levels of IV vs research involve random assignment to levels of IV vs observation of natural / measured levels of IVobservation of natural / measured levels of IV

-- These have These have NOTHINGNOTHING to do with the differences to do with the differences between anova and regression, which involve partitioning between anova and regression, which involve partitioning variance between factors and within versus between a variance between factors and within versus between a regression line and observationsregression line and observations

-- ANOVA can be carried out statistically with regression ANOVA can be carried out statistically with regression analyses; tanalyses; t--tests can be carried out with correlationstests can be carried out with correlations

-- All of these statistical techniques are generalisations of All of these statistical techniques are generalisations of one underlying model, the one underlying model, the general linear model (GLM)general linear model (GLM)

3838

The General Linear ModelThe General Linear ModelWhat is it?What is it?

XXijkijk = = μμ + + ααj j + + ββk k + + αβαβjk jk + e+ eijkijk

XXijij = = μμ + + ααj j + + ππi i + e+ eijij

Y =Y = bb11XX ++ bb22ZZ ++ bb33XZXZ ++ c c ++ ee

20

3939

The General Linear ModelThe General Linear ModelWhat is it?What is it?

a system of linear equations a system of linear equations which can be used to model datawhich can be used to model data quite similar to the T1000:quite similar to the T1000:

- powerful!- versatile!- can execute a range of operations!- can take on a variety of appearances!- provides the basis for just about every parametric statistical test we know (OK, weak link there )(OK, weak link there )

Read Cronbach, 1968 for moreRead Cronbach, 1968 for more

4040

21

4141

magic tricks!magic tricks!it is fairly easy to show that:it is fairly easy to show that:

1.1. a ta t--test is a correlationtest is a correlation2.2. factorial anova is a standard factorial anova is a standard

regression problemregression problem3.3. ancova is a hierarchical ancova is a hierarchical

regression problemregression problem4.4. interactions in anova are interactions in anova are

identical to those in MMRidentical to those in MMR

4242

correlation and the tcorrelation and the t--testtest

you may have heard of a you may have heard of a pointpoint--biserial biserial correlationcorrelation (Howell p. 297(Howell p. 297--305)305)

this is a special case of correlation where one of this is a special case of correlation where one of the variables is dichotomous (e.g., gender) and the variables is dichotomous (e.g., gender) and the other is continuous (e.g., height)the other is continuous (e.g., height)the other name for a the other name for a pointpoint--biserial biserial correlation is correlation is an an independent samples tindependent samples t--testtest

22

4343

Females Females MalesMales

150150 165165160160 170170165165 180180155155 175175

Heights of males and females – this is how we

are used to seeing the data laid out when we are doing hand calculations for t-test

but we know that SPSS would prefer that we lay the

data out like this

hmmm looks familiar

GenderGender HeightHeight1 1 1501501 1 1601601 1 1651651 1 1551552 2 1651652 2 1701702 2 1801802 2 175175

4444

so let’s run our tso let’s run our t--testtestIndependent Samples Test

.000 1.000 -3.286 6 .017

-3.286 6.000 .017

Equal variancesassumedEqual variancesnot assumed

HEIGHTF Sig.

Levene's Test forEquality of Variances

t df Sig. (2-tailed)

t-test for Equality of Means

t(6) = 3.29, p = .017

23

4545

now run as a correlation now run as a correlation (just as if we had two continuous variables)(just as if we had two continuous variables)

r = .802, p = .017, r2 = .643

Correlations

1 .802*. .0178 8

.802* 1

.017 .8 8

Pearson CorrelationSig. (2-tailed)NPearson CorrelationSig. (2-tailed)N

GENDER

HEIGHT

GENDER HEIGHT

Correlation is significant at the 0.05 level (2-tailed).*.

p value is the same as in t-test

4646

rere--run as an anova run as an anova (to get estimates of effect size)(to get estimates of effect size)

F(1,6) = 10.8, p = .017, η2 = .643p value is again the samepartial η2 = r2 (from previous slide)F (i.e., 10.8) = t2 (i.e., 3.292)

Tests of Between-Subjects Effects

Dependent Variable: HEIGHT

450.000a 1 450.000 10.800 .017 .643217800.000 1 217800.000 5227.200 .000 .999

450.000 1 450.000 10.800 .017 .643250.000 6 41.667

218500.000 8700.000 7

SourceCorrected ModelInterceptGENDERErrorTotalCorrected Total

Type III Sumof Squares df Mean Square F Sig.

Partial EtaSquared

R Squared = .643 (Adjusted R Squared = .583)a.

24

4747

now run as a regression now run as a regression (just for the sake of comparison)(just for the sake of comparison)

R2 = .643, F(1,6) = 10.8, p = .017

Model Summary

.802a .643 .583 6.45497Model1

R R SquareAdjustedR Square

Std. Error ofthe Estimate

Predictors: (Constant), GENDERa.

ANOVAb

450.000 1 450.000 10.800 .017a

250.000 6 41.667700.000 7

RegressionResidualTotal

Model1

Sum ofSquares df Mean Square F Sig.

Predictors: (Constant), GENDERa.

Dependent Variable: HEIGHTb.

R2 = partial η2 = r2

F and p arethe same

4848

an additional slide to consolidate an additional slide to consolidate structural modelsstructural models

First to help interpretation re-run MR using dummy coding (female = 1 male = 0) Can use structural model to calc means:

Group Statistics

4 172.5000 6.45497 3.227494 157.5000 6.45497 3.22749

gendermalefemale

heightN Mean Std. Deviation

Std. ErrorMean

From t-test

Coefficientsa

172.500 3.227 53.447 .000-15.000 4.564 -.802 -3.286 .017

(Constant)gender

Model1

B Std. Error

UnstandardizedCoefficients

Beta

StandardizedCoefficients

t Sig.

Dependent Variable: heighta.

From regress

Y hat = a + B1X 1

So, for men (coded as zero), Y hat = 172.50 – (15.00*0) = 172.50

And for women (coded as one), Y hat = 172.50 – (15.00*1) = 157.50

25

4949

explanationexplanationa ta t--test, or an anova between two groups, is just test, or an anova between two groups, is just a special case of correlation, a special case of correlation, –– which in turn is just a special case of regression, which in turn is just a special case of regression, –– which is a representation of the General Linear Modelwhich is a representation of the General Linear Model

SPSS did the sameSPSS did the same** thing in all four analyses thing in all four analyses ––it just presented the output in different waysit just presented the output in different ways

*(strictly speaking, bivariate correlations and t-tests are not executions of the GLM – they are calculated using ‘shortcuts’ that achieve the same basic results)

5050

hierarchical regression and ancovahierarchical regression and ancova

in in ancovaancova our goal was to remove the effects of our goal was to remove the effects of a covariate before examining our treatment a covariate before examining our treatment effecteffectin in hierarchical regressionhierarchical regression, the idea was to , the idea was to examine the contribution of a set of variables at examine the contribution of a set of variables at step 2 after accounting for prediction at step 1step 2 after accounting for prediction at step 1–– as it turns out, both are basically doing the same as it turns out, both are basically doing the same

thing!thing!

26

5151

let’s go back to let’s go back to our height data our height data –– and include and include

ageage as a as a covariate:covariate:

data is laid out how we would for an ancova or a hierarchical regression

SexSex AgeAge HeightHeight1 1 1616 1501501 1 1818 1601601 1 1717 1651651 1 1717 1551552 2 1616 1651652 2 1717 1701702 2 1818 1801802 2 1717 175175

5252



606.250a 2 303.125 16.167 .007 .86647.690 1 47.690 2.543 .172 .337

156.250 1 156.250 8.333 .034 .625450.000 1 450.000 24.000 .004 .828

93.750 5 18.750218500.000 8

700.000 7

SourceCorrected ModelInterceptAGEGENDERErrorTotalCorrected Total

Type III Sumof Squares df Mean Square F Sig.

Partial EtaSquared


first run as an ancova first run as an ancova

for gender, F(1,5) = 24.00, p = .004

this is the effect after controlling for age

27

5353

Model Summary

.472a .223 .223 1.724 1 6 .237

.931b .866 .643 24.000 1 5 .004

Model12

R R SquareR SquareChange F Change df1 df2 Sig. F Change

Change Statistics

Predictors: (Constant), AGEa.

Predictors: (Constant), AGE, GENDERb.

now run as hierarchical regression now run as hierarchical regression

Fch(1,8) = 24.00, p = .004

this is the effect after controlling for age

5454

Minor diffs in output Minor diffs in output there are some there are some minorminor differences in presentation:differences in presentation:

–– in our in our ancovaancova we are given we are given η2p = .828 but in

regression the R2ch was .643– η2

p actually corresponds to the squared partial correlation for gender .912 = .828

Coefficientsa

.725 .496.472 1.313 .237 .472 .472

.976 .374.472 2.887 .034 .791 .472.802 4.899 .004 .910 .802

(Constant)AGE(Constant)AGEGENDER

Model1

2

Beta


t Sig. Partial PartCorrelations

Dependent Variable: HEIGHTa.

28

5555

Minor diffs in outputMinor diffs in output–– in our ancova the test for age is given as in our ancova the test for age is given as FF(1,5) = (1,5) =

8.33, 8.33, pp = .034= .034– this actually corresponds to the test of the coefficient

for age in the full model at step 2:•• remember tremember t2 2 = F = F (2.887(2.8872 2 = 8.33)= 8.33)

Coefficientsa

.725 .496.472 1.313 .237 .472 .472

.976 .374.472 2.887 .034 .791 .472.802 4.899 .004 .910 .802

(Constant)AGE(Constant)AGEGENDER

Model1

2

Beta


t Sig. Partial PartCorrelations

Dependent Variable: HEIGHTa.

5656

explanationexplanationancova and hierarchical regression achieve the ancova and hierarchical regression achieve the same broad purposesame broad purposesome minor differences in the output simply some minor differences in the output simply reflect defaults which have been programmed reflect defaults which have been programmed into SPSS into SPSS –– e.g., as effect sizes have only recently become e.g., as effect sizes have only recently become

emphasised for anova, these don’t line up as you emphasised for anova, these don’t line up as you would expect with the ones for regression, but the link would expect with the ones for regression, but the link is in there somewhere!is in there somewhere!

29

5757

interactions interactions –– MMR vs anovaMMR vs anova

testing interactions in anova and MMR testing interactions in anova and MMR look incredibly differentlook incredibly different–– this is just because they have different this is just because they have different

histories histories –– essentially they are doing the same thingessentially they are doing the same thing

5858

2 categorical variables2 categorical variables

going back to our height data, let’s say we going back to our height data, let’s say we wanted to examine the interaction wanted to examine the interaction between maternal diet and gender in the between maternal diet and gender in the prediction of heightprediction of height–– factor A is gender (M/F)factor A is gender (M/F)–– factor B is maternal diet (healthy, unhealthy)factor B is maternal diet (healthy, unhealthy)(N = 16)(N = 16)

30

5959



950.000a 3 316.667 8.444 .003435600.000 1 435600.000 11616.000 .000

625.000 1 625.000 16.667 .002100.000 1 100.000 2.667 .128225.000 1 225.000 6.000 .031450.000 12 37.500

437000.000 161400.000 15

SourceCorrected ModelInterceptGENDERDIETGENDER * DIETErrorTotalCorrected Total

Type III Sumof Squares df Mean Square F Significance


anova anova –– the way we knowthe way we know

FF (1,12) = 6.00, (1,12) = 6.00, pp = .031= .031

6060

MMRMMR

in our MMR lecture we talked briefly about in our MMR lecture we talked briefly about categorical variables in MMR categorical variables in MMR –– they can they can get a bit tricky get a bit tricky but with dichotomous variables it is dead but with dichotomous variables it is dead easy easy –– enter additive effects (gender and diet) at step enter additive effects (gender and diet) at step

1 1 –– interaction term (gender*diet) at step 2 .interaction term (gender*diet) at step 2 .

31

6161

MMRMMRModel Summary

.720a .518 .444 7.20577 .518 6.981 2 13 .009

.824b .679 .598 6.12372 .161 6.000 1 12 .031

Model12

R R SquareAdjustedR Square

Std. Error ofthe Estimate

R SquareChange F Change df1 df2

Significance FChange

Change Statistics

Predictors: (constant) DIET, GENDER...a.

Predictors: (constant) DIET, GENDER, INT...b.

FchFch (1,12) = 6.00, (1,12) = 6.00, pp = .031= .031

6262

implications implications the GLM has been behind the scenes for just the GLM has been behind the scenes for just about all of the statistical methods examined in about all of the statistical methods examined in PSYC3010PSYC3010we stick to a lot of these conventions about we stick to a lot of these conventions about when to use ANOVA instead of regression for when to use ANOVA instead of regression for practical reasonspractical reasonsby understanding the common links through all by understanding the common links through all these analyses we can be less rigid in our use of these analyses we can be less rigid in our use of these toolsthese toolshere are some of the comparisons we can makehere are some of the comparisons we can make

32

6363

hypothesis testing hypothesis testing in in anovaanova we test the hypothesis that our manipulations we test the hypothesis that our manipulations have had a significant effect on our DVhave had a significant effect on our DV

HH00:: μμ11 = = μμ2 2 = = μμ33–– the null hypothesis the null hypothesis –– no differences among treatment meansno differences among treatment means

HH11:: the null hypothesis is falsethe null hypothesis is false–– the alternative hypothesis the alternative hypothesis –– there is at least one difference among there is at least one difference among

treatment meanstreatment means

in in regressionregression we test the hypothesis that our predictors we test the hypothesis that our predictors are accounting for a significant amount of variance in our are accounting for a significant amount of variance in our criterioncriterion

HH00:: the relationship between the criterion and the set of predictors is zero

HH11:: the relationship between the criterion and the set of predictors isnot zero

6464

variance partitioning variance partitioning in in anovaanova we want to partition the total variance out into we want to partition the total variance out into effects and error termseffects and error terms–– main effectsmain effects and and interactions interactions compared to compared to errorerror–– the goal is to attribute a the goal is to attribute a significantsignificant and and substantialsubstantial proportion proportion

of variance in our DV to our effectsof variance in our DV to our effects

in in regressionregression we want to model our data by finding the we want to model our data by finding the line/plane of best fit, i.e., the one that minimises errors of line/plane of best fit, i.e., the one that minimises errors of predictionprediction–– the model can then be described in terms of the model can then be described in terms of additive effectsadditive effects

and and interactions,interactions, which are compared to which are compared to errorerror–– the goal is to explain a the goal is to explain a significantsignificant and and substantialsubstantial proportion of proportion of

variance in our criterion as possiblevariance in our criterion as possible

33

6565

effect size effect size in in anovaanova we can quantify the amount of the total we can quantify the amount of the total variance which each effect accounts forvariance which each effect accounts for–– etaeta--squared squared (sample estimate)(sample estimate)–– omegaomega--squared squared (population estimate)(population estimate)

in in regressionregression we can quantify the amount of we can quantify the amount of variance that our model accounts forvariance that our model accounts for–– RR2 2 (sample estimate)(sample estimate)–– RR22 adjusted adjusted (population estimate)(population estimate)–– srsr22 (importance of individual predictor)(importance of individual predictor)

6666

complex relationships complex relationships in in anovaanova we can test for 2we can test for 2--way or 3way or 3--way interactions way interactions (and beyond!)(and beyond!)– the effect of factor A on the DV changes over levels of factor B–– followfollow--up these with simple effects up these with simple effects –– i.e., examine the effect of i.e., examine the effect of

A on the DV at each level of BA on the DV at each level of B

in in regressionregression we can test for 2we can test for 2--way or 3way or 3--way way interactions (and beyond!)interactions (and beyond!)–– the relationship between X and Y varies over values of Zthe relationship between X and Y varies over values of Z–– followfollow--up these with simple slopes up these with simple slopes –– i.e., examine the i.e., examine the

relationship between X and Y at high and low conditional relationship between X and Y at high and low conditional values of Zvalues of Z

34

6767

increasing power increasing power in in anovaanova we can employ a number of statistical and we can employ a number of statistical and methodological techniques: methodological techniques: –– blocking on a concomitant factorblocking on a concomitant factor–– remove individual differences (i.e., use a withinremove individual differences (i.e., use a within--subjects design)subjects design)–– include a covariate (i.e., use ancova)include a covariate (i.e., use ancova)

in in regressionregression we also have some similar techniques at we also have some similar techniques at our disposal:our disposal:–– partial the effect of another variable out first (i.e., use hierarchical partial the effect of another variable out first (i.e., use hierarchical

regression regression -- similar to ancova)similar to ancova)–– improve measurement (e.g., measure subjects with most reliable improve measurement (e.g., measure subjects with most reliable

measures measures –– i.e., higher alpha)i.e., higher alpha)

6868

The multivariate universe:The multivariate universe:Before 3010:Before 3010:–– Single explanationsSingle explanations–– Barely grasp difference Barely grasp difference

between correlations and between correlations and group differencesgroup differences

–– Tendency to rely too much Tendency to rely too much on pon p--valuesvalues

After 3010:After 3010:–– Multiple explanationsMultiple explanations–– Explanations that interact, Explanations that interact,

or are interor are inter--relatedrelated–– Variables considered jointly Variables considered jointly

so you can see interactions so you can see interactions and interand inter--relationships relationships explain more than explain more than considering each aloneconsidering each alone

–– Strong understanding of Strong understanding of correlations and group correlations and group differencesdifferences

–– Understanding key idea of Understanding key idea of effect sizeseffect sizes

35

6969

In the tutes: In the tutes: No tutes!No tutes!

In future :In future :

Consult times for me for the exam will be:Consult times for me for the exam will be:Monday 20 June 4Monday 20 June 4--5pm 5pm Friday 17 June 8Friday 17 June 8--10am10amMonday 13 June 1Monday 13 June 1--3pm3pmMonday 6 June 3Monday 6 June 3--4pm4pmOr by appointmentOr by appointment

Every effort will be made to post the A2 marks Every effort will be made to post the A2 marks online by Friday 18 June, 5pm, although this online by Friday 18 June, 5pm, although this cannot be guaranteedcannot be guaranteed

Assignment feedback sheets can be picked up Assignment feedback sheets can be picked up from Winnifred by appointmentfrom Winnifred by appointment

Thank you!Thank you!

Good luck on the examGood luck on the exam

7070

Date post:	31-Jan-2018
Category:	Documents
Upload:	phunghanh
View:	218 times
Download:	1 times

psyc3010 lecture 13 - School of Psychology Homeuqwloui1/stats/3010 for post/3010l13... · psyc3010...

Documents