+ All Categories
Home > Documents > Multilevel modelling short course Mark Tranmer, CCSR.

Multilevel modelling short course Mark Tranmer, CCSR.

Date post: 28-Mar-2015
Category:
Upload: devin-hunter
View: 227 times
Download: 5 times
Share this document with a friend
Popular Tags:
35
Multilevel modelling short course Mark Tranmer, CCSR
Transcript
Page 1: Multilevel modelling short course Mark Tranmer, CCSR.

Multilevel modellingshort course

Mark Tranmer, CCSR

Page 2: Multilevel modelling short course Mark Tranmer, CCSR.

What is multilevel analysis

• Many populations have a group structure of some kind: hierarchical or non-hierarchical.

• For example pupils can be grouped into schools

• Individuals can be grouped into areas.• Pupils can be grouped by school, and by

neighbourhood. • Suppose we wish to assess area variations in

income, possibly with respect to other factors.

Page 3: Multilevel modelling short course Mark Tranmer, CCSR.

What is multilevel analysis?

• If we have district level data we can estimate a district level relationship.

• E.g. average income and average age in each district

• If we have individual level data we can estimate an individual level relationship

• E.g. we can relate a person’s income to a person’s age.

Page 4: Multilevel modelling short course Mark Tranmer, CCSR.

What is multilevel analysis?

• But how do we assess the relationships at the district level and the individual level at the same time?

• We can do this with a multilevel model.

• We can fit this kind of model with specialist software such as MLwiN, which we will use today.

Page 5: Multilevel modelling short course Mark Tranmer, CCSR.

The ecological fallacy

• We could assume that an equation we estimate at the district level also occurs at the individual level, that is to make a cross level inference

• But this is generally not sensible – individuals vary within each district with respect to the variables we wish to relate.

• Hence we could well make invalid inferences about the relationship at the individual level

• This phenomenon is referred to as ‘the ecological fallacy’.

Page 6: Multilevel modelling short course Mark Tranmer, CCSR.

Problems of ignoring population structure

• If we carry out the analysis at the individual level we do not recognise in our analysis that ‘similar’ individuals that live within small sub areas of our population.

• That is, ‘clustering’ occurs• Ignoring this clustering may lead to biased

estimates of summary statistics, especially variances, standard deviations and standard errors.

• Hence we might falsely attribute statistical significance (or non significance) to results if we ignore the clustering.

Page 7: Multilevel modelling short course Mark Tranmer, CCSR.

Examples of multilevel relationships

Page 8: Multilevel modelling short course Mark Tranmer, CCSR.

Some substantive multilevel examples

• Schools. Variations in exam performance.

 

Level 3: school

Level 2: class

Level 1: pupils

Variations in exam score. ‘School effectiveness’

Page 9: Multilevel modelling short course Mark Tranmer, CCSR.

Some substantive multilevel examples

• Areas: Variations in health Level 3: CountiesLevel 2: DistrictsLevel 1: people • People: Dental data Level 2: People’s mouthsLevel 1: teeth

Page 10: Multilevel modelling short course Mark Tranmer, CCSR.

Some substantive multilevel examples

• Time as a level. Level 2: PersonLevel 1: Occasion • Multivariate. Level 2: PupilLevel 1: subject of exam score.

Page 11: Multilevel modelling short course Mark Tranmer, CCSR.

Terminology

• Nesting. Level k-1 units contained in level k units. E.g.

classes at level 2 nested in schools at level 3. Classes are the level 2 units, schools are the level 3 units.

• Cross classification.Non overlapping higher level units – school andneighbourhood at level 2, pupil at level 1.

Page 12: Multilevel modelling short course Mark Tranmer, CCSR.

Continuous and Binary Response variables

• For a continuous response we use a multilevel model that is an extension of the standard multiple regression model – as we will see this morning.

• For a binary response we use a multilevel model that is an extension of the logistic regression model – as we will see this afternoon.

Page 13: Multilevel modelling short course Mark Tranmer, CCSR.

Data requirements

• What are the data requirements for multilevel modelling?

• The standard requirements are to have available a dataset that includes indicators of the group to which individual unit belongs.

• For example information for a sample of pupils that includes an indicator of the school that they attend.

• Another example is a sample of individuals that includes an indicator of the area in which they live.

Page 14: Multilevel modelling short course Mark Tranmer, CCSR.

Fixed effects

• What about fixed effects analysis?• If we had information on pupils that attended three

schools, we can carry out a fixed effects analysis to compare the three schools based on these sample data.

• We would do this by doing an analysis that includes two dummy variables that allow us to compare the schools.

• We could make inferences from our results about how the three schools compare but we would not want to make wider inferences about ‘all schools’ based on information on only 3 schools.

Page 15: Multilevel modelling short course Mark Tranmer, CCSR.

Multilevel modelling

• For multilevel modelling we would have information on a ‘reasonable number of higher level units’

• What is ‘reasonable’? Snijders and Bosker (1999) recommend at least 10 groups. 20 or more is better.

• We essentially assume we have a representative sample of higher level units in multilevel modelling, so 30 is a good number to have in mind.

Page 16: Multilevel modelling short course Mark Tranmer, CCSR.

Multilevel modelling

• Suppose we had data for pupils based on 30 schools.• We could carry out a fixed effects analysis on these

data by using 29 dummy variables.• Or we could use multilevel modelling which assumes

the schools are themselves a sample. Hence we do not need to estimate so many model parameters using multilevel modelling and it is desirable in this situation.

• Multilevel modelling also takes into account group size in estimation – estimates of residuals for groups with small populations – e.g. a school with 2 pupils – are ‘shrunken’ towards the mean.

Page 17: Multilevel modelling short course Mark Tranmer, CCSR.

Theory: Single level models

• Suppose we have data for 4059 pupils in 65 schools.

• How could we model the data?

• Model 1: pupil level model based on the 4059 pupils

Var(yi) = 2

iii exy 10

Page 18: Multilevel modelling short course Mark Tranmer, CCSR.

Single level models

• Model 2: Or a school level model based on aggregate data for the 65 schools; that is, the school means.

jjj exy 10

Page 19: Multilevel modelling short course Mark Tranmer, CCSR.

Multilevel models: model 3 ‘variance components’ model

   

Var(yij) = 2u+2

e = 2

 i is the pupil subscriptj is the school subscript 2

u measures variation in schools.

2e measures variation in pupils.

ijjij euy 0

Page 20: Multilevel modelling short course Mark Tranmer, CCSR.

Intra-‘class’ correlation

2u /2

= the intra class correlation:

the proportion of the overall variation in exam score attributable to schools. i.e. how similar are exam scores within schools

Page 21: Multilevel modelling short course Mark Tranmer, CCSR.

Random intercepts model

Model 4: 2 level model: pupils in schools, with an explanatory variable.  

ijjijij euxy 10

Page 22: Multilevel modelling short course Mark Tranmer, CCSR.

Random slopes model

ijjijjij euxy 010

jj u111

ijjijjijij euxuxy 0110

Model 5: random slopes   Where the ‘random slopes coefficient is:

Or alternatively, but equivalently, we can write the model as:  

 

 

Page 23: Multilevel modelling short course Mark Tranmer, CCSR.

Group level variables

• We can also add group level variables to the model, e.g. the type of school (mixed or single sex), or the percentage of pupils taking free school meals in the school.

ijjjjijjij euzwxy 03210

Page 24: Multilevel modelling short course Mark Tranmer, CCSR.

Binary response variables

• Many response variables are ‘binary’ ‘0/1’ ‘dichotomous’.

• E.g. whether or not a person is unemployed or has a limiting long term illness.

• Risk of unemployment may be associated with personal characteristics and/or where people live. We can use Multilevel logistic models to investigate these issues.

Page 25: Multilevel modelling short course Mark Tranmer, CCSR.

Binary response variables

• Let’s suppose we are looking at the risk of people being unemployed given some demographic characteristics, and also given some information about the area in which they live.

• We can look at this problem using multilevel logistic regression models

Page 26: Multilevel modelling short course Mark Tranmer, CCSR.

Multilevel logistic regression models

Model 6: The basic (two level) multilevel model for a binary response is written as follows.   where yij takes the value 0 or 1 for each individual i in

group j (0=not unemployed, 1=employed),

pij is the predicted probability of unemployment for

individual i in area j.

eij is an individual level error,

ijijij epy

Page 27: Multilevel modelling short course Mark Tranmer, CCSR.

Multilevel logistic regression models

jppij uxxxp ...)(Logit 22110

 

 

Where 0 is the ‘intercept’ and, 1 to p are

the coefficients of the p explanatory variables

Page 28: Multilevel modelling short course Mark Tranmer, CCSR.

MLwiN for binary response variables.• MLwiN could be used to fit a multilevel model based on

the example of unemployment as a response variable and some demographic information as explanatory variables.

• For this analysis we could use 1991 UK Census data from the Samples of Anonymised records (SAR).

• The MLwiN procedure for binary response variables is slightly more involved than that for continuous response variables.

• See chapter 9 of the mlwin user guide• www.cmm.bristol.ac.uk/MLwiN/download/userman_2005.pdf

Page 29: Multilevel modelling short course Mark Tranmer, CCSR.

SPSS for mutilevel modelling

• In versions of SPSS >= 11.5 it is now possible to fit models for dependent variables with an interval response.

• The syntax on the next slide shows how variance components, random intercepts and random intercepts/slopes models can be fitted for a 2-level example - pupils in schools.

Page 30: Multilevel modelling short course Mark Tranmer, CCSR.

Random intercepts and slopes (on standlrt) model for pupils in Schools. (normexam is continuous response; standlrt is continuous) Explanatory variable. Syntax is as follows.

mixed normexam with standlrt / print = solution / fixed standlrt / random intercept standlrt | subject(school) covtype(UN).

SPSS for multilevel modelling

[ to access via SPSS menus: analyse > mixed models ]

Page 31: Multilevel modelling short course Mark Tranmer, CCSR.

Model Dimensionb

1 1

1 1

2 Unstructured 3 SCHOOL

1

4 6

Intercept

STANDLRT

Fixed Effects

Intercept + STANDLRTaRandom Effects

Residual

Total

Numberof Levels

CovarianceStructure

Number ofParameters

SubjectVariables

As of version 11.5, the syntax rules for the RANDOM subcommand have changed. Yourcommand syntax may yield results that differ from those produced by prior versions. If you areusing SPSS 11 syntax, please consult the current syntax reference guide for more information.

a.

Dependent Variable: NORMEXAM.b.

Page 32: Multilevel modelling short course Mark Tranmer, CCSR.

Estimates of Fixed Effectsa

-.0116529 .0401111 60.653 -.291 .772 -.0918693 .0685635

.5565333 .0201139 56.343 27.669 .000 .5162458 .5968209

ParameterIntercept

STANDLRT

Estimate Std. Error df t Sig. Lower Bound Upper Bound

95% Confidence Interval

Dependent Variable: NORMEXAM.a.

Page 33: Multilevel modelling short course Mark Tranmer, CCSR.

Estimates of Covariance Parametersa

.5536372 .0124922

.0921177 .0187573

.0183415 .0070894

.0149670 .0046961

ParameterResidual

UN (1,1)

UN (2,1)

UN (2,2)

Intercept + STANDLRT[subject = SCHOOL]

Estimate Std. Error

Dependent Variable: NORMEXAM.a.

Page 34: Multilevel modelling short course Mark Tranmer, CCSR.

variance components model only

mixed normexam / print = solution / random intercept | subject(school) covtype(UN).

random intercepts model only

mixed normexam with standlrt / print = solution / fixed standlrt / random intercept | subject(school) covtype(UN).

Page 35: Multilevel modelling short course Mark Tranmer, CCSR.

Reading listBooks:• Plewis, I (1997) ‘Statistics in Education’. Edward Arnold• Snijders T and Bosker R (1999) ‘An introduction to

Basic and Advanced Multilevel modelling. Sage Publications.

• Goldstein, H (1995) Multilevel statisical models. Edward Arnold.

Web:

• http://www.cmm.bristol.ac.uk• Nb: New version of mlwin 2.10 just released : see

website


Recommended