Correction for measurement error in survey research using SQP

transcript

Correction for measurement error

in survey researchusing SQP

Willem E. SarisRECSM 2013

Introduction

• All researchers agree that survey data contain measurement errors

• Since 1971 procedures for correction of measurement errors are known (Duncan and Goldberger)

• However, very few researchers try to correct for these errors

Attention to measurement problems in social science journals

of 2011Journal Year No.

paperSurvey research used

Errors mentioned

Errors corrected

ESR 2011 48 41 9 1

EJPR 2011 32 20 4 1

POQ 2011 33 32 4 1

AJPS 2011 54 23 3 0

JM 2011 47 27 11 8

ESR=The European Sociological Review, EJPR= European Journal of Political Research,POQ= Public Opinion Quarterly, APSR=The American Journal of Political Science , JM=Journal of Marketing

Why does this happen?

1. because the effect of measurement errors is very small?or2. because it is very difficult to correct for measurement error?or3. because the information about the size of the measurement errors is not available?

1. Is the effect of the measurement errors very

small?

1. Is the effect of the measurement errors very

small?• Around 1971, Alwin, Andrews and I

detected, using LISREL, that the errors in survey questions are very large

• All three have spent their academic life on the estimation and correction for measurement error

• Duane Alwin (2007) concentrated on the Quasi Simplex approach

• Frank Andrews (1984) and I used the MTMM approach

The size of the error variance

• Our estimate was that in average 50% of the variance of responses to survey questions is error

• So there is a considerable difference between the variable one likes to measure and the observed variable

Consequences of measurement error

• The consequences will discussed for

• The observed correlations

• The regression analysis

• Comparative research

The consequences for the correlation

• Imagine that we are interested in the correlation between – f1 = job satisfaction

– f2= life satisfaction

• We ask : How satisfied are you with your job? and : How satisfied are you with your life?

The responses are represented by y1 and y2

We know that there is quite a difference between f1 and y1 and between f2 and y2

A very simple model

If the variables fi and yi are standardized

• qi2 = the quality of the indicator i for

latent variable i

• 1- qi2 = the error variance of

indicator i for latent variable i

• It can be proven that: (y1y2) = (f1f2) q1q2

f1 f2 (f1f2) q1 q2 y1 y2 e1 e2

Consequences for correlations

• If the correlation between the latent variables is (f1,f2) = .9, the correlation between the observed variables will be as follows

Quality coefficient

Observed correlation

q1 q2 (y1, y2)

1.0 1.0 .90

.9 .9 .73

.8 .8 .58

.7 .7 .45

.6 .6 .33

Consequences for correlations and regressions

JS* Age* LS*

JS* 1.0

Age* 0.0 1.0

LS* .4 .4 1.0

JS Age LS

JS 1.0

Age 0.0 1.0

LS .13 .24 1.0

Correlations betweenLatent variables Observed variables

RegressionLS*=.4JS*+.4Age*+u3 LS=.13JS+.24Age+e3

Consequences for cross cultural comparison

f1 f2 .9 .9 y1 y2 e1 e2

f1 f2 .7 .7 y1 y2 e1 e2

Country A Country B

Corr(Y1,Y2)=.65=.8*.9*.9 Corr(Y1,Y2)=.4=.8*.7*.7

Conclusions

• The research of me, Andrews, Alwin and others shows that the error variance in survey data is rather large

• The errors cause that the correlations and regression coefficients between observed variables can be very different from those between latent variables

• Differences in error variances across countries will make comparisons across countries impossible

2. Is correction for measurement errors very difficult?

The standard SEM approach

e1 y1 x1 1 Environmental Values Perception e2 y2 (1) Environmental damage (1) e3 y3 x2 2 Environment friendly 3 behavior (3) e4 y4 x3 3 e5 y5 Influence Understanding

2 politics (2) e6 y6 x4 4

Is this the approach to use?

• In principle this approach is correct but in reality it leads to a lot of complications and errors

• This may be a reason that researchers don´t correct for measurement errors

• There should be simpler procedures

2. Is correction for measurement errors very

difficult?

If this model holds :(y1y2) = (f1f2) q1q2

Then it also holds that (f1f2) = (y1y2)/ q1q2

So correction for measurement error is very simple

This holds for single questions as well as composite scores

f1 f2 (f1f2) q1 q2 y1 y2 e1 e2

Quality estimates of two scales in the last Pilot of the ESS

• Two scales were constructed: – one based on opinions about liberal rights

called “liberal democracy” and – one based on opinions about electoral

requirements called “electoral democracy”

• The quality of the scale is:– for liberal demoncracy .79– for electoral democracy .77

Correction for measurement error

• The oberved correlation between the two scales is (y1y2) = .638

• So (f1f2) = .638/√(.79x.77) = .82

• So while the observed correlation is not very high, the correlation corrected for measurement error indicates quite a strong relationship between the two scales

Relationships with other variables

• We expect that the scale of liberal democracy should correlate with the variables :– Just (no poverty), quality = .51– Direct (referenda), quality = .62– Income (houshold), quality = .92

• We will now show how simple we can do regresion analysis with and without correcting for measurement errors

Procedure to correct for measurement error using LISREL

Without correction for measurement error

Effects on liberal democracy in the UK da ni=4 no=378 ma=km km1.0 .495 1.0 .401 .413 1.0.210 -.053 -.116 1.0 labels liberal just direct income model ny=1 nx=3 out

Here 1 on the diagonal

With correction for measurement error

Effects on liberal democracy in the UKda ni=4 no=378 ma=kmcm.79 .495 .51 .401 .413 .62.210 -.053 -.116 .92 labelsliberal just direct incomemodel ny=1 nx=3out

Here quality on the diagonal

The correlations and regression

Without correction for measurement errorsCorrelations

liberal just direct income

-------- -------- -------- -------- liberal 1.00 just 0.50 1.00 direct 0.40 0.41 1.00 income 0.21 -0.05 -0.12

Regression (36% explained) just direct income -------- -------- -------- liberal 0.40 0.27 0.26 s.e. (0.05) (0.05) (0.04) t-value 8.77 5.84 6.29

With correction for measurement errorsCorrelations

liberal just direct income

-------- -------- -------- -------- liberal 1.00 just 0.78 1.00 direct 0.57 0.73 1.00 income 0.25 -0.08 -0.15

Regression (70% explained) just direct income

-------- -------- -------- liberal 0.76 0.07 0.32 s.e. (0.04) (0.04) (0.03) t-value 18.22 1.59 11.06

Generalization

• The same can be done for causal models with several variables and composite scores

• It can be done for standardized and unstandardized coefficients

• STATA has also posibilities for correction for measurement error but less general

Procedure to correct for measurement error using Stata

Limitations:

• One can apply it only on regression, not on causal models in general

• Only correction for measurement error in the independent variables

• Only unstandardized analysis

Regression without correction in STATA

regress liberal socjustice direct income if cntry==1

The procedure for correction in STATA

eivreg liberal socjustice direct income , r(socjustice .51 direct .62 income .92), if cntry==1

Conclusions

• Correction for measurement errors is nowadays very simple

• Correction for measurement errors is also necessary

3. Is it difficult to estimate the quality of questions and

composite scores?

3. Is it difficult to estimate the quality of questions and

composite scores?• There are a lot of different procedures

• They all require at least 2 questions for each concept and the estimates are specific for the formulations of these questions

• That means that the questionnaires become twice as long and more expensive

The Multi-Trait Multi Method approach

• There are many procedures developed to obtain estimates of the quality of questions and composite scores (Saris&Gallhofer 2007)

• We have chosen the MTMM design– proposed by Campbell and Fiske (1959) – further developed by Andrews (1984), Saris

and Andrews (1991), Saris, Satorra and Coenders (2004)

An example

• Three ESS questions about satisfaction:

– On the whole, how satisfied are you with the present state of the economy in Britain?

– Now think about the national government. How satisfied are you with the way it is doing its job?

– And on the whole, how satisfied are you with the way democracy works in Britain?

Three alternative response scales

The first (M1): 1)very satisfied , 2)fairly satisfied, 3)fairly dissatisfied or

4)very dissatisfied

The second (M2): very very dissat- satis- isfied fied 0 1 2 3 4 5 6 7 8 9

The third (M3): 1)not at all satisfied 2)satisfied 3)rather satisfied 4)very

satisfied

Estimation

• In this way one gets 45 variances and covariances

• Using this data the quality coefficients for these 9 questions can be estimated

Limitation of these experiments

• In the ESS 3.000 questions have been evaluated with respect to quality up to now

• However, in the same time 60.000 questions have been asked

• One can never evaluate all questions

• So an alternative procedure is necessary

An alternative procedure

• Frank Andrews already studied the relationship between the characteristics of the questions and the quality of questions

• My idea was that if these relationships are strong one can use them for the prediction of the quality of new questions

• I also thought of creating a program that could make these quality predictions

MTMM experimenst in IRMCS1990 - 2000

• 87 MTMM experiments were collected in the US (Andrews), the Netherlands (Scherpenzeel), Belgium (Billiet)and Austria (Költringer) containing 1023 questions

• A first meta analysis was done to see if the quality of the questions could be explained by question characteristics

• The results were very promising: the explained variance was .50 and .60 for the reliability and validity (Saris & Gallhofer 2007)

MTMM experiments in the ESS2000 - 2012

• In the European Social Survey in each round in each country 4 to 6 experiments

• That means that in each round 1000 questions in more than 20 different European languages were evaluated

• After 3 rounds, we had information about the quality of 3.000 questions

• We expected to be able to predict the quality of the questions from the questions characteristics

The long way to the solution: SQP

• We coded the question characteristics of the MTMM questions

• And we estimated the relationship between these characteristics and the quality of the questions

• Without going into details (Oberski et al 2012), we could predict reliability with a R2 =.8 and the validity with a R2=.9 for the present 3.700 MTMM questions

• The prediction procedure was implemented in the program SQP 2.0

The quality predictions of SQP 2.0

• So we are quite confident that SQP can make rather good predictions of new questions on the basis of the characteristics of the question

Let us go to have a look

Available here: http://sqp.upf.edu/

Can be used free of charge!

You just need to register and then you can use it directly online

Conclusions

• It seems that it is easy to get information about the quality of questions

• SQP gives for a lot of questions information about the quality based on research

• SQP can also be used to predict the quality of questions that have not been studied

• Users can bring in their own questions and by coding the question obtain a prediction of the quality

• If the qualities of single questions are known, the quality of composite scores can also be derived

Conclusions

• The program SQP is an internet application

• So all users that are coding questions add information about quality of new questions to the database

• In this way,one gets a growing data base of questions with their quality: A wikipedia for questions

Conclusions

Is there any reason not to correct for measurement error ?

1. Is the effect of measurement errors very small? NO!

2. Is it very difficult to correct for measurement error? NO!

3. Is the information about the size of the measurement errors missing? NO!

Conclusions

• There is no reason anymore to analyze data without correction for measurement error

• If one takes research seriously, one has to make the correction for measurement errors

• Otherwise one cannot trust the results from the research

Summary

• A summary of all details and problems of this approach using ESS data will be provided in a second edition of

• Saris and Gallhofer Design, Evaluation and Analysis of Questionnaires for Survey Research. Hoboken, Wiley

• The book will appear in 2014

A FINAL ILLUSTRATION OF CORRECTION FOR A MORE COMPLEX CASE

• A very popular topic of research is the explanantion of the opinion about immigration of people from outside Europe

Economic threat

Better life

Culture threat

Allow more people from outside Europe

Summary of the predicted values of the quality indicators in Ireland

Variable Method r2 v2 m2 q2

Allow SQP2.0 .826 .906 .094 .747

Economy SQP2.0 .770 .780 .220 .601

Culture SQP2.0 .761 .705 .295 .537

Better SQP2.0 .748 .725 .275 .543

Correction for errors taking cmv into account

ρ(f1,f2) f1 f2 fi = ith variable of interest vij = validity coefficient for variable i v1j Mj v2j Mj = method factor for both variables m1j m2j mij = method effect on variable i

t1j t2j tij = true score for yij r1j r2j rij = reliability coefficient y1j y2j yij = the observed variable e1j e2j eij= the random error in variable yij

(y1j,y2j) = (f1,f2)q1jq2j + cmv

(f1,f2)(y1j,y2j) - cmv]/ q1jq2j

Correction of the correlations for random errors and CMV

Estimates of the parameters with and without correction

Conclusion

• This example shows again that the corrections for measurement error is necessary

• Now there is also no excuse anymore.

• The procedures to correct are simple

• And SQP provides information about the quality of questions even without collecting extra new data

We did not do this work alone

• Hubert Blalock, Karl Jöreskog, Frank Andrews, Albert Satorra, Marius de Pijper, Anuska Ferligoj, Roger Jowel, JoanManuel Batista

• Past Students: Annette Scherpenzeel, Richard Költringer, Germa Coenders, Chris Aalbers, Irmgard Corten, William van der Veld, Luis Coromina, Laura Guillen, Desiree Knoppen

• The new generation: Melanie Revilla, Diana Zavalla, Laur Lilleoja, Wiebke Weber

• Special group: Daniel Oberski and Tom Gruner

The Future

• Of course the predictions are not perfect

• Improvement is always possible- Alternative quality estimation procedures can be developed

• Extention is necessary for - new question forms and - other languages

• But…

Future

• I leave this task for the RECSM researchers:– Wiebke Weber, Melanie Revilla, Diana

Zavalla, Anna de Castellarnau, Lydia Repke, Jennifer Neumann, Bruno Arpino, Paolo Moncagatta and André Pirralha

• I have a lot of confidence that they will take the proper decisions in the future to maintain and improve the present tool

• So that I can concentrate on other things…

Club Pati Barcelona

www.upf.edu/surveyrecsm@upf.edu

Correction for measurement error in survey research using SQP

Documents