Post on 21-Dec-2014
description
transcript
Suppressor and distort variables
WANG ChengjunCity University of HONG KONG
20110304
Suppressor and SuppressionA suppressor is one which weakens a
relationship, which conceals its true strength.Context: zero order correlation. This is the relationship between two variables,
while ignoring the influence of other variablesThe general idea: there is some kind of noise (error) in X1 that is
not correlated with Y, but is correlated with X2.
By including X2 we suppress this noise, and leave X1 as an improved predictor of Y.
Suppressor Variable
Normal situationBecause the
variables share variance and influence .
each semi-partial correlation, and the corresponding beta, will be less than the simple correlation between Xi and Y.
X1
Y
Classical suppression: rY2
= 0The presence
of X2 will increase the multiple correlation, even though it is not correlated with Y.
X2 suppresses some of error variance in X1.
Suicide rate and religion
Durkheim argues that if we control the variable of education, the suicide rate of Jews people will be even smaller.
Jews are assumed to be a more integrated group.
Catholic Protestant Judaist
Suicide rate 33.8 64.9 1.3
Distort variablesDistort variable converts a
positive relationship into a negative relationship.
X1
Y
+
X1
Y X2
-
-
Distort variable
Distort variablesZero order correlation between
marriage and suicide rates indicates that marriage make people tend to suicide.
Marriage make people more integrated, so the married people should have less suicide rate.
Keep model completeFor regression: y=b1*x1+b2*x2, if
|ry1|<|b1|For regression: y=b1*x1+b2*x2, if
ry1*b1<0
Suppressor and distort variables remind us to keep the model complete.
Appendix 1 A simulation of suppression
################Jonathan'sexample############################
# see http://zjz06.spaces.live.com/blog/cns!3F49BBFB6C5A1D86!341.entry y<-c(1,2,3,4,5) x1<-c(2,3,4,5,1) x2<-c(3,2,1,4,5) cor(y,x1);cor(y,x2);cor(x1,x2) ################## partialcorrelations###################### library(ggm) data<-cbind(y,x1,x2) # partial correlation between y and x1controlling for x2 pcor(c('y','x1','x2'),var(data)) pcor(c('y','x2','x1'),var(data)) # regression partials the effect ofsupression fit12<-lm(y~x1+x2) summary(fit1);summary(fit2);summary(fit12) library(QuantPsyc) lm.beta(fit12) #########visualize the data in correlationmatrices########### library(corrgram)#install.packages('corrgram') corrgram(data, order=TRUE,lower.panel=panel.shade, upper.panel=panel.pie,text.panel=panel.txt, main="Suppression in zero ordercorrelation")
Appendix 11 A simulation of distortionset.seed(20110303)y<-rnorm(10000) x2<--0.1*y+0.01*rnorm(10000)x1<-0.8*x2+0.01*rnorm(10000)cor(x1,x2);cor(x1,y);cor(x2,y);plot(data.frame(cbind(y,x1,x2)),col='3') f1<-lm(y~x1) f2<-lm(y~x2) f3<-lm(y~x1+x2) f4<-lm(x1~x2)summary(f1);summary(f2);summary(f3);summary
(f4)