Post on 10-Dec-2014
description
transcript
IInnttrroodduuccttiioonn ooff MMiixxeedd eeffffeecctt mmooddeellLearning by simulation
Supstat Inc.
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
1 of 34 1/29/14, 10:51 PM
OOuuttlliinneeWhat is mixed effect model
Fixed effect model
Mixed effect model
General Mixed effect model
Case study
·
·
·
Random Intercept model
Random Intercept and Slope Model
-
-
·
·
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
2 of 34 1/29/14, 10:51 PM
What is mixed effect model
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
3 of 34 1/29/14, 10:51 PM
CCllaassssiiccaall nnoorrmmaall lliinneeaarr mmooddeellFormation:
Yi = b0 + b1*Xi + ei
Yi is response from suject i.
Xi are covariates.
b0, b1 are parameters that we want to estimate.
ei are the random terms in the model, and are assumped to be independently and indenticallydistributed from Normal(0,1). It is very important that there is no stucuture in ei and itrepresents the variations that could not be controled in our studies.
·
·
·
·
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
4 of 34 1/29/14, 10:51 PM
VViioollaattiioonn ooff iinnddeeppeennddeennccee aassssuummppaattiioonn..In many cases, responses are not independent from each other. These data usualy have somecluster stucture.
We need new tools - Mixed effect model.
Repeated measures, where measurements are taken multiple times from the same sujects.(clustered by subject)
A survey of all the family memebers. (clustered by family)
A survey of students from 20 classrooms in a high school. (clustered by classroom)
Longitudial data, or known as the panel data, where several responses are collected from thesame sujects along the time. (clustered by subject)
·
·
·
·
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
5 of 34 1/29/14, 10:51 PM
MMiixxeedd eeffffeecctt mmooddeellMixed effect model = Fixed effect + Random effect
Fixed effects
Random effect
·
expected to have a systematic and predictable influence on your data.
exhaust “the levels of a factor”.Think of sex(male/femal).
-
-
·
expected to have a non-systematic, unpredictable, or “random” influence on your data.
Random effects have factor levels that are drawn from a large population, but we do notknow exactly how or why they differ.
-
-
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
6 of 34 1/29/14, 10:51 PM
EExxaammppllee ooff FFiixxeedd eeffffeeccttss aanndd RRaannddoomm eeffffeeccttssFIXED EFFECTS RANDOM EFFECTS
Male or female Individuals with repeated measures
Insecticide sprayed or not Block within a field
Upland or lowland Brood
One country versus another Split plot within a plot
Wet versus dry Family
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
7 of 34 1/29/14, 10:51 PM
Fixed effect model
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
8 of 34 1/29/14, 10:51 PM
FFiixxeedd eeffffeecctt mmooddeellFixed effect model is just the linear model that you maybe already know.
Yi = b0 + b1*Xi + ei
1<i<n n is number of sample
Yi: Response Variable
b0: fixed intercept
b1: fixed slope
Xi: Explanatory Variable (fixed effect)
ei: noise (error)
·
·
·
·
·
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
9 of 34 1/29/14, 10:51 PM
DDaattaa ggeenneerraattiioonn ooff ffiixxeedd eeffffeecctt mmooddeellset.seed(1)# genaerate xx <- seq(1,5,length.out=100)# generate errornoise <- rnorm(n=100,mean=0,sd=1)b0 <- 1b1 <- 2# generate yy <- b0 + b1*x + noise
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
10 of 34 1/29/14, 10:51 PM
DDaattaa ggeenneerraattiioonn ooff ffiixxeedd eeffffeecctt mmooddeellplot(y~x)
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
11 of 34 1/29/14, 10:51 PM
CCooooeeffffiicciieenntt eessttiimmaattiioonn ooff ffiixxeedd eeffffeecctt mmooddeellmodel <- lm(y~x)summary(model)
Call:lm(formula = y ~ x)
Residuals: Min 1Q Median 3Q Max -2.3401 -0.6058 0.0155 0.5851 2.2975
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.1424 0.2491 4.59 1.3e-05 ***x 1.9888 0.0774 25.70 < 2e-16 ***---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.903 on 98 degrees of freedom
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
12 of 34 1/29/14, 10:51 PM
pplloott ooff ffiixxeedd eeffffeecctt mmooddeellplot(y~x)abline(model)
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
13 of 34 1/29/14, 10:51 PM
Mixed effect model
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
14 of 34 1/29/14, 10:51 PM
RRaannddoomm IInntteerrcceepptt mmooddeellthere are i people, and we repeat measure j times for every people. These poeple are individuallydifferent which we don't know, so there are random effect cause by people, and there are anotherrandom noise cause by measure for every people.
Yij = b0 + b1*Xij + bi + eij
b0: fixed intercept
b1: fixed slope
Xij: fixed effect
bi: random effect(influence intercept)
eij: noise
·
·
·
·
·
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
15 of 34 1/29/14, 10:51 PM
DDaattaa ggeenneerraattiioonn ooff RRaannddoomm IInntteerrcceepptt mmooddeellb0 <- 9.9b1 <- 2# repeat measure times for 6 peoplen <- c(13, 14, 14, 15, 12, 13)npeople <- length(n)set.seed(1)# generate x(fixed effect)x <- matrix(rep(0, length=max(n) * npeople),ncol = npeople)for (i in 1:npeople){ x[1:n[i], i] <- runif(n[i], min = 1, max = 5) x[1:n[i], i] <- sort(x[1:n[i], i])}# random effectbi <- rnorm(npeople, mean = 0, sd = 10)
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
16 of 34 1/29/14, 10:51 PM
DDaattaa ggeenneerraattiioonn ooff RRaannddoomm IInntteerrcceepptt mmooddeellxall <- NULLyall <- NULLpeopleall <- NULLfor (i in 1:npeople){ xall <- c(xall, x[1:n[i], i]) # combine x # generate y y <- rep(b0 + bi[i], length = n[i]) + b1 * x[1:n[i],i] + rnorm(n[i], mean = 0, sd = 2) # noise yall <- c(yall, y) # combine y people <- rep(i, length = n[i]) peopleall <- c(peopleall, people)}# final datasetdata1 <- data.frame(yall,peopleall,xall)
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
17 of 34 1/29/14, 10:51 PM
CCooooeeffffiicciieenntt eessttiimmaattiioonn ooff RRaannddoomm IInntteerrcceeppttmmooddeelllibrary(nlme)# xall is fixed effect# bi influence intercept of modellme1 <- lme(yall~xall,random=~1|peopleall,data=data1)summary(lme1)
Linear mixed-effects model fit by REML Data: data1 AIC BIC logLik 358 368 -175
Random effects: Formula: ~1 | peopleall (Intercept) ResidualStdDev: 7.3 1.77
Fixed effects: yall ~ xall
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
18 of 34 1/29/14, 10:51 PM
PPlloott ooff RRaannddoomm IInntteerrcceepptt mmooddeell
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
19 of 34 1/29/14, 10:51 PM
RRaannddoomm IInntteerrcceepptt aanndd ssllooppee mmooddeellYij = b0 + (b1+si)*Xij + bi + eij
b0: fixed intercept
b1: fixed slope
X: fixed effect
bi: random effect(influence intercept)
eij: noise
si: random effect(influence slope)
·
·
·
·
·
·
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
20 of 34 1/29/14, 10:51 PM
DDaattaa ggeenneerraattiioonn ooff RRaannddoomm IInntteerrcceepptt aannddssllooppee mmooddeella0 <- 9.9a1 <- 2n <- c(12, 13, 14, 15, 16, 13)npeople <- length(n)set.seed(1)si <- rnorm(npeople, mean = 0, sd = 0.5) # random slopex <- matrix(rep(0, length = max(n) * npeople), ncol = npeople)for (i in 1:npeople){ x[1:n[i], i] <- runif(n[i], min = 1, max = 5) x[1:n[i], i] <- sort(x[1:n[i], i])}
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
21 of 34 1/29/14, 10:51 PM
DDaattaa ggeenneerraattiioonn ooff RRaannddoomm IInntteerrcceepptt aannddssllooppee mmooddeellbi <- rnorm(npeople, mean = 0, sd = 10) # random interceptxall <- NULLyall <- NULLpeopleall <- NULLfor (i in 1:npeople){ xall <- c(xall, x[1:n[i], i]) y <- rep(a0 + bi[i], length = n[i]) + (a1 + si[i]) * x[1:n[i],i] + rnorm(n[i], mean = 0, sd = 0.5) yall <- c(yall, y) people <- rep(i, length = n[i]) peopleall <- c(peopleall, people)}# generate final datasetdata2 <- data.frame(yall, peopleall, xall)
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
22 of 34 1/29/14, 10:51 PM
CCooooeeffffiicciieenntt eessttiimmaattiioonn ooff RRaannddoomm IInntteerrcceeppttaanndd ssllooppee mmooddeell# bi influence intercept and slope of modellme2 <- lme(yall~xall,random=~1+xall|peopleall,data=data2)print(summary(lme2))
Linear mixed-effects model fit by REML Data: data2 AIC BIC logLik 179 194 -83.6
Random effects: Formula: ~1 + xall | peopleall Structure: General positive-definite, Log-Cholesky parametrization StdDev Corr (Intercept) 11.593 (Intr)xall 0.464 0.044 Residual 0.445
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
23 of 34 1/29/14, 10:51 PM
PPlloott ooff RRaannddoomm IInntteerrcceepptt aanndd ssllooppee mmooddeell
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
24 of 34 1/29/14, 10:51 PM
wwhhaatt iiff wwee jjuusstt uussee lliinneeaarr mmooddeellcomplete pooling·
# wrong estimationlm1 <- lm(yall~xall,data=data2)summary(lm1)
Call:lm(formula = yall ~ xall, data = data2)
Residuals: Min 1Q Median 3Q Max -17.80 -6.27 -3.67 2.19 24.33
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 6.86 3.72 1.84 0.06874 . xall 4.31 1.15 3.76 0.00032 ***---
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
25 of 34 1/29/14, 10:51 PM
wwhhaatt iiff wwee jjuusstt uussee lliinneeaarr mmooddeellno pooling·
# wrong estimation and waste too many freedom and we don't care about the exact different of people. we juslm2 <- lm(yall~xall+factor(peopleall)+xall*factor(peopleall),data=data1)summary(lm2)
Call:lm(formula = yall ~ xall + factor(peopleall) + xall * factor(peopleall), data = data1)
Residuals: Min 1Q Median 3Q Max -2.983 -1.194 0.054 1.092 4.238
Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 18.818 1.342 14.02 < 2e-16 ***xall 0.929 0.413 2.25 0.028 *
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
26 of 34 1/29/14, 10:51 PM
General Mixed effect model
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
27 of 34 1/29/14, 10:51 PM
LLooggiissttiicc MMiixxeedd eeffffeecctt mmooddeellYij = exp(eta)/(1+exp(eta))
eta = b0 + b1*Xij + bi + eij
b0: fixed intercept
b1: fixed slope
X: fixed effect
bi: random effect(influence intercept)
eij: noise
·
·
·
·
·
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
28 of 34 1/29/14, 10:51 PM
DDaattaa ggeenneerraattiioonn ooff LLooggiissttiicc MMiixxeedd eeffffeeccttmmooddeellb0 <- - 6b1 <- 2.1set.seed(1)n <- c(12, 13, 14, 15, 16, 13)npeople <- length(n)x <- matrix(rep(0, length = max(n) * npeople), ncol = npeople)bi <- rnorm(npeople, mean = 0, sd = 1.5)for (i in 1:npeople){ x[1:n[i], i] <- runif(n[i], min = 1,max = 5) x[1:n[i], i] <- sort(x[1:n[i], i])}
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
29 of 34 1/29/14, 10:51 PM
DDaattaa ggeenneerraattiioonn ooff LLooggiissttiicc MMiixxeedd eeffffeeccttmmooddeellxall <- NULLyall <- NULLpeopleall <- NULLfor (i in 1:npeople){ xall <- c(xall, x[1:n[i], i]) y <- NULL for(j in 1:n[i]){ eta1 <- b0 + b1 * x[j, i] + bi[i] y <- c(y, rbinom(n = 1, size = 1, prob = exp(eta1)/(exp(eta1) + 1))) } yall <- c(yall, y) people <- rep(i, length = n[i]) peopleall <- c(peopleall, people)}data3 <- data.frame(xall, peopleall,yall)
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
30 of 34 1/29/14, 10:51 PM
CCooooeeffffiicciieenntt eessttiimmaattiioonn ooff LLooggiissttiicc MMiixxeeddeeffffeecctt mmooddeelllibrary(lme4)# formula is differentlmer3 <- glmer(yall~xall+(1|peopleall),data=data3,family=binomial)print(summary(lmer3))
Generalized linear mixed model fit by maximum likelihood ['glmerMod'] Family: binomial ( logit )Formula: yall ~ xall + (1 | peopleall) Data: data3
AIC BIC logLik deviance 69.8 77.1 -31.9 63.8
Random effects: Groups Name Variance Std.Dev. peopleall (Intercept) 3.94 1.98 Number of obs: 83, groups: peopleall, 6
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
31 of 34 1/29/14, 10:51 PM
PPlloott ooff LLooggiissttiicc MMiixxeedd eeffffeecctt mmooddeell
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
32 of 34 1/29/14, 10:51 PM
Case study
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
33 of 34 1/29/14, 10:51 PM
Introduction of Mixed effect model http://nycdatascience.com/mixed_effect_model_supstat/index.html#1
34 of 34 1/29/14, 10:51 PM