Lecture slides stats1.13.l23.air

Post on 29-Jun-2015

90 views 0 download

Tags:

description

Lecture slides stats1.13.l23.air

transcript

Statistics One

Lecture 23 Generalized Linear Model

Two segments

•  Overview •  Examples

2

Lecture 23 ~ Segment 1

Generalized Linear Model Overview

Generalized Linear Model

•  An extension of the General Linear Model that allows for non-normal distributions in the outcome variable and therefore also allows testing of non-linear relationships between a set of predictors and the outcome variable

4

Generalized Linear Model

•  Generalized Linear Model: GLM* •  General Linear Model: GLM

5

General Linear Model (GLM)

•  GLM is the mathematical framework used in many common statistical analyses, including multiple regression and ANOVA

6

Characteristics of GLM

•  Linear: pairs of variables are assumed to have linear relations

•  Additive: if one set of variables predict another variable, the effects are thought to be additive

7

Characteristics of GLM

•  BUT! This does not preclude testing non-linear or non-additive effects

8

Characteristics of GLM

•  GLM can accommodate such tests, for example, by

•  Transformation of variables –  Transform so non-linear becomes linear

•  Moderation analysis –  Fake the GLM into testing non-additive effects

9

GLM example

•  Simple regression •  Y = B0 + B1X1 + e

•  Y = faculty salary •  X1 = years since PhD

10

GLM example

•  Multiple regression •  Y = B0 + B1X1 + B2X2 + B3X3 + e

•  Y = faculty salary •  X1 = years since PhD •  X2 = number of publications •  X3 = (years x pubs)

11

Generalized linear model (GLM*)

•  Appropriate when simple transformations or product terms are not sufficient

12

Generalized linear model (GLM*)

•  The “linear” model is allowed to generalize to other forms by adding a “link function”

13

Generalized linear model (GLM*)

•  For example, in binary logistic regression, the logit function was the link function

14

Binary logistic regression

•  ln(Ŷ / (1 - Ŷ)) = B0 + Σ(BkXk)

15

Ŷ = predicted value on the outcome variable Y B0 = predicted value on Y when all X = 0 Xk = predictor variables Bk = unstandardized regression coefficients (Y – Ŷ) = residual (prediction error) k = the number of predictor variables

Segment summary

•  GLM* is an extension of GLM that allows for non-normal distributions in the outcome variable and therefore also allows testing of non-linear relationships between a set of predictors and the outcome variable

16

Segment summary

•  Appropriate when simple transformations or product terms are not sufficient

17

Segment summary

•  The “linear” model is allowed to generalize to other forms by adding a “link function”

18

END SEGMENT

Lecture 23 ~ Segment 2

Generalized Linear Model Examples

GLM* Examples

•  GLM* is an extension of GLM that allows for non-normal distributions in the outcome variable and therefore also allows testing of non-linear relationships between a set of predictors and the outcome variable

21

GLM* Examples

•  Appropriate when simple transformations or product terms are not sufficient

22

GLM* Examples

•  The “linear” model is allowed to generalize to other forms by adding a “link function”

23

GLM* Examples

•  Binary logistic regression

24

Binary logistic regression

GLM* Examples

•  In binary logistic regression, the logit function served as the link function

26

GLM* Examples

•  ln(Ŷ / (1 - Ŷ)) = B0 + Σ(BkXk)

27

Ŷ = predicted value on the outcome variable Y B0 = predicted value on Y when all X = 0 Xk = predictor variables Bk = unstandardized regression coefficients (Y – Ŷ) = residual (prediction error) k = the number of predictor variables

GLM* Examples

•  More than 2 categories on the outcome – Multinomial logistic regression •  A-1 logistic regression equations are formed

– Where A = # of groups – One group serves as reference group

GLM* Examples

•  Another example is Poisson regression •  Poission distributions are common with

“count data” – A number of events occurring in a fixed interval

of time

29

GLM* Examples

•  For example, the number of traffic accidents as a function of weather conditions – Clear weather – Rain – Snow

30

Poisson regression

31

GLM* Examples

•  In Poisson regression, the log function serves as the link function

•  Note: this example also has a categorical predictor and would therefore also require dummy coding

32

Segment summary

•  GLM* is an extension of GLM that allows for non-normal distributions in the outcome variable and therefore also allows testing of non-linear relationships between a set of predictors and the outcome variable

33

Segment summary

•  Appropriate when simple transformations or product terms are not sufficient

34

Segment summary

•  The “linear” model is allowed to generalize to other forms by adding a “link function”

35

END SEGMENT

END LECTURE 23