+ All Categories
Home > Documents > Generalized Linear Models

Generalized Linear Models

Date post: 08-Jan-2016
Category:
Upload: danil
View: 50 times
Download: 3 times
Share this document with a friend
Description:
Generalized Linear Models. All the regression models treated so far have common structure. This structure can be split up into two parts: The random part: The systematic part: These two elements are the basic building blocks of generalized linear models. The systematic part. - PowerPoint PPT Presentation
Popular Tags:
27
Generalized Linear Models • All the regression models treated so far have common structure. This structure can be split up into two parts: The random part: The systematic part: • These two elements are the basic building blocks of generalized linear models.
Transcript
Page 1: Generalized Linear Models

Generalized Linear Models

• All the regression models treated so far have common structure. This structure can be split up into two parts: The random part: The systematic part:

• These two elements are the basic building blocks of generalized linear models.

Page 2: Generalized Linear Models

The systematic part

• Generalized linear model, systematic part: The covariates influence the distribution of

response through the linear predictor:

There is a link-function that links the expectation to the linear predictor:

Page 3: Generalized Linear Models

The generalization from linear models to GLM

• GLMs are a generalization of linear normal models in two directions:

Page 4: Generalized Linear Models

Example: binomial distribution• Definition: the binomial distribution is the discrete

probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p.

Page 5: Generalized Linear Models

Example

• For the binomial distribution

• The variance is a function of the mean:

• The linear model for the logit: ____________________ is a non-linear model for the probability ___________________.

Page 6: Generalized Linear Models

The exponential family

• Many distributions encountered in practice (ex: normal, binomial, Poisson and Gamma distribution) share a common structure:

Page 7: Generalized Linear Models

Example of the exponential family: Normal distribution

Page 8: Generalized Linear Models

Example of the exponential family: Binomial

Page 9: Generalized Linear Models

Example of the exponential family

• The Poisson distribution: It is a discrete probability distribution that expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and independently to the time.

• Ex:The number of phone calls received by a telephone

operator in a 10-minute period. The number of typos per page made by a secretary.

Page 10: Generalized Linear Models

Poisson distribution

• The Poisson distribution belongs to the exponential family:

Page 11: Generalized Linear Models

Mean and variance in the exponential family

• It can be shown that the mean and variance in the exponential family is:

Page 12: Generalized Linear Models

Mean and variance example: Poisson

• For the Poisson model, mean and variance are:

• To summarize, for any given distribution we obtain a specific form of b which in turn determines the variance function.

• The converse is also true:

• Hence specifying a distribution and a variance function is two sides of the same coin as long as we work with exponential families.

Page 13: Generalized Linear Models

Various variance functions

Page 14: Generalized Linear Models

The link function

• The link function is a function which relates the mean to the linear predictor:

• Various link functions have been illustrated so far:

Page 15: Generalized Linear Models

Canonical link

• For each distribution there is a specific link function which yields “nice” mathematical and numerical properties in connection with the estimation process. This link function is called the canonical link:

Page 16: Generalized Linear Models

Specification of GLM

• In practice, a GLM is specified by three steps:

• In this connection it is important to be aware of the following: Most statistical packages will by default use the canonical link function unless another one is explicitly provided.

Page 17: Generalized Linear Models

R code• The glm function in R is used for fitting

generalized linear models.

• Specification of the linear predictor:

• Specification of the distribution and the link function: e.g.

family=Gamma(link=log)

Page 18: Generalized Linear Models

• Remember that the specification of a distribution yields a specific variance function. Not all possible combinations of a distribution and a link function are allowed in R.

Page 19: Generalized Linear Models

Special aspects for binomial data

• Simulate artificial Bernoulli observations with different event probabilities for two groups (the number of trails N is equal to 1):

R code group <- rep(c("A", "B"), c(30, 45))

logit.pi <- ifelse(group == "B", 0.7, 0.7 + 0.5) group <- factor(group) pi <- plogis(logit.pi) N <- rep(1, length(group)) events <- rbinom(length(group), size = N, prob = pi) dat <- data.frame(group, N, events)

Page 20: Generalized Linear Models

Analysis of simulated data• Model:

___________________________________• The response is a two-column matrix containing events and non-

events: f1<-glm(cbind(events,N-events)~group, family=binomial,data=dat)

• Define proportions: dat$prop<-with(dat, events/N)

and use these as the response and the number of trails N as weights in the fit:

f2<-glm(prop~group, family=binomial, weights=N, data=dat)

• Use the number of events directly as the response f3<-glm(events~group,family=binomial,data=dat)

Page 21: Generalized Linear Models

Fitting GLMs– logistic regression• Consider a data set where the response variable takes only 0 or 1

values and the single covariate variable is continues numerical type. Examples

• If we apply a simple linear regression model_____

to fit the data, there are some problems. • Conclusion: it is not appropriate to use the simple linear regression to

model regression data with binary responses.

Page 22: Generalized Linear Models

Logistic regression• Solution is to use the logistic function:• The formal definition of logistic model for binary response with p

variable:

Page 23: Generalized Linear Models

Logistic regression

• How to interpret the model?

• In logistic model, the odds of “success”:

• The logistic model for binary data can be slightly modified

Page 24: Generalized Linear Models

Modified to cover binomial data

Page 25: Generalized Linear Models

Bernoulli and Poisson distribution

• Likelihood:

• MLE estimates:

Page 26: Generalized Linear Models

Parameter estimation in GLMs

Page 27: Generalized Linear Models

IWLS Algorithm

• Iterative weighted least square algorithm:


Recommended