Predicting Count Data

Predicting Count Data

Poisson Regression

Review: Confusing Statistical Terms

General Linear Model (GLM)-Anything that can be written like this:

-Solved using ordinary least squares-Assumptions revolve around the Normal Dist.Generalized Linear Model-Anything that can be written like this:

-Solved using maximum likelihood-Assumptions use many different distributions

Remember: Why These Models?

• Linear Regression: Assuming normal errors around the predicted score

• When we violate this assumptions, our estimates of the distributions of the B’s are incorrect

• Also…in some case our estimates of the effect size are inaccurate (usually too small)

Linear Regression

• Linear regression is really a predictive model before anything else. (The statistical aspect is extra).

B0

B1

Count Data

Examples

• (Criminal Justice) Number of offenses per year

• (Domestic Violence) Number of DV events per person

• (Epidemiology) Number of seizures per week

Count Data

• This type of data can only have discrete values that are greater than or equal to zero.

• In situations, this data follows the Poisson Distribution

Poisson Distribution

• The Poisson random variable is defined by one parameter: the mean (μ)

• It has the strong assumption that the mean is equal to the variance

μ=σ

Poisson Regression

• In this model, instead of predicting mean of a normal distribution, you are predicting the mean of a Poisson distribution (given some predictors)

Fundamental Equation

• In linear regression:

• In Poisson regression:

Assumptions

• In your outcome variable (Y), the mean equals the variance. (There is a test for this)– For violations you can use Negative Binomial…

which is just a Poisson where the variance is separate from the mean.

• Observations are independent (as with most analyses)

• And, basically, that the predictive model makes sense ( )

Interpreting Parameters

• Like logistic, we have to interpret the EXP(B)– (This is the notation for )

• Instead of an odds ratio, this is a relative risk ratio: it is the additional rate given a one unit increase in X

• 1 is the null hypothesis• 1.2 would be an increase of .2 in the relative

rate for a one unit increase

Really, why the trouble?

• Turns out that not using Poisson isn’t the worst thing ever.– Actually get alpha deflation

• BUT- Many journals that are used to this kind of data will reject articles that do not use the proper technique

Date post:	30-Dec-2015
Category:	Documents
Upload:	howard-stanley
View:	25 times
Download:	0 times

Predicting Count Data

Documents