+ All Categories
Home > Documents > HRP 223 - 2008

HRP 223 - 2008

Date post: 25-Jan-2016
Category:
Upload: saeran
View: 30 times
Download: 0 times
Share this document with a friend
Description:
HRP 223 - 2008. Topic 9 - Regression. Height and Resting Pulse. The spreadsheet RESTING.xls has height and pulse measures on 50 people. On average, does pulse go up or down with height?. Look before you leap!. Root MSE = Estimated standard deviation of the error in the model (eta) - PowerPoint PPT Presentation
Popular Tags:
51
Copyright © 1999-2008 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law. HRP 223 - 2008 Topic 9 - Regression
Transcript
Page 1: HRP 223 - 2008

Copyright © 1999-2008 Leland Stanford Junior University. All rights reserved.Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.

HRP 223 - 2008

Topic 9 - Regression

Page 2: HRP 223 - 2008

HRP223 2008

Height and Resting Pulse

The spreadsheet RESTING.xls has height and pulse measures on 50 people. On average, does pulse go up or down with height?

Page 3: HRP 223 - 2008
Page 4: HRP 223 - 2008
Page 5: HRP 223 - 2008

Look before you leap!

Page 6: HRP 223 - 2008
Page 7: HRP 223 - 2008

HRP223 2008

Page 8: HRP 223 - 2008
Page 9: HRP 223 - 2008
Page 10: HRP 223 - 2008
Page 11: HRP 223 - 2008
Page 12: HRP 223 - 2008
Page 13: HRP 223 - 2008
Page 14: HRP 223 - 2008
Page 15: HRP 223 - 2008

Root MSE = Estimated standard deviation of the error in the model (eta)

Dependent Mean = Mean of the outcome CV = ratio of above * 100 In general r2 is interpreted as:

– .1 small effect, 3. medium effect, .5 large effect Adjusted R-square =1- ( (1- rsquare) * ((n-1)/n-m-1)) )

n=subjects m=variables– It penalizes you for putting extra terms in the model.– R-squared is typically reported if you have a single predictor

variable.– Adjusted R-square is typically reported if you have several

predictors.

Page 16: HRP 223 - 2008

HRP223 2008

Oxygen

The next set of data looks at the relationship between oxygen inhaled and exhaled. You would hope that there would be close to a perfect relationship between the two factors.

Page 17: HRP 223 - 2008

Add the library to a new flowchart.

Add the SAS data set to the project.

Page 18: HRP 223 - 2008

Look at the Data This is bad news….

At least it is symmetric.

Page 19: HRP 223 - 2008

Simple correlation is questionable.

Page 20: HRP 223 - 2008
Page 21: HRP 223 - 2008

HRP223 2008

Page 22: HRP 223 - 2008
Page 23: HRP 223 - 2008
Page 24: HRP 223 - 2008
Page 25: HRP 223 - 2008
Page 26: HRP 223 - 2008
Page 27: HRP 223 - 2008

Are the residuals about normal?

Page 28: HRP 223 - 2008
Page 29: HRP 223 - 2008

HRP223 2008

Leave yourself a note on how to interpret the output.

Right click on the flowchart and choose New > Note. Leave yourself some notes. Right click on the Note icon > Link Note to > Quadratic

Page 30: HRP 223 - 2008

HRP223 2008

Ice cream!

In this example you will predict ice cream sales based on factors like price and temperature.

Start by making a library (or copy and paste the existing one) in a new flowchart.

The data is in a text file. Import the data.

Page 31: HRP 223 - 2008

Load the Data

Page 32: HRP 223 - 2008

Add Celsius

Celsius is ( (5/9) * (Fahr-32) ) 1

2

Page 33: HRP 223 - 2008

Celsius is ( (5/9) * (Fahr-32) )

Page 34: HRP 223 - 2008
Page 35: HRP 223 - 2008
Page 36: HRP 223 - 2008
Page 37: HRP 223 - 2008
Page 38: HRP 223 - 2008

HRP223 2008

Page 39: HRP 223 - 2008
Page 40: HRP 223 - 2008
Page 41: HRP 223 - 2008
Page 42: HRP 223 - 2008

Some people say VIF > 10 is a problem but that is arbitrary.

If VIF is > 1/(1 - R-squared) then the factors are more related to other predictors than outcome.

Page 43: HRP 223 - 2008

HRP223 2008

Severely Dehydrated Children

Page 44: HRP 223 - 2008

HRP223 2008

A Look

Do univariate descriptive statistics. – Things look reasonable.

Do bivariate correlations.– Age and weight are correlated

Do univariate modeling.– There is a weak but statistically significant

association. Build a model with all 3 predictors and check

variance inflation.

Page 45: HRP 223 - 2008

A Simpler Model

It explains a fair amount of the variability (45%). How can I check to make sure the model is working well and is not being driven by outliers?

Page 46: HRP 223 - 2008

HRP223 2008

Outliers

Images from: Statistics I: Introduction to ANOVA, Regression, and Logistic Regression Course Notes (2005) and Categorical Data Analysis Using Logistic Regression Course Notes (2005), SAS Press.

Page 47: HRP 223 - 2008

First Check Residuals

Page 48: HRP 223 - 2008
Page 49: HRP 223 - 2008

HRP223 2008

What is influential?

Freund and Littell SAS System for Regression 3rd edition, page 70;

Variance inflation:– vifcheck = 1 /(1 – r2)

Leverage greater than this value:– leverageCheck = 2 * (predictors + 1) / records

Covariance more extreme than:– cov1Check = 1 + 3 * (predictors+1) / records– cov1Check = 1 - 3 * (predictors+1) / records

Dfits values with absolute value bigger than:– dffitsCheck = 2 * ((predictors + 1)/records) ** .5

Page 50: HRP 223 - 2008

HRP223 2008

Influence Code

Page 51: HRP 223 - 2008

Recommended