+ All Categories
Home > Documents > Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and...

Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and...

Date post: 03-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
34
Predicting Income and Employment in the US
Transcript
Page 1: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Predicting Income and Employment in the US

Page 2: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution
Page 3: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

● Determine factors that predict Income/Unemployment

Page 4: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Data Cleaning

Exploratory Analysis

Variable Selection

Data Modeling

Visualization Analysis

Conclusion

Page 5: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution
Page 6: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution
Page 7: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution
Page 8: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Freq

uenc

y

Page 9: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution
Page 10: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution
Page 11: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Variable Selection

Page 12: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Multiple Linear Regression - Income

Page 13: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Data Modeling-Income

Page 14: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Transforming the Data

Page 15: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution
Page 16: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution
Page 17: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

AIC/BIC Model

Page 18: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Data Modeling-IncomeFit the LASSO, Ridge, and Elastic Net models:

Creates 10-fold Cross Validation for each alpha:

Plot the solution path and cross-validated MSE as function of λ

Page 19: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

LASSO

RIDGE

ELASTIC NET

Page 20: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Prediction-Income

Predict yhat0 to yhat10 using the fit for each alpha

Compute the Mean Absolute Error and Mean Square Error for each yhat

Page 21: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Fitting the Income Modelfit.AIC.BIC <- step(lm_manual2, direction = "both", k = 1, trace = 0)

MAE= 3171 MSE=19660025

fit.lasso<-glmnet(x.train,y.train,family='gaussian',alpha=1)

MAE= 3187.912 MSE=18104664

fit.ridge<-glmnet(x.train,y.train,family='gaussian',alpha=0)

MAE= 43654.52 MSE=19009876

fit.elnet<-glmnet(x.train,y.train,family='gaussian',alpha=0.5)

MAE= 3152.585 MSE=17675860

Page 22: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Elastic Net

Page 23: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Logistic Regression - Unemployment

Page 24: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Data Manipulation● Had to create new binary variable in the dataset● National unemployment rate in January of 2015 was 5.7%● Created a binary variable that took the value 1 when the unemployment

rate was greater than or equal to 5.7, and 0 when the unemployment rate was less than 5.7

Page 25: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Data Modeling-UnemploymentFit the LASSO, Ridge, and Elastic Net models:

Creates 10-fold Cross Validation for each alpha:

Plot the solution path and cross-validated MSE as function of λ

Page 26: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

LASSO

RIDGE

ELASTIC NET

Page 27: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Prediction-Unemployment

Predict yhat0 to yhat10 using the fit for each alpha

Compute The ROC curve and AUC for each model

Page 28: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution
Page 29: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Fitting the Unemployment Modelfit.lasso2<-glmnet(x.train2,y.train2,family='binomial',alpha=1)

AUC=0.9070913

fit.ridge2<-glmnet(x.train2,y.train2,family="binomial",alpha=0)

AUC=0.903377

fit.elnet2<-glmnet(x.train2,y.train2,family='binomial',alpha=0.5)

AUC=0.9048422

Page 30: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

LASSO Regression

Page 31: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Final ModelsModel for predicting income(Elastic net)

Model for predicting unemployment rate(Lasso)

Page 32: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution
Page 33: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution
Page 34: Predicting Income and Employment in the US · Data Modeling-Income Fit the LASSO, Ridge, and Elastic Net models: Creates 10-fold Cross Validation for each alpha: Plot the solution

Conclusion● Use of these models:

○ If you have current county information, you can predict income and unemployment levels○ If you have a projection of where the county is going in the future, these models can

determine what the unemployment and income levels may be○ Look at variables to determine which conditions could be improved to increase income or

lower unemployment

● Future study:○ Refit these models when the 2020 census data comes out○ Use these models to predict what income and unemployment may look like for the 2020

census


Recommended