Post on 05-Dec-2014
description
transcript
Mother Nature’s Impact on Bike Ridership
Jackie Zajac
Kays Fattal
Naumaan Nasir
Does weather have a relationship with bike ridership?
Can we predict bike usage based on weather?
INTRODUCTION
• Our team
• Research questions
• Picking datasets
• Our audience
METHODOLOGY
• Why linear regression?
• How we manipulated the data
• MySQL engine aggregated 3M table into sum of rental counts and duration
• Mashed up with 731 rows of weather data (2011, 2012)
• Added a Year field• Tools: Excel, MySQL database,
R (Rattle)
METHODOLOGY
• Picking our best configuration
• Categoric vs. numeric variables• Must decide how to measure bike usage • Must pick best variables
• Error analysis
PHASE I
• Began with a broad study of six regressions
• Two target variables (rental counts, duration)• Three temperature measures• Minimum, Average, Maximum• Chunked the day into three time ranges to reflect
temperature during bike rides• Evaluated multiple weather variables’ affect on
regressions
• Ignored Date field
Plots
PHASE II
• Combining the data sets
• Picking best variables:
• Bike rental counts as sole target variable• Maximum temperature • Utilized date/year field • Switched Snow to categoric variable
• Analyzed and refined our regression
• Higher accuracy – R-squared = .8374 or 83.74%
MSE and R-squared• A measure of accuracy in one dataset
predicting another• Relationship between R-squared and MSE
X X
X
FINAL MODELWeight Variable
-4004.501 Intercept
62.118 Maximum Temperature
-132.741 Average Wind
93.162 Precipitation
416.818 Visibility
2063.069 Year
-161.038 Snow [0.0-1.2] inches
-4.945 Snow [1.2-2.0] inches
-588.349 Snow [2.0-3.1] inches
-5.390 Snow [3.1-3.9] inches
Y=
LESSONS LEARNED
• Too many independent variables to incorporate crime dataset in addition to weather dataset
• Means Squared Error (MSE), R-squared
• Only two years’ worth of data was available due to Bikeshare’s short history (2011, 2012)
• Final model would be even more accurate with additional historical data
CONCLUSION
• Our hypotheses proved true: weather does affect bike ridership
• Why is Maximum Temperature better?
• Why does the Year improve accuracy?
• The categorical range of snow inches
QUESTIONS?
Thanks!