+ All Categories
Home > Data & Analytics > 2015 Sport Analysis for March Madness

2015 Sport Analysis for March Madness

Date post: 10-Aug-2015
Category:
Upload: yi-chun-nancy-chien
View: 113 times
Download: 1 times
Share this document with a friend
Popular Tags:
10
March Data Crunch Madness Team Coach K. Yi Chun Chien, Xiayu Zeng, Feifei Chen, Xiaoshan Jin March 2015
Transcript

March Data Crunch Madness

Team Coach K.Yi Chun Chien, Xiayu Zeng, Feifei Chen,

Xiaoshan Jin

March 2015

2Introduction❖ Background: NCAA Men’s Basketball Tournament is a single-elimination tournament,

currently featuring 68 college teams.

❖ Objective: Create an effective model that examines factors contributing to a team’s performance, based on data from 2001-2014.

❖ Result: As can be analyzed from the model, box score has a large effect on a team’s result in 2015, which is helpful to predict:➢ Win/Lose➢ Winning Probability➢ Sweet Sixteen

3Independent & Dependent Variables

IndependentVariables

SeedLocation

Box Score

Assist, Steal, Block Shot,% 2/3 Point Field Goals,% Free Throws, Tempo

Seed#,If this team is Top 5,If this team is 15/16

Latitude, Longitude, Distance Difference

Dependent Variable:Win/Lose

4

Performance Validation

Accuracy

ROC Curve

AUC

RMSE

Define Data Group

Decision Tree

Build 5 Models

Bootstrap Forest

Boosted Tree

Neural Network

Nominal Logistic

Training (80%)

Validation(20%)

Testing (2015)

Probability

Win/Lose

Top 16

Prediction

Analysis Process

5

● Distribution Review: Most variables are normal distributed

Distribution and Correlation● Scatter Matrix: Few variables has linear correlation

6

5 Models Performance

Validation

Nominal Logistic Regression Accuracy: 72%

ROC Curve for Validation

Nominal Logistic Regression has the best performance

Performance Validation

Training

7

Result Lose WinLose 6 6Win 5 24Total 11 30

● 2015 Forecast Top 16 team ● 2015 Forecast Result: 73% accuracy

Prediction

8Model ExplanationDefensive efficiency, offensive efficiency, opponent’s blocked shots and assists are most important attributes based on individual p-value

According to our analysis results, good offensive efficiency contributes more than defensive efficiency in leading a team’s success

The closer the distance to stadium, the better result a team performs

9Interesting Analysis● Average score difference is narrowing down● The score pattern for Top 5 Seeds is less volatile

than the one for bottom 2 seeds

● 9 out of 16 is predicted correctly● Only Georgetown shows a declining pattern

of winning probability

10

Result and Conclusion❖ Whether a team wins or loses is positively related to four

primary factors:➢ offensive efficiency➢ defensive efficiency➢ block shots➢ assists

❖ Accuracy: Our model is 72.19% accurate in predicting a team’s result for 2015.


Recommended