+ All Categories
Home > Documents > Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total...

Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total...

Date post: 07-Jul-2020
Category:
Upload: others
View: 7 times
Download: 0 times
Share this document with a friend
23
Compare NBA Players with LeBron James using Statistical Analyses from 2007 - 2017 Weibin Ma, Yujia Lian, James Kong
Transcript
Page 1: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Compare NBA Players with LeBron James using Statistical Analyses from 2007 - 2017

Weibin Ma, Yujia Lian, James Kong

Page 2: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Outline● Measure and predict player positions’ efficiency

● Measure all players’ statistics

● Visualize all players including Lebron James statistics correlation

Page 3: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Data DictionaryPos:Position eFG%: Effective Field GoalTM: Team VORP: Value Over Replacement USG%: Usage% PF: Personal fouls G: Games MP:Minutes played GS: Game started TOV: Turnovers AST: Assist STL: StealsBLK: BlocksPER: Player Efficiency RatingOWS: Offensive winning sharesDWS:Defensive winning sharesWS: Winning sharesFG:Field GoalsPTS: Total pointsORB:Offensive reboundDRB:Defensive reboundTRB: Total Rebounds

Page 4: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

EDA - Boxplots● The primary goal is to

create a simple score distribution of the best players play in which positions using variables like PTS, AST, etc

● In our project, we want to focus on player positions, all players and LeBron James’s statistics

Page 5: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Decision Tree - Regular fit vs Overfit

● Predict positions' efficiency to answer some questions:○ SF is considered as a well-rounded position. Is that true?○ What are other positions are particularly strong in some stats?

Page 6: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Random Forest (RF)

● One problem that might occur with one big (deep) single DT is that it can overfit.● A RF is a collection or ensemble of decision trees.

○ but in RF a fraction of the number of rows is selected at random○ The point of RF is to prevent overfitting.

Page 7: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Decision Trees and Random Forest Results Comparison

● Random forest’s result perform better than Decision tree

Regular Fit

Overfit

Random Forest

Page 8: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

● P_vallue < 2.2e-16 (much less than 0.05)● R-squared = 0.9851

Thus, the relationship between shot attempts and field goals are well fitted in linear regression with positive trend (see picture below)

Analysis 1: Find the relationship between all players’ shot attempts and field goals

Page 9: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Plot 2 has no linear relationshipPlot 1, it has linear relationship

● There has no players:○ had many shot attempts but made very few field goals○ had fewer shots attempts but made majority of the field goals

Page 10: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Analysis 2: Show top 5 players’ 2 or 3 points shot attempts

● Stephen Curry's 2 points and 3 point shot attempts are identical while Lebron James prefers 2 point shot more than other 4 players.

Page 11: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

● To extend the histogram analysis of shots attempts, we have included 2 and 3 points field goals made by each individuals. We can compare their shot percentage and shot attempts clearly.

Page 12: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Analysis 3:LDA classification model for Player Efficiency Rate

(PER)

Parameters

Response : PER

Discriminators : Age+MP+TS.+X3PAr+FTr+ORB.+DRB.+TRB.+AST.+

STL.+BLK.+TOV.+USG.+WS.48+FG+FGA+X3P+eFG.+

FT+FT.+DRB+AST+STL+TOV+PF

Dataset : 2007~2016 SF Position Players

Building the LDA model and obtain the test_error = 0.00619195, which means this model has 99.4% reliability to predict.

Purpose : Predict whether LeBron James’ s PER > 25 or not, in 2017

Page 13: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Assumption -> 100% success rate, which means Lebron Jame’s PER > 25.0 ( By now, LBJ’s PER in 2017 is 27)

Thus, this reflects that LBJ had a top-high PER in 2017 (2016-2017 season).

Use this model to estimate the desired range and desired probability.

Predict result:

Class 1 <- (PER>25.0)

Class 0 <- (PER<=25.0)

Predict

Page 14: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Analysis 4: LDA classification model for Points (PTS) > 1600 and <2100

Response : PTS

Discriminators : Pos+G+GS+MP+TS.+AST.+STL.+TOV.+USG.+

VORP+FG+FGA+X3P+FT+STL+BLK+X2P

Dataset : 2007~2016 Players

Test_error : 0.004095563, which means this model has 99.6% reliability of prediction

Purpose : predict whether Lebron James’ s PTS > 1600 and < 2100, in 2017

Page 15: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Probability of (PTS<2100) is almost equal to 1 and the probability of (PTS>1600) is equal to 0.92 ( By now, LBJ’s PTS in 2017 is 1954).

Summary:

Predicting -> LeBron James’ PTS in 2017 is in the range of 1600 and 2100 with more than 90% probability

Predict

Page 16: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

● Analyzed variable: Points, Rebounds, Free throw, Block, Assist (Total)

Visualizing Correlation Between Each Variables

Rest of other player’s stats (Year 2017) Lebron James ’s stats (Year 2017)

Page 17: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Regression Analysis: Age VS Performance Stats

● LeBron James vs all other small forward players in NBA● Analysed variables: Age vs. Points, Assists, Rebounds, Steal, Personal foul

Other SF: Lebron James:

Page 18: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

3D Scatterplot with Regression Plane● Get rid of non-significant variables: steal, personal foul.● Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season● Doing linear regression for all other small forwards in the league. Fit the regression plane on 3D scatter plot.

Page 19: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Compare James with Other Greats● Analyze variables: Offensive Winning Shares (OWS) and Defensive Winning Shares (DWS).

Page 20: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Using KNN to Predict Whether James Will Make the 2017 All-NBA team

Dataset: 2015-2016 season NBA all the key rotation players. (Over 30 games as a starter)

Discriminators: Points, Playing minutes, PER, Efficient.(((PTS+AST+TRB+STL+BLK)-(FGA-FG)-(FTA-FT)-TOV))

Response: Whether or not certain player will select into ALL-NBA team.(0,1)

Total number of players in all-NBA team in 2016: 10. Total

Equally divide the dataset into training and testing, fit the knn model to the training data and predicting this model for the test data.

When k =5, the model perform the best.

Result:

Page 21: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Confusion matrix:

Predict whether LeBron James will be on the 2017 All American team or not:Result is 1, which is yes. He will be on the All American team.

Page 22: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

ConclusionPlayer Positions:● Small Forwards holds a critical position in a team

○ They are versatile and balanced player● Decision tree can show which player position does best at certain variables

○ Random Forest works better than decision tree to prevent overfit All Players including LeBron James:● Shot attempts and field goals present linear relationship● Successfully predict LeBron James has top PER compare to other players and

PTS range is within our measurement in the 2016-2017 seasonLeBron James:● James’s rebounds and assists reach the highest in the 2016-2017 season.● Make the prediction that LeBron James will be selected into the 2017 All

American team● With his age, he is the best player of any era using OWS and DWS as our

measurement

Page 23: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for

Critical Questions1. When we used decision tree and random forest to predict player positions’

efficiency, would it be useful for analyzing for individual players as well? Why or why not?

2. When we perform KNN classification and selection, is there another way of choosing variables since we have too many of them? Why or why not?

3. We want to predict how great LeBron James is. However, when we used LDA to predict LeBron James’s player efficiency rate (PER) and Total points(PTS), we got high accuracy, could this model be used at predicting other variables?


Recommended