compare nba players with lebron james using statistical ......focus on lebron james’s career total...
TRANSCRIPT
![Page 1: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/1.jpg)
Compare NBA Players with LeBron James using Statistical Analyses from 2007 - 2017
Weibin Ma, Yujia Lian, James Kong
![Page 2: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/2.jpg)
Outline● Measure and predict player positions’ efficiency
● Measure all players’ statistics
● Visualize all players including Lebron James statistics correlation
![Page 3: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/3.jpg)
Data DictionaryPos:Position eFG%: Effective Field GoalTM: Team VORP: Value Over Replacement USG%: Usage% PF: Personal fouls G: Games MP:Minutes played GS: Game started TOV: Turnovers AST: Assist STL: StealsBLK: BlocksPER: Player Efficiency RatingOWS: Offensive winning sharesDWS:Defensive winning sharesWS: Winning sharesFG:Field GoalsPTS: Total pointsORB:Offensive reboundDRB:Defensive reboundTRB: Total Rebounds
![Page 4: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/4.jpg)
EDA - Boxplots● The primary goal is to
create a simple score distribution of the best players play in which positions using variables like PTS, AST, etc
● In our project, we want to focus on player positions, all players and LeBron James’s statistics
![Page 5: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/5.jpg)
Decision Tree - Regular fit vs Overfit
● Predict positions' efficiency to answer some questions:○ SF is considered as a well-rounded position. Is that true?○ What are other positions are particularly strong in some stats?
![Page 6: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/6.jpg)
Random Forest (RF)
● One problem that might occur with one big (deep) single DT is that it can overfit.● A RF is a collection or ensemble of decision trees.
○ but in RF a fraction of the number of rows is selected at random○ The point of RF is to prevent overfitting.
![Page 7: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/7.jpg)
Decision Trees and Random Forest Results Comparison
● Random forest’s result perform better than Decision tree
Regular Fit
Overfit
Random Forest
![Page 8: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/8.jpg)
● P_vallue < 2.2e-16 (much less than 0.05)● R-squared = 0.9851
Thus, the relationship between shot attempts and field goals are well fitted in linear regression with positive trend (see picture below)
Analysis 1: Find the relationship between all players’ shot attempts and field goals
![Page 9: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/9.jpg)
Plot 2 has no linear relationshipPlot 1, it has linear relationship
● There has no players:○ had many shot attempts but made very few field goals○ had fewer shots attempts but made majority of the field goals
![Page 10: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/10.jpg)
Analysis 2: Show top 5 players’ 2 or 3 points shot attempts
● Stephen Curry's 2 points and 3 point shot attempts are identical while Lebron James prefers 2 point shot more than other 4 players.
![Page 11: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/11.jpg)
● To extend the histogram analysis of shots attempts, we have included 2 and 3 points field goals made by each individuals. We can compare their shot percentage and shot attempts clearly.
![Page 12: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/12.jpg)
Analysis 3:LDA classification model for Player Efficiency Rate
(PER)
Parameters
Response : PER
Discriminators : Age+MP+TS.+X3PAr+FTr+ORB.+DRB.+TRB.+AST.+
STL.+BLK.+TOV.+USG.+WS.48+FG+FGA+X3P+eFG.+
FT+FT.+DRB+AST+STL+TOV+PF
Dataset : 2007~2016 SF Position Players
Building the LDA model and obtain the test_error = 0.00619195, which means this model has 99.4% reliability to predict.
Purpose : Predict whether LeBron James’ s PER > 25 or not, in 2017
![Page 13: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/13.jpg)
Assumption -> 100% success rate, which means Lebron Jame’s PER > 25.0 ( By now, LBJ’s PER in 2017 is 27)
Thus, this reflects that LBJ had a top-high PER in 2017 (2016-2017 season).
Use this model to estimate the desired range and desired probability.
Predict result:
Class 1 <- (PER>25.0)
Class 0 <- (PER<=25.0)
Predict
![Page 14: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/14.jpg)
Analysis 4: LDA classification model for Points (PTS) > 1600 and <2100
Response : PTS
Discriminators : Pos+G+GS+MP+TS.+AST.+STL.+TOV.+USG.+
VORP+FG+FGA+X3P+FT+STL+BLK+X2P
Dataset : 2007~2016 Players
Test_error : 0.004095563, which means this model has 99.6% reliability of prediction
Purpose : predict whether Lebron James’ s PTS > 1600 and < 2100, in 2017
![Page 15: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/15.jpg)
Probability of (PTS<2100) is almost equal to 1 and the probability of (PTS>1600) is equal to 0.92 ( By now, LBJ’s PTS in 2017 is 1954).
Summary:
Predicting -> LeBron James’ PTS in 2017 is in the range of 1600 and 2100 with more than 90% probability
Predict
![Page 16: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/16.jpg)
● Analyzed variable: Points, Rebounds, Free throw, Block, Assist (Total)
Visualizing Correlation Between Each Variables
Rest of other player’s stats (Year 2017) Lebron James ’s stats (Year 2017)
![Page 17: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/17.jpg)
Regression Analysis: Age VS Performance Stats
● LeBron James vs all other small forward players in NBA● Analysed variables: Age vs. Points, Assists, Rebounds, Steal, Personal foul
Other SF: Lebron James:
![Page 18: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/18.jpg)
3D Scatterplot with Regression Plane● Get rid of non-significant variables: steal, personal foul.● Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season● Doing linear regression for all other small forwards in the league. Fit the regression plane on 3D scatter plot.
![Page 19: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/19.jpg)
Compare James with Other Greats● Analyze variables: Offensive Winning Shares (OWS) and Defensive Winning Shares (DWS).
![Page 20: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/20.jpg)
Using KNN to Predict Whether James Will Make the 2017 All-NBA team
Dataset: 2015-2016 season NBA all the key rotation players. (Over 30 games as a starter)
Discriminators: Points, Playing minutes, PER, Efficient.(((PTS+AST+TRB+STL+BLK)-(FGA-FG)-(FTA-FT)-TOV))
Response: Whether or not certain player will select into ALL-NBA team.(0,1)
Total number of players in all-NBA team in 2016: 10. Total
Equally divide the dataset into training and testing, fit the knn model to the training data and predicting this model for the test data.
When k =5, the model perform the best.
Result:
![Page 21: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/21.jpg)
Confusion matrix:
Predict whether LeBron James will be on the 2017 All American team or not:Result is 1, which is yes. He will be on the All American team.
![Page 22: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/22.jpg)
ConclusionPlayer Positions:● Small Forwards holds a critical position in a team
○ They are versatile and balanced player● Decision tree can show which player position does best at certain variables
○ Random Forest works better than decision tree to prevent overfit All Players including LeBron James:● Shot attempts and field goals present linear relationship● Successfully predict LeBron James has top PER compare to other players and
PTS range is within our measurement in the 2016-2017 seasonLeBron James:● James’s rebounds and assists reach the highest in the 2016-2017 season.● Make the prediction that LeBron James will be selected into the 2017 All
American team● With his age, he is the best player of any era using OWS and DWS as our
measurement
![Page 23: Compare NBA Players with LeBron James using Statistical ......Focus on Lebron James’s career total rebounds and assists vs other SFs in 2016-2017 season Doing linear regression for](https://reader035.vdocument.in/reader035/viewer/2022062311/5f2895aa90a5aa244546efb4/html5/thumbnails/23.jpg)
Critical Questions1. When we used decision tree and random forest to predict player positions’
efficiency, would it be useful for analyzing for individual players as well? Why or why not?
2. When we perform KNN classification and selection, is there another way of choosing variables since we have too many of them? Why or why not?
3. We want to predict how great LeBron James is. However, when we used LDA to predict LeBron James’s player efficiency rate (PER) and Total points(PTS), we got high accuracy, could this model be used at predicting other variables?