gauravjagtap-2611 / Archtype-of-NBA-Players

This Project aims to cluster players on the basis of their performance on offensive side of court and give Winning Proportion of these groups in a team.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ARCHTYPE OF NATIONAL BASKETBALL ASSOCIATION (NBA) PLAYERS

wp4923812

Motivation

For years players have been recognised by their defensive position on court and also informally known by names such as Rim Runner, Spot Up Bigs, and by their ball movements. This is the results provided on Wikipedia if we search for type of playersin basketball:

  • 1–Pointguard
  • 2–Shooting guard 350px-Basketball_Positions
  • 3– Small forward
  • 4–Power forward5–Center

But these are defensive position not player type, and similar results can be seen on official NBA website.

Thus there is a urgent need to analyse players on the basis of their performance/game on court rather than position, in order to understand players and team better. And take decisive actions in direction of improvement.

Scraped Data

We will be scraping data from Basketball Reference website and will be referring NBA official website for further help.
In order to scrape data from the above stated website, we used library urlopen & BeautifulSoup to access the data available on website.

Player Pos Age Tm G GS MP FG FGA FG% 3P 3PA 3P% 2P 2PA 2P% eFG% FT FTA FT% ORB DRB TRB AST STL BLK TOV PF PTS
0 Álex Abrines SG 25 OKC 31 2 19.0 1.8 5.1 .357 1.3 4.1 .323 0.5 1.0 .500 .487 0.4 0.4 .923 0.2 1.4 1.5 0.6 0.5 0.2 0.5 1.7 5.3
1 Quincy Acy PF 28 PHO 10 0 12.3 0.4 1.8 .222 0.2 1.5 .133 0.2 0.3 .667 .278 0.7 1.0 .700 0.3 2.2 2.5 0.8 0.1 0.4 0.4 2.4 1.7
2 Jaylen Adams PG 22 ATL 34 1 12.6 1.1 3.2 .345 0.7 2.2 .338 0.4 1.1 .361 .459 0.2 0.3 .778 0.3 1.4 1.8 1.9 0.4 0.1 0.8 1.3 3.2
3 Steven Adams C 25 OKC 80 80 33.4 6.0 10.1 .595 0.0 0.0 .000 6.0 10.1 .596 .595 1.8 3.7 .500 4.9 4.6 9.5 1.6 1.5 1.0 1.7 2.6 13.9
4 Bam Adebayo C 21 MIA 82 28 23.3 3.4 5.9 .576 0.0 0.2 .200 3.4 5.7 .588 .579 2.0 2.8 .735 2.0 5.3 7.3 2.2 0.9 0.8 1.5 2.5 8.9
5 Deng Adel SF 21 CLE 19 3 10.2 0.6 1.9 .306 0.3 1.2 .261 0.3 0.7 .385 .389 0.2 0.2 1.000 0.2 0.8 1.0 0.3 0.1 0.2 0.3 0.7 1.7
6 DeVaughn Akoon-Purcell SG 25 DEN 7 0 3.1 0.4 1.4 .300 0.0 0.6 .000 0.4 0.9 .500 .300 0.1 0.3 .500 0.1 0.4 0.6 0.9 0.3 0.0 0.3 0.6 1.0
7 LaMarcus Aldridge C 33 SAS 81 81 33.2 8.4 16.3 .519 0.1 0.5 .238 8.3 15.8 .528 .522 4.3 5.1 .847 3.1 6.1 9.2 2.4 0.5 1.3 1.8 2.2 21.3
8 Rawle Alkins SG 21 CHI 10 1 12.0 1.3 3.9 .333 0.3 1.2 .250 1.0 2.7 .370 .372 0.8 1.2 .667 1.1 1.5 2.6 1.3 0.1 0.0 0.8 0.7 3.7
9 Grayson Allen SG 23 UTA 38 2 10.9 1.8 4.7 .376 0.8 2.6 .323 0.9 2.1 .443 .466 1.2 1.6 .750 0.1 0.5 0.6 0.7 0.2 0.2 0.9 1.2 5.6

Data Pre-processing, Feature Enginnering and EDA

Detailed analysis can be found in Jupyter notebook attached above as Archtype of NBA Players, here are some finding from this section.
This is corrolarogram representing correlation between all the feature in the data. corrolarogram I tried to model linear relationship between all the scoring variables in the processed data. scatter plot

Clustering Players on the basis of their Similarities and Dissimlarities.

Clustering can be largly classified into following 4 types, where every type uses unique technique to measure differences between the data points. :

  • Exclusive Clustering
  • Overlapping Clustering
  • Hierarchical Clustering
  • Probabilistic Clustering We will using all above stated methods excluding Overlapping Clustering.

- K-Means Clustering

Using Elbow plot & Silhouette Coefficient we decide the number of clusters

Elbow Plot

elbow plot

Silhouette Plot & Coefficient for different values of clusters

Silhouette

- Hierarchical Agglomerative Clustering

This is a "bottom-up" approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy.

- Gaussian Mixture Models

Comparision between these Method

we will be moving forward in our analysis with the clustering technique which has highest value of Silhouette Coefficient value. Comparison

Results:

Parallel Plot of all players in NBA season 2018-19

Parallel plot gives very good visualisation about the similarities and disimilarities between different clusters. Here every line describes a player and every colour describes a cluster. This is interactie plot, so you can slide over the respective feature axis, fix their limits and analyse the cluster you wish. newplot (1)

Analysis of Results

boxplot

Conclusion

- From above boxplot we conclude very valuable information that can be used by NBA team coachs to retrospect over their team and strengthen it further by removing and drafting new player of particular type.

- If we consider mean as optimum value for our conclusion, then from above boxplot we can conclude that in order to make a strong team we should have the following proportion of every type of players:

TYPE 1 : 25%
TYPE 2 : 46%-49%
TYPE 3 : 15%
TYPE 4 : LESS than 10%

- Every team consist of roughly 16 to 18 players, where only 5 of these players are in court while playing. Given the concluded proportion of type of players, it is very important that a team should have/draft the elite players of everytype in order to form a strong team. Also there are many other factors that can decide the fate of a team in a season.For example: the way coachs are making decisions on court in accordace to the situation, defence of our team, understanding between the players and many more.\

- What I concluded is just one aspect of improvement, it will not always guarantee success to a team.

About

This Project aims to cluster players on the basis of their performance on offensive side of court and give Winning Proportion of these groups in a team.


Languages

Language:Jupyter Notebook 100.0%