calvindoesdata / messi_shots_kmeans

Basic K-Means clustering model for Lionel Messi's historic shot data.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

K-Means Shot Clusters: Lionel Messi

Basic K-Means clustering model to identify the optimal number of clusters for categorising shot locations from Lionel Messi's career.

This project contains a modular Python code base and some open source football event data generated by StatsBomb. The code performs the following functions:

  • Iterates through StatsBomb's event data, identifies all shots by Lionel Messi and extracts the xy coordinates of shot locations
  • Performs the K-Means elbow test on the Messi shot data across a range of n_clusters values to find the optimal value
  • Runs the K-Means clustering algorithm across the Messi data using the optimal n_clusters value, generating a plot of clusters and their centres on a half-pitch plot

Install requirements

It is recommended to create a virtual environment and install the dependencies listed in the requirements file. This can be done in the command line by:

python3 -m venv my_venv
source my_venv/bin/activate
pip3 install -r requirements.txt

Accessing the code base

The code base accessed locally by cloning the repository. After navigating to your local directory of choice, run the following in the command line:

git clone https://github.com/calvindoesdata/messi_shots_kmeans.git

Alternatively the project can be downloaded as a .zip from the repository home page by selecting 'Code' > 'Download ZIP'.

Running the project

This project can be run from the command line using the following commands:

cd .../messi_shots_kmeans/
python3 main.py

About

Basic K-Means clustering model for Lionel Messi's historic shot data.


Languages

Language:Python 100.0%