opunsoars / soccer_analytics

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Soccer analytics

soccer_analytics is a Python project trying to facilitate and being a starting point for analytics projects in soccer.

  • Extensive number of helper functions for visualization and animation of soccer events
  • Calculation of relevant soccer KPIs for event data (tracking data to come)
  • Pre-proccessed wyscout event data allows you to dive into the analyses immediately
  • Detailed tutorials in form of notebooks that help you get started with this project and soccer analytics in general
  • Thought of as a starting point for projects rather than a "hidden" library
  • Set up in a way so that functions are easily extendable
  • All plots and animations are created with plotly and therefore easily integretable into dash dashboards
  • Supports python 3.6 - 3.8

Tutorial

This projects includes a number of notebooks that serve as tutorial on how to use the helper functions and might be a good starting point into soccer analytics in general. The notebooks can be found here and I recommend to go through them in the following order:

  1. Exploratory analysis event data: This notebook gives you an overview over the pre-processed wyscout data and runs rudimentary exploratory analysis using pandas-profiling

  2. Goal kick analysis: In this notebook we identify the best teams w.r.t goal kicks in the Bundesliga. On the way we learn how to

    • Use bar plots in plotly
    • Visualize events on a soccer field through graphs and animations
    • Draw heatmaps on a soccer field
  3. Passing analysis: We continue our journey but looking at passes between players and analyze one match in more detail. Technically, we learn how to use the helper function to:

    • Compute statistics efficiently
    • Draw position plots of players
    • Visualize passing lines and passing zones
  4. Expected goal model with logistic regression: While in the previous notebooks it was mostly about visualization, in this notebook we start looking into machine learning. We jointly build an expected goal model using logistic regression and learn about fundamentals of machine learning, e.g.:

    • Feature engineering
    • Multivariate analysis
    • Metrics
    • Model interpretation
  5. Challenges using gradient boosters: In this rather technical notebook we are going to look into some of the challenges that often arise in real-life situations when using gradient boosters such as lightGBM or XGBoost, such as:

    • Overfitting
    • Feature interpretation
    • Monotonicity
    • Extrapolation

Examples

Event visualisation

Heatmap

Passing map

Polar charts

Installation

If you are new to Python and soccer analytics I would recommend to you to download Anaconda distribution and follow the instruction under Conda

Conda

  1. Open the Anaconda Prompt and cd to the project folder
  2. Create a new conda environment "soccer_analytics"
    conda create -n soccer_analytics python=3.6
  3. Activate the conda environment
    conda activate soccer_analytics
  4. Install all required packages
    pip install -r requirements.txt

Acknowledgements

Data sources

Event data: Wyscout

About


Languages

Language:Jupyter Notebook 98.4%Language:Python 1.6%