huzaifakhan04 / exploratory-data-analysis-of-european-football-database-using-sql-and-python

This repository contains results of an exploratory data analysis with visualisation performed on nearly 11,000,000 entries across the European Soccer (Football) Database, based in the SQLite database engine, from Kaggle using SQL (SQLite) in Python.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exploratory Data Analysis of European Football Database (2008 ‒ 2016) Using SQL & Python:

This repository contains results of an exploratory data analysis with visualisation performed on nearly 11,000,000 entries across the European Soccer (Football) Database, based in the SQLite database engine, from Kaggle using SQL (SQLite) in Python.

Dependencies:

What is the European Soccer Database?

European Soccer (Football) Database is an open-source database based in the SQLite database engine available on Kaggle for use in data analysis and machine learning projects. The database contains information about more than 25,000 matches and 10,000 players, and nearly 300 teams across football leagues in 11 countries in Europe between the 2008/2009 and 2015/2016 season sourced from multiple websites. Additionally, the database also contains information about weekly updated player attributes and ratings sourced from corresponding yearly instalments of the FIFA franchise from Electronic Arts and EA SPORTS, and betting odds from up to 10 odds providers.

Questions Explored:

  • What was the win percentage, and the aggregated number of wins, losses, goals scored, and goals conceded for each team in each season across each league that they're associated with?
  • Which were the top ten teams with the most wins across all seasons and leagues during the entire duration?
  • Which (did) team attributes played a major part in determining the performance (aggregated goals scored) of the teams?
  • How did the aggregated goals scored for the top five highest-scoring teams in a specific league (Italy Serie A) change over the seasons?
  • Did the Body Mass Index (BMI) of the players affect their performance score (overall rating)?
  • Who were the top ten players with the highest performance scores (overall ratings) across all seasons and leagues during the entire duration?
  • How do the player attributes of the consistently highest-rated players compare to the average player attributes?
  • Which player attributes set the consistently highest-rated player apart from the average player attributes, and by how much?

About

This repository contains results of an exploratory data analysis with visualisation performed on nearly 11,000,000 entries across the European Soccer (Football) Database, based in the SQLite database engine, from Kaggle using SQL (SQLite) in Python.

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Jupyter Notebook 100.0%