mathurinm / datacamp-olympics

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Datacamp - Olympic Games πŸ₯‡ πŸ₯ˆ πŸ₯‰

This repository contains the data on Olympics (up to Rio Games in 2016):
data/
  athlete_events.csv   # one row ~ one participation
  noc_regions.csv    # matching region to national olympic comittee (NOC)

Based on this data, many questions can be answered, here are a few examples:

  • Dominating countries. Which countries dominate the OG currently ? thanks to which events ? how has this evolved throughout the years ? which have the largest rate of success compared to their participation numbers ?
  • Winning strategies. Which attribute is the most correlated to performance ? country, age, height, weight ? an analysis targetted on Athletics events
  • Participants morphology. How has the morphology of skiiers changed throughout the years ? Does is seem related to performance ? Breakdown by events.
  • Sports availability. When did Judo events start for men & women ? Is there other sports with similar disparities ? Is there any sports that were once in the Olympics and now disappeared ?

Your task, should you choose to accept it, is to investigate in depth a question using descriptive statistics, data visualization, clustering and dimension reduction. You will be working in a group of 3 or 4, and we will help you along all day. The question can be one of the above suggestions, but we highly encourage you to find one you have a high interest in amongst yourselves πŸš€

About