Mihaiii / tennis_slam_pointbypoint

Point-by-point data for Grand Slams, 2011-15

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Grand Slam Point-by-Point Data, 2011-15

This repo contains point-by-point data for most[1] main-draw singles Grand Slam matches since 2011. It was scraped from the four Grand Slam websites shortly after each event.

There are two files for each tournament. "-matches.csv" contain metadata for all the matches included from the tournament, and '-points.csv' contains all the available data for each point.

Unfortunately, much of the most useful data isn't available for every tournament. (For instance, there is no first/second serve indicator for many events, and rally length isn't included after the first few.) Much of the metadata isn't available for the last few years of tournaments, and some point-level data (such as winner type) isn't represented the same way throughout the whole dataset.

Still, there's a lot that can be done with this[2], especially since point-by-point tennis data is not readily available.

I'll try to keep this updated after each tournament, but I can't make any promises as to punctuality.

License

Creative Commons License
Tennis databases, files, and algorithms by Jeff Sackmann / Tennis Abstract is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Based on a work at https://github.com/JeffSackmann.

In other words: Attribution is required. Non-commercial use only.


[1] In general, this data is available for matches on courts with the Hawkeye system installed. The vast majority of missing matches are first-rounders.

[2] For instance, http://heavytopspin.com/2011/09/16/win-probability-graphs-and-stats/ http://heavytopspin.com/2011/08/07/do-points-get-shorter-as-the-match-progresses/ http://heavytopspin.com/2011/06/06/fun-with-french-open-rally-length/

About

Point-by-point data for Grand Slams, 2011-15