6ord / NFLfootball

Match up analysis and recommendation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

September 22, 2019 Update

  1. nflTeamBoxscoreScrape_v3.py now captures a league schedule and generates game boxscore links from that. User no longer has to organize a schedule file to feed into the script. snapcountScrape() method still under construction
  2. Shiny App moved to https://datacritic6ord.shinyapps.io/nflmatchups_2019_v13/

Full write up here: https://datacritics.com/2018/11/06/data-science-fantasy-football/

November 6, 2018 Update

  1. nflTeamBoxscoreScrape_v2.py now includes snapcountScrape() method that collects snap counts
  2. Shiny App moved to https://datacritic6ord.shinyapps.io/nflmatchups_v22/

October 24, 2018 Update

  1. nflTeamBoxscoreFunctions.r and app.r updated. User set parameters Current week and number of past weeks to build analysis upon, are now contained in a List called 'vars' and is initiated in nflTeamBoxscoreFunctions.r
  2. nflBoxscoreScrape2018.csv updated using Python scraping script, now inclusive of week 7 boxscores (Read Me will not be updated with every update week over week. Please check published shinyapp (see Oct 22 '18 update) for updates.

October 22, 2018 Update

  1. Renamed nflTeamBoxscoreOutput.r to app.r for shinyapp server deploy
  2. Absolute local paths replaced with relative paths in nflTeamBoxscoreStats.r & app.r
  3. shinyapp deployed on https://datacritic6ord.shinyapps.io/nflmatchups_v1/

October 15, 2018 Update

Project now scrapes team boxscores a week at a time, in appending fashion. The following is the end-to-end process from data collection to match-up recommendation/ranking:

  1. boxscoreScrapeWk(wkNum) function in nflTeamBoxscoreScrape_v2.py collects boxscore data see September 3, 2018 update for more detail.
  2. nflTeamBoxscoreOutput.r sources:
    • nflTeamBoxscoreFunctions.r to build user defined functions
    • nflTeamBoxscoreStats.r to import, clean and build teams' 'allowed' stats
    • nflTeamBoxscoreAnalysis.r to extract recent weeks and build pair-wise variables for current week matchups
  3. nflTeamBoxscoreOutput.r uses Shiny package to output recommendations based on average ranking of pair-wise variable groups.

NOTE: SOME PRIMETIME GAMES ARE TBD on 2018_schedule_single.csv file. Will need updating later on in the season.

Future Enhancement:

  • merge in and analyze players' snap count for player picking as well as team picking
  • public online access to Shiny app

September 3, 2018 Update

2018_schedule_single.csv

nflTeamBoxscoreScrape_v2.py

  • Updated to contain two scraping functions:
    1. RegSeason2017BoxscoreScrape()

      • single function that scrapes boxscores for entire regular season. See original documentation on the nflBoxscoreScrape() function for more details.
      • Uses the following Python Modules: re, requests, bs4, csv, datetime
    2. boxscoreScrapeWk(wkNum)

      • scrapes boxscores for a given week (wkNum) of the season, appends output to .csv file in working path
      • Uses the following Python Modules: re, requests, bs4, csv, datetime
      • Warning: Headings are also appended. Will need to remove when importing for analysis

Work In Progress: nflTeamBoxscoreStats.r

  • next iteration to convert scraped character data types to numeric as appropriate, other organizing into data frame, and incorporate analysis for recommendation.

nflFootball

NFL American football web scraped stats, and match up analysis and recommendation. See Projects tab

2017_schedule.csv

2017_TNFMNF.csv

  • Manual text file written to identify dates of non-Sunday games (Thursday Night Football & Monday Night Football), including week and teams.

nflTeamBoxscoreStats.r

  • Pre-Python Scrape:

    • Uses 2017_schedule.csv and 2017_TNFMNF.csv to construct a dataframe with Game Date, Road Team and Home Team. This dataframe is then written to 2017_schedule_single.csv.
  • Post-Python Scrape:

    • Analysis Script In Progress

2017_schedule_single.csv

nflTeamBoxscoreScrape.py

  • Contains a single function nflBoxscoreScrape(), which scrapes box score Team Stats from CBSSports.com. Data is captured in a List of Lists (List of games, where each element is a List of stat categories such as Team Rushed Yards, Team Passed Yards, number of 1st Downs, Time of Possession etc.). This data structure is then written to a flat nflBoxscoreScrape.csv, where each row is a team's stat observations per game.
  • Games in scope dependent on 2017_schedule_single.csv
  • Uses the following Python Modules: re, requests, bs4, csv, datetime

About

Match up analysis and recommendation


Languages

Language:Python 58.9%Language:R 41.1%