kuhlenh / basketball

Public subset of my basketball github

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The shell script "demo.sh" loads scraped data alias tables
between the two data sets (NCAA and Basketball Reference)
for schools and players. It then runs sample R code that does
a simple stepwise regression to detect some NCAA features
that impact NBA playing time 1 year out from the draft.

You won't be able to run these without installing PostgreSQL,
R etc., but I've included two text files showing the results. The
first is "script_output.txt" which shows the output of the "demo.sh"
script (including the total time take - about 12 seconds).

The file "feature_selection.txt" shows the results of the stepwise
regression.

This is the final model - no surprise, the pick number dominates
in a non-linear way. Also settled on were height, position, games,
assists per game and steals per game. I did not examine any
interaction terms, nor did I look at other measures of NBA value,
but these are straightforward given the database (up to the
limitations of my scraped data, of course).

I haven't adjusted college performance for NCAA strength of
schedule yet.

About

Public subset of my basketball github

License:MIT License


Languages

Language:PLpgSQL 57.1%Language:Ruby 23.6%Language:R 11.7%Language:Shell 6.3%Language:Python 1.3%