Ben Shen's repositories
2016-04-federal-surveillance-planes
The data and analysis referenced in the Apr. 6, 2016 BuzzFeed News article, "Spies in the Skies." https://www.buzzfeed.com/peteraldhous/spies-in-the-skies
2016-01-tennis-betting-analysis
Methodology and code supporting the BuzzFeed News/BBC article, "The Tennis Racket," published Jan. 17, 2016.
agency-loan-level
Loan-level analysis of Fannie Mae and Freddie Mac data
Big-Data-Bowl
Homepage for the National Football League's Big Data Bowl
ChainLadder
Claims reserving models in R
Coursera_Programming_Languages
Materials and code snippets for Programming Languages course on Coursera
currency-portfolio-optimization
Currency Portfolio Optimization - IPython notebook and data
fatal-car-crashes
Diving into the data behind signs on Illinois highways that say "957 TRAFFIC DEATHS IN 2012." #peoplenotdata
ggplot-tutorial
Repository for ggplot2 tutorial
jsDataV.is-source
Source code for jsDataV.is visualizations
Kaggle-Ensemble-Guide
Code for the Kaggle Ensembling Guide Article on MLWave
kaggle-Rain
Winning solution to the Kaggle competition - How Much Did It Rain? II
kaggle-titanic
Predict survival on the Titanic.
Kaggle_CrowdFlower
1st Place Solution for Search Results Relevance Competition on Kaggle (https://www.kaggle.com/c/crowdflower-search-relevance)
kaggle_kobe
Kobe Bryant Shot Selection
learning-from-imbalanced-classes
Learning From Imbalanced Classes
matplotlib_for_papers
Handout for the tutorial "Creating publication-quality figures with matplotlib"
py-Goldsberry
Python Package for facilitating analysis of NBA Data
py_ml_utils
Some small utility modules to help with pandas, numpy and sklearn usage in other projects
Santander-Product-Recommendation
2nd Place Solution of the Kaggle Competition - Santander Product Recommendation
sklearn_pycon2015
Materials for my Pycon 2015 scikit-learn tutorial.
tutorial_biz_dynamics
Using the Census Bureau's Business Dynamics API, DataScience.com has developed a tutorial on business survival rates using Python