Additional Materials and Links

Week 1

Recap of main ML algorithms

Overview of methods

Scikit-Learn (or sklearn) library
Overview of k-NN (sklearn's documentation)
Overview of Linear Models (sklearn's documentation)
Overview of Decision Trees (sklearn's documentation)
Overview of algorithms and parameters in H2O documentation

Additional Tools

Vowpal Wabbit repository
XGBoost repository
LightGBM repository
Interactive demo of simple feed-forward Neural Net
Frameworks for Neural Nets: Keras,PyTorch,TensorFlow,MXNet, Lasagne
Example from sklearn with different decision surfaces
Arbitrary order factorization machines

Software/Hardware requirements

StandCloud Computing:

AWS, Google Cloud, Microsoft Azure

AWS spot option:

Stack and packages:

Basic SciPy stack (ipython, numpy, pandas, matplotlib)
Jupyter Notebook
Stand-alone python tSNE package
Libraries to work with sparse CTR-like data: LibFM, LibFFM
Another tree-based method: RGF (implemetation, paper)
Python distribution with all-included packages: Anaconda
Blog "datas-frame" (contains posts about effective Pandas usage)

Feature preprocessing and generation with respect to models

Feature preprocessing

Feature generation

Feature extraction from text and images

Feature extraction from text

Bag of words

Word2vec

NLP Libraries

Feature extraction from images

Pretrained models

Finetuning

Week 2

Exploratory data analysis

Visualization tools

Others

Biclustering algorithms for sorting corrplots

Validation

Data leakages

Perfect score script by Oleg Trott -- used to probe leaderboard
Page about data leakages on Kaggle

Week 3

Metrics optimization

Classification

Ranking

Learning to Rank using Gradient Descent -- original paper about pairwise method for AUC optimization
Overview of further developments of RankNet
RankLib (implemtations for the 2 papers from above)
Learning to Rank Overview

Clustering

Evaluation metrics for clustering

Week 4

Hyperparameter tuning

Tips and tricks

Advanced features II

Matrix Factorization:

Overview of Matrix Decomposition methods (sklearn)

t-SNE:

Interactions:

Ensembling

Week 5

Competitions go through

You can often find a solution of the competition you're interested on its forum. Here we put links to collections of such solutions that will prove useful to you.

Past solutions

About

How to Win a Data Science Competition: Learn from Top Kagglers

https://www.coursera.org/learn/competitive-data-science

Languages

Language:Jupyter Notebook 99.7%Language:Python 0.2%Language:Shell 0.1%