- Scikit-Learn (or sklearn) library
- Overview of k-NN (sklearn's documentation)
- Overview of Linear Models (sklearn's documentation)
- Overview of Decision Trees (sklearn's documentation)
- Overview of algorithms and parameters in H2O documentation
- Vowpal Wabbit repository
- XGBoost repository
- LightGBM repository
- Interactive demo of simple feed-forward Neural Net
- Frameworks for Neural Nets: Keras,PyTorch,TensorFlow,MXNet, Lasagne
- Example from sklearn with different decision surfaces
- Arbitrary order factorization machines
- Basic SciPy stack (ipython, numpy, pandas, matplotlib)
- Jupyter Notebook
- Stand-alone python tSNE package
- Libraries to work with sparse CTR-like data: LibFM, LibFFM
- Another tree-based method: RGF (implemetation, paper)
- Python distribution with all-included packages: Anaconda
- Blog "datas-frame" (contains posts about effective Pandas usage)
- Preprocessing in Sklearn
- Andrew NG about gradient descent and feature scaling
- Feature Scaling and the effect of standardization for machine learning algorithms
- Discover Feature Engineering, How to Engineer Features and How to Get Good at It
- Discussion of feature engineering on Quora
Bag of words
Word2vec
- Tutorial to Word2vec
- Tutorial to word2vec usage
- Text Classification With Word2Vec
- Introduction to Word Embedding Models with Word2Vec
NLP Libraries
Pretrained models
Finetuning
- How to Retrain Inception's Final Layer for New Categories in Tensorflow
- Fine-tuning Deep Learning Models in Keras
- Perfect score script by Oleg Trott -- used to probe leaderboard
- Page about data leakages on Kaggle
- Evaluation Metrics for Classification Problems: Quick Examples + References
- Decision Trees: “Gini” vs. “Entropy” criteria
- Understanding ROC curves
- Learning to Rank using Gradient Descent -- original paper about pairwise method for AUC optimization
- Overview of further developments of RankNet
- RankLib (implemtations for the 2 papers from above)
- Learning to Rank Overview
- Tuning the hyper-parameters of an estimator (sklearn)
- Optimizing hyperparameters with hyperopt
- Complete Guide to Parameter Tuning in Gradient Boosting (GBM) in Python
- Far0n's framework for Kaggle competitions "kaggletils"
- 28 Jupyter Notebook tips, tricks and shortcuts
- Multicore t-SNE implementation
- Comparison of Manifold Learning methods (sklearn)
- How to Use t-SNE Effectively (distill.pub blog)
- tSNE homepage (Laurens van der Maaten)
- Example: tSNE with different perplexities (sklearn)
- Facebook Research's paper about extracting categorical features from trees
- Example: Feature transformations with ensembles of trees (sklearn)
- Kaggle ensembling guide at MLWave.com (overview of approaches)
- StackNet — a computational, scalable and analytical meta modelling framework (by KazAnova)
- Heamy — a set of useful tools for competitive data science (including ensembling)
You can often find a solution of the competition you're interested on its forum. Here we put links to collections of such solutions that will prove useful to you.