michael0905 / Zillow-Prize

A repository of my code for the kaggle competition: Zillow Prize

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Zillow-Prize

A repository of my code for the kaggle competition: Zillow Prize

Leaderboard

180/3799 (Top 5%)

Architecture

Architecture

StackNet Architecture

The final output is a weighted average of 4 models(Catboost, Genetic Programming, LightGBM, StackNet). StackNet is a great framework for ensembling models. For this competition, use 12 models for 5-fold stacking(level-2 model does not use level-1 model's features).

Some Thoughts and Tricks

  • Kaggle community is very helpful. Learned a lot from shared kernels and discussion.
  • Catboost seems to be the best single model for the competition.
  • Feature engineering is important but I cannot come up with some "magic" features with combination of origin ones. Dropping some features is useful though.
  • Choose different features for different models to increase the variety of models
  • Ensemble as much irrelevent good models as possible. Tree-based models are suitable for kaggle competition, try different depth, loss, bagging strategy and etc.
  • Cut off outliers with high |logerror|

About

A repository of my code for the kaggle competition: Zillow Prize


Languages

Language:Python 100.0%