lampda / stacking

Template for Stacking (Stacked Generalization) Ensemble Method

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

#Template for Stacking (Stacked Generalization) Ensemble Method

##What is Stacking?

Stacking, also known as Stacked Generalization, is an ensemble method where the goal is to combine the output of machine learning algorithms with another machine learning algorithm.

##The Goal

Use this project as a template for your stacking implementation. In this project, StackingClassifier only provides fit and predict_proba functions, and works with binary [0, 1] labels. predict_proba function returns the probability of label 1. To learn how to use, see test/test_stackingclassifier.py

##Simple Explanation About The Project

StackingClassifier class acts as a proxy class that connects to one of two classes that implement different stacking techniques.

  • In the technique called ONEGO, the prediction dataset is created in one go. Here is the steps in this technique:

    • 1.Split the train dataset into 3 parts: train1, train2, train3
    • 2.Fit a base classifier on train1 and create predictions for rest of the train dataset
    • 3.Fit the same classifier on train2 and create predictions for rest of the train dataset
    • 4.Fit the same classifier on train3 and create predictions for rest of the train dataset
    • 5.Fit the same classifier on the entire train set and create predictions for the prediction dataset
    • 6.Repeat steps 2,3,4 and 5 for each base classifiers
    • 7.Create a dataset called stacking_data that is output of steps 2,3 and 4
    • 8.Create a dataset called stacking_prediction that is output of step 5
    • 9.Fit the combiner classifier on stacking_data
    • 10.Use fitted combiner classifier for prediction on stacking_prediction
  • In the technique called OUTOFFOLDS, the prediction dataset is created by taking the average of the out-of-fold predictors' predictions

Author: Necati Demir ndemir@demir.web.tr

About

Template for Stacking (Stacked Generalization) Ensemble Method


Languages

Language:Python 100.0%