learn-co-students / dsc-ensemble-methods-section-intro-seattle-ds-080519

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Ensembles - Introduction

Introduction

In this section, you'll learn about some of the most powerful machine learning algorithms: ensemble models! This lesson summarizes the topics we'll be covering in this section.

Ensembles

The idea of ensembles is to bring together multiple models to use them to improve the quality of your predictions when compared to just using a single model. In many real-world problems and Kaggle competitions, ensemble methods tend to outperform any single model.

Ensemble Methods

We start the section by providing an introduction to the concept of ensemble methods, explaining how they take advantage of the delphic technique (or "wisdom of crowds") where the average of multiple independent estimates is usually more consistently accurate than the individual estimates.

We also provide an introduction to the idea of bagging (Bootstrap Aggregation).

Random Forests

We then look at random forests - an ensemble method for decision trees that takes advantage of bagging and the subspace sampling method to create a "forest" of decision trees that provides consistently better predictions than any single decision tree.

GridsearchCV

We will also introduce some of the common hyperparameters for tuning decision trees. In this lesson, we look at how you can use GridSearchCV to perform an exhaustive search across multiple hyperparameters and multiple possible values to come up with a better performing model.

Gradient Boosting and Weak Learners

Next up, we introduce the concept of boosting which is at the heart of some of the most powerful ensemble methods such as Adaboost and Gradient Boosted Trees.

XGBoost

Finally, we end this section by introducing XGBoost (eXtreme Gradient Boosting) - the top gradient boosting algorithm currently in use.

Summary

You will often find yourself using a range of ensemble techniques to improve the performance of your models, so this section will introduce you to the techniques that will help you to improve the quality of your models.

About

License:Other


Languages

Language:Jupyter Notebook 100.0%