tomwhitbrook / ds-skills-regression-summary-london-ds-091018

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Regression Summary

Recall to date our general outline for creating regression models:

  • Define X and y
    • y should be a continuous numeric variable
    • X should be a number of numeric features *X features may need substantial preproccesing including: * Transforming datetime values * Normalizing features ranges/distributions * Creating dummy variables * Ensuring there are no categorical variables coded (misleadingly) as numbers
  • Train / Test Split
  • Fit algorithm on training data
    • Cross Validation
    • Feature Engineering
      • Synthetic Polynomial Features / Polynomial Regression
  • Evaluate Model Performance on Test Set
    • Repeat process with seperate train/test split; do results hold?
  • Continue feature engineering, tuning, etc.

Use this time to review all of your notes to date regarding these varied techniques.
You should also continue to practice outlining work for your midterm.
The general outline of the next few days looks like this:

Class 8: Perform Exploratory Data Analysis and Fit / Tune a Regression Model (Final Practice for Midterm)
Class 9: In class time devoted to working on Midterm project
Class 10: Discussion / Presentations of Midterm Projects

Midterm Project Rubric

About


Languages

Language:Jupyter Notebook 100.0%