kymy91 / lab-cross-validation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

logo_ironhack_blue 7

Lab | Cross Validation

For this lab, we will build a model on customer churn binary classification problem. You will be using files_for_lab/Customer-Churn.csv file.

Instructions

  1. Apply SMOTE for upsampling the data

    • Use logistic regression to fit the model and compute the accuracy of the model.
    • Use decision tree classifier to fit the model and compute the accuracy of the model.
    • Compare the accuracies of the two models.
  2. Apply TomekLinks for downsampling

    • It is important to remember that it does not make the two classes equal but only removes the points from the majority class that are close to other points in minority class.
    • Use logistic regression to fit the model and compute the accuracy of the model.
    • Use decision tree classifier to fit the model and compute the accuracy of the model.
    • Compare the accuracies of the two models.
    • You can also apply this algorithm one more time and check the how the imbalance in the two classes changed from the last time.

About


Languages

Language:Jupyter Notebook 100.0%