laiagomezmessia / lab-random-forests

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

logo_ironhack_blue 7

Lab | Random Forests

For this lab, you will be using the .CSV files provided in the files_for_lab folder. These are cleaned versions of the learningSet data from the Case Study 'Healthcare for All'.
Begin a new Jupyter Notebook after Forking and Cloning this Repo.

Instructions

  • Apply the Random Forests algorithm but this time only by upscaling the data to deal with the imbalance.
  • Use Feature Selections that you have learned in class to decide if you want to use all of the features (Variance Threshold, RFE, PCA, etc.)
  • Re-run the Random Forest algorithm to determine if the Feature Selection has improved the results.
  • Discuss the output and its impact in the business scenario. Is the cost of a false positive equals to the cost of the false negative? How would you change your algorithm or data in order to maximize the return of the business?

About


Languages

Language:Jupyter Notebook 100.0%