peeush-agarwal / amlp_book_code

Approaching(Almost) any ML problem books code for practice

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Approaching (Almost) Any ML problem book's code for practice

Practice code from above book.

  1. Supervised and unsupervised learning
  2. Cross validation
  3. Evaluation metrics
  4. Project structure for any ML project
  5. Approaching categorical variables
    1. OneHot encoding + Logistic Regression model
      This gives us AUC score of ~0.78 which is good. As the AUC score is in range of 0-1 and 1 being the perfect model.
    2. LabelEncoding
      1. Random Forest model
        • This gives us AUC score of ~0.71 which is worse than Logistic regression model.
        • This model also takes more time and space compared to Logistic regression model.
        • This implies that we should never ignore basic model when training for the problem.
      2. XGBoost model
        • This gives us AUC score of ~0.76 which is better than RandomForest model, but still not better than Logistic regression model.
        • This model also takes more time and space compared to Logistic regression and RandomForest models.

About

Approaching(Almost) any ML problem books code for practice


Languages

Language:Jupyter Notebook 96.8%Language:Python 3.0%Language:Shell 0.2%