mahtaz / Simboost-ML_project-

Classification using Machine Learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Simboost(ML_project)

SimBoost

Drug discovery is a time-consuming, laborious, costly and high-risk process. According to a report by the Eastern Research Group (ERG), it usually takes 10-15 years to develop a new drug. However, the success rate of developing a new molecular entity is only 2.01%.
Finding a compound that selectively binds to a particular protein is a highly challenging and typically expensive procedure in the drug development process.
In this project we are going to implement SimBoost which is machine-learning approch for predicting drug–target binding affinities using gradient boosting.

Table of contents

  • 1. Setup

  • 2.Feature Engineering

    • 2.1 Average Similarities and Binding values

    • 2.2 Drug/Target Similarity Networks

    • 2.3 Non-negative Matrix Factorization

    • 2.4 Building Train, Validation and Test Dataset using extracted features

  • 3.XGboost

    • 3.1 Tune Hyperparameters

    • 3.2 Ploting Feature importance

    • 3.3 Evaluation

  • 4.Classification

About

Classification using Machine Learning


Languages

Language:Jupyter Notebook 100.0%