hyojin0912 / HJ-DTA-DataBias

BASE: a web service for providing compound-protein binding affinity prediction datasets with reduced similarity bias

Home Page:https://fundis.kaist.ac.kr/base

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

BASE: a web service for providing compound-protein binding affinity prediction datasets with reduced similarity bias

Overview

Deep learning-based drug-target affinity (DTA) prediction models have shown high performance but suffer from dataset bias. Our study investigates this bias using comprehensive databases and demonstrates that compound-protein binding affinity can often be predicted using compound features alone, due to high similarity among target proteins. We developed bias-reduced datasets by decreasing protein similarity between training and test sets, which improved model performance and balanced feature importance.

We introduce the Binding Affinity Similarity Explorer (BASE) web service, which offers bias-reduced datasets and prediction results to aid in the development of generalized and robust DTA models. BASE is freely available at https://fundis.kaist.ac.kr/base.

Figure9

Installation Instructions

To run the project locally, clone the repository:

git clone https://github.com/yourusername/HJ-DTA-DataBias.git
cd HJ-DTA-DataBias

Journal & Contact Info

Korea Advanced Institute Science and Technology(KAIST)

Synergistic Bioinformatics Laboratory

Hyojin Son*, Sechan Lee, Jaeuk Kim, Haangik Park, Myeong-Ha Hwang and Gwan-Su Yi†

Acknowledgement

This work was supported by the BK-21 program through National Research Foundation of Korea (NRF) under Ministro of Education.

License

The code in this repository is licensed under the MIT License. See the LICENSE file for more details.

The data in the data folder is licensed under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. See the LICENSE file for more details.

About

BASE: a web service for providing compound-protein binding affinity prediction datasets with reduced similarity bias

https://fundis.kaist.ac.kr/base

License:Other


Languages

Language:Jupyter Notebook 98.7%Language:R 1.3%