wellingtonyl / DTINet

A Network Integration Approach for Drug-Target Interaction Prediction

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DTINet: A Network Integration Approach for Drug-Target Interaction Prediction

DTINet is a computational pipeline to predict novel drug-target interactions (DTIs) from heterogeneous network. DTINet focuses on learning a low-dimensional vector representation of features for each node in the heterogeneous network, and then predicts the likelihood of a new DTI based on these representations via a vector space projection scheme. See our paper on Nature Communications and preprint on bioRxiv:100305.

Quick start

We provide an example script to run experiments on our dataset:

  • Run run_DTINet.m: predict drug-target interactions, and evaluate the results with cross-validation.

Note: See the "Tutorial" section below for a detailed instruction on how to specify parameters of DTINet, or how to run DTINet on your own dataset.

Supplementary Information

supplementary/ directory

  • Supplementary_Data_1.xlsx: The list of top 150 novel drug-target interactions predicted by DTINet, which was trained based all on drugs and targets that have at least one known interacting pair. Known drug-target pairs (corresponding to those non-zero entries in the drug-target interaction matrix) and novel predicted DTIs that share homologous proteins (with sequence identity scores >40%) with known DTIs were excluded from the list.
  • Supplementary_Data_2.xlsx: The entire list of novel drug-target interactions predicted by DTINet, which was trained based on all drugs and targets that have at least one known interacting pair.
  • Supplementary_Data_3.xlsx: Examples of the novel predictions which can be supported by the previous known evidence in the literature.

Code and data

src/ directory

  • DTINet.m: predict drug-target interactions (DTIs)
  • DCA.m: compact feature learning by integrating heterogeneous network
  • diffusionRWR.m: network diffusion algorithm (random walk with restart)
  • compute_similarity.m: compute Jaccard similarity based on interaction/association network
  • auc.m: evaluation script
  • run_DCA.m: example code of running DCA.m for feature learning
  • run_DTINet.m: example code of running DTINet.m for drug-target prediction
  • train_mf.mexa64: pre-built binary file of inductive matrix completion algorithm (downloaded from here)
  • download_imc.sh: download the inductive matrix completion source and build the executable library from source.

data/ directory

  • drug.txt: list of drug names
  • protein.txt: list of protein names
  • disease.txt: list of disease names
  • se.txt: list of side effect names
  • drug_dict_map: a complete ID mapping between drug names and DrugBank ID
  • protein_dict_map: a complete ID mapping between protein names and UniProt ID
  • mat_drug_se.txt : Drug-SideEffect association matrix
  • mat_protein_protein.txt : Protein-Protein interaction matrix
  • mat_protein_drug.txt : Protein-Drug interaction matrix
  • mat_drug_protein.txt : Drug_Protein interaction matrix (transpose of the above matrix)
  • mat_drug_protein_remove_homo.txt: Drug_Protein interaction matrix, in which homologous proteins with identity score >40% were excluded (see the paper).
  • mat_drug_drug.txt : Drug-Drug interaction matrix
  • mat_protein_disease.txt : Protein-Disease association matrix
  • mat_drug_disease.txt : Drug-Disease association matrix
  • Similarity_Matrix_Drugs.txt : Drug similarity scores based on chemical structures of drugs
  • Similarity_Matrix_Proteins.txt : Protein similarity scores based on primary sequences of proteins Note: drugs, proteins, diseases and side-effects are organized in the same order across all files, including name lists, ID mappings and interaction/association matrices.

feature/ directory

We provided the pre-trained vector representations for drugs and proteins, which were used to produce the results in our paper.

  • drug_vector_d100.txt
  • protein_vector_d400.txt

Third-party software

Our implementation requires the Inductive Matrix Completion (IMC) library. We provide an executable binary file in the src/ folder for convenience. The executable binary file was built on a typical Ubuntu 14.04 (64 bit) system. If you are using other Linux platforms, please consider building the library from its source by running bash download_imc.sh.

Tips: We recommend users to download and install the IMC library using the download_imc.sh script. If you download the library yourself from the website of IMC, please be aware that DTINet requires the C/C++ version (with Python and Matlab interfaces). Please do not use the other version, i.e., a pure MATLAB implementation. The pure MATLAB version treats the unknown/missing entries in the interaction matrix as zeros, which is not the same as required in DTINet.

Tutorial

  1. Put interaction/association matrices in the data/ folder.
  2. Create a network/ folder under DTINet/ and run compute_similarity.m, which will compute the Jaccard similarity of drugs and proteins, based on interaction/association matrices.
  3. Specify parameters (number of dimensions of feature vectors, restart probability, the maximum number of iterations) and run run_DCA.m, which will learn the feature vectors of drugs and proteins and save them in the feature/ folder.
  4. Set the path of feature vectors and corresponding parameters in run_DTINet.m and execute it. This script will predict the drug-target interactions and evaluate the results using a ten-fold cross-validation.

Citation

Luo, Y., Zhao, X., Zhou, J., Yang, J., Zhang, Y., Kuang, W., Peng, J., Chen, L. & Zeng, J. A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nature Communications 8, (2017).

@article{Luo2017,
  author = {Yunan Luo and Xinbin Zhao and Jingtian Zhou and Jinglin Yang and Yanqing Zhang and Wenhua Kuang and Jian Peng and Ligong Chen and Jianyang Zeng},
  title = {A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information},
  doi = {10.1038/s41467-017-00680-8},
  url = {https://doi.org/10.1038/s41467-017-00680-8},
  year  = {2017},
  month = {sep},
  publisher = {Springer Nature},
  volume = {8},
  number = {1},
  journal = {Nature Communications}
}

Contacts

If you have any questions or comments, please feel free to email Yunan Luo (luoyunan[at]gmail[dot]com) and/or Jianyang Zeng (zengjy321[at]tsinghua[dot]edu[dot]cn).

About

A Network Integration Approach for Drug-Target Interaction Prediction

License:GNU General Public License v3.0


Languages

Language:MATLAB 88.7%Language:M 6.5%Language:Shell 4.7%