yehlincho / GLAMOUR

Graph Learning over Macromolecule Representations

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GLAMOUR: Graph Learning over Macromolecule Representations

Somesh Mohapatra, Joyce An, Rafael Gómez-Bombarelli

Department of Materials Science and Engineering, Massachusetts Institute of Technology

The repository and the Tutorial accompanies Chemistry-informed Macromolecule Graph Representation for Similarity Computation, Unsupervised and Supervised Learning.


In this work, we developed a graph representation for macromolecules. Leveraging this representation, we developed methods for -

  • Similarity Computation: Using chemical similarity between monomers through cheminformatic fingerprints and exact graph edit distances (GED) or graph kernels to compare topologies, it allows for quantification of the chemical and structural similarity of two arbitrary macromolecule topologies.
  • Unsupervised Learning: Dimensionality reduction of the similarity matrices, followed by coloration using the labels shows distinct regions for different classes of macromolecules.
  • Supervised learning: The representation was coupled to supervised GNN models to learn structure-property relationships in glycans and anti-microbial peptides.
  • Attribution: These methods highlight the regions of the macromolecules and the substructures within the monomers that are most responsible for the predicted properties.

Using the codebase

To use the code with an Anaconda environment, follow the installation procedure here -

conda create -n GLAMOUR python=3.6.12
conda activate GLAMOUR
conda install pytorch==1.7.1 cudatoolkit=10.1 -c pytorch
conda install -c conda-forge matplotlib==3.2.2
conda install -c rdkit rdkit==2018.09.3
conda install -c dglteam dgl-cuda10.1
conda install -c dglteam dgllife
conda install captum -c pytorch
conda install -c anaconda scikit-learn==0.23.2
conda install -c anaconda networkx
conda install seaborn
conda install -c conda-forge svglib
conda install -c conda-forge umap-learn
conda install -c conda-forge grakel

If you are new to Anaconda, you can install it from here.

How to cite

@article{mohapatra2022chemistry,
  title={Chemistry-informed Macromolecule Graph Representation for Similarity Computation, Unsupervised and Supervised Learning},
  author={Mohapatra, Somesh and An, Joyce and G{\'o}mez-Bombarelli, Rafael},
  journal={Machine Learning: Science and Technology},
  year={2022},
  publisher={IOP Publishing}
}

License

MIT License

About

Graph Learning over Macromolecule Representations


Languages

Language:Jupyter Notebook 58.6%Language:Python 41.4%