maduprey / protein-structure-learning

Protein classification with deep learning and boosted trees using topological features

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Topological protein classification

This repo provides notebooks for protein classification using topological features with data from the Protein Classification Benchmark PCB00019 test.

  • protein-classification-benchmark.ipynb compares several boosted trees, SVM, and a TensorFlow MLP trained on persistence curves across the 55-task benchmark.
  • protein-classification-deep-learning.ipynb constructs a multiclass classifier using a TensorFlow MLP trained on persistence curves. A SVM is also constructed using the same data, as a comparison.
  • protein-3D.ipynb provides tools for visualizing protein structure using py3Dmol in a Colab notebook setting.

Notes

Originally inspired by Barnes et al.'s Frontiers paper, A Comparative Study of Machine Learning Methods for Persistence Diagrams.

Look into incorporating PersLay. A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures. Uses deep learning. Cited in the Frontiers paper. Need to review this paper in depth. Even uses protein data for classification (gets ~75% accuracy, it seems). Main paper. Supp.

About

Protein classification with deep learning and boosted trees using topological features


Languages

Language:Jupyter Notebook 100.0%