ryan-hays / core

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Deep learning on protein images

First application of deep convolutional neural networks for drug-protein interaction prediction has appeared in the paper of AtomWise when the small 3D AlexNets have been trained on atoms of drug-protein complexes. We have experimented with the network structure and obtained predictions of a very high accuracy.

alt tag Fig1: One of the first networks we have trained is a 3D modification of the AlexNet as it was described in: Krizhevsky et al.

alt tag Fig2: Network Vision. To represent input to the network, coordinates of atoms determined by crystallographers have to be converted to an image made of cubic pixels. Since on a small scale the world is very sparse, approximately 1 real-valued pixel in 10^7 zeros, an approximation has to be made. In our case every atom is assigned a numeric tag that fills the value of the cubic pixel with the size of 0.5A. This picture depicts the structure of an aminoglycoside nucleotidyltransferase - an enzyme important for nucleic acids metabolism with the substrate in it's binding site. Image of protein have been rendered in VMD in a form common for scientific literature. Small differently colored spheres represent an approximation of atomic coordinates that the network sees.

alt_tag Fig3: The structure of the network that performed very well on our previous Kaggle competition. Significant improvements comapred have been achieved by the smaller pixel size and very wide first layer convolutions. An interactive demo of the same network can be seen here.

Here we provide a working example of the code that distinguishes correct docked positions of ligands from incorrect with an AUC of 0.934 after 80 epochs of training.

Usage:

git clone https://github.com/mitaffinity/core.git

Navigate your browser to https://inclass.kaggle.com/c/affinity4, and follow the registration steps to get the data. Afterwards, change the path in FLAGS.database_path to the location on your local machine, and type

python av4_main.py

this script should automatically index folders in your database and start training.

Our old Kaggle competition, and software can be found here. Enjoy!

About


Languages

Language:Python 99.9%Language:Shell 0.1%