Alternative Usage

Question

Alternative Usage

protbiochem opened this issue 3 years ago · comments

Recently our lab has collected a large amount of data on a specific class of proteins and what they bind to. This protein class is not sampled in the TOUGH or VERTEX dataset, such that when we attempt to distinguish binding pockets in this class of proteins using DeeplyTough, it fails, presumably because it wasn't trained on this class of proteins. My question is, is there some way we could use DeeplyTough on our own data to train our own dataset which could then be used to distinguish between these protein classes?

Joshua Meyers · Answer 1 · Mon Oct 25 2021 20:45:58 GMT+0800 (China Standard Time)

Hey guys, apologies for the slow reply. This repo contains all the code required for training deeplytough so in theory it is possible. However, it is not a trivial change and unfortunately, we cannot support this work. To give an indication of what this entails: One would have to implement their own dataset class (analogous to vertex.py) and then plug this into the various training and evaluation scripts.

From a scientific point of view, I would have expected the existing trained networks to distinguish these pockets, I am surprised by your result! For retraining you would presumably lose the advantages of such a large training dataset (tough-m1) and care would have to be given to design an appropriate splitting strategy.

Best of luck!