ml-jku / clamp

Code for the paper Enhancing Activity Prediction Models in Drug Discovery with the Ability to Understand Human Language

Home Page:https://arxiv.org/abs/2303.03363

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Pretraining and computational resource?

opened this issue · comments

Hi, thank you for sharing your great work!
I am interested in the concept of your paper and would like to try pretraining as written in your paper.
How can I pretrain using this repository?
Another question is about the computational resource.
In your paper, it took total 170 days and 800 times. Does the pretraining require the same computational time?
Is it possible to pretrain using a single GPU?

Thank you in advance:)

Hi concon23,
pretraining on the full PubChem18 dataset should take around 2-5 days with a modest consumer GPU after preprocessing the data.
You can follow the instructions in the reproduce section of the readme.
Hope you manage, otherwise I'm happy to help.

Hi @phseidl
Thank you for your kind reply.
I understand that.
It is friendly to users with common computational resources!

Sincerely:)

Hi @phseidl
Sorry for asking a question again.
python clamp/train.py --dataset=./data/fsmol --assay_mode=clip --split=FSMOL_split
The above command runs the pretraining?
Or runs a few shot training or something other?

Thank you in advance:)

Hi @concon23,
this performs pretraining and evaluates it on zero-shot.
To run few-shot you can add --support_set_size=k where k is the number of support-samples you want.
Best, Philipp