Add the DeepDDI model

Question

Add the DeepDDI model

benedekrozemberczki opened this issue 2 years ago · comments

Benedek Rozemberczki commented 2 years ago

Dear @hzcheney,

Please read the paper first. It is here.
After that read the contributing guidelines.
If there is an existing open source version of the model please take a look.
ChemicalX is built on top of PyTorch 1.10. and torchdrug.
A similar model is which uses to generate drug representations. Take a look at the layer definition here.
The library heavily builds on top on torchdrug and molecules in batches are PackedGraphs.
There is already a model class under ./chemicalx/models/
Context features, drug level features and labels are all FloatTensors.
Look at the examples and tests under ./examples/ and ./tests/.
Add auxiliary layers as you see fit - please document these, add tests and add these layers to the main readme.md if needed.
Add typing to the initialisation and forward pass.
Non data dependent hyperparameters should have default values.
Please add tests under ./tests/ and make sure that your model/layer is tested with real data.
Write an example under ./examples/. What is the AUC on the test set? Is it reasonable?

Han · Answer 1 · Thu Dec 23 2021 21:26:19 GMT+0800 (China Standard Time)

Hi!😊Is this repo welcome for contribution?

Benedek Rozemberczki · Answer 2 · Thu Dec 23 2021 22:33:11 GMT+0800 (China Standard Time)

Hi @hzcheney,

We are architecting the data loaders in January 2022 and after that, we will have a board with outstanding features and issues. I will get back to you!

Thank you for your interest! We want to hit KDD 2022 Applied Track.

Benedek

Han · Answer 3 · Fri Dec 24 2021 13:32:37 GMT+0800 (China Standard Time)

That will be great! Good luck on your paper!

Benedek Rozemberczki · Answer 4 · Fri Jan 14 2022 17:43:27 GMT+0800 (China Standard Time)

Hi @hzcheney ,

Are you interested in contributing?

Han · Answer 5 · Sat Jan 15 2022 00:05:04 GMT+0800 (China Standard Time)

@benedekrozemberczki Yeah, I will try.

Benedek Rozemberczki · Answer 6 · Mon Jan 24 2022 05:48:51 GMT+0800 (China Standard Time)

@YuWVandy what do you think?

Han · Answer 7 · Fri Jan 28 2022 00:35:02 GMT+0800 (China Standard Time)

Hi! @benedekrozemberczki Sorry about the late response, I have already finished the model part. There is a problem with the input feature named SSP(structural similarity profile), it consists of the drug similarity vector which is based on their fingerprint. The problem is I can't find a straightforward way to calculate the SSP, any idea?

Benedek Rozemberczki · Answer 8 · Fri Jan 28 2022 01:19:43 GMT+0800 (China Standard Time)

It is the following:

For each drug a fingerprint is generated D X n. Where D is the number of drugs and n is the fingerprint dimensionality.
Using the fingerprints you define a D X D similarity matrix.
Using this Matrix you use PCA to reduce the dimensionality of the similarity matrix.
This would require on my side that we add a key to the dataset which we could use to retrieve the SSP vectors.

I would say using the drug feature vectors is sufficient to develop this.

Charles Tapley Hoyt · Answer 9 · Fri Jan 28 2022 01:29:54 GMT+0800 (China Standard Time)

I would say don’t consider the drug featurization as a part of the model. Whether you use maacs, Morgan, or SSP shouldn’t make a difference

Charles Tapley Hoyt · Answer 10 · Fri Jan 28 2022 01:33:42 GMT+0800 (China Standard Time)

So you could just submit the PR to take in whatever drug features are available from the data loader (currently Morgan fingerprints) and in future work we could add different featurizations to the data loader.

Benedek Rozemberczki · Answer 11 · Fri Jan 28 2022 01:55:53 GMT+0800 (China Standard Time)

Completely agree with @cthoyt about this. It should not be on the model side.

Benedek Rozemberczki · Answer 12 · Sat Jan 29 2022 00:27:46 GMT+0800 (China Standard Time)

@hzcheney Are you going to open a PR with your code?

Han · Answer 13 · Sat Jan 29 2022 00:57:12 GMT+0800 (China Standard Time)

@hzcheney Are you going to open a PR with your code?

@benedekrozemberczki Yeah! I have already opened a PR and please review it!