About the trained models and ROC curve

Question

wliusjtu opened this issue 3 years ago · comments

Dear authors,

We are recently working on your paper and the released code, and we have the following two issues that we hope you could help us to manage them:

training the models for all the protein sequences from scratch is quite time-consuming, we are thus wondering if there are trained models of all the protein sequences along with the released code?
you evaluated AUC and ROC in your paper, however, it seems there is no corresponding code in the released one, could you please provide the code?

Many thanks.

Pascal Notin · Answer 1 · Wed Jun 15 2022 10:19:40 GMT+0800 (China Standard Time)

Dear Wei,
Apologies about the delayed response. Regarding your two questions:

We have made available all scores for the proteins we have trained models for at: https://evemodel.org/. You may also download from that website the MSAs we used to train the different models with the training script in the repo (latest version available at: https://github.com/OATML-Markslab/EVE).
Whenever we mention AUC in the paper, we always mean Area Under the Receiver Operating Characteristic. We use the default implementation from sklearn.metrics. See for example the performance_helpers.py script under utils (https://github.com/OATML-Markslab/EVE/blob/master/utils/performance_helpers.py#L42).
Kind regards,
Pascal

Pascal Notin · Answer 2 · Thu Dec 08 2022 18:55:01 GMT+0800 (China Standard Time)

Closing the issue now but feel free to re-open if needed.