Jupyter notebooks for playing with Mpro virtual screening info

This notebook is a collection of examples for processing the MPro fragment screening virutal screening follow up data.

This is not supposed to be the definitive way to approach this, just a collection of suggestions.

See these links for background on the project:

If running a local Jupyter environment create a conda environment:

conda env create -f environment.yml

Then activate the environment:

conda activate jupyter-xchem

To get NGLViewer working you might need to run this:

jupyter-nbextension enable nglview --py --sys-prefix

Then start Jupyter:

jupyter notebook

Notebooks

The following notebooks may be of interest:

Score_distrbutions.ipynb - initial playground with a hotch-potch of appraoches that are used in the main notebooks.
1_DataPrep.ipynb - initial data merging and preparation.
2_InititalDataAnalysis.ipynb - basic analysis of the results.
3_AugmentationAndFiltering.ipynb - augmentation and filtering of the results.

Datasets

The following datasets may be of interest:

Mpro_16_data.sdf.gz - SD file containing the output of the 1_DataPrep notebook
Mpro_16_data.smi.gz - file with the SMILES from the output from the 1_DataPrep notebook

The follow datasets have been provided to supplement the data with ADMET data

data/enalos/16/Enalos_data.csv.gz - data provided by NovaMechanics Ltd through Enalos Suite (analysis here)
data/prosilico/16/predictions.csv.gz - data provided by Prosilico (analysis here)
data/marionegri/16/EPA_tox_class_1_to_20000.txt.gz - EPA tox class predictions generated by The Mario Negri IRCCS Institute (analysis here)

These are intended to be used in the 3_AugmentationAndFiltering notebook.

If you are wanting to generate data that can be used in the process of selecting compounds (see the above data for examples) you should use datasets 1 or 2 and PLEASE make sure the SMILES string (title line of the SDF) is included in your data so that it can be merged into the main data.

Updates

This data is continually being updated. We try to keep this README up to date.

About

Jupyter notebooks for analysing the Mpo fragment screen virtual screening data

Languages

Language:Jupyter Notebook 99.8%Language:Python 0.2%