This repository contains the source of pyscreener, both a library and software for conducting HTVS via python calls
- python>=3.6
- all virtual screening software and openbabel located on your PATH
- (if using DOCK6-based HTVS) the location of the DOCK6 parent directory in your environment variables
pyscreener
is not a virtual screening software in itself. Rather, it is a wrapper around common VS software to enable a simple and common interface between them without having to learn the ins and outs of the preparation+simulation pipeline for each different software. With that in mind, it is up to the user to install the appropriate virtual screening software and place them on their PATH.
After you have downloaded the appropriate software relevant to preparing inputs for and actually running your VS software of choice, you have two options:
- append the directory containing these software to your bin via the command
export PATH=$PATH:<dir>
, where is the directory containing the software in question. - alternatively, and perhaps easier, is to copy the software to a directory that is already on your path. Typically this will be either
~/bin
or~/.local/bin
. To see what directories are currently on your path, typeecho $PATH
. There will typically be a lot of directories on your path, and it is best to avoid creating files in any directory above your home directory ($HOME on most *nix-based systems)
Due to some wonkiness with getting DOCK6 to work with pyscreener
, it requires that the DOCK6 environment variable be set with the location of the DOCK6 parent folder (the folder that is unpacked after downloading the original zip file and contains both the bin
and parameters
subdirectories.) To set the environment variable, enter the following command: export DOCK6=<path/to/dock6>
. (note: this environment vairable must always be set before running pyscreener, so it's probably best to place this inside your .bashrc
or .bash_profile
)
Note: both of the above steps must be satisifed before using pyscreener
To avoid having to do this every time you start a new shell, you can add whatever commands you typed to your respective shell's startup file (e.g., .bash_profile for a bash shell) (you can also add them to the non-login shell startup file, but it's not good a idea to recursively edit your PATH in these files)
The first step in installing pyscreener is to clone this repository: git clone <this_repo>
The easiest way to install all dependencies is to use conda along with the supplied environment.yml file, but you may also install them manually, if desired. All libraries listed in that file are required before using pyscreener
- (if necessary) install conda
cd /path/to/pyscreener
conda env create -f environment.yml
Before running pyscreener
, be sure to first activate the environment: conda activate pyscreener
- vina-type software
- install ADFR Suite for receptor preparation
- install any of the following docking software: vina, qvina2, smina, psovina and ensure the desired software is located on your path
- DOCK6
- install DOCK6 and specify the DOCK6 environment variable as the path of the parent folder (the one containing
bin
,install
, etc.) as detailed above - install sphgen_cpp and place the executable inside the
bin
subdirectory of the DOCK6 parent directory - install chimera and place the executable on your PATH (either by moving the exectuable to a folder already on your PATH or adding the folder containing the exectuable to your PATH)
- install DOCK6 and specify the DOCK6 environment variable as the path of the parent folder (the one containing
pyscreener was designed to have a minimal interface under the principal that a high-throughput virtual screen is intended to be a broad strokes technique to gauge ligand favorability. With that in mind, all one really needs to get going are the following:
- the PDB id of your receptor of interest or a PDB format file of the specific structure
- a file containing the ligands you would like to dock, in SDF, SMI, or CSV format
- the coordinates of your docking box (center + size), a PDB format file containing the coordinates of a previously bound ligand, or a numbered list of residues from which to construct the docking box (e.g., [42, 64, 117, 169, 191])
There are a variety of other options you can specify as well (including how to score a ligand given that multiple scored conformations are output, how many times to repeatedly dock a given ligand, etc.) To see all of these options and what they do, use the following command: python run.py --help
All of these options may be specified on the command line, but they may also be placed in a configuration file that accepts YAML, INI, and argparse
syntaxes. Example configuration files are located in test_configs. Assuming everything is working and installed properly, you can run any of these files via the following command: python run.py --config test_configs/<config>
At the core of the pyscreener software is the pyscreener
library that enables the running of docking software from input preparation all the way to output file parsing. The workhorse class is the Screener
ABC, which handles all of this for a user. To actually initialize a screener object, either of the derived classes: Vina
or DOCK
. Vina
is the Screener
class for performing docking simulations using any software derived from AutoDock Vina and accepts the software
keyword argument to its initializer. Currently, the list of supported Vina-type software is as follows: AutoDock Vina, Smina, QVina2, and PSOVina. DOCK
is the Screener
class for performing DOCKing using the DOCK software from UCSF. The input preparation pipeline for this software is a little more involved, so we encourage readers to look at the file to see what these additional parameters are.
Copyright (c) 2020, david graff
Project based on the Computational Molecular Science Python Cookiecutter version 1.5.