Automation of AMOEBA Polarizable Force Field for Small Molecules - Poltype 2

Obejective

Given an input chemical structure, all parameters can be automatically assigned from a database or derived via fitting to ab initio data generated on the fly.

Please Cite

Walker, B., Liu, C., Wait, E., Ren, P., J. Comput. Chem. 2022, 1. https://doi.org/10.1002/jcc.26954

Wu JC, Chattree G, Ren P. Automation of AMOEBA polarizable force field parameterization for small molecules. Theor Chem Acc. 2012 Feb 26;131(3):1138. doi: 10.1007/s00214-012-1138-6. PMID: 22505837; PMCID: PMC3322661.

License

Tinker License

📚 Documentation Overview

Please read 👇🙏

Features

Parameterization Input Features
- Automated total charge assignment
- Dominant ionization state enumeration (pH=7) / tautomer enumeration
- Smart memory resource defaults for QM jobs
Parameterization Features
- Molecule fragmenter to speed up QM calculations
- Parallelized job submission for QM jobs
- Psi4/Gaussian quantum package support
- QM dimer data generation for vdW fitting
- Torsion-Torsion coupling
- Expanded torsion database
Host-guest Modelling Features
- Docking with GOLD, AutoDock4, AutoDock Vina, Vinardo
- Protein-ligand interaction profiler and visualization with BINANA and ProLif
- Missing residues & loop modeling with Modeller
- Protonation assignment via propka/pdb2pqr
- Tinker box set up
- BAR Free energy estimation file setup

💻 Program Installation

Automated AMOEBA Ligand Parameterization (Main function of Poltype)

Parameterization Input Preparation

Ligand Protonation State Generation

Minimum Example Usage Parameterization

Recommended-Poltype-inputs

Default Resource Consumption

💻 Advanced Program Usage

Parameterization Output Files

Parameterization Sanity Checks

Parameterization Examples

Automated AMOEBA Ligand Parameterization How It Works

Automated AMOEBA Molecular Dynamics and Free Energy Prediction (addon features)

Reference 1

Protein Ligand Interaction Visualization

Protein Input Preparation

Minimum Input Example Docking

Molecular Dynamics Input Preparation

Minimum Input Example Binding Free Energy

Minimum Input Example Solvation Free Energy

Minimum Input Example Neat Liquid Simulation

Molecular Dynamics Sanity Checks

Free Energy Output Files

Binding Free Energy Table Output Example
Hydration Free Energy Table Output Example

Hydration Free Energy Examples

Binding Free Energy Examples

💻 Advanced Program Usage

Automated AMOEBA Molecular Dynamics and Free Energy Prediciton How It Works

Parameterization Input Preparation

The input structure can be given as an filetype with bond order information (sdf,mol,mol2,etc..)
Cartesian XYZ, tinker XYZ and PDB is not recommended, this will be converted to an SDF file and bond orders will be guessed based on atomic distances.
If 2D structure is given, poltype will generate 3D coordinates for you.
Total charge is determined by computing the formal charge for each atom.
Formal atom charge will be assigned via the input number of bonds and bond order for surrounding bonds and element of each atom.
Optional keywords exist to add missing hydrogens.
If carbon or nitrogen are negatively charged, then Poltype will assume you meant to have hydrogens on those atoms and protonate to neutral charge. For all other elements, if there exists a formal charge (due to input bond order and valence electrons of element), then poltype will detect the formal charge for you and compute total charge from sum of each formal atom charge. Warning message is printed when hydrogen atoms are added.
Special radical charge states require additional information in the input file specifying which atom is a radical.
If you do not want charged atoms to be protonated then use addhydrogentocharged=False

Ligand Protonation State Generation

Dominant ionization states at pH 7 are enumerated and SDF files are generated via Dimorphite-DL (IonizationState_0.sdf,IonizationState_1.sdf,..).
Tautomer states are enumerated via rdkit and the first tautomer is the canonical tautomer TautomerState_0.sdf
Use genprotstatesonly to quit program after generating dominant ionization states at pH=7 and tautomers.

Minimum Example Usage Parameterization

All input arguments are specified in poltype.ini file

structure=methylamine.sdf

Navigate to directory containing poltype.ini and .sdf file, and run:

nohup python /path_to_poltype/poltype.py &

final.xyz and final.key are the resulting structure and parameter files you will need.

After poltype finishes, check the OPENME folder for torsion fitting and ESP fitting results.

Recommended Poltype inputs

By default, Poltype uses the old GDMA algorithm with non-bonded parameters mostly from amoeba09.

To use the new GDMA algorithm and non-bonded parameters adjusted to reproduce hydration free energies:

new_gdma=True
gdmacommand_Radius_S=0.80
prmmodfile=dma4_hfe2023

Adjustments inlcude multipole and vdw scaling

To use the new GDMA algorithm with default GDMA radius, no scaling of multipole or vdW parameters:

new_gdma=True

Default Resource Consumption

By default, Poltype computes the number of fragment poltype jobs (or any QM job if fragmenter is not being used) to run in parallel as the floor function of the input number of cores divided by the number of cores per job (default of 2).
RAM, cores, and disk space can all be detected and a consumption ratio of 80% is used by default. Fragment RAM, disk, and cores are divided evenly by the number of Poltype jobs in parallel.

Parameterization Sanity Checks

MM = Molecular Mechanics (AMOEBA model), QM = Quantum Mechanics
Check for 2D coordinates and generates 3D coordinates at begining of program
Check to ensure refined multipoles enable MM to model the QM potential grid well and raises error if not
Check to ensure QM and MM dipoles are very similar and raise error if not
Check to ensure the final minimized MM structure is similar to the geometry optimized QM structure and raises error if not
Check for any missing van der Waals at end of program parameters and raises error
Check for any missing multipole parameters at end of program and raises error
Check for any zeroed out torsion parameters in final key file. Checks if fragmenter is transferring torsion properly.

Protein Ligand Interaction Visualization

Navigate to folder VisualizationNotebooks
Be sure to install the conda environment (notebookenvironment.yml),conda env create -f notebookenvironment.yml.
Activate the conda environment conda activate pymolenv
Move protein-ligand complexed PDB to VisualizationNotebooks folder.
Ensure that the residue labels for ligand are labeled as LIG.
Launch the jupyter notebook jupyter-notebook Protein-Ligand-Interactions.ipynb
Input your complexed PDB name into variable ligandreceptorfilename
Run the cell and the output from BINANA interaction profiler and ProLif is shown.

Protein Input Preparation

If your protein PDB has missing residues, poltype wraps Modeller to fill in missing residues for you (experimental). Need to use keyword pdbcode. This will download the PDB for you, and call modeller, then quit the program after PDB has been filled in and optimized with Modeller.
An academic license file is required after installation of modeller conda install -c salilab modeller, the screen will prompt you which file to insert your license key in.
After this check results of output PDB.

pdbcode=5l2t

Remove the keyword pdbcode from poltype input file, if you wish to perform further computations.
Protonation state assingment and adding ligand to the protein pocket are next steps for computing binding simulations.
usepdb2pqr keyword can be used with uncomplexedproteinpdbname to estimate pKa values of titratable residues via propka and then protonate the PDB for you. The output PDB will have extension "_final.pdb".
pdb2pqr can be installed via conda install -c conda-forge pdb2pqr or use yaml file

uncomplexedproteinpdbname=5l2t_filled.BL00020001.pdb

After adding any missing residues and assigning the protonation state, the ligand needs to be added to the protein pocket and given as input complexedproteinpdbname for binding computations.
If your protonated uncomplexedproteinpdbname now no longer has ligands, can add them back with alignment procedure in poltype to generate a new complexedproteinpdbname as follows.

complexedproteinpdbname=pdb3srd.pdb
uncomplexedproteinpdbname=3srd_filled.BL00020001_final.pdb

Output file will contain _align.pdb, this aligns complexedproteinpdbname to the uncomplexedproteinpdbname and then adds ligands.

Minimum Input Example Docking

If you want to use GOLD, ensure that the bin folder is in your PATH
Complexed protein with ligand needs to be provided (this should already be protonated before providing as input)
For generating PDBQT files with AutoDock4 and AutoDock Vina, a seperate python 2 environment (dockingprep.yml) needs to be installed.
Optional keywords exist to change docking grid center (default is center of ligand from the input protein-ligand complex), docking grid length (how far grid extends), grid spacing (controls point density on grid) and the number of poses generated (default 10).
Final scores, structures and rankings are given in a file called DockingReport.txt

complexedproteinpdbname=complex.pdb
listofligands=lig1.sdf,lig2.sdf
usead4=True
usevina=True
usevinardo=True
usegold=True

Molecular Dynamics Input Preparation

If Tinker9 executables are in PATH, then program will switch to using analyze9,dynamic9,minimize9, the GPU executables.
Make a seperate folder from where parameterization files from poltype were made (with new poltype.ini file too)
Ligand XYZ and key files are required (such as final.xyz and final.key from Poltype parameterization).
Besides the ligandxyzfilenamelist for all ligands, there is also annihilateligandxyzfilenamelist which tells the program which ligand types to annihilate. By default, annihilateligandxyzfilenamelist=ligandxyzfilenamelist.
If there are duplicate molecules (same parameters and XYZ), make copies of each XYZ and key and provide as inputs. The program will then change the type numbers so as none of the individual molecules have overlapping type numbers. This way can choose to disappear one or more of the ligands with the same parameters (but different types). If there are duplicate ligands in your system, please make sure the list order of ligandxyzfilenamelist and annihilateligandxyzfilenamelist are the same order as ligands that occur in the input PDB file. This way the program can determine which of the duplicates in annihilateligandxyzfilenamelist to disappear and distinguish those from other duplicates in ligandxyzfilenamelist that are not in annihilateligandxyzfilenamelist
Waters and ions within 8 angstroms of any atom in ligand - protein PDB will automatically be detected and added to protein-ligand XYZ file.
For binding free energy compuations, either a host PDB or premade tinker XYZ is required
Inputs are inside poltype.ini
Make sure pdb files (complexed and uncomplexed) have no missing residues or atoms.
Charge is read from input XYZ files generated.
Make sure if using custom receptor parameters, then either adding to keyfilenamelist or in prmfilepath
Use submitlocally=False if you do not wish to submit dynamics jobs locally. By default this is already False for production dynamics and for BAR, for minimzation and equilbriation, by default this is True and jobs are submited locally. Then program will wait for you to complete the jobs in text file (such as _proddynamicsjobs.txt).
For HFE, if your ligand is charged and you want to compute the salt hydration free energy, add "salthfe=True"
Some files containing .out extensions are used to determine if dynamics has already been done to prevent rerunning dynamics each time program is called. Tinker keys files are always recreated everytime program is rerun.
Minimized and equilibriated PDB (ligand complexed with host) and XYZ structures for viewing in pymol will have extension _pymol in name. Also PDB trajectory files are generated for complexed host-ligand (if input complexedproteinpdbname is given) for equilbriated ARC and production dynamics ARC with full electrostatics and vdW interactions. PDB trajectory files are in same location as ARC file.
If your dynamic jobs are killed prematurely, upon rerunning poltype it will read the number of steps in your output dynamics files and determine the number of new steps needed to be taken to finish dynamics.

Minimum Input Example Binding Free Energy

complexedproteinpdbname=complex.pdb 
keyfilenamelist=Mol1.key , Mol2.key
ligandxyzfilenamelist=Mol1.xyz , Mol2.xyz

receptorligandxyzfilename=complex.xyz
prmfilepath=prmfile # with absolute path for receptor
keyfilenamelist=complex.key 
ligandxyzfilenamelist=ligand.xyz

Minimum Input Example Solvation Free Energy

keyfilenamelist=Mol.key
ligandxyzfilenamelist=Mol.xyz

Minimum Input Example Neat Liquid Simulation

density=997
keyfilenamelist=Mol.key
ligandxyzfilenamelist=Mol.xyz
equilibriatescheme=50,100,150,200,300,300

Navigate to directory containing poltype.ini, and run:

nohup python /path_to_poltype/poltype.py &

Molecular Dynamics Sanity Checks

Check for missing parameters
Check total charge of all boxes for binding free energy alchemical perturbation have a net zero charge

Free Energy Output Files

Binding Free Energy Table Output Example

ΔGˢᵒˡᵛ = Change in solvation free energy (does not contain gas phase component by default)
ΔGˢᵒˡᵛᵉʳʳ = Change in solvation free energy error
ΔGᶜᵒᵐᵖᶜᵒʳʳ = Change in complexation free energy with analytical correction
ΔGᶜᵒᵐᵖᵘⁿᶜᵒʳʳ = Change in complexation free energy with out analytical correction
ΔGᶜᵒᵐᵖᶜᵒʳʳᵉʳʳ = Change in complexation free energy error with analytical correction
ΔGᵃⁿᵃᶜᵒᵐᵖᶜᵒʳʳ = Analytical correction to complexation free energy
ΔGᵇᶦⁿᵈᶜᵒʳʳ = Change in binding free energy corrected
ΔGᵇᶦⁿᵈᶜᵒʳʳᵉʳʳ = Change in binding free energy corrected error
ΔGˢᵒˡᵛᵉˡᵉ = Change in solvation free energy electrostatic component
ΔGˢᵒˡᵛᵛᵈʷ = Change in solvation free energy van der Waals component
ΔGᶜᵒᵐᵖᵉˡᵉ = Change in complexation free energy electrostatic component
ΔGᶜᵒᵐᵖᵛᵈʷ = Change in complexation free energy van der Waals component
ΔHˢᵒˡᵛ = Change in solvation enthalpy
ΔHˢᵒˡᵛᵉʳʳ = Change in solvation entropy error
ΔSˢᵒˡᵛ = Change in solvation entropy
ΔSˢᵒˡᵛᵉʳʳ= Change in solvation entropy error
ΔHᶜᵒᵐᵖ = Change in complexation enthalpy
ΔHᶜᵒᵐᵖᵉʳʳ = Change in complexation enthalpy error
ΔSᶜᵒᵐᵖ = Change in complexation entropy
ΔSᶜᵒᵐᵖᵉʳʳ = Change in complexation entropy error
ΔHᵇᶦⁿᵈ = Change in binding enthalpy
ΔHᵇᶦⁿᵈᵉʳʳ = Change in binding enthalpy error
ΔSᵇᶦⁿᵈ = Change in binding entropy
ΔSᵇᶦⁿᵈᵉʳʳ = Change in binding entropy error

Gibbs free energy table
Gibbs_Free_Energy_Change_Table.csv

Enthalpy, Entropy and Gibbs free energy table
Enthalpy,_Entropy,_Gibbs_Energy_Change_Table.csv

Generic simulation information
Solvation_Simulation_Info_Table.csv

Generic simulation information
Complexation_Simulation_Info_Table.csv

Hydration Free Energy Table Output Example

ΔGˢᵒˡᵛ = Change in solvation free energy
ΔGˢᵒˡᵛᵉʳʳ = Change in solvation free energy error
ΔHˢᵒˡᵛ = Change in solvation enthalpy
ΔHˢᵒˡᵛᵉʳʳ = Change in solvation enthalpy error
ΔSˢᵒˡᵛ = Change in solvation entropy
ΔSˢᵒˡᵛᵉʳʳ = Change in solvation entropy error
ΔGˢᵒˡᵛᵉˡᵉ = Change in solvation free energy electrostatic component
ΔGˢᵒˡᵛᵛᵈʷ = Change in solvation free energy van der Waals component
ΔGᵉˡᵉᵍᵃˢ = Change in solvation free energy electrostatic component gas phase
ΔGᵉˡᵉˢᵒˡ = Change in solvation free energy electrostatic component solution phase
ΔGᵛᵈʷᵍᵃˢ = Change in solvation free energy van der Waals component gas phase
ΔGᵛᵈʷˢᵒˡ = Change in solvation free energy van der Waals component solution phase
ΔGˢᵒˡ = Change in solvation free energy solution phase
ΔGᵍᵃˢ = Change in solvation free energy gas phase
ΔGˢᵒˡᵛᶠʷᵈ = Change in forward solvation free energy (A->B)
ΔGˢᵒˡᵛᵇʷᵈ = Change in backword solvation free energy (B->A)
SolvOverlap = Solvation overlap between neigboring states, value 0-1

Gibbs free energy table
Gibbs_Free_Energy_Change_Table.csv

Enthalpy, Entropy and Gibbs free energy table
Enthalpy,_Entropy,_Gibbs_Energy_Change_Table.csv

Generic simulation information
Solvation_Simulation_Info_Table.csv

Individual BAR step free energy computations
Useful for troubleshooting if free energy is incorrect
CompBARResults.csv,SolvBARResults.csv

Citations

Molecular dynamics. Florent Barbault. (2010, September 10). Retrieved November 8, 2021, from https://florentbarbault.wordpress.com/research/mova-2/.

TinkerTools / poltype2

Automation of AMOEBA Polarizable Force Field for Small Molecules - Poltype 2

Obejective

Please Cite

License

📚 Documentation Overview

Features

Automated AMOEBA Ligand Parameterization (Main function of Poltype)

Automated AMOEBA Molecular Dynamics and Free Energy Prediction (addon features)

Parameterization Input Preparation

Ligand Protonation State Generation

Minimum Example Usage Parameterization

Recommended Poltype inputs

Default Resource Consumption

Parameterization Sanity Checks

Protein Ligand Interaction Visualization

Protein Input Preparation

Minimum Input Example Docking

Molecular Dynamics Input Preparation

Minimum Input Example Binding Free Energy

Minimum Input Example Solvation Free Energy

Minimum Input Example Neat Liquid Simulation

Molecular Dynamics Sanity Checks

Free Energy Output Files

Binding Free Energy Table Output Example

Hydration Free Energy Table Output Example

Citations

About

Languages