pharmai / plip

Protein-Ligand Interaction Profiler - Analyze and visualize non-covalent protein-ligand interactions in PDB files according to 📝 Adasme et al. (2021), https://doi.org/10.1093/nar/gkab294

Home Page:http://plip.biotec.tu-dresden.de

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

conf.NOHYDRO and aromatics

matteoferla opened this issue · comments

Describe the bug
If a structure with aromatic rings is provided in protonated form the atoms will not be aromatic.
As result when an aromatic protonated complex is used (regardless of config.NOHYDRO = True being set),
all pi-pi interactions are lost.

To Reproduce
This toy model shows what is going on:

from plip.basic import config
from plip.structure.preparation import PDBComplex
from rdkit import Chem
from rdkit.Chem import AllChem
from openbabel import pybel

benzene = AllChem.AddHs( Chem.MolFromSmiles('c1ccccc1') )

block = Chem.MolToPDBBlock(benzene, flavor=0)
mol = pybel.readstring('pdb', block)
print([a.type for a in mol.atoms])  # ['Car', 'Car', 'Car', 'Car', 'Car', 'Car', 'H', 'H', 'H', 'H', 'H', 'H']

block = Chem.MolToPDBBlock(benzene, flavor=0)
mol = pybel.readstring('pdb', block, {'s': None})
# s	Output single bonds only.
# https://openbabel.org/docs/current/FileFormats/Protein_Data_Bank_format.html
# this option is passed only to readfile not readstring though.
print([a.type for a in mol.atoms])  # ['Car', 'Car', 'Car', 'Car', 'Car', 'Car', 'H', 'H', 'H', 'H', 'H', 'H']

block = Chem.MolToPDBBlock(benzene, flavor=8)
# flavor & 8 : Don't use multiple CONECTs to encode bond order
# flavor 0: CONECT    1    2    2    6    7
# flavor 8: CONECT    1    2    6    7
mol = pybel.readstring('pdb', block, {'s': None})
print([a.type for a in mol.atoms])  # ['C3', 'C3', 'C3', 'C3', 'C3', 'C3', 'H', 'H', 'H', 'H', 'H', 'H']

obc = pybel.ob.OBConversion()
obc.SetInFormat('pdb')
block = Chem.MolToPDBBlock(benzene, flavor=8)
mol = pybel.readstring('pdb', block, {'s': None})  # s	Output single bonds only. https://openbabel.org/docs/current/FileFormats/Protein_Data_Bank_format.html
mol.OBMol.PerceiveBondOrders()
print([a.type for a in mol.atoms])  # ['C2', 'C2', 'C2', 'C2', 'C2', 'C2', 'H', 'H', 'H', 'H', 'H', 'H']

block = Chem.MolToPDBBlock(benzene, flavor=0)
config.NOHYDRO = True  # as in: "don't add hydrogen"
pdb_complex = PDBComplex()
pdb_complex.load_pdb(block, as_string=True)
print([a.type for a in pdb_complex.protcomplex.atoms]) # ['Car', 'Car', 'Car', 'Car', 'Car', 'Car', 'H', 'H', 'H', 'H', 'H', 'H']
print([a.type for a in pdb_complex.ligands[0].mol.atoms])  # ['C2', 'C2', 'C2', 'C2', 'C2', 'C2']

Expected behavior
Pybel when reading a protonated PDB block with repeated CONECT for bond order (flavor &8 in RDKit and with the -s flag for OpenBabel, will correctly assume the benzene ring is formed of 6x sp3 atoms (w/ a proton and a radical), this is corrected to non-aromatic sp2 atoms w/ OBMol.PerceiveBondOrders() but these are not aromatic. I am not sure what the pybel command to perceive Kekule bonds is.

I mention this because the extracted ligand has sp2 carbons, even when the parsed protcomplex has aromatic carbons.

Additional context
A simple solution is strip all hydrogens.
This sounds a bit wasteful especially as far as I know there's not a way to provide a reference SMILES or a reference Molecule to assign the bond order —right?
In my case, my ligands may be strained and was hoping the hydrogens would help PerceiveBondOrders get things mostly right.