conf.NOHYDRO and aromatics
matteoferla opened this issue · comments
Describe the bug
If a structure with aromatic rings is provided in protonated form the atoms will not be aromatic.
As result when an aromatic protonated complex is used (regardless of config.NOHYDRO = True
being set),
all pi-pi interactions are lost.
To Reproduce
This toy model shows what is going on:
from plip.basic import config
from plip.structure.preparation import PDBComplex
from rdkit import Chem
from rdkit.Chem import AllChem
from openbabel import pybel
benzene = AllChem.AddHs( Chem.MolFromSmiles('c1ccccc1') )
block = Chem.MolToPDBBlock(benzene, flavor=0)
mol = pybel.readstring('pdb', block)
print([a.type for a in mol.atoms]) # ['Car', 'Car', 'Car', 'Car', 'Car', 'Car', 'H', 'H', 'H', 'H', 'H', 'H']
block = Chem.MolToPDBBlock(benzene, flavor=0)
mol = pybel.readstring('pdb', block, {'s': None})
# s Output single bonds only.
# https://openbabel.org/docs/current/FileFormats/Protein_Data_Bank_format.html
# this option is passed only to readfile not readstring though.
print([a.type for a in mol.atoms]) # ['Car', 'Car', 'Car', 'Car', 'Car', 'Car', 'H', 'H', 'H', 'H', 'H', 'H']
block = Chem.MolToPDBBlock(benzene, flavor=8)
# flavor & 8 : Don't use multiple CONECTs to encode bond order
# flavor 0: CONECT 1 2 2 6 7
# flavor 8: CONECT 1 2 6 7
mol = pybel.readstring('pdb', block, {'s': None})
print([a.type for a in mol.atoms]) # ['C3', 'C3', 'C3', 'C3', 'C3', 'C3', 'H', 'H', 'H', 'H', 'H', 'H']
obc = pybel.ob.OBConversion()
obc.SetInFormat('pdb')
block = Chem.MolToPDBBlock(benzene, flavor=8)
mol = pybel.readstring('pdb', block, {'s': None}) # s Output single bonds only. https://openbabel.org/docs/current/FileFormats/Protein_Data_Bank_format.html
mol.OBMol.PerceiveBondOrders()
print([a.type for a in mol.atoms]) # ['C2', 'C2', 'C2', 'C2', 'C2', 'C2', 'H', 'H', 'H', 'H', 'H', 'H']
block = Chem.MolToPDBBlock(benzene, flavor=0)
config.NOHYDRO = True # as in: "don't add hydrogen"
pdb_complex = PDBComplex()
pdb_complex.load_pdb(block, as_string=True)
print([a.type for a in pdb_complex.protcomplex.atoms]) # ['Car', 'Car', 'Car', 'Car', 'Car', 'Car', 'H', 'H', 'H', 'H', 'H', 'H']
print([a.type for a in pdb_complex.ligands[0].mol.atoms]) # ['C2', 'C2', 'C2', 'C2', 'C2', 'C2']
Expected behavior
Pybel when reading a protonated PDB block with repeated CONECT for bond order (flavor &8 in RDKit and with the -s flag for OpenBabel, will correctly assume the benzene ring is formed of 6x sp3 atoms (w/ a proton and a radical), this is corrected to non-aromatic sp2 atoms w/ OBMol.PerceiveBondOrders()
but these are not aromatic. I am not sure what the pybel command to perceive Kekule bonds is.
I mention this because the extracted ligand has sp2 carbons, even when the parsed protcomplex has aromatic carbons.
Additional context
A simple solution is strip all hydrogens.
This sounds a bit wasteful especially as far as I know there's not a way to provide a reference SMILES or a reference Molecule to assign the bond order —right?
In my case, my ligands may be strained and was hoping the hydrogens would help PerceiveBondOrders
get things mostly right.