ContactCounter

Default values for optional arguments

-cutoff: activesite cutoff distance = 5 Angstroms

-heavy_atoms: number of heavy atoms to be considered a ligand = 10

--download_pdbs, -d: download PDBs from internet = False

--keep_pdbs: keep PDB files after download = False

--keep_clean_pdbs: keep cleaned-up versions of the PDB files after use = False

--ignore_glycosylated_proteins, -i: when determining a ligand, ignore covalenlty bound residues = False

Example ways to run this program

Use default arguments and no downloading of PDBs

./program_name pdb_list

Use default arguments and download PDBs

./program_name pdb_list --download_pdbs/-d

Don't download PDBs but change activesite cutoff distance to 7 Ang and heavy atom cutoff to 13

./program_name pdb_list -cutoff=7 -heavy_atoms=13

Download and keep PDBs using default values

./program_name pdb_list -d --keep_pdbs

Download and keep both downloaded and cleaned-up PDBs using default values

./program_name pdb_list -d --keep_pdbs --keep_clean_pdbs

Function overview

split_pdb_file

self.protein_lines : an instance of PDB_line for each line in the PDB starting with ATOM self.ligand_lines : an instance of PDB_line for each line in the PDB starting with HETATM, unless it is a water residue, metal residue, amino acid residue, or an unknown residue self.protein : a dictionary where each key is a unique protein name (resname_reschain_resnum) and the values are the PDB_line instances of each corresponding line from the PDB self.ligand : a dictionary where each key is a unique ligand name (resname_reschain_resnum) and the values are the PDB_line instances of each corresponding line from the PDB

get_ligand_residues

self.ligand_dict : a dictionary where each key is a unqiue ligand name (resname_reschain_resnum) that passes the user specified heavy atom cutoff, and the values are the PDB_line instances of each corresponding line from the PDB self.lig_res_names : a list of the 3-letter names for each ligand, where repeats are allowed self.uniq_lig_res_names : a list of the 3-letter names for each ligand, where no repeats are allowed - this is to be used to count the number of each ligand residue type in the protein. Essentially, this is the result of self.lig_res_names.count( "name" ) self.num_ligand_residues : the number of ligand residues that satisfied the heavy atom cutoff in the protein self.num_ligand_atoms : the number of atoms in each ligand residue from self.num_ligand_residues

get_activesite

self.activesite_dict : a dictionary where each key is a unique protein name (resname_reschain_resnum) and each value is a list of the corresponding PDB lines self.activesite_lig_pro_res_dict : a dictionary where each key is a unique ligand name (resname_reschain_resnum) and each value is a list of the 3-letter name of each protein residue within the specified cutoff distance self.activesite_lig_pro_atms_dict : a dictionary where each key is a unique ligand name (resname_reschain_resnum) and each value is a list of the PDB lines of each protein residue within the specified cutoff distance self.num_activesite_res : the number of residues within the user cutoff's activesite for each ligand residue self.num_activesite_atms : the number of atoms of each residue within the user cutoff's activesite for each ligand residue self.activesite_residues : the unique name (resname_reschain_resnum) of each protein residue within the user cutoff's activesite

mlnance / Contact_Counter