CSB-KaracaLab / CSB-WS

This page is prepared to model the complex of phage G3P and E. Coli TolA

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Complex modeling using ColabFold

This page is prepared to model the complex of phage G3P and E. Coli TolA coreceptor using ColabFold. image

Biological Investigation:

  • Monomer 1: Attachment protein G3P, Escherichia phage If1 (Bacteriophage If1)

Uniprot ID: O80297 https://www.uniprot.org/uniprot/O80297

  • Monomer 2: Cell envelope integrity inner membrane protein TolA, E. Coli

Uniprot ID: Q8X965 https://www.uniprot.org/uniprot/Q8X965

image

  • It is known that N1 domain of G3P (17-81 res.) binds to CT domain of TolA (268-394 res.), so we will use these domains to model the complex structure. You can read the relevant paper.

image

Preparing input fasta files

  • Input file should be provided as one line.
  • There should be colon sign (:) between sequences.

image

  • Final input sequence in ColabFold format: Let's select the interacting domains from the input sequences and prepare the input file for ColabFold as below:

TTDAECLSKPAFDGTLSNVWKEGDSRYANFENCIYELSGIGIGYDNDTSWNGHWTPVRAAD:SGADINNYAGQIKSAIESKFYDASSYAGKTCTLRIKLAPDGMLLDIKPEGGDPALCQAALAAAKLAKIPKPPSQAVYEVFKNAPLDFKPA

  • Please copy and paste the above sequence to the query_sequence section, give a job name.

Choosing the AlphaFold version

image

  • alphafold2_ptm: For monomer modeling

  • alphafold2_multimer_v1: Initial release of AF2-multimer for complex prediction.

  • alphafold2_multimer_v2: Correction of clashes seen in alphafold2_multimer_v1 release. Generally, we prefer to use this version.

  • alphafold2_multimer_v3: Improved sampling space by changing; num-recycle: specifies number of recycles to run and recycle-early-stop-tolerance: specifies when to stop the cycle. This version is recommended for very large or difficult targets but consider twice before using due to increased computational time.

  • After choosing the optimal version for your case, alphafold2_multimer_v2 for this workshop, please run all.

image

  • You can follow the progression of your run in Run Prediction section.

image

  • When the run is completed you see the results in Display 3D structure section.

image

  • The results are ranked according to pLDDT score between 0-100; higher score means better confidence. You read more on it from here.

  • If you are using Chrome, the results are automatically downloaded to your computer as zip file. Otherwise you can download manually from the left section as zip file.

  • Now, you can download the result file from here.

  • ** Additional Colabfold tips:**

The sequence length that can be modelled in Colab Fold change upon the GPU you assigned. To check what GPU you got, open a new code cell and type:

!nvidia-smi

and run the cell.

Screen Shot 2022-10-20 at 22 57 47

For Tesla T4 or Tesla P100 with~16G, the max sequence length is ~1400

For Tesla K80 with ~12G, the max sequence length is ~1000

Comparison of the generated models with the experimental structure by using PyMOL?

image

image

  • Name of the example protein structure: crystal structure of g3p from phage IF1 in complex with its coreceptor, the C-terminal domain of TolA
  • PDB ID of the TolA:G3P complex structure: 2X9A

image

  • Download PyMOL with 15-days trial license, or request an academic license for free!

  • Open both the crystal structure and the generated models in PyMOL.

  • Let's go deeper in PyMOL!

  • Useful PyMOL tips:

To show the structures as cartoon:

as cartoon

To align one structure to the another:

align model_name, target_name

To color the structure according to the B-factor column which contains pLDDT scores similar to the AF2 colors:

spectrum b, rainbow_rev, model_name

Or, you can also color obtained models according to pLLDT scores using this Github repo.

About

This page is prepared to model the complex of phage G3P and E. Coli TolA


Languages

Language:Shell 80.7%Language:TeX 19.3%