RosettaCommons / RoseTTAFold

This package contains deep learning models and related scripts for RoseTTAFold

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Using a GPU-capable version of sequence alignment (e.g. comer2) instead of hhblits

ragunyrasta opened this issue · comments

Most of my runtime is lost in hhblits which runs on the CPU as opposed to the GPU. Has anyone tried using a GPU-capable equivalent (such as comer2) for this? Any thoughts/comments are appreciated.

Hi Ragunyrasta, similar problem for me. Have you find your answer? I am curious to know. Thanks! David.

Unfortunately the HHBlits developers have told me that they lost funding to develop a GPU capable version.
So I'm about to find out more about Comer2 and try. But in the meantime here's what I've found:

  1. HHBlits is completely memory constrained. The CPU is mostly idle while waiting on memory. So an easy fix may be to ensure that your computer has the max amount of RAM it can take (64GB on standard motherboards, more on the expensive ones)
  2. Given that it's memory bound, unless comer2 has a fundamentally different algorithm, I don't expect it to help much. Graphic cards have much lesser memory than CPUs and their memory is welded, so impossible to upgrade. I hope I'm wrong.

Glad that it helped. One thing to try: Set the number of cpus to a valid value in your run*.py script. When it runs hhblits, do a 'top' in your terminal and note down how idle the cpu is. Kill the job. Then increase or decrease the number of CPUs to the script and repeat. Try to find the efficient frontier (i.e. where the cpu is maximally busy) That could likely be the optimal number of CPUs to use for running hhblits in as quick a time as possible on your system.

commented

@ragunyrasta How long does it take you to run rosettafold to predict a protein of 300 or 400 amino acids, and why does it(just the e2e shell) take me up to three or four hours with an A100 card instead of a few minutes as reported?This is really confusing to me because I need to analyze a lot of proteins