BenevolentAI / DeeplyTough

DeeplyTough: Learning Structural Comparison of Protein Binding Sites

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Vertex dataset lost some pocket PDB files

mylRalph opened this issue · comments

Hi, glad to see your excellent work!
I followed your code to preprocess TOUGH-M1 and Vertex for training and evaluation with --db_preprocessing set set 0, just trying to attain the same splits used in your work. However, I encountered some problems blew:

  1. I can't find corresponding pocket PDB files in several pocket paths, e.g. DeeplyTough/STRUCTURE_DATA_DIR/Vertex/4cmt/4cmt_site_2.pdb, DeeplyTough/STRUCTURE_DATA_DIR/Vertex/4a9t/4a9t_site_2.pdb , and DeeplyTough/STRUCTURE_DATA_DIR/Vertex/4anu/4anu_site_2.pdb, etc, which resulted in invalid 23,380 pocket pairs and didn't match the number in the paper (1,461,668 positive and 102,935 negative pocket pairs, 1,564,603 pairs in total).
  2. I got 6580 structures (unmatch with your result, 6548 structures) left for training after filtering TOUGH-M1 for the evaluation of Vertex with --db_exclude_vertex set 'seqclust', but the number of pocket pairs constructed from these 6580 TOUGH-M1 structures totally accorded with your result, 710,009 pairs, which made me very confused.

I would really appreciate it if I could get your help! Looking forward to your reply!