Vertex dataset lost some pocket PDB files
mylRalph opened this issue · comments
Yuanle Mo commented
Hi, glad to see your excellent work!
I followed your code to preprocess TOUGH-M1 and Vertex for training and evaluation with --db_preprocessing set
set 0
, just trying to attain the same splits used in your work. However, I encountered some problems blew:
- I can't find corresponding pocket PDB files in several pocket paths, e.g.
DeeplyTough/STRUCTURE_DATA_DIR/Vertex/4cmt/4cmt_site_2.pdb
,DeeplyTough/STRUCTURE_DATA_DIR/Vertex/4a9t/4a9t_site_2.pdb
, andDeeplyTough/STRUCTURE_DATA_DIR/Vertex/4anu/4anu_site_2.pdb
, etc, which resulted in invalid 23,380 pocket pairs and didn't match the number in the paper (1,461,668 positive and 102,935 negative pocket pairs, 1,564,603 pairs in total). - I got 6580 structures (unmatch with your result, 6548 structures) left for training after filtering TOUGH-M1 for the evaluation of Vertex with
--db_exclude_vertex
set'seqclust'
, but the number of pocket pairs constructed from these 6580 TOUGH-M1 structures totally accorded with your result, 710,009 pairs, which made me very confused.
I would really appreciate it if I could get your help! Looking forward to your reply!