questions regarding the reproduction of your test results

Question

questions regarding the reproduction of your test results

pengzhangzhi opened this issue a year ago · comments

Hi. I am trying to reproduce your test results about generating antibody CDRs (sequence-structure co-design) using the DiffAb model.
Using the design_testset.py script, index 10 (pdb 7bwj_H_L_E), and codesign_single ckpt, the results on CDR-H3 are unacceptably bad.
The following table is the rmsd-ca between generated structure and native structure.

         H_CDR1     H_CDR2      H_CDR3     L_CDR1     L_CDR2     L_CDR3
mean    1.428283   1.659423   52.084544   1.605134   0.385445   3.645147
min     0.861669   0.935878   28.523428   0.730051   0.208125   1.603799
max     2.923090   3.224749  153.499802   2.118108   0.752684   6.416183

the rmsd-ca is calculated by the following code.

generate_flags = variant['data']['generate_flag']
native_atom_positions = variant['data']['pos_heavyatom'][...,BBHeavyAtom.CA,:][generate_flags]
# native_atom_positions = native_atom_positions[mask_ha[generate_flags]]
pred_atom_positions = pos_ha[...,BBHeavyAtom.CA,:][generate_flags]
# pred_atom_positions = pred_atom_positions[mask_ha[generate_flags]]
rmsd = ((native_atom_positions - pred_atom_positions)**2).sum(-1).mean()

If this case has such a high rmsd, I doubt that the testset rmsd reported in your paper, Table 1 would also high.
No offense, I am trying to find out what is wrong with my reproduction.
Let me know if you want more details about my reproduction.

Shitong Luo · Answer 1 · Tue Dec 20 2022 11:54:58 GMT+0800 (China Standard Time)

I git-cloned this repo, ran python design_testset.py 10, and got the following results:

RMSD min=3.528, max=11.301, avg=6.349

The unreasonably high RMSDs clearly indicate there are issues on your reproduction. Could you at least visualize your generated samples to see what happens?

Shitong Luo · Answer 2 · Tue Dec 20 2022 12:00:03 GMT+0800 (China Standard Time)

rmsd = ((native_atom_positions - pred_atom_positions)**2).sum(-1).mean()

R.M.S.D means Root-mean-square deviation, where is the square-root?

Fred · Answer 3 · Tue Dec 20 2022 12:23:45 GMT+0800 (China Standard Time)

oops, My bad! Thank you sooooo much~