tica plot is not same as Figure S4
Paulie-ai opened this issue · comments
Hello Jiarui,
Recently i am using mdtraj to extract 1000 frame as reference, and using sampling 1000 frame for all 12 fast folding protein. Specificly, I am using interval to make microseconds MD data to 1000 frame. But the TICA plot is not even close, I used default eval.py and metrics.py, i am very confused about the reason for this results. Can you offer some help to this results? Thanks.
metrics_dev_0318-05-27.csv
Hi @Paulie-ai , for tICA analysis, we select a trajectory with more samples from D. E. Shaw's trajectories (Science 2011 and Science 2010) to ensure the correctness. For fast-folding proteins, we set stride=50 to get a trajectory with more than 10,000 samples. You can use the following code for you extraction and analysis:
mdconvert -t [output_pdb] -o [topology file] [trajectories] -s 50
1000 samples is not enough for tICA method, and using a lag time 20 could be too high (when using 1000 samples).