Seeking guidance on use of custom STR bed files
Johnymcb opened this issue · comments
Great job. I'm interested in using DeepRepeat to call genome-wide STRs. I have had a go at your example tutorials they worked well for me. I still have some basic questions that I would like your guidance on:
- Can I run DeepRepeat on most custom-defined STR locus with <= 6bp repeat motifs? I have some STRs catalogues of 2-6bp repeat motifs that have been run on other tools, I'm not sure if all of these motifs are included in your training dataset.
- Alternatively, is it possible to get a bed file of the STRs that are included in your whole genome training data set or some handy scripts/tutorials for creating a new model?
Hi @Johnymcb thank you for being interested in the tools. Your two questions are the same. We included many well-trained models. For a quick reference I suggest you checking https://github.com/WGLab/DeepRepeat/blob/master/docs/Reproducibility.md
, and download well-trained models via commands below.
wget https://www.openbioinformatics.org/hx1/drmodel/trainedmod_32_0.2.tar.gz
tar -xvf trainedmod_32_0.2.tar.gz
From there, you can find whether the motif you are interested in has a trained model. All the trained motifs are also included in trainedmod_32_0.2
.
Hi @liuqianhn, thank you.