gymrek-lab / LongTR

Tandem repeat genotyping with long reads

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Test data

HLHsieh opened this issue · comments

Hi there,

I am interested in this amazing tool. I am wondering whether you can provide test data.

Thank you,
Hsin

Hello,

Thank you for your interest in LongTR. I just added a small test data to the test_data folder. Please let me know if you need anything else.

Best,
Helia

Hi Helia,

Thank you for your quick response. I did not see hg38.analysisSet.fa and test.vcf.gz under the test_data folder, so the following code can not work

./LongTR --bams HG002_sample_reads.bam,HG003_sample_reads.bam,HG004_sample_reads.bam  \
--fasta hg38.analysisSet.fa \
--regions test_regions_hg38.bed \
--tr-vcf test.vcf.gz \
--bam-samps HG002,HG003,HG004 --bam-libs HG002,HG003,HG004 \
--min-reads 5 \
--max-tr-len 10000 \
--skip-assembly \
--phased-bam

Best,
Hsin

Hello,

That file is the hg38 human reference genome. you can download it from the UCSC genome browser:

https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/analysisSet/

or if you already have any hg38 reference genome in your system that should work as well.

Best,
Helia

Hi Helia,

Appreciated that test data works. However, I have the following error on my own data. I only have one sample, and I am wondering whether read groups/bam-samps is necessary or any possibility to use one sample as input.

Using phased BAM tags to genotype and phase TRs (WARNING: Any arguments provided to --snp-vcf will be ignored)
Detected 1 BAM/CRAM files
ERROR: Provided BAM/CRAM files don't contain read groups in the header and the --bam-samps flag was not specified
Exiting...

My code:

LongTR --bams test.sorted.bam --fasta genome.fa --regions test.bed --tr-vcf test.vcf.gz --phased-bam

Best,
Hsin

Hi,

I would like to follow up on this issue. It would be appreciated to provide any suggestion.

Best,
Hsin

Hello,

As the error suggests, you have to provide the --bam-samps flag similar to the initial test script:

--bam-samps HG002,HG003,HG004 --bam-libs HG002,HG003,HG004

Please replace the names above with the sample name in your bam file.

Best,
Helia