Test data
HLHsieh opened this issue · comments
Hi there,
I am interested in this amazing tool. I am wondering whether you can provide test data.
Thank you,
Hsin
Hello,
Thank you for your interest in LongTR. I just added a small test data to the test_data folder. Please let me know if you need anything else.
Best,
Helia
Hi Helia,
Thank you for your quick response. I did not see hg38.analysisSet.fa and test.vcf.gz under the test_data folder, so the following code can not work
./LongTR --bams HG002_sample_reads.bam,HG003_sample_reads.bam,HG004_sample_reads.bam \
--fasta hg38.analysisSet.fa \
--regions test_regions_hg38.bed \
--tr-vcf test.vcf.gz \
--bam-samps HG002,HG003,HG004 --bam-libs HG002,HG003,HG004 \
--min-reads 5 \
--max-tr-len 10000 \
--skip-assembly \
--phased-bam
Best,
Hsin
Hello,
That file is the hg38 human reference genome. you can download it from the UCSC genome browser:
https://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/analysisSet/
or if you already have any hg38 reference genome in your system that should work as well.
Best,
Helia
Hi Helia,
Appreciated that test data works. However, I have the following error on my own data. I only have one sample, and I am wondering whether read groups/bam-samps is necessary or any possibility to use one sample as input.
Using phased BAM tags to genotype and phase TRs (WARNING: Any arguments provided to --snp-vcf will be ignored)
Detected 1 BAM/CRAM files
ERROR: Provided BAM/CRAM files don't contain read groups in the header and the --bam-samps flag was not specified
Exiting...
My code:
LongTR --bams test.sorted.bam --fasta genome.fa --regions test.bed --tr-vcf test.vcf.gz --phased-bam
Best,
Hsin
Hi,
I would like to follow up on this issue. It would be appreciated to provide any suggestion.
Best,
Hsin
Hello,
As the error suggests, you have to provide the --bam-samps flag similar to the initial test script:
--bam-samps HG002,HG003,HG004 --bam-libs HG002,HG003,HG004
Please replace the names above with the sample name in your bam file.
Best,
Helia