Problem of fasterq dump --split-3 in Ubuntu(v 3.10.0), error in Quality score expression, different in macOS(v 3.0.1/v 3.1.0)
Jyi-Yang opened this issue · comments
Hi there,
I have used the fasterq dump --split-3 SRR15347541
to download fastq files from SRA both on the Linux server and my own laptop.
But when I want to check the base quality score, there shows some problems.
On Mac(v 3.0.1):
(qiime2-amplicon-2024.2) apple@Iris 2024_1 % head -20 SRR15347541_1_mac.fastq
@SRR15347541.1 1 length=250
TNCGTAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGTGCGCAGGCGGTTATGCAAGACAGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATTTGTGACTGTATGGCTAGAGTACGGTAGAGGGGGATGGAATTCCGCGTGTAGCAGTGAAATGCGTAGATATGCGGAGGAACACCGATGGCGAAGGCAATCCCCTGGACCTGTACTGACGCTCATGCACGAAAGCGTGGGGAGCAAAC
+SRR15347541.1 1 length=250
C#>>AABCCFBCGGGGGGGGGGHGGGGGHHHHGHHHGGGGHHHGGGGGGGGGGGGFGEFFDFGE2GGFFF@GFBGGHHHHGGGGCGHFHHFGHHGFHGHHGHHHHHHHGFFDGHHFHFFGFG2@><?E?GGFDGGGG@GEG/DGHDGBA@DC:CCGBHF/BGHFFFFDA?AFBBGFGCCFG.9AFFF-9>DFB>-C@DE?FFBFFFBEFFFFFB/9/;FFBADDFFBFF/BAADFDFFFFFFFFDDAFFF
On Mac(v 3.1.0):
@SRR15347541.1 1 length=250
TNCGTAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGTGCGCAGGCGGTTATGCAAGACAGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATTTGTGACTGTATGGCTAGAGTACGGTAGAGGGGGATGGAATTCCGCGTGTAGCAGTGAAATGCGTAGATATGCGGAGGAACACCGATGGCGAAGGCAATCCCCTGGACCTGTACTGACGCTCATGCACGAAAGCGTGGGGAGCAAAC
+SRR15347541.1 1 length=250
C#>>AABCCFBCGGGGGGGGGGHGGGGGHHHHGHHHGGGGHHHGGGGGGGGGGGGFGEFFDFGE2GGFFF@GFBGGHHHHGGGGCGHFHHFGHHGFHGHHGHHHHHHHGFFDGHHFHFFGFG2@><?E?GGFDGGGG@GEG/DGHDGBA@DC:CCGBHF/BGHFFFFDA?AFBBGFGCCFG.9AFFF-9>DFB>-C@DE?FFBFFFBEFFFFFB/9/;FFBADDFFBFF/BAADFDFFFFFFFFDDAFFF
Both the files above showed a normal format.
On Linux:
@SRR15347541.1 1 length=250
TNCGTAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGTGCGCAGGCGGTTATGCAAGACAGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATTTGTGACTGTATGGCTAGAGTACGGTAGAGGGGGATGGAATTCCGCGTGTAGCAGTGAAATGCGTAGATATGCGGAGGAACACCGATGGCGAAGGCAATCCCCTGGACCTGTACTGACGCTCATGCACGAAAGCGTGGGGAGCAAAC
+SRR15347541.1 1 length=250
??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
On Mac:
So I transferred the fastq file from the Linux server to Mac and use head -20 SRR15347541_1_linux.fastq
to check:
@SRR15347541.1 1 length=250
TNCGTAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGTGCGCAGGCGGTTATGCAAGACAGATGTGAAATCCCCGGGCTCAACCTGGGAACTGCATTTGTGACTGTATGGCTAGAGTACGGTAGAGGGGGATGGAATTCCGCGTGTAGCAGTGAAATGCGTAGATATGCGGAGGAACACCGATGGCGAAGGCAATCCCCTGGACCTGTACTGACGCTCATGCACGAAAGCGTGGGGAGCAAAC
+SRR15347541.1 1 length=250
??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????
It seems there are some errors there.
And I have tried both fasterq dump --split-3
and fastq dump --split-3
, there are the same problems.
Could you help me with this?
Thanks a lot.
You dump different runs:
- on Mac - SRA Normalized Format files with full base quality scores,
- on Linux - SRA Lite files with simplified base quality scores.
Run vdb-config --interactive
on both systems. I think you will find that the setting for "Prefer SRA Lite files ..." is different, that it is on for the Linux host and off for the Mac. If so, that would be the cause of the difference in the output.