Not sorting /1 and /2 reads properly.

Question

Not sorting /1 and /2 reads properly.

jallmer opened this issue 3 years ago · comments

I just cloned and made seqtk.

I wanted to use it to split a mixed fastq file into two for each read pair.
According to the seqtk seq -h I could use -1 and -2 for that:
seqtk seq -1 toAssemble_mixedPairs.fastq > toAssemble_1.fastq
seqtk seq -2 toAssemble_mixedPairs.fastq > toAssemble_2.fastq

I used 'grep /1 toAssemble_2' to confirm there is no /1 in the file. That seems fine.

The toAssemble_1 file, however, contains /2 entries.
Counting them reveals 70mio /1 and 280mio /2 reads (~80GB mixed file).

Any ideas?

Heng Li · Answer 1 · Wed Apr 14 2021 22:08:16 GMT+0800 (China Standard Time)

The input fastq has to be interleaved.