lh3 / seqtk

Toolkit for processing sequences in FASTA/Q formats

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

subseq silent fail on malformatted fq

cmsoulette opened this issue · comments

seqtk will fail silently with malformatted fq.

Aligned BAM was converted with pysam, using .get_forward_sequence() and .get_forward_qualities() functions. The later returns array dtype instead of string and could lead to malformatted fq if not converted to string. Running seqtk subseq on malformatted file will not throw any error.

Example FQ entry:

@SRR.ABC.123
AGGGCAATGTACTTCGTTCA.....
+SRR.ABC.123
array('B', [3, 3, 3, 3, 2,....])