griffithlab / rnaseq_tutorial

Informatics for RNA-seq: A web resource for analysis on the cloud. Educational tutorials and working pipelines for RNA-seq analysis including an introduction to: cloud computing, critical file formats, reference genomes, gene annotation, expression, differential expression, alternative splicing, data visualization, and interpretation.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

sequences dropped from the index

fklirono opened this issue · comments

Hello,

kallisto (0.44.0) seems to be silently dropping sequences from the index.

Working example:

Is there a reason why some sequences are not indexed?

Code to reproduce example:

wget 'http://www.circbase.org/download/human_hg19_circRNAs_putative_spliced_sequence.fa.gz' | gzip -d -c > human_hg19_circRNAs_putative_spliced_sequence.fa

sed -n '/^>/p' human_hg19_circRNAs_putative_spliced_sequence.fa |  wc -l 

kallisto index -i human_hg19_circRNAs_putative_spliced_sequence.fa.fai human_hg19_circRNAs_putative_spliced_sequence.fa

kallisto inspect human_hg19_circRNAs_putative_spliced_sequence.fa.fai

sorry for opening this issue here. I should have opened it directly at the kallisto github....had not had enough coffee yet to wake up...