High percentage of reads not mapped, considered "too short"
thallinger opened this issue · comments
I am aligning human stranded RNA-seq data (PE 101) from ~30 samples to the chm13v2.0
assembly with the command below
and am experiencing 15-28% of unmapped, "too-short" reads.
STAR --runThreadN 6 --runMode alignReads --genomeDir ${refdir} --outFileNamePrefix ${outputdir} \
--readFilesIn ${readfiles} --readFilesCommand zcat --outSAMunmapped Within KeepPairs \
--outFilterMatchNminOverLread 0.0 --outFilterMatchNmin 26 \
--outSAMtype BAM SortedByCoordinate --quantMode GeneCounts
I applied different values for the --outFilterMatchNminOverLread
and --outFilterMatchNmin
parameters (0.66/0, 0.40/0, 0.20/0 and finally 0.0/26), but none of the combinations had any influence on the number of "too short" unmapped reads.
Am I using the proper parameters with incorrect values or are there any other parameters, which are related to filtering "too-short" reads?
Execution environment:
more /etc/debian_version
10.1
STAR --version
2.7.11a
Hi @thallinger
You need to also set --outFilterMatchNminOverLread 0
, since the outFilter parameters are combined with the AND logic.
Hi @thallinger
You need to also set
--outFilterMatchNminOverLread 0
, since the outFilter parameters are combined with the AND logic.
I used to think the parameter is set for filter low quantity map reads, if set to 0, will it result in a lot of low-quality map? I use star for two chip-seq data, but >30% too short reads, however the data used performance were well.