high % of reads mapped to no features gene counts

Question

high % of reads mapped to no features gene counts

teryyoung opened this issue 5 months ago · comments

this is my mapping result, the percentage mapped reads looks well, but a lot of these reads assigned to no features, how do I fix this problem?

this is my command:
STAR --twopassMode Basic\ --quantMode TranscriptomeSAM GeneCounts\ --runThreadN 6\ --genomeDir ${ref_dir}\ --alignIntronMin 5\ --alignIntronMax 60000\ --outSAMtype BAM Unsorted\ --outSAMattrRGline ID:"${sample}" SM:"${sample}" PL:ILLUMINA\ --outFilterMismatchNmax 2\ --outSJfilterReads Unique --outSAMmultNmax 1\ --outFileNamePrefix ./alignment/${sample}/${sample}\ --outSAMmapqUnique 60\ --readFilesCommand zcat\ --readFilesIn "$r1" "$r2"
--outTmpDir ./tmp

samtools sort -@ 12 -o ./alignment/${sample}/${sample}.sorted.bam ./alignment/${sample}/${sample}Aligned.out.bam
`

Alexander Dobin · Answer 1 · Sat Feb 24 2024 03:00:49 GMT+0800 (China Standard Time)

This means that many reads in your library map to intronic or intergenic regions. If you have a set of more complete annotations, you may be able to increase the % of exonic reads, but this more likely the issue with library prep.

teryyoung · Answer 2 · Sat Feb 24 2024 16:32:16 GMT+0800 (China Standard Time)

@alexdobin thanks for your reply, I used the <hg38.ncbiRefSeq.gtf> annotation file.
But now I find the library are commercial referencial RNA which contain many of fusion Genes. I think this is why STAR cannot assign them to gene features ? I wonder can STAR recognize fusion gene? or any other issue?

Alexander Dobin · Answer 3 · Sat Mar 02 2024 00:19:18 GMT+0800 (China Standard Time)

Hi @teryyoung
STAR can detect chimeric alignments, and you would need some downstream processing to detect fusions (e.g. STAR-Fusion).
However, I doubt this will explain what you see.