twlab / TEProf2Paper

TEProf2 Pipeline used to find promoters and predict protein sequences from RNA-sequencing data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

How to get raw reads counts for final all TE-chimeric transcripts in different samples?

31474molly opened this issue · comments

Dear professors,
Thank you for your outstanding work! Now I have got the All TE-derived Alternative Isoforms Statistics.xlsx and allCandidateStatistics.tsv for many samples from three different stages. I want to do transcript differential analysis between different stages. However, we can only find TPM available for downstream analysis. I wonder if you could help to provide the solutions? Can we use tools like featureCounts or HTSeq, with annotated transcripts information in All TE-derived Alternative Isoforms Statistics.xlsx, align the reads from each sample's BAM file to annotated transcripts, and calculate the raw read count for each transcript?
Looking forward to your reply sincerely.

Hello, thank you for your comments! Since we use stringtie for quantification, you can use the ballgown package (https://github.com/alyssafrazee/ballgown) for differential transcript expression analysis. The stringtie documentation also has instructions on how analysis with DESEQ2 can be done with their custom python script (https://ccb.jhu.edu/software/stringtie/index.shtml?t=manual). In TEProf2, all the steps needed to run ballgown are done, so you should be able to use ballgown with the files that were created when you ran the pipeline.