zavolanlab / PAQR_KAPAC

scripts, pipelines and documentation to run PAQR and KAPAC; KAPAC allows to infer regulatory sequence motifs implicated in 3’ end processing changes; PAQR enables the quantification of poly(A) site usage from standard RNA-seq data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Intronic polyadenylation quantification

vvdnoord opened this issue · comments

Hello,

First of all: thank you very much for sharing this code with very clear description here!

After running the PAQR pipeline successfully, I am getting a list of differential usage of polyA sites for different terminal exon (TE) isoforms.

Actually, I am especially interested in quantifying differential intronic polyA site usage (e.g. switching from a regular TE isoform to intronic polyA isoform). Therefore, I was wondering if we could also get the relative usage for these intronic polyA sites isoforms, as included in the polyAsite atlas, using this pipeline?

I noticed that the .bed annotation file contains also only the TE polyA sites. Would it be possible to extend this file with the intronic polyadenylation clusters (IN) from the polyAsite atlas and run the script as normal? Or are there then other issues? If this would be possible: could you give me some advice on what would be an easy way to annotate the .bed file from the polyA atlas to transcript/gene IDs so that they can be uploaded in a similar way as the TE's (I am quite new to these genomic bioinformatics tools)?

Hopefully you can help with this!

Best regards,
Vera

Hi Vera

many thanks for the interest in PAQR.

Unfortunately, it is not straight forward to apply the implementation for the inference of intronic poly(A) site usage. The main problem here is that the method assesses read coverage profiles for exons. If the same approach would be extended to include introns I expect immediate issues due to the vast number of coverage profiles that can occur at introns and intron-exon boundaries.

My ad hoc idea to infer intronic poly(A) site usage would be to use an existing tool like DEXseq and use our annotated intronic poly(A) sites plus some up- and downstream regions as self-defined "exons". In this case, DEXseq could, in principal, allow you to infer differential intronic poly(A) site usage.

Let us know if you have additional questions,
best regards,

Ralf

Dear Ralf,

Thank you for your help! Extending the intronic polyA sites with some regions up and downstream and using them in DEXseq as "exons" indeed sounds like a reasonable alternative. Thank you for this suggestion, I will try it out!

Best,
Vera