comprna / ORQAS

ORF Quantification pipeline for Alternative Splicing

Home Page:https://github.com/comprna/ORQAS

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Determination of uniformity and periodicity thresholds

marcasriv opened this issue · comments

Dear authors,

I've a question related to the procedure that you followed to classify translated and untranslated ORFs after finishing running ORQAS. How do you exactly determine the cut-off values for the uniformity and periodicity? Also, I've noticed that in Supplementary Figure 1 each different dataset presents different cutoffs, therefore do cutoffs need to be re-calculated for each dataset?

Thanks so much,

Marina

To determine which values of uniformity and periodicity are indicative of an isoform being translated, we selected the set of positive controls that appear in blue in fig. 1B and sup. fig. 1. Those are ORFs with evidence of protein expression from mass spectrometry (MS), immunohistochemistry (IHC)and uniprot in all 37 tissues available the HPA. For the mouse samples, the positive controls are genes with one-to-one orthology with the human positive controls. Since every dataset can have different coverage or differences in the protocol that could affect uniformity and periodicity, we calculate in each dataset our thresholds based on the values of what we expect as translated. That said, we considered the translation of those ORFs within 90% of the periodicity and uniformity distribution of the positive controls, allowing for some variability from the samples or sequencing even in the positive set. So, yes, we recommend recalculating thresholds for each dataset and we strongly encourage it when the data comes from different experiments (i.e you can see how different are the thresholds between our glia/glioma and hippocampus datasets).