haessar / peaks2utr

A robust Python tool for the annotation of 3’ UTRs

Home Page:https://doi.org/10.1093/bioinformatics/btad112

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Recommended way to convert output GFF file back to GTF?

jasonleongbio opened this issue · comments

Hi, I was testing whether the updated genome annotation file would lead to improved mapping statistics, so I tried to re-map my single-cell data with the updated genome annotation file. As CellRanger only accepts genome annotation files in GTF format (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/tutorial_mr), may I know which tool you would recommend us to use in order to convert the output from peaks2utr (which is in gff format) back to gtf?

I tried to use gffread but it seems the newly annotated 3'UTR would be either merged with exons or discarded (I guess something similar to the issue reported in gpertea/gffread#74 but I'm sorry that I am not so sure), which seems to add another layer of uncertainty to my comparison. So I guess it would be better to know what you think about this. Or perhaps there are alternative ways (like AGAT or other tools)? Looking forward to any advice - thanks so much.

I actually have a feature branch (not yet merged into master) that deals explicitly with gff/gtf inputs/outputs. For example, you can use a gff3 as input and specify the --gtf flag when calling peaks2utr, and it will output the file in gtf format.

You are welcome to try installing peaks2utr from this branch in another conda env. Simply follow the instructions in the README but do a git checkout after cloning (recommend doing this in a clean work directory):

git clone https://github.com/haessar/peaks2utr.git
cd peaks2utr
git checkout feature-gtf-output
python -m build
pip install dist/*.tar.gz

Then try running with the --gtf flag. Let me know how you get on.

This is now implemented as of latest release (v1.0.0). To output as GTF (regardless of whether GFF or GTF is given as input), specify the --gtf flag in the command line when running peaks2utr.