gpertea / gffread

GFF/GTF utility providing format conversions, region filtering, FASTA sequence extraction and more

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

-E prints errors but cannot find a way to write errors to a file

Angsty-Wondergirl opened this issue · comments

I ran the following...
gffread -E problemGTF.gtf -o test.gtf
....in effort to find 0 exon transcripts. I cannot find any error flags in the test.gtf file, meaning they were only printed and not saved in any file. The number of errors printed on screen is too many to manually copy paste to a text editor. I am not interested in errors re duplicates, only those regarding column 3 of the gtf where it should read transcript or exon (specifically which transcripts lack exons), as I know my file has transcripts lacking exons that I need to remove.

The -E flag should identify that....but I need to find a way to get the printed errors into a text file since I cannot copy all the errors printed back to me. I then tried this method from stack exchange which results in an empty output file.

Anyone have any ideas how to get the -E outputs into a text file (I cannot find a way to do this). It's odd there is no way to make a file with the error messages since those errors inform how a gtf is edited for use with your data.

(I have found a work around but this seems silly given that gffread is designed to show me where these errors are. For those it may help I found I can pull out the lines from the gtf for IDs with both transcripts and exons by identifying all IDs that do have an exon (i.e. listing lines with exon in $3 will pull IDs for only genes with both transcripts and exons, and not pull IDs associated only with transcripts) via...
awk '$3 == "exon" {print}' problemGTF.gtf > test.txt
...and then using test.txt to make a list of these IDs (listofIDs.txt, i.e. IDs associated with exon in $3 of the gtf). Grep rows from problemGTF.gtf using listofIDs.txt to make a new gtf containing only IDs associated with an exon (you can use the same grep command with the addition of -v to get IDs that have transcripts but no exons for sake of record keeping)
eg: grep -w -f listofIDs.txt problemGTF.gtf > transcriptswithexons.gtf