dputhier / libgtftk

gtftk C Library and program

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Processing time in add_attr

dputhier opened this issue · comments

Maybe it can help for issue #57. I found another issue which seems to be observed only with the file Global_GTF_category_moreThan200nc.gtf.

Indeed, it won't happen with another large file (e.g all human transcript from release 90 of ensembl).
When I convert this file (Global_GTF_category_moreThan200nc.gtf) to ensembl format and try to join the file given as attached doc, it take a lot of time (10 minutes ?). The time is lost in the add_attribute C function this time.

    gtftk join_attr -i Global_GTF_category_moreThan200nc_ens.gtf -j to_join.txt -k transcript_id -n tx_geno_size -t transcript -V 2

to_join.txt

This one seems to be fixed in 2714b90