nf-core / bacass

Simple bacterial assembly and annotation pipeline

Home Page:https://nf-co.re/bacass

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Removal of Prokka in favor of Bakta

Daniel-VM opened this issue · comments

Issue description

Prokka is no longer under maintenance and Bakta seems to be a reasonable replacement for genome annotation which incorporates several improvements .

Describe the solution you'd like

Remove Prokka from nf-core/bacass and add Bakta instead.

Additional notes

It seems that Bakta needs a database to perform the annotations. However, even the light version of its database is somewhat heavy and could slow down the testing process.

Another option is to keep Prokka and add Bakta as an additional tool for annotation.

I am open to suggestions.

I don't know if you'll find a newer bacterial annotator that comes with a database the size of prokka's, though.

If it helps any, the bakta database is a lot smaller than the kraken2 ones.

Thanks for your input @erinyoung !.

I see... Well, at some point I noticed that downloading the kraken2 db (8gb) took less time than the Bakta database (1.3gb) using the nf-core/module/bakta/baktadbdownload. Could it be that the aws S3 speeds up the kraken2 database download process (hosted at: https://genome-idx.s3.amazonaws.com/kraken/k2_standard_8gb_20210517.tar.gz)?

Anyway, lets give it a try 👍🏾 .

Instead of removing Prokka, Bakta has been added as an additional tool for gene annotation (#95 )