nf-core / bacass

Simple bacterial assembly and annotation pipeline

Home Page:https://nf-co.re/bacass

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can we access kraken2db from S3 bucket?

piroonj opened this issue · comments

I just start using bacass on AWS. I tried to access kraken2db from S3 bucket. My question is that is it possible to access kraken2db from S3 bucket? I have tried run bacass with
--kraken2db 's3://kraken2_db/minikraken2_v2_8GB_201904_UPDATE' using nextflow-tower

I got the error message below.
Command error:
kraken2: database ("s3://kraken2_db/minikraken2_v2_8GB_201904_UPDATE") does not contain necessary file taxo.k2d

Thank you very much,
Piroon

Hi @piroonj !

how does the s3 bucket content look like? The folder s3://kraken2_db/minikraken2_v2_8GB_201904_UPDATE should contain e.g. taxo.k2d and all other kraken2 db files required by kraken2. Is this the case?

Also: Do you have your AWS credentials set up properly so that your permissions allow you to access the file(s) there?

You could also use public AWS kraken2 indices if you want to - check out this website for more information. The s3 / tar.gz URLs are usually fine for using:

https://benlangmead.github.io/aws-indexes/k2

Hi @apeltzer

Thank you for your response.

  1. The folder s3://kraken2_db/minikraken2_v2_8GB_201904_UPDATE contain taxo.k2d and other related files. I successfully ran it on local server.
  2. The AWS credentials were set.

I thought that the issue might due to the mounting system of docker. I try run it locally on my server. It is running fine as long as the kraken2_db is the sub-folder of the working directory. What do you think?

Latest release is capable to utilize the https://benlangmead.github.io/aws-indexes/k2 indices directly, e.g. by loading the compressed archive directly - see changelog: https://github.com/nf-core/bacass/releases/tag/2.0.0