gbouras13 / hybracter

Automated long-read first bacterial genome assembly tool implemented in Snakemake using Snaketool.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[BUG] mash not found

CorentinEscobar opened this issue · comments

Hi everyone,
I have a problem when I use hybracter.

I created the environment and installed hybracter with conda as indicated on github. However, when I run the program it works fine at first then hangs because it can't find mash. Or, mash is well installed because I can use it independently to hybracter.

You can see the error message in the hybracter.log file attached

Have you ever had this problem or do you know how to fix it?

Thanks for your help

hybracter.log

Hi @CorentinEscobar ,

Thanks for uploading the log and trying Hybracter.

This is a bug with installing Plassembler. Mash is a dependency of Plassembler, and for whatever reason Mash sometimes doesn't install with conda (I have found this bug in my other tool Pharokka e.g. here gbouras13/pharokka#235).

The way Hybracter works is that it makes different conda environments for each rule inside hybracter.

You can see the path to the conda environment for each rule as it is listed as conda-env:

Therefore, to fix this, you will need to activate the Plassembler conda environment and install mash (I'd try mash==2.2 as that normally works).

e.g. for you

conda activate /home/horigene/hybracter/hybracter/workflow/conda/67da6b66edaa0b7b03c36d43c63ed64f_
mamba install mash==2.2
mash -h
plassembler -h
conda deactivate

then hopefully re-running hybracter will work.

Let me know how you go.

George

I've added some instructions to the documentation under the header 'Errors with Installing Dependencies & Environments'

https://hybracter.readthedocs.io/en/latest/install/

George

Hi @gbouras13,

Thank you for your quick response and your precise explanations.
I was able to resolve the dependencies issue. I had the same problem with canu but I solved this problem in the same way.

However, I have another problem. I have the impression that all the dependencies are installed correctly but that plassembler cannot find the database. I don't understand where he wants the database to be installed and what its name should be.

Do you have any idea to fix this error?
The log file is attached

Thank you in advance for your help
Corentin

hybracter.log

Hi @CorentinEscobar ,

Have you run

hybracter install

That should install the DB in the correct place.

George

yes but it doesn't work and the terminal shows me this message :

2024-01-11 11:44:41.809 | ERROR | plassembler.utils.db:check_db_installation:44 - Database directory is missing /home/horigene/anaconda3/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../databases/plsdb.msh. Plassembler database needs to be downloaded using the plassembler download command.

Corentin

That is very strange - maybe try to specify a directory for the database using -d or --databases

hybracter install -d <path>

then

hybracter long ... -d <path>

George

I did it but I still get the same error message..

Corentin

I notice you have Plassembler v1.4.1 installed

The it is looking for the v1.4.1 DB - it was updated in v1.5.0.

Therefore, perhaps try:

conda activate /home/horigene/hybracter/hybracter/workflow/conda/67da6b66edaa0b7b03c36d43c63ed64f_
plassembler download -d /home/horigene/anaconda3/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/../databases 
conda deactivate

and then run hybracter

George

Hi,

Thanks for your help. The plassembler update has indeed resolved the database problem.However, I have a new problem with medaka. it does not find certain dependencies even though I then installed them independently.

Here is the error message:


[Mon Jan 15 13:18:40 2024]
Finished job 11.
3 of 18 steps (17%) done
Checking program versions
This is medaka 1.11.3
Program    Version    Required   Pass     
bcftools   Not found  1.11       False    
bgzip      1.17       1.11       True     
minimap2   2.26       2.11       True     
samtools   1.18       1.11       True     
tabix      1.17       1.11       True     
[Mon Jan 15 13:18:41 2024]
Error in rule medaka_round_1:
    jobid: 25
    input: /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/processing/pre_polish/hybracter_RB04_test_chromosome_plus_plasmids.fasta, /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/processing/qc/hybracter_RB04_test_filt_trim.fastq.gz
    output: /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/processing/complete/medaka_rd_1/hybracter_RB04_test/consensus.fasta, /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/versions/hybracter_RB04_test/medaka_complete.version
    log: /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/stderr/medaka_round_1/hybracter_RB04_test.log (check log file(s) for error details)
    conda-env: /home/horigene/anaconda3/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/348d2d1aa0134c8a68065158ce832239_
    shell:
        
        medaka_consensus -i /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/processing/qc/hybracter_RB04_test_filt_trim.fastq.gz -d /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/processing/pre_polish/hybracter_RB04_test_chromosome_plus_plasmids.fasta -o /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/processing/complete/medaka_rd_1/hybracter_RB04_test -m r1041_e82_400bps_sup_v4.2.0  -t 16 2> /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/stderr/medaka_round_1/hybracter_RB04_test.log
        medaka --version > /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/versions/hybracter_RB04_test/medaka_complete.version
        touch /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/processing/complete/medaka_rd_1/hybracter_RB04_test/calls_to_draft.bam
        rm /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/processing/complete/medaka_rd_1/hybracter_RB04_test/calls_to_draft.bam
        touch /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/processing/complete/medaka_rd_1/hybracter_RB04_test/consensus_probs.hdf
        rm /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/processing/complete/medaka_rd_1/hybracter_RB04_test/consensus_probs.hdf
        
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Logfile /mnt/data/horigene/Corentin/20231214_24_genomes_GW/hybracter/stderr/medaka_round_1/hybracter_RB04_test.log:
================================================================================
Cannot import pyabpoa, some features may not be available.
Cannot import pyabpoa, some features may not be available.
Cannot import pyabpoa, some features may not be available.
Cannot import pyabpoa, some features may not be available.
bcftools: error while loading shared libraries: libgsl.so.25: cannot open shared object file: No such file or directory
================================================================================

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-01-15T131838.221728.snakemake.log
WorkflowError:
At least one job did not complete successfully.
[2024:01:15 13:18:41] ERROR: Snakemake failed

Have you ever had this problem ? Do you know how to fix this ?
Thanks for your help
Corentin

The medaka version you are using is 1.11 - not sure how that has happened, hybracter uses 1.8.

I would try:

conda activate /home/horigene/anaconda3/envs/hybracterENV/lib/python3.12/site-packages/hybracter/workflow/conda/348d2d1aa0134c8a68065158ce832239_
mamba install medaka==1.8.0
conda deactivate

and try again.

Hi @gbouras13 ,
I finally found where my problem comes from. It was necessary to adjust the priority order of the channels. By uninstalling and reinstalling everything from the beginning, with the order of the channels below, it works without error for me.

channels:

  • conda-forge
  • bioconda
  • defaults
  • r
    channel_priority: strict

Thanks a lot for your help !
I just have to play with hybracter now ;)

Corentin

Awesome glad you solved this - hope you like Hybracter!

George