Merck / deepbgc

BGC Detection and Classification Using Deep Learning

Home Page:https://doi.org/10.1093/nar/gkz654

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DeepBGC failed with Exception: Unexpected error detecting protein domains using HMMER hmmscan

Bollie15 opened this issue · comments

Hi, I have some problems about using DeepBGC to identify the BGCs in Mycobacterium tuberculosis H37Rv (high GC Gram+).

Here is my DeepBGC version information:

(base) mima@123456:~/BGC/jieheganjun$ deepbgc info
 _____                  ____    ____   ____ 
 |  _ \  ___  ___ ____ | __ )  / ___) / ___)
 | | \ \/ _ \/ _ \  _ \|  _ \ | |  _ | |    
 | |_/ /  __/  __/ |_) | |_) || |_| || |___ 
 |____/ \___|\___| ___/|____/  \____| \____)
=================|_|===== version 0.1.26 =====
INFO    08/04 16:55:44   Available data files: ['Pfam-A.31.0.hmm', 'Pfam-A.31.0.hmm.h3f', 'Pfam-A.31.0.hmm.h3m', 'Pfam-A.31.0.hmm.h3i', 'Pfam-A.31.0.clans.tsv', 'Pfam-A.31.0.hmm.h3p']
INFO    08/04 16:55:44   ================================================================================
INFO    08/04 16:55:44   Available detectors: ['clusterfinder_retrained', 'clusterfinder_original', 'deepbgc', 'product_class', 'product_activity', 'clusterfinder_geneborder']
INFO    08/04 16:55:44   --------------------------------------------------------------------------------
INFO    08/04 16:55:44   Model: clusterfinder_retrained
INFO    08/04 16:55:44   Loading model from: /home/mima/.local/share/deepbgc/data/0.1.0/detector/clusterfinder_retrained.pkl
WARNING 08/04 16:55:44   Model not supported: ('Package "hmmlearn" needs to be installed to run ClusterFinder HMM. ', 'Install extra dependencies using: \n    pip install "deepbgc[hmm]"')
INFO    08/04 16:55:44   --------------------------------------------------------------------------------
INFO    08/04 16:55:44   Model: clusterfinder_original
INFO    08/04 16:55:44   Loading model from: /home/mima/.local/share/deepbgc/data/0.1.0/detector/clusterfinder_original.pkl
WARNING 08/04 16:55:44   Model not supported: ('Package "hmmlearn" needs to be installed to run ClusterFinder HMM. ', 'Install extra dependencies using: \n    pip install "deepbgc[hmm]"')
INFO    08/04 16:55:44   --------------------------------------------------------------------------------
INFO    08/04 16:55:44   Model: deepbgc
INFO    08/04 16:55:44   Loading model from: /home/mima/.local/share/deepbgc/data/0.1.0/detector/deepbgc.pkl
Using TensorFlow backend.
WARNING 08/04 16:55:49   From /home/mima/miniconda3/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING 08/04 16:55:49   From /home/mima/miniconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
INFO    08/04 16:55:50   Type: KerasRNN
INFO    08/04 16:55:50   Version: 0.1.0
INFO    08/04 16:55:50   Timestamp: 1551305667.986168 (2019-02-28T06:14:27.986168)
INFO    08/04 16:55:50   --------------------------------------------------------------------------------
INFO    08/04 16:55:50   Model: product_class
INFO    08/04 16:55:50   Loading model from: /home/mima/.local/share/deepbgc/data/0.1.0/detector/product_class.pkl
/home/mima/miniconda3/lib/python3.7/site-packages/sklearn/base.py:306: UserWarning: Trying to unpickle estimator DecisionTreeClassifier from version 0.18.2 when using version 0.21.3. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
/home/mima/miniconda3/lib/python3.7/site-packages/sklearn/base.py:306: UserWarning: Trying to unpickle estimator RandomForestClassifier from version 0.18.2 when using version 0.21.3. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
INFO    08/04 16:55:50   Type: RandomForestClassifier
INFO    08/04 16:55:50   Version: 0.1.0
INFO    08/04 16:55:50   Timestamp: 1551781410.019103 (2019-03-05T18:23:30.019103)
INFO    08/04 16:55:50   --------------------------------------------------------------------------------
INFO    08/04 16:55:50   Model: product_activity
INFO    08/04 16:55:50   Loading model from: /home/mima/.local/share/deepbgc/data/0.1.0/detector/product_activity.pkl
INFO    08/04 16:55:50   Type: RandomForestClassifier
INFO    08/04 16:55:50   Version: 0.1.0
INFO    08/04 16:55:50   Timestamp: 1551781433.886473 (2019-03-05T18:23:53.886473)
INFO    08/04 16:55:50   --------------------------------------------------------------------------------
INFO    08/04 16:55:50   Model: clusterfinder_geneborder
INFO    08/04 16:55:50   Loading model from: /home/mima/.local/share/deepbgc/data/0.1.0/detector/clusterfinder_geneborder.pkl
WARNING 08/04 16:55:50   Model not supported: ('Package "hmmlearn" needs to be installed to run ClusterFinder HMM. ', 'Install extra dependencies using: \n    pip install "deepbgc[hmm]"')
INFO    08/04 16:55:50   ================================================================================
INFO    08/04 16:55:50   Available classifiers: ['product_class', 'product_activity']
INFO    08/04 16:55:50   --------------------------------------------------------------------------------
INFO    08/04 16:55:50   Model: product_class
INFO    08/04 16:55:50   Loading model from: /home/mima/.local/share/deepbgc/data/0.1.0/classifier/product_class.pkl
INFO    08/04 16:55:50   Type: RandomForestClassifier
INFO    08/04 16:55:50   Version: 0.1.0
INFO    08/04 16:55:50   Timestamp: 1551781410.019103 (2019-03-05T18:23:30.019103)
INFO    08/04 16:55:51   --------------------------------------------------------------------------------
INFO    08/04 16:55:51   Model: product_activity
INFO    08/04 16:55:51   Loading model from: /home/mima/.local/share/deepbgc/data/0.1.0/classifier/product_activity.pkl
INFO    08/04 16:55:51   Type: RandomForestClassifier
INFO    08/04 16:55:51   Version: 0.1.0
INFO    08/04 16:55:51   Timestamp: 1551781433.886473 (2019-03-05T18:23:53.886473)
INFO    08/04 16:55:51   ================================================================================
WARNING 08/04 16:55:51   Some warnings detected, check the output above

Then type the command deepbgc pipeline ./GCF_000195955.2_ASM19595v2_genomic.fna. Unfortunately, it failed.

(base) mima@123456:~/BGC/jieheganjun$ deepbgc pipeline ./GCF_000195955.2_ASM19595v2_genomic.fna 
 _____                  ____    ____   ____ 
 |  _ \  ___  ___ ____ | __ )  / ___) / ___)
 | | \ \/ _ \/ _ \  _ \|  _ \ | |  _ | |    
 | |_/ /  __/  __/ |_) | |_) || |_| || |___ 
 |____/ \___|\___| ___/|____/  \____| \____)
=================|_|===== version 0.1.26 =====
INFO    08/04 16:32:36   Loading model from: /home/mima/.local/share/deepbgc/data/0.1.0/detector/deepbgc.pkl
Using TensorFlow backend.
WARNING 08/04 16:32:36   From /home/mima/miniconda3/lib/python3.7/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING 08/04 16:32:36   From /home/mima/miniconda3/lib/python3.7/site-packages/keras/backend/tensorflow_backend.py:3445: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
INFO    08/04 16:32:37   Loading model from: /home/mima/.local/share/deepbgc/data/0.1.0/classifier/product_class.pkl
/home/mima/miniconda3/lib/python3.7/site-packages/sklearn/base.py:306: UserWarning: Trying to unpickle estimator DecisionTreeClassifier from version 0.18.2 when using version 0.21.3. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
/home/mima/miniconda3/lib/python3.7/site-packages/sklearn/base.py:306: UserWarning: Trying to unpickle estimator RandomForestClassifier from version 0.18.2 when using version 0.21.3. This might lead to breaking code or invalid results. Use at your own risk.
  UserWarning)
INFO    08/04 16:32:37   Loading model from: /home/mima/.local/share/deepbgc/data/0.1.0/classifier/product_activity.pkl
INFO    08/04 16:32:37   Processing input file 1/1: ./GCF_000195955.2_ASM19595v2_genomic.fna
INFO    08/04 16:32:37   ================================================================================
INFO    08/04 16:32:37   Processing record #1: NC_000962.3
WARNING 08/04 16:32:37   Updating record alphabet to generic_dna
INFO    08/04 16:32:37   Finding genes in record: NC_000962.3
INFO    08/04 16:32:47   Detecting Pfam domains in "NC_000962.3" using HMMER hmmscan, this might take a while...
WARNING 08/04 16:42:14   == HMMER hmmscan Error: ================
WARNING 08/04 16:42:14   
WARNING 08/04 16:42:14   == End HMMER hmmscan Error. ============
ERROR   08/04 16:42:14   Unexpected error detecting protein domains using HMMER hmmscan
Traceback (most recent call last):
  File "/home/mima/miniconda3/lib/python3.7/site-packages/deepbgc/main.py", line 113, in main
    run(argv)
  File "/home/mima/miniconda3/lib/python3.7/site-packages/deepbgc/main.py", line 102, in run
    args.func.run(**args_dict)
  File "/home/mima/miniconda3/lib/python3.7/site-packages/deepbgc/command/pipeline.py", line 177, in run
    step.run(record)
  File "/home/mima/miniconda3/lib/python3.7/site-packages/deepbgc/pipeline/annotator.py", line 35, in run
    pfam_annotator.annotate()
  File "/home/mima/miniconda3/lib/python3.7/site-packages/deepbgc/pipeline/pfam.py", line 97, in annotate
    self._run_hmmscan(protein_path, domtbl_path)
  File "/home/mima/miniconda3/lib/python3.7/site-packages/deepbgc/pipeline/pfam.py", line 73, in _run_hmmscan
    raise Exception("Unexpected error detecting protein domains using HMMER hmmscan")
Exception: Unexpected error detecting protein domains using HMMER hmmscan
ERROR   08/04 16:42:14   ================================================================================
ERROR   08/04 16:42:14   DeepBGC failed with Exception: Unexpected error detecting protein domains using HMMER hmmscan
ERROR   08/04 16:42:14   ================================================================================

I don't know how to solve this problem, so please help me. (By the way, I have tried put this genome sequences into antiSMASH, and it works). Thanks in advance.

Hi @Bollie15, I tried running deepbgc on those sequences and I didn't get any error. Can you try running again? Looks like hmmscan has crashed, maybe enough memory was not available?

Hi @Bollie15, I tried running deepbgc on those sequences and I didn't get any error. Can you try running again? Looks like hmmscan has crashed, maybe enough memory was not available?

Thanks! It ran successfully after I closed all background programs.