Rostlab / nalaf

NLP framework in python for entity recognition and relationship extraction

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bug trying to run example (related to GNormPlus)

juanmirocks opened this issue · comments

python example_annotate.py -c /usr/local/bin/ -s "This is c.A1003G an example"
Due to a dependence on GNormPlus, running nalaf with -s and -d switches might take a long time.
Traceback (most recent call last):
  File "example_annotate.py", line 61, in <module>
    GNormPlusGeneTagger().tag(dataset, uniprot=True)
  File "/Users/jmcejuela/Work/hck/nalaf/nalaf/learning/taggers.py", line 153, in tag
    abstract = next(parts)
StopIteration

Today (2016-02-04)

python3 example_annotate.py -s "This is c.A1003G an example"
Due to a dependence on GNormPlus, running nalaf with -s and -d switches might take a long time.
starting feature generator: <class 'nalaf.features.simple.SimpleFeatureGenerator'>
starting feature generator: <class 'nalaf.features.stemming.PorterStemFeatureGenerator'>
starting feature generator: <class 'nalaf.features.window.WindowFeatureGenerator'>
reading from cache /Users/jmcejuela/.nalaf/GNormPlus_cache.json
writing the cache /Users/jmcejuela/.nalaf/GNormPlus_cache.json
Traceback (most recent call last):
  File "example_annotate.py", line 56, in <module>
    GNormPlusGeneTagger().tag(dataset, uniprot=True)
  File "/Users/jmcejuela/Work/hck/nalaf/nalaf/learning/taggers.py", line 148, in tag
    genes, gnorm_title, gnorm_abstract = gnorm.get_genes_for_pmid(doc_id, postproc=True)
  File "/Users/jmcejuela/Work/hck/nalaf/nalaf/utils/ncbi_utils.py", line 39, in get_genes_for_pmid
    while not lines[line_counter]:
IndexError: list index out of range

Same with file:

python3 example_annotate.py -d resources/example.txt
Due to a dependence on GNormPlus, running nalaf with -s and -d switches might take a long time.
starting feature generator: <class 'nalaf.features.simple.SimpleFeatureGenerator'>
starting feature generator: <class 'nalaf.features.stemming.PorterStemFeatureGenerator'>
starting feature generator: <class 'nalaf.features.window.WindowFeatureGenerator'>
reading from cache /Users/jmcejuela/.nalaf/GNormPlus_cache.json
writing the cache /Users/jmcejuela/.nalaf/GNormPlus_cache.json
Traceback (most recent call last):
  File "example_annotate.py", line 56, in <module>
    GNormPlusGeneTagger().tag(dataset, uniprot=True)
  File "/Users/jmcejuela/Work/hck/nalaf/nalaf/learning/taggers.py", line 148, in tag
    genes, gnorm_title, gnorm_abstract = gnorm.get_genes_for_pmid(doc_id, postproc=True)
  File "/Users/jmcejuela/Work/hck/nalaf/nalaf/utils/ncbi_utils.py", line 39, in get_genes_for_pmid
    while not lines[line_counter]:
IndexError: list index out of range

@abojchevski likely to do with wrong doc format structure

The -d switch is broken... I am aware but did not document it. Sorry. Needs to be fixed either way.

excuse me。Has this bug been solved?Today ,I run the code,but console occurs the same errors

Why only pmids{15878741 12625412} that is in example given make it? others come out bugs?

@yanqiangmiffy thanks for reporting -- first, curious: what is your use of nalaf?

The bug is actually kinda deprecated because in reality, after refactoring the nalaf code, the dependency of GNormPlus should not exist in nalaf, only in nala

So the real solution would be to remove GNormPlus as dependency. Nonetheless, do you still need this code to run?