Tian312 / PICO_Parser

A clinical BERT-based NLP tool for parsing clinical trial abstracts following the PICO framework

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

No result for BERT Parser

miladnouriezade opened this issue · comments

Hi, I've used BERT Parser as you said and tried to get a prediction on test data you provided but I got nothing except, exceptionlist.txt in the output directory.
I replaced downloadedbluebert_pretrained_ori.tar.gz with the same folder in the directory and put the NCBI_BERT's files in the bert_init_models folder and just modified bluebert_dir = "bluebert_pretrained_ori" am I right?
I would be grateful if you could help me.

Hi @miladnouriezade, thanks for your interests in this project. This could be caused by the potential mismatch between the BERT BPE tokenizer and parser's tokenizer in some special cases of the text. Some abstracts work and others containing certain structure of text don't. I'm trying to fix it, meanwhile you can try different abstract text of your own.

Hi @miladnouriezade the bugs have been solved in the latest version. You can try to run example input in BERT-based parser following the instruction.

Hi @Tian312 I've tried your new update in BERT Parser today but I've got nothing except exceptionlist.txt.
Here is my output

Loading customized config and text tokenizer... 
ln: QuickUMLS: File exists
WARNING:tensorflow:Estimator's model_fn (<function model_fn_builder.<locals>.model_fn at 0x133a22730>) includes params argument, but params are not passed to Estimator.
WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False.
Start parsing PICO elements.
Saved all parsing results in test/json3

@miladnouriezade could you change the line 795 try: to if 1==1: and comment out the line 792-793 except: for run_bluebert_ner_predict.py , run it again and paste the error message here?

@Tian312 Hi again, I did what you said, Here is the error:

Loading customized config and text tokenizer... 
ln: QuickUMLS: File exists
WARNING:tensorflow:Estimator's model_fn (<function model_fn_builder.<locals>.model_fn at 0x128add730>) includes params argument, but params are not passed to Estimator.
WARNING:tensorflow:eval_on_tpu ignored because use_tpu is False.
Start parsing PICO elements.
Traceback (most recent call last):
  File "run_bluebert_ner_predict.py", line 801, in <module>
    main()
  File "run_bluebert_ner_predict.py", line 765, in main
    predict_examples = processor.get_pred_examples(input_file)
  File "run_bluebert_ner_predict.py", line 266, in get_pred_examples
    self._read_data(self.txt2conll(data_dir)), "test")
  File "run_bluebert_ner_predict.py", line 185, in txt2conll
    sents = sent_tokenize(raw_text) 
  File "/Users/miladnourizade/anaconda3/envs/PICO_Parser/lib/python3.6/site-packages/nltk/tokenize/__init__.py", line 106, in sent_tokenize
    tokenizer = load("tokenizers/punkt/{0}.pickle".format(language))
  File "/Users/miladnourizade/anaconda3/envs/PICO_Parser/lib/python3.6/site-packages/nltk/data.py", line 752, in load
    opened_resource = _open(resource_url)
  File "/Users/miladnourizade/anaconda3/envs/PICO_Parser/lib/python3.6/site-packages/nltk/data.py", line 877, in _open
    return find(path_, path + [""]).open()
  File "/Users/miladnourizade/anaconda3/envs/PICO_Parser/lib/python3.6/site-packages/nltk/data.py", line 585, in find
    raise LookupError(resource_not_found)
LookupError: 
**********************************************************************
  Resource punkt not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt')
  
  For more information see: https://www.nltk.org/data.html

  Attempted to load tokenizers/punkt/PY3/english.pickle

  Searched in:
    - '/Users/miladnourizade/nltk_data'
    - '/Users/miladnourizade/anaconda3/envs/PICO_Parser/nltk_data'
    - '/Users/miladnourizade/anaconda3/envs/PICO_Parser/share/nltk_data'
    - '/Users/miladnourizade/anaconda3/envs/PICO_Parser/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - ''
**********************************************************************


@Tian312 I've used this commands to obtain punkt model and it worked.

  >>> import nltk
  >>> nltk.download('punkt')

i am unable to get tensorflow 1.0 can you help me out please.